{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# PROC SQL Partie_2" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Segmentation dans une variable" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Avec `proc SQL`, il est très facile de créer des catégories dans une table. Contrairement au _data step_, nous ne sommes plus obligés de conditionneur avec des `if`. Il suffit d'inclure toutes les conditions dans la clause `case`.\n", "\n", "* Dans l'exemple suivant, nous allons diviser nos assurés en trois groupes d'âge;\n", " 1. age $<$ 40 : EncoreJeune\n", " 2. age $\\leq$40 $<$ 60 : Pas Vieux\n", " 3. age $>$ 40 : Commence Vieux\n", "\n", "* Nous allons aussi afficher seulement la somme des coûts de sinistre\n", "* Finalement, pour montrer qu'on peut ajouter une variable qui est basée sur une variable nouvellement calculée dans la même requête, nous allons calculer le coût de sinistre avec un supplément de 2% sur 5 ans (une sorte d'inflation comme exemple) " ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "
\n", "

Créer des catégories

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
numeropollanguetype_proftype_territoirecoutTotalcoutTotalInflaagecategorieAge
1FAutreSemi-urbain..50Pas Vieux
1FAutreSemi-urbain..50Pas Vieux
1FAutreSemi-urbain$243.86$269.2450Pas Vieux
5FTechnicienRural..40Pas Vieux
5FTechnicienRural..40Pas Vieux
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "libname libSql \"data\";\n", "proc sql outobs=5;\n", " title 'Créer des catégories';\n", " select numeropol,\n", " langue,\n", " type_prof,\n", " type_territoire, \n", " sum(cout1, cout2, cout3,cout4,cout5,cout6,cout7)as coutTotal format=dollar15.2, \n", " calculated CoutTotal*(1.02**5) as coutTotalInfla format=dollar15.2, \n", " 2005 - year(date_naissance) as age,\n", " case \n", " when calculated age < 40 then 'EncoreJeune'\n", " when calculated age ge 40 and calculated age < 60 then 'Pas Vieux'\n", " when calculated age ge 60 then 'Commence Vieux'\n", " end as categorieAge\n", " from libSql.equipe_1\n", "quit; " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Triage des données" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Avec la procédure `PROC SQL`, le triage de données se fait facilement avec `ORDER BY`" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Trier par une colonne" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "
\n", "

données triées par le nombre de sinistre

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
numeropolalimentationnbsin
43808Végétarien0
81812Végétarien0
22456Carnivore0
54340Carnivore0
31956Végétalien0
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "proc sql outobs=5;\n", " title 'données triées par le nombre de sinistre';\n", " select numeropol,alimentation ,nbsin\n", " from libSql.equipe_1\n", " order by nbsin;\n", "quit; " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Trier par de multiples colonnes" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

données triées par nombre de sinistre et alimentation

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
numeropolalimentationnbsin
85246Carnivore0
88270Carnivore0
82031Carnivore0
88695Carnivore0
81449Carnivore0
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "proc sql outobs=5;\n", " title 'données triées par nombre de sinistre et alimentation';\n", " select numeropol,alimentation ,nbsin\n", " from libSql.equipe_1\n", " order by nbsin,alimentation;\n", "quit; " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Triage en ordre croissant ou décroissant" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Table génerale contenant toutes les informations

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
numeropolalimentationnbsin
18824Carnivore5
42279Végétarien5
37423Carnivore4
70558Carnivore4
55240Carnivore4
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "proc sql outobs=5;\n", " title 'Table génerale contenant toutes les informations';\n", " select numeropol,alimentation ,nbsin\n", " from libSql.equipe_1\n", " order by nbsin desc,alimentation;\n", "quit; " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "L'ordre peut être dans n'importe quel ordre" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Table génerale contenant toutes les informations

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
numeropolalimentationnbsin
18824Carnivore5
42279Végétarien5
390Carnivore4
7754Carnivore4
8747Végétarien4
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "proc sql outobs=5;\n", " select numeropol,alimentation ,nbsin\n", " from libSql.equipe_1\n", " order by nbsin desc,numeropol asc;\n", "quit; \n", "title;" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Trier sur des colonnes avec valeur calculée\n", "On peut aussi trier sur des variables calculées! c'est formidable! réfléchissez bien lorsqu'on programmait jadis en `vba`. Calculons l'âge de nos assurés et faisons un tri sur cette variable calculée." ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
numeropolalimentationnbsinage
18824Carnivore536
42279Végétarien550
43163Végétalien430
68935Végétarien431
73779Végétarien434
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "proc sql outobs=5;\n", " select numeropol,alimentation ,nbsin , \n", " 2005 - year(date_naissance) as age \n", " from libSql.equipe_1\n", " order by nbsin desc, age asc;\n", "quit; " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Au lieu de spécifier le nom de la colonne sur laquelle nous voulons appliquer un tri, nous pouvons le faire en indiquant le numéro de la colonne (position de la colonne)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Créer des catégories

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
numeropolalimentationnbsinage
18824Carnivore536
42279Végétarien550
43163Végétalien430
68935Végétarien431
73779Végétarien434
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "proc sql outobs=5;\n", " title 'Créer des catégories';\n", " select numeropol,alimentation ,nbsin , \n", " 2005 - year(date_naissance) as age \n", " from libSql.equipe_1\n", " order by 3 desc, 4 asc;\n", "quit; " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Trier sur des colonnes non sélectionnées\n", "\n", "Il est aussi possible de faire un triage sur des colonnes non sélectionnées, par exemple ici, nous sélectionnons seulement `numeropol, alimentation ,nbsin , age` mais nous appliquons un tri sur type_territoire. " ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Créer des catégories

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
numeropolalimentationnbsinage
68406Carnivore184
73804Carnivore036
67988Végétalien030
49293Végétalien140
49083Carnivore043
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "proc sql outobs=5;\n", " select numeropol,alimentation ,nbsin , \n", " 2005 - year(date_naissance) as age \n", " from libSql.equipe_1\n", " order by type_territoire desc;\n", "quit; \n", "title;" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# La sélection d'observation" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "## Sélectionner des observations selon leurs valeurs\n", "\n", "Le tableau suivant présente un sommaire des comparateurs logiques. ([Lien vers la documentation SAS](https://documentation.sas.com/?docsetId=lrcon&docsetTarget=p00iah2thp63bmn1lt20esag14lh.htm&docsetVersion=9.4&locale=en))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Comparison Operators
\n", "

Symbol

\n", "
\n", "

Mnemonic Equivalent

\n", "
\n", "

Definition

\n", "
\n", "

Example

\n", "
\n", "

=

\n", "
\n", "

EQ

\n", "
\n", "

equal to

\n", "
\n", "
a=3
\n", "
\n", "

^=

\n", "
\n", "

NE

\n", "
\n", "

not equal to1

\n", "
\n", "
a\n",
    "ne 3
\n", "
\n", "

¬=

\n", "
\n", "

NE

\n", "
\n", "

not equal to

\n", "
\n", "

\n", "
\n", "

~=

\n", "
\n", "

NE

\n", "
\n", "

not equal to

\n", "
\n", "

\n", "
\n", "

>

\n", "
\n", "

GT

\n", "
\n", "

greater than

\n", "
\n", "
num>5
\n", "
\n", "

<

\n", "
\n", "

LT

\n", "
\n", "

less than

\n", "
\n", "
num<8
\n", "
\n", "

>=

\n", "
\n", "

GE

\n", "
\n", "

greater than or equal\n", " to2

\n", "
\n", "
sales>=300
\n", "
\n", "

<=

\n", "
\n", "

LE

\n", "
\n", "

less than or equal to3

\n", "
\n", "
sales<=100
\n", "
\n", "

\n", "
\n", "

IN

\n", "
\n", "

equal to one of a list

\n", "
\n", "
num\n",
    "in (3, 4, 5)
\n", "
1 The symbol that you use\n", " for NE depends on your personal computer.
2 The symbol => is also\n", " accepted for compatibility with previous releases of SAS. It is not\n", " supported in WHERE clauses or in PROC SQL.
3 The symbol =< is also\n", " accepted for compatibility with previous releases of SAS. It is not\n", " supported in WHERE clauses or in PROC SQL.
" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Par exemple, sélectionnons les assurés ayant plus que 3 sinistres" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Créer des catégories

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
numeropolalimentationnbsinage
80730Végétalien435
74259Végétarien447
73779Végétarien434
72873Carnivore447
72524Végétarien439
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "proc sql outobs=5;\n", " title 'Créer des catégories';\n", " select numeropol,alimentation ,nbsin , \n", " 2005 - year(date_naissance) as age \n", " from libSql.equipe_1\n", " where nbsin gt 3\n", " order by 1 desc;\n", "quit; " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Sélection des observations selon plusieurs conditions\n", "\n", "Sélectionnons des assurés âgés de moins de 40 ans et ayant plus trois sinistres. N'oublions pas que la colonne `age` est une variable calculée!" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Créer des catégories

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
numeropolalimentationnbsinage
80730Végétalien435
73779Végétarien434
72524Végétarien439
70256Végétarien437
68935Végétarien431
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "proc sql outobs=5;\n", " title 'Créer des catégories';\n", " select numeropol,alimentation ,nbsin , \n", " 2005 - year(date_naissance) as age \n", " from libSql.equipe_1\n", " where nbsin gt 3 and calculated age lt 40\n", " order by 1 desc;\n", "quit; " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "## D'autres opérateurs conditionnels\n", "\n", "Avec l'opérateur l'opération `WHERE`, nous pouvons utiliser plusieurs opérateurs;" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "ANY\n", "\n", "specifies that at least one of a set of values obtained from a subquery must satisfy a given condition\n", "\n", "`where Population> any (select Population from sql.countries)`\n", "\n", "ALL\n", "\n", "specifies that all of the values obtained from a subquery must satisfy a given condition\n", "\n", "`where Population > all (select Population from sql.countries)`\n", "\n", "BETWEEN-AND\n", "\n", "tests for values within an inclusive range\n", "\n", "`where Population between 1000000 and 5000000`\n", "\n", "CONTAINS\n", "\n", "tests for values that contain a specified string\n", "\n", "`where Continent contains 'America';`\n", "\n", "EXISTS\n", "\n", "tests for the existence of a set of values obtained from a subquery\n", "\n", "`where exists (select * from sql.oilprod);`\n", "\n", "IN\n", "\n", "tests for values that match one of a list of values\n", "\n", "`where Name in ('Africa', 'Asia');`\n", "\n", "IS NULL or IS MISSING\n", "\n", "tests for missing values\n", "\n", "`where Population is missing;`\n", "\n", "LIKE\n", "\n", "tests for values that match a specified pattern ([footnote 1](#p114vm628bnawan1av42nioe1ule-note1))\n", "\n", "`where Continent like 'A%';`\n", "\n", "\\=\\*\n", "\n", "tests for values that sound like a specified value\n", "\n", "`where Name=* 'Tiland';`" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Exemple d'opérateur `IN`" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

titre

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
numeropollanguetype_profalimentation
5FTechnicienCarnivore
5FTechnicienVégétarien
5FTechnicienCarnivore
5FTechnicienCarnivore
5FTechnicienVégétarien
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "proc sql outobs=5;\n", " title \"titre\";\n", " select \n", " numeropol, \n", " langue,\n", " type_prof,\n", " alimentation\n", " from libSql.equipe_1\n", " where type_prof in ('Informaticien','Technicien')\n", " order by numeropol;\n", "quit; " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Exemple d'opérateur `LIKE`" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

titre

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
numeropollanguetype_profalimentation
5FTechnicienCarnivore
5FTechnicienVégétarien
5FTechnicienCarnivore
5FTechnicienCarnivore
5FTechnicienVégétarien
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "proc sql outobs=5;\n", " title \"titre\";\n", " select \n", " numeropol, \n", " langue,\n", " type_prof,\n", " alimentation\n", " from libSql.equipe_1\n", " where type_prof like \"%cien\"\n", " order by numeropol;\n", "quit; " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Sommaire des données" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Il est possible d'appliquer plusieurs fonctions [_aggregate_](https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/sqlproc/p0xwc73ekemkapn1byhceqrwrisb.htm#:~:text=The%20aggregate%20function%20instructs%20PROC,that%20are%20listed%20are%20calculated.) sur des colonnes, comme des moyennes, des MIN/Max...etc. La liste complète est dans le tableau ci-dessous." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "| Function | Definition |\n", "| :--- | :--- |\n", "| AVG, MEAN | mean or average of values |\n", "| COUNT, FREQ, N | number of nonmissing values |\n", "| CSS | corrected sum of squares |\n", "| CV | coefficient of variation (percent) |\n", "| MAX | largest value |\n", "| MIN | smallest value |\n", "| NMISS | number of missing values |\n", "| PRT | probability of a greater absolute value of Student's t |\n", "| RANGE | range of values |\n", "| STD | standard deviation |\n", "| STDERR | standard error of the mean |\n", "| SUM | sum of values |\n", "| SUMWGT | sum of the WEIGHT variable values (footnote 1) |\n", "| VAR | variance |" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Exemple**: Nous voudrions avoir un tableau avec contenant le numeropol, debut_pol, fin_pol et **la moyenne des coûts de sinistres 1 à 7**. Toutefois, nous voulons seulement les moyennes supérieures à 1000$" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Coûts moyens des sinistres

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
numeropoldebut_polfin_polcoutMoyen
1618/10/199817/10/1999$1,156.14
2916/02/199615/02/1997$5,088.87
2901/06/199915/02/2000$2,255.66
2916/02/200115/02/2002$24,046.30
6507/02/199506/02/1996$3,719.03
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "proc sql outobs=5;\n", " title \"Coûts moyens des sinistres\";\n", " select \n", " numeropol, \n", " debut_pol, \n", " fin_pol, \n", " mean(cout1, cout2, cout3,cout4,cout5,cout6,cout7) as coutMoyen format=dollar15.2\n", " from libSql.equipe_1\n", " where calculated coutMoyen gt 1000;\n", "quit;" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" }, "name": "_merged", "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": {}, "toc_section_display": true, "toc_window_display": true } }, "nbformat": 4, "nbformat_minor": 4 }