{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Bibliothèque permanente" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Comme nous avons vu dans le chapitre précédent, SAS a besoin d'importer les données de n'importe quel format afin que l'utilisateur puisse appliquer les analyses souhaitées. Toutesfois, afin d'être plus productif, il est plus simple de conserver ces données quelque part où nous ne sommes pas obligés de refaire cette opération d'importation d'un jeu de données à chaque fois que nous en avons besoin. Surtout lorsque nous y effectuons des changements tels que des ajouts de variables ou des calculs qui prennent du temps à exécuter." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "SAS nous permet de sauvegarder ces données dans ce qu'on appelle une **bibliothèque** ou (_**library**_)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Mais d'abord, rappelons-nous comment les données ont organisés dans SAS. Ci-dessous, un parallèle avec ce que nous sommes habitués de voir avec des tableurs tel qu’Excel de Microsoft ou Calc de [LibreOffice](https://www.libreoffice.org/discover/calc/);" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "\n", "\n", "\n", "\n", "\n", " \n", " \n", " \n", "
Excel (ou autre)SAS
feuilledata set
colonnevariable
ligneobservation
" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Lorsqu'on créait un jeu de données avec la procédure `data` comme suit;" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "filename mesDos 'data/emprunt_bancaire.csv';\n", "data loan;\n", " infile mesDos dsd\n", " firstobs=2;\n", " input Loan_ID $ loan_status $ Principal terms age education $ Gender $;\n", "run;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "En réalité, SAS sauvegarde les données sous le format `work.loan`, le `work` qui précède `loan` est le nom de la bibliothèque (_library reference_) temporaire que SAS crée par défaut. SAS y insère tous les jeux de données si l'utilisateur n'en définit pas une.\n", "\n", "D'ailleurs, dans tous les exemples que nous avons réalisés jusqu'à présent, nous avons créé des jeux de données dans une bibliothèque temporaire nommée `WORK`. Vous pouvez le vérifier en affichant les données comme suit:\n" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

The SAS System

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Loan_IDloan_statusPrincipaltermsageeducationGender
xqd20166PAIDOFF10003045High Schmale
xqd20168PAIDOFF5003050Bechalorfemale
xqd20160PAIDOFF10003033Bechalorfemale
xqd20160PAIDOFF8951527collegemale
xqd20160PAIDOFF10003028collegefemale
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "proc print data=work.loan noobs;\n", "run;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Remarquez qu'ici nous ajoutons `work.` même si nous ne le précisons pas à la création du jeu de données." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Comment créer une bibliothèque permanente?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Nous créons une bibliothèque en donnant simplement le nom de cette dernière dans la commande `LIBNAME` et de spécifier dans quel répertoire nous voulons la stocker.\n", "\n", "La longueur du nom de la bibliothèque ne doit pas dépasser 8 caractères et doit respecter les règles de validation de SAS (caractères spéciaux ...etc.)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Allons-y avec un exemple où je veux créer une bibliothèque permanente appelée `act3035` dans le répertoire `data`. Dans cet exemple, j'aurais pu mettre ma bibliothèque permanente dans un autre répertoire qui serait `repertoire/sous_repertoire/data`" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

The SAS System

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
code_permexamen1examen2examen3
Diane Rossi676870
Stéphanie de la Gonzalez728187
Thierry Lacroix738786
Jérôme Blanc-Moreno736677
Danielle Le Guyon817384
Alfred Adam737982
Chantal Mahe757769
Eugène Dupont818083
Bertrand Pages Le Didier867775
Guy Gillet de la Valette786980
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "libname act3035 'data'; /* repertoire/sous_repertoire/data */\n", "data act3035.notes_examens;\n", "infile datalines dsd;\n", " length code_perm $ 50;\n", " input code_perm $ examen1-examen3;\n", " datalines;\n", " \"Diane Rossi\",67,68,70\n", " \"Stéphanie de la Gonzalez\",72,81,87\n", " \"Thierry Lacroix\",73,87,86\n", " \"Jérôme Blanc-Moreno\",73,66,77\n", " \"Danielle Le Guyon\",81,73,84\n", " \"Alfred Adam\",73,79,82\n", " \"Chantal Mahe\",75,77,69\n", " \"Eugène Dupont\",81,80,83\n", " \"Bertrand Pages Le Didier\",86,77,75\n", " \"Guy Gillet de la Valette\",78,69,80\n", " ;\n", "proc print data=act3035.notes_examens nobs;\n", "run;" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Comme expliqué en classe, un des avantages de prendre l'habitude de travailler avec des bibliothèques permanentes, est de ne pas refaire le travaille de [ETL](https://www.sas.com/en_us/insights/data-management/what-is-etl.html) (extraire les données, les transformer et les importer).\n", "\n", "En plus, si nous avons un (ou plusieurs) **petit** ensemble de données, ce n'est pas très pénalisant de travailler avec les bibliothèques temporaires, et nous pouvons aisément nous en sortir avec la taille de la mémoire de nos ordinateurs. Toutefois, dès que nous avons des bases de données plus importantes, comme c'est le cas en pratique, il est très pertinent de créer des bibliothèques permanentes et d'y stocker nos ensembles de données. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Description de la bibliothèque" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Nous pouvons voir la description des bibliothèques permanentes que nous avons créées avec la commande `PROC CONTENTS`." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Listing All the SAS Data Sets in a Library

\n", "
\n", "
\n", "

The CONTENTS Procedure

\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Data Set NameACT3035.NOTES_EXAMENSObservations10
Member TypeDATAVariables4
EngineV9Indexes0
Created09/09/2017 18:40:49Observation Length80
Last Modified09/09/2017 18:40:49Deleted Observations0
Protection CompressedNO
Data Set Type SortedNO
Label   
Data RepresentationSOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64  
Encodingutf-8 Unicode (UTF-8)  
\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Engine/Host Dependent Information
Data Set Page Size65536
Number of Data Set Pages1
First Data Page1
Max Obs per Page817
Obs in First Data Page10
Number of Data Set Repairs0
Filename/mnt/hgfs/myfolders/cody/librairies/notes_examens.sas7bdat
Release Created9.0401M4
Host CreatedLinux
Inode Number2221
Access Permissionrwxrwx---
Owner Nameroot
File Size128KB
File Size (bytes)131072
\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Alphabetic List of Variables and Attributes
#VariableTypeLen
1code_permChar50
2examen1Num8
3examen2Num8
4examen3Num8
\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "proc contents data=act3035.notes_examens;\n", "run;" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Nous pouvons aussi faire une description de tous les ensembles de données avec l'option `_ALL_` contenus dans une bibliothèque" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Listing All the SAS Data Sets in a Library

\n", "
\n", "
\n", "

The CONTENTS Procedure

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Directory
LibrefACT3035
EngineV9
Physical Name/mnt/hgfs/myfolders/data
Filename/mnt/hgfs/myfolders/data
Inode Number557
Access Permissionrwxrwx---
Owner Nameroot
File Size1KB
File Size (bytes)576
\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
#NameMember TypeFile SizeLast Modified
1NOTES_EXAMENSDATA128KB01/16/2018 20:32:37
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "title \"Listing All the SAS Data Sets in a Library\";\n", "proc contents data=act3035._all_ nods;\n", "run;" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Références\n", "Guide, A. Programmer’S. \"Learning SAS® by Example.\"" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": {}, "toc_section_display": true, "toc_window_display": true } }, "nbformat": 4, "nbformat_minor": 4 }