{ "cells": [ { "cell_type": "markdown", "id": "ce64f7e8-3ad9-476a-8578-55e8bab3ab9a", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Opérations sur les matrices\n" ] }, { "cell_type": "markdown", "id": "5923b87a-bec7-45bf-b019-586afd3ac1cf", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## apply" ] }, { "cell_type": "markdown", "id": "9ec49f99-0af3-4128-b208-bdee4e7679c5", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "\n", "Nous avons vu qu'il était possible de faire des opérations sur les matrices en ligne ou en colonne. Toutefois, ce n'est pas toutes les fonctions statistiques qui sont applicables sur des colonnes et/ou lignes comme `colMeans`. Pour appliquer d'utres sortes de fonctions, nous devons utiliser la fonction `apply`.\n", "\n", "On peut alors utiliser `apply` lorsqu'on veut appliquer un calcule ou une fonction quelconque (FUN) sur des colonnes ou des lignes d'une matrice (incluant les matrices plus que 2D)\n", "\n", "Soit une matrice de 12 premiers entiers;" ] }, { "cell_type": "code", "execution_count": 2, "id": "1eb0d3ad-d207-41b1-b809-eb9b187b91c8", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\t\n", "\t\n", "\t\n", "\n", "
1 4 7 10
2 5 8 11
3 6 9 12
\n" ], "text/latex": [ "\\begin{tabular}{llll}\n", "\t 1 & 4 & 7 & 10\\\\\n", "\t 2 & 5 & 8 & 11\\\\\n", "\t 3 & 6 & 9 & 12\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "| 1 | 4 | 7 | 10 | \n", "| 2 | 5 | 8 | 11 | \n", "| 3 | 6 | 9 | 12 | \n", "\n", "\n" ], "text/plain": [ " [,1] [,2] [,3] [,4]\n", "[1,] 1 4 7 10 \n", "[2,] 2 5 8 11 \n", "[3,] 3 6 9 12 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "m<-matrix(1:12, nrow=3)\n", "m" ] }, { "cell_type": "markdown", "id": "21411475-cfa2-469f-87c7-1c8b472d90c7", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Calculons le logarithme naturel de chaque élément de cette matrice:" ] }, { "cell_type": "code", "execution_count": 3, "id": "a177615e-3711-4e1b-81e7-10a864e45e5c", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\t\n", "\t\n", "\t\n", "\n", "
0.00000001.386294 1.945910 2.302585
0.69314721.609438 2.079442 2.397895
1.09861231.791759 2.197225 2.484907
\n" ], "text/latex": [ "\\begin{tabular}{llll}\n", "\t 0.0000000 & 1.386294 & 1.945910 & 2.302585 \\\\\n", "\t 0.6931472 & 1.609438 & 2.079442 & 2.397895 \\\\\n", "\t 1.0986123 & 1.791759 & 2.197225 & 2.484907 \\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "| 0.0000000 | 1.386294 | 1.945910 | 2.302585 | \n", "| 0.6931472 | 1.609438 | 2.079442 | 2.397895 | \n", "| 1.0986123 | 1.791759 | 2.197225 | 2.484907 | \n", "\n", "\n" ], "text/plain": [ " [,1] [,2] [,3] [,4] \n", "[1,] 0.0000000 1.386294 1.945910 2.302585\n", "[2,] 0.6931472 1.609438 2.079442 2.397895\n", "[3,] 1.0986123 1.791759 2.197225 2.484907" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "h<-apply(m, c(1,2), log) #c(1,2) ça veut dire sur ligne et colonne\n", "h" ] }, { "cell_type": "markdown", "id": "ff3bc8c6-8f3c-4305-b6fd-a4dc8974ba2f", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Créons une matrice `3 x 1` qui nous retourne le résultat de la somme de chaque ligne;" ] }, { "cell_type": "code", "execution_count": 4, "id": "89e7216a-e22a-449c-82e3-e4e827842392", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\t\n", "\t\n", "\t\n", "\n", "
22
26
30
\n" ], "text/latex": [ "\\begin{tabular}{l}\n", "\t 22\\\\\n", "\t 26\\\\\n", "\t 30\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "| 22 | \n", "| 26 | \n", "| 30 | \n", "\n", "\n" ], "text/plain": [ " [,1]\n", "[1,] 22 \n", "[2,] 26 \n", "[3,] 30 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "h<-matrix(apply(m, 1, sum))\n", "h" ] }, { "cell_type": "markdown", "id": "a2965308-66cd-4591-b05c-5f9de2431e6b", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Si nous comparerons à la fonction `rowSums` que nous avons vue;" ] }, { "cell_type": "code", "execution_count": 7, "id": "24fe9579-7416-476c-839e-672adf169d8f", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
    \n", "\t
  1. 22
  2. \n", "\t
  3. 26
  4. \n", "\t
  5. 30
  6. \n", "
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 22\n", "\\item 26\n", "\\item 30\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 22\n", "2. 26\n", "3. 30\n", "\n", "\n" ], "text/plain": [ "[1] 22 26 30" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "rowSums(m)" ] }, { "cell_type": "markdown", "id": "8dd9b1c9-daa1-47e2-bef1-bff8f6d1c059", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## lapply" ] }, { "cell_type": "markdown", "id": "f12f4875-aa81-48a1-999c-3411cb272bfd", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "La fonction lapply applique une fonction quelconque (FUN) à tous les éléments d’un vecteur ou d’une liste X et retourne le résultat sous forme de liste." ] }, { "cell_type": "markdown", "id": "f35a1be0-0e1f-4cd2-944c-d69a2b1e7563", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Dans l'exemple suivant, nous avons une liste de trois vecteurs {vecteur_1, vecteur_2, vecteur_3} de taille différente, on voudrait savoir quelle est la taille de chaque élément, on voudrait la réponse dans une **liste**;" ] }, { "cell_type": "code", "execution_count": 4, "id": "48d83260-b2bd-4dcf-9c3a-6d12e1e2635c", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\t
$vecteur_1
\n", "\t\t
1
\n", "\t
$vecteur_2
\n", "\t\t
17
\n", "\t
$vecteur_3
\n", "\t\t
43
\n", "
\n" ], "text/latex": [ "\\begin{description}\n", "\\item[\\$vecteur\\_1] 1\n", "\\item[\\$vecteur\\_2] 17\n", "\\item[\\$vecteur\\_3] 43\n", "\\end{description}\n" ], "text/markdown": [ "$vecteur_1\n", ": 1\n", "$vecteur_2\n", ": 17\n", "$vecteur_3\n", ": 43\n", "\n", "\n" ], "text/plain": [ "$vecteur_1\n", "[1] 1\n", "\n", "$vecteur_2\n", "[1] 17\n", "\n", "$vecteur_3\n", "[1] 43\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "x <- list(vecteur_1 = 1, vecteur_2 = 1:17, vecteur_3 = 55:97) \n", "lapply(x, FUN = length) " ] }, { "cell_type": "markdown", "id": "0d772600-3602-4443-910b-cc4283f348a5", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Dans le résultat ci-haut, la fonction `lapply` nous a retourné une liste de trois éléments avec la taille de chaque vecteur." ] }, { "cell_type": "markdown", "id": "18aa9b71-197b-4ced-9add-5da7ea3c7c44", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Regardons un autre exemple où nous cherchons à créer quatre échantillons aléatoires de taille {5, 6, 7, 8} tirés du vecteur `x= 1 2 3 4 5 6 7 8 9 10`" ] }, { "cell_type": "code", "execution_count": 5, "id": "e509b7cf-c4d5-4c04-b475-995554d5a221", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "
    \n", "\t
    1. \n", "\t
    2. 3
    3. \n", "\t
    4. 8
    5. \n", "\t
    6. 4
    7. \n", "\t
    8. 7
    9. \n", "\t
    10. 6
    11. \n", "
    \n", "
  1. \n", "\t
    1. \n", "\t
    2. 1
    3. \n", "\t
    4. 5
    5. \n", "\t
    6. 8
    7. \n", "\t
    8. 4
    9. \n", "\t
    10. 3
    11. \n", "\t
    12. 9
    13. \n", "
    \n", "
  2. \n", "\t
    1. \n", "\t
    2. 5
    3. \n", "\t
    4. 7
    5. \n", "\t
    6. 10
    7. \n", "\t
    8. 1
    9. \n", "\t
    10. 6
    11. \n", "\t
    12. 2
    13. \n", "\t
    14. 9
    15. \n", "
    \n", "
  3. \n", "\t
    1. \n", "\t
    2. 4
    3. \n", "\t
    4. 9
    5. \n", "\t
    6. 8
    7. \n", "\t
    8. 5
    9. \n", "\t
    10. 10
    11. \n", "\t
    12. 7
    13. \n", "\t
    14. 3
    15. \n", "\t
    16. 6
    17. \n", "
    \n", "
  4. \n", "
\n" ], "text/latex": [ "\\begin{enumerate}\n", "\\item \\begin{enumerate*}\n", "\\item 3\n", "\\item 8\n", "\\item 4\n", "\\item 7\n", "\\item 6\n", "\\end{enumerate*}\n", "\n", "\\item \\begin{enumerate*}\n", "\\item 1\n", "\\item 5\n", "\\item 8\n", "\\item 4\n", "\\item 3\n", "\\item 9\n", "\\end{enumerate*}\n", "\n", "\\item \\begin{enumerate*}\n", "\\item 5\n", "\\item 7\n", "\\item 10\n", "\\item 1\n", "\\item 6\n", "\\item 2\n", "\\item 9\n", "\\end{enumerate*}\n", "\n", "\\item \\begin{enumerate*}\n", "\\item 4\n", "\\item 9\n", "\\item 8\n", "\\item 5\n", "\\item 10\n", "\\item 7\n", "\\item 3\n", "\\item 6\n", "\\end{enumerate*}\n", "\n", "\\end{enumerate}\n" ], "text/markdown": [ "1. 1. 3\n", "2. 8\n", "3. 4\n", "4. 7\n", "5. 6\n", "\n", "\n", "\n", "2. 1. 1\n", "2. 5\n", "3. 8\n", "4. 4\n", "5. 3\n", "6. 9\n", "\n", "\n", "\n", "3. 1. 5\n", "2. 7\n", "3. 10\n", "4. 1\n", "5. 6\n", "6. 2\n", "7. 9\n", "\n", "\n", "\n", "4. 1. 4\n", "2. 9\n", "3. 8\n", "4. 5\n", "5. 10\n", "6. 7\n", "7. 3\n", "8. 6\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "[[1]]\n", "[1] 3 8 4 7 6\n", "\n", "[[2]]\n", "[1] 1 5 8 4 3 9\n", "\n", "[[3]]\n", "[1] 5 7 10 1 6 2 9\n", "\n", "[[4]]\n", "[1] 4 9 8 5 10 7 3 6\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "set.seed(123)\n", "lapply(5:8, sample, x = 1:10)" ] }, { "cell_type": "markdown", "id": "9a769800-e011-464b-909e-55b96ade514b", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## sapply\n", "\n", "Dans certains cas, on voudrait appliquer une fonction quelconque sur une liste, mais on ne veut pas que `R` nous retourne une une liste, on désire plutôt que `R` nous retourne un vecteur. La fonction `sapply` fait exactement cela. Le résultat est donc **s**implifiée par rapport à celui de `lapply`, d’où le nom de la fonction. " ] }, { "cell_type": "markdown", "id": "1d37cf6e-0212-4fd1-894c-aaa45a36c66c", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "La taille de chaque élément de notre liste;" ] }, { "cell_type": "code", "execution_count": 7, "id": "a0409877-d3f9-46fd-a5ed-4974f59e69b6", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\t
vecteur_1
\n", "\t\t
1
\n", "\t
vecteur_2
\n", "\t\t
17
\n", "\t
vecteur_3
\n", "\t\t
43
\n", "
\n" ], "text/latex": [ "\\begin{description*}\n", "\\item[vecteur\\textbackslash{}\\_1] 1\n", "\\item[vecteur\\textbackslash{}\\_2] 17\n", "\\item[vecteur\\textbackslash{}\\_3] 43\n", "\\end{description*}\n" ], "text/markdown": [ "vecteur_1\n", ": 1vecteur_2\n", ": 17vecteur_3\n", ": 43\n", "\n" ], "text/plain": [ "vecteur_1 vecteur_2 vecteur_3 \n", " 1 17 43 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "sapply(x, FUN = length)" ] }, { "cell_type": "markdown", "id": "46a0a759-baca-45f3-a9b0-8886add5bbcf", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "ou la somme de chaque élément de notre liste `x`" ] }, { "cell_type": "code", "execution_count": 6, "id": "b775216f-b77c-48c2-bafc-a2cf621755af", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\t
vecteur_1
\n", "\t\t
1
\n", "\t
vecteur_2
\n", "\t\t
153
\n", "\t
vecteur_3
\n", "\t\t
3268
\n", "
\n" ], "text/latex": [ "\\begin{description*}\n", "\\item[vecteur\\textbackslash{}\\_1] 1\n", "\\item[vecteur\\textbackslash{}\\_2] 153\n", "\\item[vecteur\\textbackslash{}\\_3] 3268\n", "\\end{description*}\n" ], "text/markdown": [ "vecteur_1\n", ": 1vecteur_2\n", ": 153vecteur_3\n", ": 3268\n", "\n" ], "text/plain": [ "vecteur_1 vecteur_2 vecteur_3 \n", " 1 153 3268 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "sapply(x, FUN = sum) " ] }, { "cell_type": "markdown", "id": "7765ff08-60a0-4da7-b90e-c3dd5e41368c", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Si le résultat de chaque application de la fonction est un vecteur et que les vecteurs sont tous de la même longueur, alors sapply retourne une matrice, remplie comme toujours par colonne :" ] }, { "cell_type": "code", "execution_count": 7, "id": "29818deb-fe95-4be4-9a22-0c359455d997", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
    \n", "\t
    1. \n", "\t
    2. 6
    3. \n", "\t
    4. 10
    5. \n", "\t
    6. 3
    7. \n", "\t
    8. 2
    9. \n", "\t
    10. 9
    11. \n", "
    \n", "
  1. \n", "\t
    1. \n", "\t
    2. 10
    3. \n", "\t
    4. 7
    5. \n", "\t
    6. 9
    7. \n", "\t
    8. 1
    9. \n", "\t
    10. 3
    11. \n", "
    \n", "
  2. \n", "\t
    1. \n", "\t
    2. 8
    3. \n", "\t
    4. 2
    5. \n", "\t
    6. 3
    7. \n", "\t
    8. 9
    9. \n", "\t
    10. 1
    11. \n", "
    \n", "
  3. \n", "
\n" ], "text/latex": [ "\\begin{enumerate}\n", "\\item \\begin{enumerate*}\n", "\\item 6\n", "\\item 10\n", "\\item 3\n", "\\item 2\n", "\\item 9\n", "\\end{enumerate*}\n", "\n", "\\item \\begin{enumerate*}\n", "\\item 10\n", "\\item 7\n", "\\item 9\n", "\\item 1\n", "\\item 3\n", "\\end{enumerate*}\n", "\n", "\\item \\begin{enumerate*}\n", "\\item 8\n", "\\item 2\n", "\\item 3\n", "\\item 9\n", "\\item 1\n", "\\end{enumerate*}\n", "\n", "\\end{enumerate}\n" ], "text/markdown": [ "1. 1. 6\n", "2. 10\n", "3. 3\n", "4. 2\n", "5. 9\n", "\n", "\n", "\n", "2. 1. 10\n", "2. 7\n", "3. 9\n", "4. 1\n", "5. 3\n", "\n", "\n", "\n", "3. 1. 8\n", "2. 2\n", "3. 3\n", "4. 9\n", "5. 1\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "[[1]]\n", "[1] 6 10 3 2 9\n", "\n", "[[2]]\n", "[1] 10 7 9 1 3\n", "\n", "[[3]]\n", "[1] 8 2 3 9 1\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "(x <- lapply(rep(5, 3), sample, x = 1:10))" ] }, { "cell_type": "code", "execution_count": 8, "id": "18803165-bc7d-47eb-a2a8-601954227174", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
2 11
3 32
6 73
9 98
10109
\n" ], "text/latex": [ "\\begin{tabular}{lll}\n", "\t 2 & 1 & 1 \\\\\n", "\t 3 & 3 & 2 \\\\\n", "\t 6 & 7 & 3 \\\\\n", "\t 9 & 9 & 8 \\\\\n", "\t 10 & 10 & 9 \\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "| 2 | 1 | 1 | \n", "| 3 | 3 | 2 | \n", "| 6 | 7 | 3 | \n", "| 9 | 9 | 8 | \n", "| 10 | 10 | 9 | \n", "\n", "\n" ], "text/plain": [ " [,1] [,2] [,3]\n", "[1,] 2 1 1 \n", "[2,] 3 3 2 \n", "[3,] 6 7 3 \n", "[4,] 9 9 8 \n", "[5,] 10 10 9 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "sapply(x, sort)" ] }, { "cell_type": "markdown", "id": "e2f725d2-8a0a-437c-9a5b-3ba850fbe31d", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## autre\n", "Il existe d'autres façons de manipuler les matrices, listes, vecteurs ...etc. Dans ce cours nous avons couvert les trois principaux, toutefois, on peut avoir besoin dans certains cas d'utiliser `vapply`, `mapply`, `Map`, `rapply` ou même `tapply` qui s'apparentent tous aux trois fonctions que nous avons couverts avec plus d'options ou format différent du résultat obtenu." ] } ], "metadata": { "kernelspec": { "display_name": "R", "language": "R", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "4.1.2" } }, "nbformat": 4, "nbformat_minor": 5 }