"SciPy es un ecosistema para computo cientifico en python, esta constriuido sobre los arreglos de NumPy. Scipy incluye herramientas como Matplotlib, pandas , SymPy y scikit-learn. \n",
"\n",
"## 5.1 NumPy\n",
"NumPy es la base para todos los paquetes de computo científico en python, provee soporte para arreglos multidimensionales y matrices, junto con una amplia coleccion de funciones matematicas de alto nivel para operar con estos arreglos.\n",
"\n",
"### 5.1.1 numpy.array \n",
"El tipo de dato mas importante de numpy es **numpy.array** sus atibutos mas importantes son:\n",
"* numpy.array.**ndim**: -numero de dimensiones del arreglo.\n",
"* numpy.array.**shape**: Un tumpla indicando el tamaño del arreglo en cada dimension.\n",
"* numpy.array.**size**: El numero total elementos en el arreglo.\n",
"* numpy.array.**dtype**: El tipo de elemenos en el arreglo e.g. numpy.int32, numpy.int16, and numpy.float64.\n",
"* numpy.array.**itemsize**: el tamaño en bytes de cada elemento del arrglo.\n",
"* numpy.array.**data**: El bloque de memoria que contiene los datos del arreglo.\n"
"[**Producto punto**](https://en.wikipedia.org/wiki/Dot_product) y [**Multiplicacion Matricial**](https://en.wikipedia.org/wiki/Matrix_multiplication)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"260"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a@b == a.dot(b)\n",
"a@b"
]
},
{
"cell_type": "code",
"execution_count": 277,
"metadata": {},
"metadata": {},
"outputs": [],
"outputs": [],
"source": []
"source": [
"a = np.array([[1, 1],\n",
" [1, 1]])\n",
"\n",
"b = np.array([[4, 1], \n",
" [2, 2]]) "
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"#np.matmul(a, b) == a.dot(b)\n",
"#np.matmul(a, b)\n",
"#help(np.dot)"
]
},
{
"cell_type": "code",
"execution_count": 99,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0 1 2 3 4 5 6 7 8]\n",
"[[0 1 2]\n",
" [3 4 5]\n",
" [6 7 8]]\n"
]
},
{
"data": {
"text/plain": [
"36"
]
},
"execution_count": 99,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"c= np.arange(9).reshape(3,3)\n",
"print(np.arange(9))\n",
"print(c)\n",
"c.sum()"
]
},
{
"cell_type": "code",
"execution_count": 100,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 9, 12, 15])"
]
},
"execution_count": 100,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"c.sum(axis=0) # Suma por Columna"
]
},
{
"cell_type": "code",
"execution_count": 101,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 3, 12, 21])"
]
},
"execution_count": 101,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"c.sum(axis=1) #Suma por Fila"
]
},
},
{
{
"cell_type": "markdown",
"cell_type": "markdown",
"metadata": {},
"metadata": {},
"source": [
"source": [
"## 2.2 Pandas"
"### Elementos, filas, columnas y subarreglos."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"54\n"
]
},
{
"data": {
"text/plain": [
"array([[ 0, 1, 2, 3, 4],\n",
" [10, 11, 12, 13, 14],\n",
" [20, 21, 22, 23, 24],\n",
" [30, 31, 32, 33, 34],\n",
" [40, 41, 42, 43, 44],\n",
" [50, 51, 52, 53, 54]], dtype=int8)"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def f(x,y):\n",
" return 10*x+y\n",
"print(f(5,4))\n",
"b = np.fromfunction(f,(6,5),dtype=np.int8)\n",
"b\n",
"#help(np.fromfunction)"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"b[1][2] == b[1,2]"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([10, 11, 12, 13, 14], dtype=int8)"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"b[1,:]"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 2, 12, 22, 32, 42, 52], dtype=int8)"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"b[:,2]"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 3, 4],\n",
" [13, 14],\n",
" [23, 24],\n",
" [33, 34],\n",
" [43, 44],\n",
" [53, 54]], dtype=int8)"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"b[:,3:5]"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[20, 21, 22],\n",
" [30, 31, 32],\n",
" [40, 41, 42],\n",
" [50, 51, 52]], dtype=int8)"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"b[2:,:3]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Iterando elementos"
]
},
{
"cell_type": "code",
"execution_count": 177,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0 1 2 3 4]\n",
"a\n",
"[10 11 12 13 14]\n",
"a\n",
"[20 21 22 23 24]\n",
"a\n",
"[30 31 32 33 34]\n",
"a\n",
"[40 41 42 43 44]\n",
"a\n",
"[50 51 52 53 54]\n",
"a\n"
]
}
],
"source": [
"for row in b:\n",
" print(row)\n",
" print('a')"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n",
"1\n",
"2\n",
"3\n",
"10\n",
"11\n",
"12\n",
"13\n",
"20\n",
"21\n",
"22\n",
"23\n"
]
}
],
"source": [
"for element in b.flat:\n",
" print(element)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Cambio de forma"
]
},
{
"cell_type": "code",
"execution_count": 239,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"float64\n"
]
},
{
"data": {
"text/plain": [
"array([[4., 4., 0.],\n",
" [8., 0., 1.],\n",
" [9., 7., 3.],\n",
" [6., 4., 3.]])"
]
},
"execution_count": 239,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a = np.floor(10*np.random.random((4,3)))\n",
"print(a.dtype)\n",
"a\n",
"#help(np.floor)"
]
},
{
"cell_type": "code",
"execution_count": 220,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(3, 4)"
]
},
"execution_count": 220,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a.shape"
]
},
{
"cell_type": "code",
"execution_count": 245,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[4., 4., 0., 8.],\n",
" [0., 1., 9., 7.],\n",
" [3., 6., 4., 3.]])"
]
},
"execution_count": 245,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a.reshape(3,4)"
]
},
{
"cell_type": "code",
"execution_count": 244,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[4. 4. 0.]\n",
" [8. 0. 1.]\n",
" [9. 7. 3.]\n",
" [6. 4. 3.]]\n"
]
},
{
"data": {
"text/plain": [
"array([[4., 8., 9., 6.],\n",
" [4., 0., 7., 4.],\n",
" [0., 1., 3., 3.]])"
]
},
"execution_count": 244,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"print(a)\n",
"a.T"
]
},
{
"cell_type": "code",
"execution_count": 243,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ True, True, True, True],\n",
" [ True, True, True, True],\n",
" [ True, True, True, True]])"
]
},
"execution_count": 243,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a.transpose()==a.T"
]
},
{
"cell_type": "code",
"execution_count": 242,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(4, 3)\n"
]
},
{
"data": {
"text/plain": [
"(3, 4)"
]
},
"execution_count": 242,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"print(a.shape)\n",
"a.T.shape"
]
},
{
"cell_type": "code",
"execution_count": 266,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[[4., 4.],\n",
" [0., 8.],\n",
" [0., 1.]],\n",
"\n",
" [[9., 7.],\n",
" [3., 6.],\n",
" [4., 3.]]])"
]
},
"execution_count": 266,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# La dimencion con -1 se calcula automaticamente\n",
"a.reshape(-1,3,2) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5.2 Ejercicos\n",
"\n",
"### 5.2.1 Sin utilizar numpy escribe una funcion para obten el producto punto de dos vectores."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Trying:\n",
" A = [5, -10, 15] \n",
"Expecting nothing\n",
"ok\n",
"Trying:\n",
" B = [30, 35, -40]\n",
"Expecting nothing\n",
"ok\n",
"Trying:\n",
" ProdPunto(A, B)\n",
"Expecting:\n",
" -800\n",
"ok\n",
"Trying:\n",
" D = [2, 5.6, 9, 8, 10] \n",
"Expecting nothing\n",
"ok\n",
"Trying:\n",
" E = [1, 3, 2.4, 2, 11]\n",
"Expecting nothing\n",
"ok\n",
"Trying:\n",
" ProdPunto(D, E)\n",
"Expecting:\n",
" 166.39999999999998\n",
"ok\n",
"1 items had no tests:\n",
" __main__\n",
"1 items passed all tests:\n",
" 6 tests in __main__.ProdPunto\n",
"6 tests in 2 items.\n",
"6 passed and 0 failed.\n",
"Test passed.\n"
]
},
{
"data": {
"text/plain": [
"TestResults(failed=0, attempted=6)"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def ProdPunto(A,B):\n",
" '''Función que obtiene el producto punto de dos vectores A y B sin usar arreglos de numPy\n",
" \n",
" Args: \n",
" A, B (list): Listas que representan los vectores A, B con valores tipo float o int.\n",
" Deben tener la misma longitud, de lo contrario genera un mensaje de error.\n",
" \n",
" Returns:\n",
" float, int: El resultado del producto punto entre A y B\n",
" \n",
" Ejemplos:\n",
" >>> A = [5, -10, 15] \n",
" >>> B = [30, 35, -40]\n",
" >>> ProdPunto(A, B)\n",
" -800\n",
" \n",
" >>> D = [2, 5.6, 9, 8, 10] \n",
" >>> E = [1, 3, 2.4, 2, 11]\n",
" >>> ProdPunto(D, E)\n",
" 166.39999999999998\n",
" '''\n",
" if len(A) == len(B):\n",
" return sum( map((lambda x, y: x*y), A, B) )\n",
" else:\n",
" print(\"Error, los vectores son de diferente longitud\", len(A), '!=', len(B))\n",
"\n",
"import doctest\n",
"doctest.testmod(verbose=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 5.2.2 Sin utilizar numpy escribe una funcion que obtenga la multiplicacion de dos matrices.\n"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[28, 4, 5]\n",
"[8, 64, 50]\n",
"[-28, -4, -5]\n"
]
}
],
"source": [
"def MulMat(A, B):\n",
" '''Función que devuelve la multiplicación de dos matrices A y B sin usar arreglos numPy\n",
" \n",
" Args: \n",
" A, B (list): Lista de Arreglos que representan las matrices A, B con valores tipo float o int.\n",
" Columnas de A debe ser igual a Filas de B, de lo contrario genera un mensaje de error. \n",
" Tomar en cuenta que no siempre se cumple la propiedad conmutativa: MulMat(A, B) =! MulMat(B, A)\n",
" \n",
" Returns:\n",
" list: El resultado de la multiplicación entre la matriz A y la matriz B\n",
"### 5.2.3 Utiliza numpy para probar que las dos funciones anteriores dan el resultado correcto."
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"ProdPunto(a,b)= 166.39999999999998\n",
"np.dot(a,b)= 166.39999999999998\n",
"\n",
"ProdPunto(a,b)= -800\n",
"np.dot(a,b)= -800\n",
"\n",
"MulMat(A,B)=\n",
"[58, 64]\n",
"[139, 154]\n",
"np.matmul(A,B)=\n",
" [[ 58 64]\n",
" [139 154]]\n",
"\n",
"MulMat(E,D)=\n",
"[15942, 26525]\n",
"[-2018.8999999999996, 31705.8]\n",
"[62661, 8803]\n",
"[-26921.4, 4481.799999999999]\n",
"np.matmul(E,D)=\n",
" [[ 15942. 26525. ]\n",
" [ -2018.9 31705.8]\n",
" [ 62661. 8803. ]\n",
" [-26921.4 4481.8]]\n"
]
}
],
"source": [
"import numpy as np\n",
"\n",
"a = [2, 5.6, 9, 8, 10]\n",
"b = [1, 3, 2.4, 2, 11]\n",
"c = [5, -10, 15]\n",
"d = [30, 35, -40]\n",
"\n",
"if ProdPunto(a,b) == np.dot(a,b):\n",
" print('ProdPunto(a,b)=', ProdPunto(a,b))\n",
" print('np.dot(a,b)=', np.dot(a,b))\n",
"if ProdPunto(c,d) == np.dot(c,d):\n",
" print('\\nProdPunto(a,b)=', ProdPunto(c,d))\n",
" print('np.dot(a,b)=', np.dot(c,d))\n",
"\n",
"A = [[1,2,3],\n",
" [4,5,6]]\n",
"\n",
"B = [[7,8],\n",
" [9,10],\n",
" [11,12]]\n",
"\n",
"E=[[125, 216, 419],\n",
" [38.3, -516, 237],\n",
" [-209, 855, 601],\n",
" [403, 237, -50.6]]\n",
"\n",
"D=[[-73, 36],\n",
" [21, -28],\n",
" [49,67]]\n",
"\n",
"if (MulMat(A,B) == np.matmul(A,B)).all():\n",
" print('\\nMulMat(A,B)=')\n",
" for col in MulMat(A, B): print(col)\n",
" print('np.matmul(A,B)=\\n', np.matmul(A,B))\n",
"if (MulMat(E,D) == np.matmul(E,D)).all():\n",
" print('\\nMulMat(E,D)=')\n",
" for col in MulMat(E, D): print(col)\n",
" print('np.matmul(E,D)=\\n', np.matmul(E,D))\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 5.2.4 Utilizando solo lo visto hasta el momento de numpy escribe una funcion que encuentre la inversa de una matriz por el metodo de Gauss-Jordan.\n",
"[Wikipedia](https://en.wikipedia.org/wiki/Gaussian_elimination): En matemáticas, la eliminación de Gauss Jordan, llamada así en honor de Carl Friedrich Gauss y Wilhelm Jordan es un algoritmo del álgebra lineal que se usa para determinar las soluciones de un sistema de ecuaciones lineales, para encontrar matrices e inversas. Un sistema de ecuaciones se resuelve por el método de Gauss cuando se obtienen sus soluciones mediante la reducción del sistema dado a otro equivalente en el que cada ecuación tiene una incógnita menos que la anterior. El método de Gauss transforma la matriz de coeficientes en una matriz triangular superior. El método de Gauss-Jordan continúa el proceso de transformación hasta obtener una matriz diagonal"
"En python, pandas es una biblioteca de software escrita como extensión de NumPy para manipulación y análisis de datos. En particular, ofrece estructuras de datos y operaciones para manipular tablas numéricas y series temporales.\n",
"and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. Su objetivo es ser un bloque de construccion fundamental para realizar analisis de datos en el mundo real.\n",
"El nombre de la biblioteca deriva del término \"datos de panel\" (PANel DAta), término de econometría que designa datos que combinan una dimensión temporal con otra dimensión transversal.\n",
"\n",
"Pandas tiene dos typos de datos principales, **Series** (1D) y **DataFrame** (2D), *Dataframe* es un contenedr para *Series* y *Series* es un contenedor de escalares. \n",
"\n",
"### 5.3.1 Series\n",
"Series es un arreglo unidimensional etiquetado capaz de contener cualquier tipo de dato (Enteros, cadenas, punto flotante, objetos, etc), El eje de etiquetas es llamado indice (**index**).\n",