Learning NumPy

NumPy is a package for scientific computing with Python.NumPy has one main data structure name ndarray, which is N-dimensional array. Each ndarray is associated with only one data type in dtype object.

To use NumPy, we import the module.

In [2]:
import numpy as np

# Creating ndarray¶

Use arange() method to create a range(start, end, step) where the start is inclusive, the end is exclusive. If it has only one argument, it indicates the arange(end) and the default start is 0. If it has two arguments, it indicates arange(start, end) and the default step is 1.

shape attribute return the shape of the ndarray.

In [4]:
a=np.arange(5)
print(a)
a.shape
[0 1 2 3 4]
Out[4]:
(5,)
In [6]:
b=np.arange(10,32,3).reshape(2,4)
print(b)
b.shape
[[10 13 16 19]
[22 25 28 31]]
Out[6]:
(2, 4)
In [7]:
a=np.arange(0,20,2).reshape(5,2)
print(a)
a.shape
[[ 0  2]
[ 4  6]
[ 8 10]
[12 14]
[16 18]]
Out[7]:
(5, 2)

Use linspace(start, end, num) function to generate array similar to arange. However, the third argument is the number of elements rather than the step. The start and end are both inclusive. The generated elements are interpolation between the start and end.

In [48]:
a=np.linspace(1,23,12).reshape(3,4)
print(a)
a.shape
[[  1.   3.   5.   7.]
[  9.  11.  13.  15.]
[ 17.  19.  21.  23.]]
Out[48]:
(3, 4)

zeros((rows, cols),dtype) function creates a zero matrix size of a tuple (rows , cols).

In [24]:
z=np.zeros((3,4));z
Out[24]:
array([[ 0.,  0.,  0.,  0.],
[ 0.,  0.,  0.,  0.],
[ 0.,  0.,  0.,  0.]])

ones((rows, cols),dtype) function creates a one matrix size of a tuple (rows , cols). By default the data type is float, but we can change to integer by adding int in the second argument.

In [50]:
n=np.ones((2,3),int);n
Out[50]:
array([[1, 1, 1],
[1, 1, 1]])

To generate random matrix, use function random.random((rows,cols)).

In [49]:
r=np.random.random((2,3));r
Out[49]:
array([[ 0.53942954,  0.40109363,  0.63573783],
[ 0.04378601,  0.91120817,  0.38331728]])

To generate random integer matrix, use function random.randint(start,end,num). The start is inclusive but the end is exclusive. If you want the end to be inclusive, make it end+1.

In [53]:
r=np.random.randint(1,10,12).reshape(4,3);r
Out[53]:
array([[4, 7, 9],
[5, 5, 1],
[5, 9, 9],
[7, 9, 5]])

Boolean array or matrix can be created by specifying a condition.

In [172]:
r=np.random.randint(1,10,12).reshape(4,3);
print(r)
r<7
[[4 3 5]
[4 2 6]
[1 7 9]
[9 7 7]]
Out[172]:
array([[ True,  True,  True],
[ True,  True,  True],
[ True, False, False],
[False, False, False]], dtype=bool)

Boolean Array is useful for masking or filtering

In [171]:
r=np.random.randint(1,10,12).reshape(4,3)
print(r)
m=(r<7)
print(m)
r[m]
[[2 2 4]
[8 3 6]
[6 5 7]
[1 7 6]]
[[ True  True  True]
[False  True  True]
[ True  True False]
[ True False  True]]
Out[171]:
array([2, 2, 4, 3, 6, 6, 5, 1, 6])

Still another way to create ndarray is directly from array of either tuple or list.

In [32]:
g = np.array([(1, 2, 3), [4, 5, 6], (7, 8, 9)]); g
Out[32]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])

We can change the data type when we define the ndarray.

In [36]:
g = np.array([(1, 2, 3), [4, 5, 6], (7, 8, 9)],dtype=complex); g
Out[36]:
array([[ 1.+0.j,  2.+0.j,  3.+0.j],
[ 4.+0.j,  5.+0.j,  6.+0.j],
[ 7.+0.j,  8.+0.j,  9.+0.j]])

Number of dimension can be more than 2. The following is three dimensional array.

In [3]:
a = np.arange(30).reshape(2, 3, 5)
print(a)
[[[ 0  1  2  3  4]
[ 5  6  7  8  9]
[10 11 12 13 14]]

[[15 16 17 18 19]
[20 21 22 23 24]
[25 26 27 28 29]]]
p  p  p  p  p
o  o  o  o  o
s  s  s  s  s

dim 2  0  1  2  3  4

|  |  |  |  |
dim 0     ↓  ↓  ↓  ↓  ↓
-> [[[ 0  1  2  3  4]   <- dim 1, pos 0
pos 0   [ 5  6  7  8  9]   <- dim 1, pos 1
[10 11 12 13 14]]  <- dim 1, pos 2
dim 0
->  [[15 16 17 18 19]   <- dim 1, pos 0
pos 1   [20 21 22 23 24]   <- dim 1, pos 1
[25 26 27 28 29]]] <- dim 1, pos 2
↑  ↑  ↑  ↑  ↑
|  |  |  |  |

dim 2  p  p  p  p  p
o  o  o  o  o
s  s  s  s  s

0  1  2  3  4

source

# Basic Operation¶

## Selection¶

We use colon notation to extract portions of an array to generate new ones. The colon notation is start:end:step . The start is inclusive while the end is exclusive. When the start is omitted, the default is 0. When the end is omitted, the default is the end of the array. When the step is omitted, the default is 1.

The following are examples in 1D array.

In [183]:
a = np.arange(10, 15)
a
Out[183]:
array([10, 11, 12, 13, 14])
In [184]:
a[2]
Out[184]:
12
In [185]:
a[1:4]
Out[185]:
array([11, 12, 13])
In [186]:
a[:4]
Out[186]:
array([10, 11, 12, 13])
In [187]:
a[:4:2]
Out[187]:
array([10, 12])
In [188]:
a[:]
Out[188]:
array([10, 11, 12, 13, 14])

The following are examples in 2D array.

In [194]:
A = np.arange(12).reshape(4,3)+1
A
Out[194]:
array([[ 1,  2,  3],
[ 4,  5,  6],
[ 7,  8,  9],
[10, 11, 12]])
In [199]:
A[0:2,0:2]
Out[199]:
array([[1, 2],
[4, 5]])
In [200]:
A[2:,2:]          # A[2:5,2:4]
Out[200]:
array([[ 9],
[12]])
In [210]:
A[[0,2],1:3]      # A[0,1:3]; A[2,1:3]
Out[210]:
array([[2, 3],
[8, 9]])
In [214]:
A[[0,3,2],[0,2,1]]  # A[0,0],A[3,2],A[2,1]
Out[214]:
array([ 1, 12,  8])

## 2D <---> 1D¶

Use ravel() method to convert a two-dimensional array into a one-dimensional array.

In [173]:
g = np.array([(1, 2, 3), [4, 5, 6], (7, 8, 9)])
print(g)
g.ravel()
[[1 2 3]
[4 5 6]
[7 8 9]]
Out[173]:
array([1, 2, 3, 4, 5, 6, 7, 8, 9])

To put back to 2D array, use reshape() method.

In [175]:
h=np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
h.reshape(3,3)
Out[175]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])

## Joining Arrays and Matrices¶

Horizontal concatenation using hstack() function.

In [56]:
A = np.ones((3, 4),int)
B = np.zeros((3, 2),int)
C = np.hstack((A, B))
C
Out[56]:
array([[1, 1, 1, 1, 0, 0],
[1, 1, 1, 1, 0, 0],
[1, 1, 1, 1, 0, 0]])

Vertical concatenation using vstack() function

In [58]:
A = np.ones((2, 2),int)
B = np.zeros((3, 2),int)
C = np.vstack((A, B))
C
Out[58]:
array([[1, 1],
[1, 1],
[0, 0],
[0, 0],
[0, 0]])

Multiple arrays can be joined using column_stack() and row_stack() functions.

In [61]:
a = np.array([1,3,2])
b = np.array([5,6,4])
c = np.array([7,9,8])
np.column_stack((a,b,c))
Out[61]:
array([[1, 5, 7],
[3, 6, 9],
[2, 4, 8]])
In [62]:
np.row_stack((a,b,c))
Out[62]:
array([[1, 3, 2],
[5, 6, 4],
[7, 9, 8]])

## Splitting Matrix¶

In this section we will use the following matrix

In [96]:
A = np.arange(24).reshape(4,6)
A
Out[96]:
array([[ 0,  1,  2,  3,  4,  5],
[ 6,  7,  8,  9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23]])

Use split(matrix,location,axis) function to separate matrix or array into several parts.

• if split over row, use axis=0.
• if split over column, use axis=1
In [97]:
[B,C,D]=np.split(A,[2,5],axis=1)
B
Out[97]:
array([[ 0,  1],
[ 6,  7],
[12, 13],
[18, 19]])
In [98]:
C
Out[98]:
array([[ 2,  3,  4],
[ 8,  9, 10],
[14, 15, 16],
[20, 21, 22]])
In [99]:
D
Out[99]:
array([[ 5],
[11],
[17],
[23]])

To separate a matrix or array into two parts, use vsplit() to separate vertically and hsplit() to separate horizontally.

In [100]:
A = np.arange(24).reshape(4,6)
[B,C]=np.hsplit(A,2)
B
Out[100]:
array([[ 0,  1,  2],
[ 6,  7,  8],
[12, 13, 14],
[18, 19, 20]])
In [101]:
C
Out[101]:
array([[ 3,  4,  5],
[ 9, 10, 11],
[15, 16, 17],
[21, 22, 23]])
In [102]:
[D,E]=np.vsplit(A,2)
D
Out[102]:
array([[ 0,  1,  2,  3,  4,  5],
[ 6,  7,  8,  9, 10, 11]])
In [103]:
E
Out[103]:
array([[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23]])

## Transpose¶

To change the position between rows and column, use transpose operator array.T

In [117]:
A=np.arange(1,24,2).reshape(3,4)
print(A)
print()
print(A.T)
[[ 1  3  5  7]
[ 9 11 13 15]
[17 19 21 23]]

[[ 1  9 17]
[ 3 11 19]
[ 5 13 21]
[ 7 15 23]]

## Matrix Product¶

There are several ways to multiply two matrices C = A * B :

• C = A.dot(B)
• C = np.dot(A,B)
In [126]:
Z=np.arange(1,24,2).reshape(3,4)
[A,B]=np.split(Z,2,1)
print('A=',A); print(); print('B=',B)
A= [[ 1  3]
[ 9 11]
[17 19]]

B= [[ 5  7]
[13 15]
[21 23]]
In [123]:
C=np.dot(A,B.T)
print(C)
[[ 26  58  90]
[122 282 442]
[218 506 794]]
In [125]:
C=A.dot(B.T)
print(C)
[[ 26  58  90]
[122 282 442]
[218 506 794]]

In [127]:
Z=np.arange(1,24,2).reshape(3,4)
[A,B]=np.split(Z,2,1)
print('A=',A); print(); print('B=',B)
A= [[ 1  3]
[ 9 11]
[17 19]]

B= [[ 5  7]
[13 15]
[21 23]]
In [129]:
C=A+B
C
Out[129]:
array([[ 6, 10],
[22, 26],
[38, 42]])
In [130]:
D=A-B
D
Out[130]:
array([[-4, -4],
[-4, -4],
[-4, -4]])

## Decrement and Increment¶

In [137]:
Z=np.arange(12).reshape(3,4)
Z
Out[137]:
array([[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11]])
In [138]:
Z += 1
Z
Out[138]:
array([[ 1,  2,  3,  4],
[ 5,  6,  7,  8],
[ 9, 10, 11, 12]])
In [139]:
Z -= 2
Z
Out[139]:
array([[-1,  0,  1,  2],
[ 3,  4,  5,  6],
[ 7,  8,  9, 10]])
In [140]:
Z *= 2
Z
Out[140]:
array([[-2,  0,  2,  4],
[ 6,  8, 10, 12],
[14, 16, 18, 20]])

# Universal Functions¶

Universal functions are functions that operate on elements of an array.

• log()
• sin()
• sqrt()
• exp()

and so on

In [148]:
Z=np.arange(6).reshape(2,3)+1
Z
Out[148]:
array([[1, 2, 3],
[4, 5, 6]])
In [149]:
np.log(Z)
Out[149]:
array([[ 0.        ,  0.69314718,  1.09861229],
[ 1.38629436,  1.60943791,  1.79175947]])
In [151]:
np.sin(Z)
Out[151]:
array([[ 0.84147098,  0.90929743,  0.14112001],
[-0.7568025 , -0.95892427, -0.2794155 ]])
In [152]:
np.sqrt(Z)
Out[152]:
array([[ 1.        ,  1.41421356,  1.73205081],
[ 2.        ,  2.23606798,  2.44948974]])
In [153]:
np.exp(Z)
Out[153]:
array([[   2.71828183,    7.3890561 ,   20.08553692],
[  54.59815003,  148.4131591 ,  403.42879349]])

## Aggregated Methods¶

NumPy array has several statisical methods to aggregate

• sum()
• mean()
• min()
• max()
• std()
In [154]:
Z=np.arange(6).reshape(2,3)+1
Z
Out[154]:
array([[1, 2, 3],
[4, 5, 6]])
In [155]:
Z.sum()
Out[155]:
21
In [156]:
Z.mean()
Out[156]:
3.5
In [157]:
Z.std()
Out[157]:
1.707825127659933
In [158]:
Z.min()
Out[158]:
1
In [159]:
Z.max()
Out[159]:
6

To aggregate with other functions or to apply the aggregation function along sum axis use apply_along_axis(aggrFunc, axis, arr)

• axis=0 to evaluate the elements column by column
• axis=1 to evaluate the elements row by row

The aggregation function can be any function.

In [165]:
np.apply_along_axis(np.mean, axis=0, arr=Z)
Out[165]:
array([ 2.5,  3.5,  4.5])
In [167]:
np.apply_along_axis(np.sum, axis=1, arr=Z)
Out[167]:
array([ 6, 15])

last update: August 2017

Cite this tutorial as: Teknomo,K. (2017) Learning Numpy (http://people.revoledu.com/kardi/tutorial/Python/)