Introduction to NumPy–Part 1

I’d like to at this time clarify that the various contributors to this blog are probably going to end up making both a personal this is interesting blog, as well as a blog containing updates for Valkyrie Robotics, and a place to rapidly provide conceptual overviews.

Today, however, I’d like to give an overview of NumPy, which, according to their website, is “the fundamental package for scientific computing with Python.” More bluntly, it’s a package for multi-dimenensional arrays and matrices as well as operations on those datatypes.

If you need to do Linear Algebra in Python, it will probably be in the form of NumPy matrices.

In this post, I expect an understanding of Python 3, as this is not a Python tutorial.

import numpy as np

We, of course, start out with the numpy include. The convention is to include using import numpy as np rather than doing from numpy import *

NumPy ndarray: Multidimensional arrays

Arrays enable you to perform mathmatical operations on blocks of data as if they were a scalar.

# arange works similarly to Python's range function
arr0 = np.arange(1, 10, 2)
arr0
Out[2]: 

    array([1, 3, 5, 7, 9])
# create a 2D list in native Python
data = [[1, 2, 3, 4],
        [5, 6, 7, 9]]

# initialize a ndarray to that list
arr1 = np.array(data)

data2 = [6.8, 3, 0.5, 7]

arr2 = np.array(data2)
arr1
Out[4]: 

    array([[1, 2, 3, 4],
           [5, 6, 7, 9]])
arr1.ndim
Out[5]: 

    2
arr1.shape
Out[6]: 

    (2, 4)

np.array infers a datatype about the objects it contains. This is stored in a special object. For example:

arr1.dtype
Out[7]: 

    dtype('int64')
arr2.dtype
Out[8]: 

    dtype('float64')

We can also use other functions to create new arrays, such as zeros, ones, or empty. The first two will create arrays filled with zero or one respecitively, whereas the latter creates empty arrays. These empty arrays are not necessarily zeroes.

np.zeros(10)
Out[9]: 

    array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])
np.ones((3, 6))
Out[10]: 

    array([[ 1.,  1.,  1.,  1.,  1.,  1.],
           [ 1.,  1.,  1.,  1.,  1.,  1.],
           [ 1.,  1.,  1.,  1.,  1.,  1.]])
np.empty((1, 4, 9))
Out[11]: 

    array([[[  6.91354255e-310,   1.09642495e-316,   2.82737564e-317,
               2.47032823e-323,   0.00000000e+000,   0.00000000e+000,
               0.00000000e+000,   6.91354255e-310,   0.00000000e+000],
            [  0.00000000e+000,  -2.77323087e-027,   0.00000000e+000,
               0.00000000e+000,  -8.12748304e-298,   0.00000000e+000,
               0.00000000e+000,   4.66932976e-088,   0.00000000e+000],
            [  0.00000000e+000,   6.91350373e-310,   0.00000000e+000,
               0.00000000e+000,   6.86025377e-057,   0.00000000e+000,
               0.00000000e+000,   5.14400001e+271,   0.00000000e+000],
            [  0.00000000e+000,  -4.82088475e+232,   9.53546696e-322,
               1.27837628e-316,   1.09643681e-316,   0.00000000e+000,
               0.00000000e+000,  -8.65683232e-271,   6.91354050e-310]]])

Here are some of the ways to construct arrays:

Function Description
asarray Convert input to array if input is not already an array.
ones, ones_like Create an array filled with ones of a given shape. ones_like uses the shape of an already existing array.
zeros, zeros_like Create an array filled with zeroes of a given shape. zeros_like uses the shape of an already existing array.
empty, empty_like Create an array of the given shape but do not fill it with values; just allocate memory.
eye, identity Construct an N x N identity matrix.

Data Types for Arrays

Dtypes usually map directly to the underlying level of code, which makes it easy for NumPy to quickly read and write data. Feel free to google all the different possibilities.

arr = np.array([1, 2, 16, 42, 54])

arr.dtype
Out[12]: 

    dtype('int64')
floats = arr.astype(np.float64)
floats.dtype
Out[13]: 

    dtype('float64')

This will cast the internal objects to the new type, and will raise a TypeError if you screw it up. For arrays, unlike matrices, arithmetic operations simply apply it to each element, like a for loop would do.

arr = np.array([[1., 2., 3.], [4., 5., 6.]])

arr
Out[14]: 

    array([[ 1.,  2.,  3.],
           [ 4.,  5.,  6.]])
# Not matrix multiplication
arr * arr
Out[15]: 

    array([[  1.,   4.,   9.],
           [ 16.,  25.,  36.]])
arr - arr
Out[16]: 

    array([[ 0.,  0.,  0.],
           [ 0.,  0.,  0.]])

In terms of NumPy array indexing, 1D arrays work similarly to lists in native Python:

arr = np.arange(10)

arr
Out[17]: 

    array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
arr[5]
Out[18]: 

    5
arr[5:8]
Out[19]: 

    array([5, 6, 7])
arr[5:8] = 12
arr
Out[21]: 

    array([ 0,  1,  2,  3,  4, 12, 12, 12,  8,  9])

As you can see, slices are simply views of the entire array. As a result, they will make any changes made to themselves onto the actual array. No data is actually copied.

arr_slice = arr[5:8]

arr_slice[1] = 12345

arr
Out[22]: 

    array([    0,     1,     2,     3,     4,    12, 12345,    12,     8,     9])
arr_slice[:] = 64

arr
Out[23]: 

    array([ 0,  1,  2,  3,  4, 64, 64, 64,  8,  9])

For higher dimensional arrays, you slice using smaller arrays. For instance, to slice a 2D array, you use 1D arrays.

# Create an array using arange and then reshape to 2D
arr2d = np.arange(1, 10).reshape((3, 3))

arr2d
Out[24]: 

    array([[1, 2, 3],
           [4, 5, 6],
           [7, 8, 9]])

Sadly, this works the opposite of matrices. So arr2d[0, 1] would give you 2, not 4, and arr2d[2, 1] would give you 8.

arr3d = np.arange(1, 13).reshape((2, 2, 3))
arr3d
Out[25]: 

    array([[[ 1,  2,  3],
            [ 4,  5,  6]],
    
           [[ 7,  8,  9],
            [10, 11, 12]]])
# Gives us a 2D array
arr3d[0]
Out[26]: 

    array([[1, 2, 3],
           [4, 5, 6]])
old = arr3d[0].copy()

arr3d[0] = 42

arr3d
Out[27]: 

    array([[[42, 42, 42],
            [42, 42, 42]],
    
           [[ 7,  8,  9],
            [10, 11, 12]]])
arr3d[0] = old
arr3d
Out[28]: 

    array([[[ 1,  2,  3],
            [ 4,  5,  6]],
    
           [[ 7,  8,  9],
            [10, 11, 12]]])

Another way to think of it: arr3d[1, 0] will give you all arrays which start with (1, 0) giving us a 1D array.


In the next post, I’ll talk about more fancy indexing and mathematical operations, and then about NumPy matrices.