Statistics and Machine Learning are all about dealing with large quantities of data. In order to process such huge volumes of data, the data is generally stored in the form of matrices. These matrices are more often termed as arrays in the programming language.
Numpy is Python’s widely used library for scientific computing. It provides a high-performance multidimensional array object, and tools for working with these arrays. Numpy implements several optimization techniques that helps in performing large computations faster.
Numpy is widely used to process single and double dimensional arrays.
Using numpy to process single dimensional arrays
The following program illustrates the same:
import numpy as np
a = np.array([0, 2, 3]) # this will create a 1 dimensional numpy array
print a.shape # this will print (3,) meaning that it is a 1-D array of length 3
print a[0] # this will print 0
print a[1] # this will print 2
print a[2] # this will print 3
print a # this will print [0, 2, 3]
Using numpy to process two dimensional arrays
Two dimensional arrays are widely used to store data while running a Machine Learning algorithm. Each row of this two dimensional array is used to store a single data-point. This data point may comprise of multiple features. Consider the following example:
import numpy as np
a = np.ones((2, 2)) # this will create an array of all ones [[1, 1], [1, 1]]
a[0][0] = 2 # this will change the array to [[2, 1], [1, 1]]
print a[0][0] # this will print 2
print a[0][1] # this will print 1
Computations become very simple to execute when using Numpy. Consider the following example:
import numpy as np
x = np.array([0, 2, 3])
y = np.array([1, 4, 5])
print x + y # prints [1, 6, 8] – element-wise addition of the arrays
print x – y # prints [-1, -2, -2] – element-wise subtraction of the arrays
print x * y # prints [0, 8, 15] – element-wise multiplication of the arrays
Other operations using Numpy
Numpy can be used to perform not just simple operations like addition, subtraction, multiplication, etc, but also it can be used to perform complex operations like transpose, dot product, etc. Consider the following simple example:
import numpy as np
x = np.array([0, 2, 3])
print x.T # prints [[0], [2], [3]] – transpose of x
x = np.array([1,2])
y = np.array([3, 4])
print np.dot(x, y) # prints 11 (the dot product)
Summary
Using the power of numpy, data can be stored and processed in a very convenient manner. This convenience allows us to help write Machine Learning algorithms that focus more on the algorithm, rather than the data processing part and the code becomes much cleaner, intuitive and easy-to-understand.
Comments are closed.