NumPy-Tutorial-for-Beginners

NumPy Tutorial for Beginners – Arrays, Funtions & Operations

What is Numpy?

NumPy is a Python library for scientific computing that supports large, multi-dimensional arrays and matrices and a large collection of high-level mathematical functions to operate on these arrays. It is widely used in data analysis, machine learning, and scientific research.

NumPy provides a powerful array object used as a container for storing and manipulating large sets of numerical data. NumPy arrays are similar to Python lists but with additional functionality optimized for numerical operations. NumPy arrays allow you to perform mathematical operations on entire arrays, which is faster and more efficient than performing the same operations on individual elements.

Numpy Stands For?

NumPy stands for “Numerical Python.” It is a Python library that supports la, multi-dimensional arrays and matrices, and a range of mathematical functions to manipulate and analyze them. The library was first released in 2005, and since then, it has become an essential component of the scientific Python ecosystem. The NumPy library is open-source and freely available to anyone who wants to use it. The library is actively maintained and developed by a community of contributors, and it is used by millions of developers and scientists worldwide.

Why Numpy?

NumPy is a popular library for scientific computing in Python. It supports large, multi-dimensional arrays and matrices and a wide range of mathematical functions to operate on these arrays.

Here are some of the reasons why NumPy is used so extensively in scientific computing:

  1. Fast and efficient: NumPy is built on top of C programming language, which makes it faster than traditional Python data structures, like lists and dictionaries. This speed is essential when working with large datasets.
  2.  Multi-dimensional arrays: NumPy provides a powerful array object used as a container for storing and manipulating large sets of numerical data. NumPy arrays are similar to Python lists but with additional functionality optimized for numerical operations. NumPy arrays allow you to perform mathematical operations on entire arrays, which is faster and more efficient than performing the same operations on individual elements.
  3.  Broadcasting: NumPy’s broadcasting capability allows you to perform operations on arrays with different shapes and sizes without requiring explicit loops. Broadcasting can significantly reduce the amount of code needed to perform complex operations.
  4.  Mathematical functions: NumPy includes many high-level mathematical functions, including trigonometric, logarithmic, and exponential functions, linear algebra functions, statistical functions, and more. These functions are optimized for NumPy arrays and are much faster than equivalent functions in Python.
  5.  Memory efficiency: NumPy arrays are stored in contiguous memory blocks, which means that they are more memory-efficient than traditional Python data structures. This is especially important when working with large datasets that cannot be stored in memory all at once.
  6.  Integration with other libraries: NumPy is widely used in the scientific Python ecosystem and is often used in conjunction with other libraries, such as SciPy, pandas, and scikit-learn. NumPy arrays can be seamlessly integrated with these libraries, making it easier to perform complex scientific computations.

How to install NumPy Python?

NumPy is a Python package that can be installed using several methods, including:

Using pip

Pip is a package manager for Python that allows you to easily install, upgrade, and manage Python packages. To install NumPy using pip, open a command prompt or terminal window and enter the following command:

pip install numpy

This will download and install the latest version of NumPy from the PyPI (Python Package Index) repository.

Using Anaconda

Anaconda is a popular Python distribution with a wide range of scientific computing packages, including NumPy, SciPy, and Matplotlib. If you’re using Anaconda, NumPy is already included, and you can use it without needing to install it separately.

Using a package manager

If you’re using a Linux-based operating system, you can use the system’s package manager to install NumPy. For example, on Ubuntu or Debian, you can use the apt-get package manager to install NumPy by running the following command:

sudo apt-get install python-numpy

From source

You can also install NumPy from source if you prefer. To do this, you’ll need to download the NumPy source code from the NumPy website, extract it, and then run the setup.py script to install NumPy. Here are the basic steps:

  • Download the NumPy source code from the NumPy website (https://numpy.org).
  • Extract the contents of the downloaded file to a directory of your choice.
  • Open a command prompt or terminal window and navigate to the directory where you extracted the NumPy source code.

Run the following command to install NumPy:

python setup.py install

Once NumPy is installed, you can import it into your Python code using the following command:

import numpy as np

This will allow you to use the NumPy functions and data structures in your code.

What is Numpy Arrays?

NumPy arrays are a powerful and flexible data structure essential for scientific computing, data analysis, and machine learning. They provide a more efficient and optimized way to store and manipulate numerical data than traditional Python lists. Their support for multidimensional arrays and mathematical operations makes them a valuable tool for a wide range of applications.

Numpy Array in Python 

In NumPy, there are several types of arrays you can create.

The main types of arrays in NumPy include:

One Dimensional (1D arrays)

A 1-dimensional array, also known as a 1D array or vector, is a NumPy array that contains a single sequence of elements arranged in a linear fashion. It is a type of array with only one axis, which can be accessed using indexing.

import numpy as np

arr = np.array([1, 2, 3, 4, 5])
print(arr)
output: [1 2 3 4 5]

2 Dimensional Arrays (2D Arrays)

A 2-dimensional array, also known as a 2D array or matrix, is a NumPy array that contains a collection of sequences arranged in a grid or table-like fashion. It is a type of array with two axes: the rows and the columns.

import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr)
output: [[1 2 3]
        [4 5 6]
        [7 8 9]]

Multi-Dimensional arrays

A multi-dimensional array, also known as an n-dimensional array, is a NumPy array that contains a collection of sequences arranged in a multi-dimensional grid or table-like fashion. It is an array with n-axes, where n is any positive integer.

import numpy as np

my_array = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(my_array)

Homogeneous arrays

In NumPy, homogeneous arrays refer to arrays that contain elements of the same data type. This means that every element in the array has the same data type, whether it is an integer, float, or any other data type supported by NumPy.

NumPy provides several functions for creating homogeneous arrays, including the array() function and various special functions for creating arrays with specific properties, such as zeros(), ones(), empty(), and full().

import numpy as np

my_array = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
print(my_array)
Output: 
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]

Heterogeneous Arrays

Heterogeneous arrays refer to arrays that contain elements of different data types. This means that the array may contain a mixture of integers, floats, strings, or other data types supported by NumPy.

To create a heterogeneous array in NumPy, you can use the array() function with a list of objects of different data types.

For example, the following code creates a 1D object array with elements of different data types:

import numpy as np

my_array = np.array([1, 2.5, "hello", [4, 5, 6]])
print(my_array)
Output:

[1 2.5 'hello' list([4, 5, 6])]

You May Also Like to Read All About: Introduction to Python Programming?

Operations In NumPy

NumPy provides a wide range of operations that can be performed on arrays, making it a powerful tool for scientific computing and data analysis.

Here are some of the most common operations you can perform with NumPy arrays:

Indexing and Slicing:

Indexing in NumPy is similar to indexing in Python lists, with the added ability to index using multiple dimensions. To index a NumPy array, you can use square brackets and specify the index or indices of the element(s) you want to access.

For example, to access the element at row 0, column 1 of a 2D NumPy array called my_array, you can use the following code:

import numpy as np

my_array = np.array([[1, 2, 3], [4, 5, 6]])
print(my_array[0, 1])



Output: 
2

Slicing in NumPy allows you to extract subsets of an array. You can slice a NumPy array by specifying a range of indices for one or more array dimensions. The basic syntax for slicing is start:stop:step, where the start is the starting index of the slice, the stop is the ending index of the slice (exclusive), and step is the step size between elements in the slice.

For example, to extract the first two rows of a 2D NumPy array called my_array, you can use the following code:

import numpy as np

my_array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(my_array[:2, :])


Output:

[[1 2 3]
 [4 5 6]]

Element-wise operations

Element-wise operations in NumPy refer to operations performed on each element of an array. These operations can be performed between arrays of the same shape or between an array and a scalar value.
The most basic element-wise operation in NumPy is addition. The arrays must be of the same shape to perform addition between two arrays. Here’s an example:

import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

c = a + b
print(c)

Output:

[5 7 9]

Broadcasting

Broadcasting is a powerful feature in NumPy that allows arrays with different shapes to be combined in arithmetic operations. It eliminates the need for explicit loops or concatenation operations, making code more concise and efficient.

Broadcasting works by automatically aligning the shapes of two arrays in a way that makes them compatible for element-wise operations. The smaller array is “broadcast” across the larger array so that they have the same shape.

import numpy as np

a = np.array([1, 2, 3])
b = 2

c = a + b
print(c)


Output:

[3 4 5]

Reduction operations

Reduction operations in NumPy are used to compute aggregate values over an array. These operations reduce an array to a single value by performing some mathematical operation on its elements.

Here are some of the commonly used reduction operations in NumPy:

  • Sum: Computes the sum of all elements in the array.
import numpy as np

a = np.array([1, 2, 3])
s = np.sum(a)
print(s)



Output :

6

  • Mean: Computes the arithmetic mean of all elements in the array.
import numpy as np

a = np.array([1, 2, 3])
m = np.mean(a)
print(m)


Output:

2.0

  • Standard deviation: Computes the standard deviation of all elements in the array.
import numpy as np

a = np.array([1, 2, 3])
sd = np.std(a)
print(sd)


Output:

0.816496580927726

  • Variance: Computes the variance of all elements in the array.
import numpy as np

a = np.array([1, 2, 3])
v = np.var(a)
print(v)


Output:

0.6666666666666666

  • Maximum and minimum: Computes the maximum and minimum values in the array.
import numpy as np

a = np.array([1, 2, 3])
max_val = np.max(a)
min_val = np.min(a)
print(max_val, min_val)


Output:

3 1

Data Type Objects in numpy

Data Type Object (dtype) describes how the bytes in a fixed-size memory block should be interpreted as a particular data type. The dtype object is an essential component of NumPy as it provides the rules for interpreting the raw bytes of a memory block as a specific type of data.

NumPy provides a range of data types that can represent numbers, characters, and other data types. Each data type object in NumPy is identified by a unique character code, which specifies the type and size of the data

NumPy provides a range of integer data types, which are used to represent whole numbers.

The following table shows some of the most commonly used integer types in NumPy:

Integer Types

NumPy provides a range of integer data types, which are used to represent whole numbers. The following table shows some of the most commonly used integer types in NumPy:

Data TypeDescription
int88-bit integer
int1616-bit integer
int3232-bit integer
int6464-bit integer
import numpy as np

x = np.array([1, 2, 3], dtype=np.int16)
print(x.dtype)  

 output: int16

Floating-point Types

NumPy provides a range of floating-point data types, which are used to represent real numbers. The following table shows some of the most commonly used floating-point types in NumPy:

Data TypeDescription
float16Half-precision floating-point
float32Single-precision floating-point
float64Double-precision floating-point
float128Quadruple-precision floating-point
import numpy as np

x = np.array([1.0, 2.0, 3.0], dtype=np.float32)
print(x.dtype) 

output: float32

Boolean Type

NumPy provides a boolean data type, which is used to represent Boolean values (True or False). The following table shows the Boolean type in NumPy:

Data TypeDescription
bool_Boolean (True or False)
import numpy as np

x = np.array([True, False, True], dtype=np.bool_)
print(x.dtype)   

output: bool

String Type

NumPy provides a string data type, which is used to represent strings of characters. The following table shows the string type in NumPy:

Data TypeDescription
string_Fixed-length string
import numpy as np

x = np.array(['hello', 'world'], dtype=np.string_)
print(x.dtype)   


output: |S5

Conclusion

NumPy is a powerful library for data analysis in Python. It provides many powerful features, such as broadcasting, reduction operations, and universal functions. Broadcasting allows you to perform operations on arrays with different shapes, while reduction operations allow you to aggregate data along one or more axes. Universal functions operate on NumPy arrays element-wise and are optimized for speed and efficiency.
NumPy is an essential library for any data scientist or analyst working with numerical data in Python. Its powerful features and efficient implementation make it a valuable tool for various applications, from scientific computing to machine learning and beyond.

Leave a Reply