5.2. NumPy Function Reference and Usage Examples#
5.2.1. Array Creation Functions#
Array Creation Functions in NumPy are fundamental for initializing arrays with various contents and shapes. These functions simplify tasks such as creating arrays filled with specific values like zeros and ones or reshaping arrays to suit your needs. Explore the table below for clear descriptions and practical examples of each function [Harris et al., 2020, NumPy Developers, 2023].
Function |
Description |
Example |
---|---|---|
|
Create an array from a Python list or tuple. |
|
|
Create an array filled with zeros. |
|
|
Create an array filled with ones. |
|
|
Create an uninitialized array. |
|
|
Create an array with evenly spaced values. |
|
|
Create an array with evenly spaced values over a range. |
|
|
Create a 2-D array with ones on the diagonal. |
|
|
Reshape an array to a specified shape. |
|
|
Permute the dimensions of an array. |
|
You can access a comprehensive list of functions and their usage by visiting the following link: Numpy Array Manipulation Routines.
import numpy as np
import pprint
def print_bold(txt):
print("\033[1m" + txt + "\033[0m")
# Create an array from a Python list or tuple
arr = np.array([1, 2, 3])
print_bold("Array from a Python list or tuple:")
pprint.pprint(arr)
# Create an array filled with zeros
zeros_array = np.zeros((3, 3))
print_bold("\nArray filled with zeros:")
pprint.pprint(zeros_array)
# Create an array filled with ones
ones_array = np.ones((2, 4))
print_bold("\nArray filled with ones:")
pprint.pprint(ones_array)
# Create an uninitialized array
empty_array = np.empty((2, 2))
print_bold("\nUninitialized array:")
pprint.pprint(empty_array)
# Create an array with evenly spaced values
arange_array = np.arange(0, 10, 2)
print_bold("\nArray with evenly spaced values:")
pprint.pprint(arange_array)
# Create an array with evenly spaced values over a range
linspace_array = np.linspace(0, 1, 5)
print_bold("\nArray with evenly spaced values over a range:")
pprint.pprint(linspace_array)
# Create a 2-D array with ones on the diagonal
eye_array = np.eye(3)
print_bold("\n2-D array with ones on the diagonal:")
pprint.pprint(eye_array)
# Reshape an array to a specified shape
original_array = np.array([1, 2, 3, 4, 5, 6])
reshaped_array = np.reshape(original_array, (2, 3))
print_bold("\nReshaped array:")
pprint.pprint(reshaped_array)
# Permute the dimensions of an array
array_to_permute = np.array([[1, 2], [3, 4], [5, 6]])
transposed_array = np.transpose(array_to_permute)
print_bold("\nTransposed array:")
pprint.pprint(transposed_array)
Array from a Python list or tuple:
array([1, 2, 3])
Array filled with zeros:
array([[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]])
Array filled with ones:
array([[1., 1., 1., 1.],
[1., 1., 1., 1.]])
Uninitialized array:
array([[2.33645657e-307, 2.67023123e-307],
[6.23040373e-307, 1.60219035e-306]])
Array with evenly spaced values:
array([0, 2, 4, 6, 8])
Array with evenly spaced values over a range:
array([0. , 0.25, 0.5 , 0.75, 1. ])
2-D array with ones on the diagonal:
array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])
Reshaped array:
array([[1, 2, 3],
[4, 5, 6]])
Transposed array:
array([[1, 3, 5],
[2, 4, 6]])
Note
Creating an uninitialized array means allocating memory space for an array without initializing its elements to specific values. In other words, the values within the array are not set to any particular initial values like zeros, ones, or any other predefined values.
When you create an uninitialized array, the values in the array will contain whatever data happened to be in that memory location before, which could be any random or undefined data. This can be useful in situations where you plan to fill the array with values later in your code, and you want to save the time it takes to initialize all the elements to a specific value.
However, it’s essential to note that using an uninitialized array can lead to unpredictable behavior if you try to use its values before assigning meaningful data to them. Therefore, it’s crucial to initialize the array’s elements explicitly if you rely on specific initial values for your computations.
5.2.2. Array Manipulation Functions#
Array Manipulation Functions in NumPy provide essential tools for modifying and combining arrays. These functions enable you to concatenate arrays along specified axes, stack arrays to create new dimensions, and split arrays into multiple sub-arrays. Explore the table below for detailed descriptions and practical examples of each function [Harris et al., 2020, NumPy Developers, 2023].
Function |
Description |
Example |
---|---|---|
|
Join arrays along a specified axis. |
|
|
Join arrays along a new axis. |
|
|
Split an array into multiple sub-arrays. |
|
You can access an extensive list of functions and their usage by visiting the following link: Numpy Array Manipulation Routines.
# Create arrays for manipulation
array1 = np.array([[1, 2], [3, 4]])
array2 = np.array([[5, 6], [7, 8]])
# Join arrays along a specified axis
concatenated_array = np.concatenate((array1, array2), axis=0)
print_bold("Concatenated Arrays:")
pprint.pprint(concatenated_array)
# Join arrays along a new axis
stacked_array = np.stack((array1, array2))
print_bold("\nStacked Arrays:")
pprint.pprint(stacked_array)
# Split an array into multiple sub-arrays
array_to_split = np.array([1, 2, 3, 4, 5, 6, 7, 8])
split_arrays = np.split(array_to_split, 4)
print_bold("\nSplit Arrays:")
for i, sub_array in enumerate(split_arrays):
print(f"Split Array {i + 1}:")
pprint.pprint(sub_array)
Concatenated Arrays:
array([[1, 2],
[3, 4],
[5, 6],
[7, 8]])
Stacked Arrays:
array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]]])
Split Arrays:
Split Array 1:
array([1, 2])
Split Array 2:
array([3, 4])
Split Array 3:
array([5, 6])
Split Array 4:
array([7, 8])
5.2.3. Element-wise Operations#
These functions include addition, subtraction, multiplication, division, exponential calculations, natural logarithms, square roots, and trigonometric functions like sine and cosine. Explore the table below for in-depth descriptions and practical examples of each operation [Harris et al., 2020, NumPy Developers, 2023].
Function |
Description |
Example |
---|---|---|
|
Element-wise addition of two arrays. |
|
|
Element-wise subtraction of two arrays. |
|
|
Element-wise multiplication of two arrays. |
|
|
Element-wise division of two arrays. |
|
|
Element-wise exponential function. |
|
|
Element-wise natural logarithm. |
|
|
Element-wise square root. |
|
|
Element-wise sine function. |
|
|
Element-wise cosine function. |
|
You can access a comprehensive list of functions and their usage by referring to the following link: Numpy Mathematical Functions.
# Create arrays for element-wise operations
array1 = np.array([1, 2, 3])
array2 = np.array([4, 5, 6])
# Element-wise addition of two arrays
addition_result = np.add(array1, array2)
print_bold("Element-wise Addition:")
pprint.pprint(addition_result)
# Element-wise subtraction of two arrays
subtraction_result = np.subtract(array1, array2)
print_bold("\nElement-wise Subtraction:")
pprint.pprint(subtraction_result)
# Element-wise multiplication of two arrays
multiplication_result = np.multiply(array1, array2)
print_bold("\nElement-wise Multiplication:")
pprint.pprint(multiplication_result)
# Element-wise division of two arrays
division_result = np.divide(array1, array2)
print_bold("\nElement-wise Division:")
pprint.pprint(division_result)
# Element-wise exponential function
exp_result = np.exp(array1)
print_bold("\nElement-wise Exponential:")
pprint.pprint(exp_result)
# Element-wise natural logarithm
log_result = np.log(array1)
print_bold("\nElement-wise Natural Logarithm:")
pprint.pprint(log_result)
# Element-wise square root
sqrt_result = np.sqrt(array1)
print_bold("\nElement-wise Square Root:")
pprint.pprint(sqrt_result)
# Element-wise sine function
sin_result = np.sin(array1)
print_bold("\nElement-wise Sine Function:")
pprint.pprint(sin_result)
# Element-wise cosine function
cos_result = np.cos(array1)
print_bold("\nElement-wise Cosine Function:")
pprint.pprint(cos_result)
Element-wise Addition:
array([5, 7, 9])
Element-wise Subtraction:
array([-3, -3, -3])
Element-wise Multiplication:
array([ 4, 10, 18])
Element-wise Division:
array([0.25, 0.4 , 0.5 ])
Element-wise Exponential:
array([ 2.71828183, 7.3890561 , 20.08553692])
Element-wise Natural Logarithm:
array([0. , 0.69314718, 1.09861229])
Element-wise Square Root:
array([1. , 1.41421356, 1.73205081])
Element-wise Sine Function:
array([0.84147098, 0.90929743, 0.14112001])
Element-wise Cosine Function:
array([ 0.54030231, -0.41614684, -0.9899925 ])
5.2.4. Statistical and Mathematical Functions#
You can calculate the dot product, sum, mean, standard deviation, minimum, and maximum values of arrays. Additionally, perform element-wise comparisons, logical operations, and even conditional selection of elements. Dive into the table below for comprehensive descriptions and practical examples of each function [Harris et al., 2020, NumPy Developers, 2023].
Function |
Description |
Example |
---|---|---|
|
Dot product of two arrays. |
|
|
Product of array elements. |
|
|
Sum of array elements. |
|
|
Mean (average) of array elements. |
|
|
Median value of an array. |
|
|
Standard deviation of array elements. |
|
|
Minimum value in an array. |
|
|
Maximum value in an array. |
|
|
Element-wise comparison of two arrays for equality. |
|
|
Element-wise logical AND of two arrays. |
|
|
Return elements chosen from two arrays depending on a condition. |
|
|
Index of the maximum value in an array. |
|
|
Index of the minimum value in an array. |
|
|
Compute the histogram of a set of data. |
|
|
Compute the q-th percentile of the data. |
|
Remark
Mean (
numpy.mean
):The mean, also known as the average, is a measure of central tendency.
Mathematically, it is calculated as the sum of all values in an array divided by the total number of values.
In NumPy, one can compute the mean using the
numpy.mean
function:mean = np.mean(array)
Mathematically, the mean (\(\mu\)) is defined as
(5.1)#\[\begin{equation}\mu = \frac{\sum{x}}{N}\end{equation}\]Where:
\(\mu\) is the mean,
\(\sum{x}\) represents the sum of all values in the array, and
\(N\) is the total number of values in the array.
For more details, refer to this link.
Median (
numpy.median
):The median is another measure of central tendency and is the middle value of a sorted dataset.
For an odd number of values, it’s the middle value. For an even number of values, it’s the average of the two middle values.
In NumPy, you can compute the median using the
numpy.median
function:median = np.median(array)
Mathematically, the median is defined as the middle value of a dataset after sorting. If there are N values:
For odd N, the median is the value at position \((N + 1) / 2\).
For even N, the median is the average of the values at positions \(N / 2\) and \((N / 2) + 1\) after sorting.
For more details, please see this link.
Standard Deviation (
numpy.std
):The standard deviation measures the dispersion or spread of data points in a dataset.
It quantifies how much individual data points deviate from the mean.
In NumPy, you can compute the standard deviation using the
numpy.std
function:std_deviation = np.std(array)
Mathematically, the standard deviation (\(\sigma\)) is defined as:
(5.2)#\[\begin{equation}\sigma = \sqrt{\frac{\sum{(x - \mu)^2}}{N - ddof}}\end{equation}\]Where:
\(\sigma\) is the standard deviation,
\(\sum{(x - \mu)^2}\) represents the sum of squared differences between each value (x) and the mean (μ),
\(N\) is the total number of values in the array, and
ddof represents the degrees of freedom. The default value is zero.
For more details, please see this link.
Histogram (
numpy.histogram
):numpy.histogram
is a function used to compute the histogram of a dataset, which is a representation of the distribution of data.It divides the data into bins or intervals and counts the number of data points that fall into each bin.
To create a histogram with ‘bins’ number of intervals:
Determine the range of your data, usually from the minimum (min_x) to the maximum (max_x) value in your dataset.
Divide the range into ‘bins’ equally spaced intervals.
Count how many data points from your dataset fall into each interval.
Percentile (
numpy.percentile
):numpy.percentile
is a function used to calculate the nth percentile of a dataset, which is a measure of relative standing within the data.Percentiles divide the data into 100 equal parts, and the nth percentile represents the value below which ‘n’ percent of the data falls.
To find the pth percentile of a dataset x:
Sort the data in ascending order.
Calculate the rank (position) of the percentile value using the formula:
(5.3)#\[\begin{equation}Rank = \left(\frac{p}{100}\right) \cdot (N + 1)\end{equation}\]Where \(N\) is the total number of data points.
If the rank is an integer, the percentile value is the value at that rank in the sorted data.
If the rank is not an integer, interpolate between the values at the floor(rank) and ceil(rank) positions.
For example, the 50th percentile is the median, which is the value below which 50% of the data falls.
Note
The standard deviation is a measure of how data points deviate from the mean. In NumPy, it is typically calculated as the average squared deviation using the formula x.sum() / N, where N is the number of data points in the array x (N = len(x)). However, when the ‘ddof’ (Delta Degrees of Freedom) parameter is specified, it adjusts the divisor. Specifically, it becomes N - ddof.
In standard statistical practice, setting ddof=1 is often used to obtain an unbiased estimator of the variance for an infinite population (a sample of a population). Alternatively, setting ddof=0 provides a maximum likelihood estimate of the variance, assuming normally distributed variables.
It’s important to note that the standard deviation is the square root of the estimated variance. Even when using ddof=1 for an unbiased variance estimate, the resulting standard deviation may not be an entirely unbiased estimate in itself.
# Create arrays for statistical and mathematical operations
array1 = np.array([1, 2, 3, 4, 5])
array2 = np.array([5, 4, 3, 2, 1])
# Dot product of two arrays
dot_product = np.dot(array1, array2)
print_bold("Dot Product of Arrays:")
pprint.pprint(dot_product)
# Product of array elements
product_result = np.prod(array1)
print_bold("\nProduct of Array Elements:")
pprint.pprint(product_result)
# Sum of array elements
sum_result = np.sum(array1)
print_bold("\nSum of Array Elements:")
pprint.pprint(sum_result)
# Mean (average) of array elements
mean_result = np.mean(array1)
print_bold("\nMean (Average) of Array Elements:")
pprint.pprint(mean_result)
# Median of array elements
median_result = np.median(array1)
print_bold("\nmedian of Array Elements:")
pprint.pprint(median_result)
# Standard deviation of array elements
std_deviation = np.std(array1)
print_bold("\nStandard Deviation of Array Elements:")
pprint.pprint(std_deviation)
# Minimum value in an array
min_value = np.min(array1)
print_bold("\nMinimum Value in Array:")
pprint.pprint(min_value)
# Maximum value in an array
max_value = np.max(array1)
print_bold("\nMaximum Value in Array:")
pprint.pprint(max_value)
# Element-wise comparison of two arrays for equality
equal_result = np.equal(array1, array2)
print_bold("\nElement-wise Comparison for Equality:")
pprint.pprint(equal_result)
# Element-wise logical AND of two arrays
logical_and_result = np.logical_and(array1 > 2, array2 < 4)
print_bold("\nElement-wise Logical AND:")
pprint.pprint(logical_and_result)
# Return elements chosen from two arrays depending on a condition
condition = (array1 > 2)
where_result = np.where(condition, array1, array2)
print_bold("\nElements Chosen Based on Condition:")
pprint.pprint(where_result)
# Index of the maximum value in an array
argmax_result = np.argmax(array1)
print_bold("\nIndex of Maximum Value:")
pprint.pprint(argmax_result)
# Index of the minimum value in an array
argmin_result = np.argmin(array1)
print_bold("\nIndex of Minimum Value:")
pprint.pprint(argmin_result)
# Compute the histogram of a set of data
data = np.array([1, 2, 2, 3, 3, 3, 4, 4, 4, 4])
histogram, bins = np.histogram(data, bins=4)
print_bold("\nHistogram:")
pprint.pprint(histogram)
print_bold("\nBins:")
pprint.pprint(bins)
# Compute the q-th percentile of the data
percentile = np.percentile(data, q=25)
print_bold("\n25th Percentile:")
pprint.pprint(percentile)
Dot Product of Arrays:
35
Product of Array Elements:
120
Sum of Array Elements:
15
Mean (Average) of Array Elements:
3.0
median of Array Elements:
3.0
Standard Deviation of Array Elements:
1.4142135623730951
Minimum Value in Array:
1
Maximum Value in Array:
5
Element-wise Comparison for Equality:
array([False, False, True, False, False])
Element-wise Logical AND:
array([False, False, True, True, True])
Elements Chosen Based on Condition:
array([5, 4, 3, 4, 5])
Index of Maximum Value:
4
Index of Minimum Value:
0
Histogram:
array([1, 2, 3, 4], dtype=int64)
Bins:
array([1. , 1.75, 2.5 , 3.25, 4. ])
25th Percentile:
2.25
Remark
NumPy’s np.where()
function offers versatile capabilities for performing conditional operations and obtaining specific results based on conditions. Here are several distinct usages of np.where()
:
Basic Usage:
import numpy as np arr = np.arange(1, 8, 1) condition = arr > 3 indices = np.where(condition) print(indices)
Output:
(array([3, 4, 5, 6], dtype=int64),)
In this fundamental example,
np.where()
is employed to locate the indices where the conditionarr > 3
is satisfied. As a result, it returns the array indices[3, 4, 5, 6]
, indicating the positions where the condition holds true within the original array.Return Values Based on Conditions:
import numpy as np arr = np.arange(1, 8, 1) result = np.where(arr > 3, 'Yes', 'No') print(result)
Output:
['No' 'No' 'No' 'Yes' 'Yes' 'Yes' 'Yes']
In this usage,
np.where()
takes a condition,arr > 3
, and returns ‘Yes’ when the condition is met (i.e., when elements are greater than 3) and ‘No’ otherwise. This provides a convenient way to generate an array of values based on the condition.Using Multiple Conditions:
import numpy as np arr = np.arange(1, 8, 1) result = np.where((arr > 2) & (arr < 5), 'In Range', 'Out of Range') print(result)
Output:
['Out of Range' 'Out of Range' 'In Range' 'In Range' 'Out of Range' 'Out of Range' 'Out of Range']
In this scenario,
np.where()
allows the combination of two conditions,(arr > 2)
and(arr < 5)
, and returns ‘In Range’ when both conditions are satisfied. Conversely, it returns ‘Out of Range’ when the conditions are not met.Show Array Values Based on Conditions:
import numpy as np arr = np.arange(1, 8, 1) print(arr[np.where(arr > 3)])
Output:
[4 5 6 7]
In this usage,
np.where()
is used to identify the positions in the array where the conditionarr > 3
is true, and subsequently, the values at those positions are displayed. This facilitates the display of array elements that meet a specific condition, such as values greater than 3.
5.2.5. Linear Algebra Functions (Optional Content)#
You can resize arrays to specific shapes, compute matrix inverses, determinants, eigenvalues, and singular value decompositions. Additionally, solve linear matrix equations with ease. Explore the table below for comprehensive descriptions and practical examples of each function [Harris et al., 2020, NumPy Developers, 2023]:
Function |
Description |
Example |
---|---|---|
|
Resize an array to a specified shape. |
|
|
Compute the multiplicative inverse of a matrix. |
|
|
Compute the determinant of a matrix. |
|
|
Compute the eigenvalues and right eigenvectors of a square array. |
|
|
Singular value decomposition of a matrix. |
|
|
Solve a linear matrix equation. |
|
You can access an extensive list of functions and their usage related to linear algebra by visiting the following link: Numpy Linear Algebra Functions.
# Create arrays and matrices for matrix operations
array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
matrix = np.array([[1, 2], [3, 4]])
vector = np.array([1, 2])
# Resize an array to a specified shape
resized_array = np.resize(array, (3, 3))
print_bold("Resized Array:")
pprint.pprint(resized_array)
# Compute the multiplicative inverse of a matrix
matrix_inverse = np.linalg.inv(matrix)
print_bold("\nMatrix Inverse:")
pprint.pprint(matrix_inverse)
# Compute the determinant of a matrix
matrix_determinant = np.linalg.det(matrix)
print_bold("\nMatrix Determinant:")
pprint.pprint(matrix_determinant)
# Compute the eigenvalues and right eigenvectors of a square array
eigenvalues, eigenvectors = np.linalg.eig(array)
print_bold("\nEigenvalues:")
pprint.pprint(eigenvalues)
print_bold("\nEigenvectors:")
pprint.pprint(eigenvectors)
# Singular value decomposition of a matrix
U, S, VT = np.linalg.svd(matrix)
print_bold("\nSingular Value Decomposition (SVD):")
print("U:")
pprint.pprint(U)
print("S:")
pprint.pprint(S)
print("VT:")
pprint.pprint(VT)
# Solve a linear matrix equation
solution = np.linalg.solve(matrix, vector)
print_bold("\nLinear Equation Solution:")
pprint.pprint(solution)
Resized Array:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Matrix Inverse:
array([[-2. , 1. ],
[ 1.5, -0.5]])
Matrix Determinant:
-2.0000000000000004
Eigenvalues:
array([ 1.61168440e+01, -1.11684397e+00, -4.22209278e-16])
Eigenvectors:
array([[-0.23197069, -0.78583024, 0.40824829],
[-0.52532209, -0.08675134, -0.81649658],
[-0.8186735 , 0.61232756, 0.40824829]])
Singular Value Decomposition (SVD):
U:
array([[-0.40455358, -0.9145143 ],
[-0.9145143 , 0.40455358]])
S:
array([5.4649857 , 0.36596619])
VT:
array([[-0.57604844, -0.81741556],
[ 0.81741556, -0.57604844]])
Linear Equation Solution:
array([0. , 0.5])
5.2.6. Random Number Generation#
You can generate random values, integers, and sample elements from arrays. Explore the table below for comprehensive descriptions and practical examples of each function [Harris et al., 2020, NumPy Developers, 2023]:
Function |
Description |
Example |
---|---|---|
|
Random values in a given shape between 0 and 1. |
|
|
Random integers from low (inclusive) to high (exclusive). |
|
|
Randomly sample elements from an array. |
|
You can access an extensive list of functions and their usage related to random sampling by visiting the following link: Numpy Random Sampling Functions.
# Generate random values
random_values = np.random.rand(3, 3)
print_bold("Random Values (between 0 and 1):")
pprint.pprint(random_values)
# Generate random integers
random_integers = np.random.randint(1, 100, size=(2, 2))
print_bold("\nRandom Integers (between 1 and 99):")
pprint.pprint(random_integers)
# Randomly sample elements from an array
array = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
random_samples = np.random.choice(array, size=3, replace=False)
print_bold("\nRandomly Sampled Elements:")
pprint.pprint(random_samples)
Random Values (between 0 and 1):
array([[0.65160832, 0.24176568, 0.1957995 ],
[0.03128251, 0.76379326, 0.64105564],
[0.60250776, 0.45127229, 0.532373 ]])
Random Integers (between 1 and 99):
array([[37, 41],
[22, 93]])
Randomly Sampled Elements:
array([9, 8, 4])