In this NumPy Normalization tutorial, we are going to learn how to normalize an array using the NumPy library of Python. But before we hop on to that, let us first try to understand the definition and meaning of NumPy and Normalization.
- Normalization
- NumPy
- NumPy Functions
- Normalization of One Dimensional (1D) array
- Normalization of Two Dimensional (2D) array
Normalization
Generally, normalization is a process that is used to rescale the real values of a numeric attribute into a range from 0 to 1. Normalization helps organize the data in such a way that it appears similar across all the areas and records. There are various advantages of data normalization, such as redundancy reduction, complexity reduction, clarity, and acquiring higher quality data.
Normally data normalization is highly used in Machine Learning. Normalization helps in making the model training less sensitive to the scale of features in Machine Learning. When using the data for training a model, we are required to scale the data so that all the numeric values are in the same range and the large values do not overwhelm the smaller values. This allows the models to meet with better weights which in turn results in a more accurate model. In simple terms, normalization helps the model to predict the outputs more and more accurately.
Now the next question which arises is how can one perform data normalization? One of the methods of performing data normalization is using Python Language. For that, Python provides the users with the NumPy library, which contains the “linalg.norm()” function, which is used to normalize the data. The normalization function takes an array as an input, normalizes the values of the array in the range of 0 to 1 by using some formula, and provides the normalized array as an output. This we will look at in detail shortly. But before that, let us understand the meaning and applications of NumPy.
NumPy
NumPy, as the name suggests, stands for Numerical Python. NumPy is an in-built Python library that is used for working with arrays. Now, as we already know that in Python, one can create an array using lists, then why do we require NumPy for that? Well, NumPy provides a faster way to work with the arrays as compared to the traditional lists.
To use NumPy in your system, you need to install the NumPy library using pip. Below is the command which is used to install NumPy in a system –
pip install numpy
After installing, we need to import this library into our application/program in order to use its functions. Below is the syntax of importing the numpy library using python –
Import numpy
Now let us see an example of how to create a one dimension array using numpy library –
import numpy as np # importing numpy library
my_array = np.array([10, 30, 50, 70, 90]) #defining the input array
print(“This is my array - ”, my_array) # Printing the array
The output of the above program will be as following –
This is my array – [10, 30, 50, 70, 90]
Let us see an example of how to create a two dimension array using NumPy library –
import numpy as np # importing numpy library as np
two_d_array = np.array([[10, 30, 50, 70, 90], [20, 40, 60, 80, 100]]) # defining the 2 D array
print(“This is a two dimensional array - ”, two_d_array) # printing the array
The output of the above program will be as following –
This is a two dimensional array – [[10 30 50 70 90]
[20 40 60 80 100]]
NumPy Functions
NumPy library contains various functions, which makes it easy to work in the fields of matrices, linear algebra, polynomials, and Fourier transform. A few of them are listed below:
Add – numpy.add() function is used for performing the addition of two arrays.
Subtract – numpy.subtract() function is used for performing the subtraction of two arrays.
Multiply – numpy.multiply() function is used for performing the multiplication of two arrays.
Divide – numpy.divide() function is used for performing the division of two arrays.
Min – numpy.min() function is used to find the minimum value of an array.
Max – numpy.max() function is used to find the maximum value of an array.
Mean – numpy.mean() function is used to calculate the mean of an array.
Var – numpy.var() function is used to calculate the variance of an array.
Std – numpy.std() function is used to calculate the standard deviation of an array.
Dot – numpy.dot() function is used to find the dot product of two arrays.
Cross – numpy.cross() function is used to find the cross product of two arrays.
Inner – numpy.inner() function is used to perform the inner product of two arrays.
Outer – numpy.outer() function is used to perform the outer product of two arrays.
Transpose – numpy.transpose() function is used to generate the transposition of an array.
Concatenate – numpy.concatenate() function is used to concatenate two or more arrays.
Similar to the above functions, the NumPy library also contains various functions for performing linear algebraic calculations. These functions can be found in the sub-module linalg. Linalg is a submodule of NumPy library which stands for Linear Algebra and is used to solve different algebraic puzzles. Let us see a few of the functions of linalg sub-module, which are mentioned below –
Det – numpy.linalg.det() function is used to compute the determinant of an array(matrix).
Inv – numpy.linalg.inv() function is used to compute the inverse of an array(matrix).
Eig – numpy.linalg.eig() function is used to compute the eigenvalues and the eigenvectors of a square array(matrix).
Norm – numpy.linalg.norm() function is used to find the norm of an array(matrix). This is the function which we are going to use to perform numpy normalization. This function takes an array or matrix as an argument and returns the norm of that array.
Now, as we know, which function should be used to normalize an array. Let us try to understand the theoretical concept of the normalization of an array. And afterward, we will see how to write a complete normalization program for one dimension array and two-dimension array as well.
So the norm which we will be using in our code is called The Euclidean norm or Frobenius norm. This norm is used to calculate the normalized matrix. The mathematical formula for normalizing a matrix is shown below –
Where,
v cap – represents the normalized array or matrix.
V – represents the input matrix.
|v|- represents the Euclidean norm or the determinant of a matrix.
Now we have the idea and understanding of all the relevant terms and functions which are going to be used in our program of NumPy normalization of an array using Python. So let us see the implementation of the same by looking at the examples below –
1. Normalization of One Dimensional (1D) array –
a.) Normalization of a predefined 1D array –
import numpy as np # importing numpy library as np
pre_one_array = np.array([10, 20, 30, 40, 50]) # defining a 1D array
print(pre_one_array) # printing the array
norm = np.linalg.norm(pre_one_array) # To find the norm of the array
print(norm) # Printing the value of the norm
normalized_array = pre_one_array/norm # Formula used to perform array normalization
print(normalized_array) # printing the normalized array
The output of the above program will be as following –
[10 20 30 40 50]
74.161984871
[ 0.13483997 0.26967994 0.40451992 0.53935989 0.67419986]
Here, as we can see, all the values of the output array lie between 0 and 1. Hence, it is clear that the predefined input 1D array has been normalized successfully.
b.) Normalization of a random 1D array –
If we want to normalize a 1D array that has random values then the below method will be used for the same –
import numpy as np # importing numpy library as np
ran_one_array = np.random.rand(5)*10 # defining a random array of 5 elements using rand function of random sub module of the numpy library. Here 10 represents the range of the values of the elements which will be between 0 to 10
print(ran_one_array) # printing the array
norm = np.linalg.norm(ran_one_array) # To find the norm of the array
print(norm) # Printing the value of the norm
normalized_array = ran_one_array/norm # Formula used to perform array normalization
print(normalized_array) # printing the normalized array
The output of the above program will be as following –
[ 2.66782852 6.70146289 5.38289872 0.52054369 9.62171167]
13.1852498544
[ 0.20233432 0.50825452 0.40825155 0.03947924 0.72973298]
Here, as we can see, all the values of the output array lie between 0 and 1. Hence, it is clear that the random input 1D array has been normalized successfully.
2. Normalization of Two Dimensional (2D) array –
a.) Normalization of a predefined 2D array –
import numpy as np # importing numpy library as np
pre_two_array = np.array([[10, 30, 50, 70, 90], [20, 40, 60, 80, 100], [5, 15, 25, 35, 45], [55, 65, 75, 85, 95], [11, 22, 33, 44, 55]]) # defining a 2D array having 5 rows and 5 columns
print(pre_two_array) # printing the array
norm = np.linalg.norm(pre_two_array) # To find the norm of the array
print(norm) # Printing the value of the norm
normalized_array = pre_two_array/norm # Formula used to perform array normalization
print(normalized_array) # printing the normalized array
The output of the above program will be as following –
[[ 10 30 50 70 90]
[ 20 40 60 80 100]
[ 5 15 25 35 45]
[ 55 65 75 85 95]
[ 11 22 33 44 55]]
280.008928429
[[ 0.03571315 0.10713944 0.17856573 0.24999203 0.32141832]
[ 0.07142629 0.14285259 0.21427888 0.28570518 0.35713147]
[ 0.01785657 0.05356972 0.08928287 0.12499601 0.16070916]
[ 0.19642231 0.23213545 0.2678486 0.30356175 0.3392749 ]
[ 0.03928446 0.07856892 0.11785338 0.15713785 0.19642231]]
Here, as we can see, all the values of the output array lie between 0 and 1. Hence, it is clear that the predefined input 2D array has been normalized successfully.
b.) Normalization of a random 2D array –
If we want to normalize a 2D array that has random values, then the below method will be used for the same –
import numpy as np # importing numpy library as np
ran_two_array = np.random.rand(5, 5)*10 # defining a random array of 5 rows and 5 columns using rand function of random sub module of the numpy library. Here 10 represents the range of the values of the elements which will be between 0 and 10
print(ran_two_array) # printing the array
norm = np.linalg.norm(ran_two_array) # To find the norm of the array
print(norm) # Printing the value of the norm
normalized_array = ran_two_array/norm # Formula used to perform array normalization
print(normalized_array) # printing the normalized array
The output of the above program will be as following –
[[ 4.57411295 8.65220668 9.63324979 1.9971668 3.23869927]
[ 0.84966168 5.90483284 0.47779068 3.28578339 2.45708816]
[ 5.85465399 4.49030481 9.12849734 9.05088372 2.16890579]
[ 1.24442784 3.31225636 5.72207596 3.9220778 1.45400695]
[ 5.49354678 3.63828521 3.66439748 3.75588512 4.4547876 ]]
25.1725603225
[[ 0.18171028 0.3437158 0.38268852 0.07933904 0.12865991]
[ 0.03375349 0.23457419 0.01898062 0.13053036 0.09760978]
[ 0.23258079 0.17838093 0.36263682 0.35955356 0.08616151]
[ 0.04943589 0.13158202 0.22731402 0.15580766 0.05776158]
[ 0.21823552 0.14453378 0.14557111 0.14920553 0.17696998]]
Here, as we can see, all the values of the output array lie between 0 and 1. Hence, it is clear that the random input 2D array has been normalized successfully.
With this, we have come to the end of this NumPy Normalization tutorial. We hope that you now understand the concept of NumPy Normalization. In this NumPy Normalization tutorial, we have covered the definition of normalization, its advantages, and its applications. We have also seen the definition and the usage of the NumPy library and its various other functions. Then we learned the theoretical concept and formula behind the normalization process. And last but not least, we implemented the normalization on a one-dimensional array as well as a two-dimensional array using the NumPy library of Python while verifying the respective outputs.
Uncover the true value of data by learning from world-renowned MIT faculty with Data Science and Machine Learning: Making Data-Driven Decisions by MIT IDSS and The Applied Data Science Program by MIT Professional Education. The programs, with curriculums designed by MIT faculty are complemented by mentored learning sessions with industry experts that will allow you to solve real-life business problems and build a portfolio with latest data science and machine learning skills.
Interested? Download the brochures for The Applied Data Science Program and Data Science and Machine Learning: Making Data-Driven Decisions
Our free online courses are tailor-made for individuals like you. Gain an edge in your career with in-demand domains such as Cybersecurity, Management, Artificial Intelligence, Cloud Computing, IT, and Software. These courses are designed by industry experts to provide you with hands-on experience and practical knowledge. Whether you’re a beginner looking to start a new career path or a professional aiming to upskill, our courses offer a flexible and accessible learning method.