Skip to main content

Managing data in Numpy- Part 2

This post is the second part for the data management in numpy. We are going to continue with different set of functions and its syntax, code snippets and description in this post.
Hello Readers,

This article is the part 2 of the 2 part series for the data management functions in Numpy.
If you have not read the first part of this post, you can directly head to this link and check it out: Managing data in Numpy- Part 1

Following are the functions or attributes of the numpy array which we will study:


5.concatenate()
6.vstack()
7. hstack()
8. flatten()
9. ravel()

5. concatenate()
np.concatenate() is used to concatenate two numpy arrays by passing them as a parameter to the function call.Other parameters that can be passed to the function call are axis, which specifies along which axis (1 or 0) the arrays should be joined.Another one is out which specifies the destination to place the result.

Syntax: np.concatenate((a1,a2,..),axis,out)
(a1,a2,..) : sequence of array to be flattened
axis: Specifies along which axis the array should be concatenated
out: specifies the destination at which the result should be written

Following is the code snippet for np.concatenate():


6. vstack()
Like concatenate vstack() is also used for joining the arrays but only vertically.This function call has only one parameter which is the tuple of arrays to be joined.

Syntax: np.vstack(tup)
tup: sequence of ndarrays

Following is the code snippet for np.vstack():  



7.hstack()
Similar to vstack() we have hstack() which is used for stacking arrays horizontally. The function call has only one parameter which is the tuple of arrays too be joined.
Syntax: np.hstack(tup)
tup: sequence of arrays

Following is the code snippet for np.hstack():



8.flatten()
ndarray.flatten() is used to flatten the array along the horizontal axis.So if the dimension of the array is 3x3 then after using flatten function on that array the dimension will be 1x9.

Syntax: ndarray.flatten(order='C') 
order: {‘C’, ‘F’, ‘A’, ‘K’}, optional 
‘C’ means to flatten in row-major (C-style) order. ‘F’ means to flatten in column-major (Fortran- style) order. ‘A’ means to flatten in column-major order if a is Fortran contiguous in memory, row-major order otherwise. ‘K’ means to flatten a in the order the elements occur in memory. The default is ‘C’. 
 
Following is the code snippet for ndarray.flatten():


9.ravel()
np.ravel() returns a flattened array of the same type which is passed in the function call to ravel.Ravel is similar to flatten but there is a subtle difference between them which is covered in the post further.
 
Syntax: np.ravel(a,order='C')
a: Input array
Order: same as the order parameter in ndarray.flatten()
 
Following is the code snippet for np.ravel():

 
Before closing the post I want to point you attention towards something important.
We have seen how to use flatten and ravel but both the functions give the same output.

Why do we need 2 different function calls to get the same output?????

This difference is explained succinctly in the following post : ndarray.flatten vs np.ravel

Thank you.

That's all for this post!!
Thank you for reading this post.
If you have any suggestions regarding the post contents or if you need some more details on any other topic, please post it in the comments section.
Your suggestions are too valuable so they should not be missed.



References:
1.https://docs.scipy.org/doc/numpy/reference/generated/numpy.concatenate.html
2.https://docs.scipy.org/doc/numpy/reference/generated/numpy.vstack.html
3.https://docs.scipy.org/doc/numpy/reference/generated/numpy.hstack.html
4.https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.flatten.html
5.https://docs.scipy.org/doc/numpy/reference/generated/numpy.ravel.html
6.https://stackoverflow.com/questions/28930465/what-is-the-difference-between-flatten-and-ravel-functions-in-numpy

Comments

Popular posts from this blog

PyMuPDF vs PDFMiner

 As a developer , I was tasked to extract specific data from a PDF. Upon analysing it further, certain patterns were found based on keywords in the document. Since I was using Python language for the task I found 2 tools quite useful which are PyMuPDF and PDFMiner. These tools can then be used to extract the text from a page on which regular expression can be applied to further extract relevant data.     Next, we are going to take a deeper look into these tools, specifically focusing on the pros and cons of each.     PyMuPDF   Docs , PIP package Pros Simple and understandable API Extensive tools to work with text, images, and graphics Available as a PIP package (pip install PyMuPDF) Better support for a range of symbols comparer to PyPDF2   Cons Parsed text is not in sequence Dependency on other package-Fitz Text sequence information lost during extraction     PDFMiner   Docs ,  PIP package ...

Finding difference between 2 files in Python

In this post, we will take a look at how to compare two files using Python.   I was tasked to compare 2 files and then list the differences between them using Python. Initially, I started with filecmp module, but even with the function parameter ‘ shallow’ set to false, the Boolean result was not enough. Sure, it can act as an indicator to take some action, but it will not list the differences.   I was looking for something more visual, something like color coding and not like the git diff output, which is not very user-friendly. But, another Python internal module, difflib helped me to get the job done.   Inside Difflib, HtmlDiff is what I was looking for. The differences were highlighted with 3 different colors and also the line numbers were indicated in a table to locate the differences. The results are quite self-explanatory and it is easier to explain the differences to other people. Code for generating the above difference table: Note: File1...

Adding existing Anaconda environment to Jupyter notebook

In this post we are going to take a look at adding Anaconda environment to Jupyter notebook. Recently, I was working on a CSV file and wanted to work with Pandas package for tabular data manipulation using Python. The problem was even if I install Pandas package, I would have to install other Data Science package as needed. But, the Anaconda environment was already setup on my laptop, which I want to reuse.   Today, we will look into how to reuse the Anaconda environment within the Jupyter Notebook.   There are 4 basic steps to be followed for adding the environment: 1. Create a conda environment Go to Conda command prompt(Run in Admin mode) Run the following command: conda create –-name newenv O/P:   What if there is an existing conda environment? Go to Conda command prompt(No need for Admin mode) Run the following command: conda env list O/P: Since there was only one environment, only one entry was displayed. ‘*’ indicates the cur...