educative.io

Regarding the Reshape Pivot Code

import pandas.util.testing as tm; tm.N = 3
import numpy as np
import pandas as pd

def unpivot(frame):
N, K = frame.shape
data = {‘value’ : frame.values.ravel(‘F’),
‘variable’ : np.asarray(frame.columns).repeat(N),
‘date’ : np.tile(np.asarray(frame.index), K)}
return pd.DataFrame(data, columns=[‘date’, ‘variable’, ‘value’])
df = unpivot(tm.makeTimeDataFrame())
print(df)`

In the above code, why we took tm.N = 3 (I know because of this it is giving us only 3 records in the output, but why only N, if we take any other alphabet, we are getting jan and Feb month data)

  1. Where I will get information about pandas.util.testing and N=3 ?

By which concept we created the data in the above code

data = {‘value’ : frame.values.ravel(‘F’),
‘variable’ : np.asarray(frame.columns).repeat(N),
‘date’ : np.tile(np.asarray(frame.index), K)}

What I can read to understand the above code

1 Like

Hi @shashidhar

This is Maham Amjad from Educative. I noticed your feedback here, and I’m glad you reached out to us.

The module pandas.util.tetsing is imported with name tm. It has an attribute N and by default its value is 30. So the statement tm.N =3 sets its value to 3. If you don’t set N, it will generate 30 records for you. Right now, 3 records are being shown for each A, B, C and D. Try removing tm.N = 3 and run the code. This time 30 records will be printed for each A, B, C and D.

Replacing N with any other alphabet does nothing, and tm.N will remain 30 by default. It’s like you are creating an attribute temporarily but it has nothing to do with the code responsible for creating a data frame because all it knows is tm.N.

For your second part of the question, the best way is to break the code and print it part by part. Here’s what’s happening:

  • Try printing frame to see what the original table looks like. The frame.values will get all the values in a 2D array. The first column will hold all values of A. Second column will hold all values of B. Third column will hold all values of C, and the fourth column will hold all values of D. Next, we are using ravel to convert 2D array into 1D array. The letter F means to index the elements in column-major. This will return 1D array, with values of A column listed first, then B, then C and then D. Check this link for more detail.

  • Next, we are setting variables for the values we obtained above. We obtained the name of columns using frame.columns and converted them into an array using np.asarray(). But we now, we know that values are in 1D array with all three values of A column listed first, then B, then C and then D. So we have called repeat(N) on the array which gives [A A A B B B C C C D D D].

  • Lastly, we get the array of the indexes of the original frame using frame.index which were the dates. We then call np.tile() on this array to repeat the array K times.

Finally, we make a new data frame in the long format, with the data we set above.

Thank you again for reaching out! Please feel free to reach out if you have any other queries or concerns.

Best Regards,
Maham Amjad| Developer Advocate
Educative Inc.

1 Like