Skip to content Skip to sidebar Skip to footer

Pandas Dataframe To Code

If I have an existing pandas dataframe, is there a way to generate the python code, which when executed in another python script, will reproduce that dataframe. e.g. In[1]: df Ou

Solution 1:

You could try to use the to_dict() method on DataFrame:

print"df = pd.DataFrame( %s )" % (str(df.to_dict()))

If your data contains NaN's, you'll have to replace them with float('nan'):

print"df = pd.DataFrame( %s )" % (str(df.to_dict()).replace(" nan"," float('nan')"))

Solution 2:

Here's another approach that does not use dicts

import numpy as np

def dataframe_to_code(df):
    data = np.array2string(df.to_numpy(), separator=', ')
    data = data.replace(" nan", " float('nan')")
    cols = df.columns.tolist()
    return f"""df = pd.DataFrame({data}, columns={cols})"""

The data.replace(" nan", " float('nan')") is optional and was inspired by madokis excellent answer.

Note that np.array2string only works for numpy versions 1.11 and higher.

I recommend using https://github.com/psf/black to format the output

Solution 3:

I always used this code which help me much

defgen_code(df):
    return'pickle.loads({})'.format(pickle.dumps(df))

import pickle
code_string = gen_code(df)
code_string

So now you can copy the output of the code_string and paste it as follow to that string variable A

A= 'Paste your code_string here'
import pickle
df=eval(A)

This had helped me copy and past data frames in such platform

Solution 4:

You can first save the dataframe you have, and then load in another python script when necessary. You can do it with two packages: pickle and shelve.

To do it with pickle:

import pandas as pd
import pickle
df = pd.DataFrame({'user': ['Bob', 'Jane', 'Alice'], 
                   'income': [40000, 50000, 42000]})
withopen('dataframe', 'wb') as pfile:
    pickle.dump(df, pfile)           # save df in a file named "dataframe"

To read the dataframe in another file:

import pickle
withopen('dataframe', 'rb') as pfile:
    df2 = pickle.load(pfile)        # read the dataframe stored in file "dataframe"print(df2)

Output:

    income  user
0   40000   Bob
1   50000   Jane
2   42000   Alice

To do it with shelve:

import pandas as pd
import shelve
df = pd.DataFrame({'user': ['Bob', 'Jane', 'Alice'], 
                   'income': [40000, 50000, 42000]})
with shelve.open('dataframe2') as shelf:
    shelf['df'] = df               # store the dataframe in file "dataframe"

To read the dataframe in another file:

import shelve
with shelve.open('dataframe2') as shelf:
    print(shelf['df'])             # read the dataframe 

Output:

    income  user
0   40000   Bob
1   50000   Jane
2   42000   Alice

Post a Comment for "Pandas Dataframe To Code"