Skip to content Skip to sidebar Skip to footer

How To Fill The Missing Value When Constructing A Dataframe?

I'm using pandas to store a large but very sparse matrix(50,000 rows*100,000 columns), each element of this matrix is a float number from 0.00 to 1.00. The original element values

Solution 1:

It won't help you but it is an expected behavior. Citing Caveats and Gotchas

When introducing NAs into an existing Series or DataFrame via reindex or some other means, boolean and integer types will be promoted to a different dtype in order to store the NAs.

Comment by @EdChum provides optimal solutions but if really have to work with dicts then you can try something like this:

# Choose some default valuedefault = 0# Prepare dict with defaults
defaults = {k: defaultfor k in chain(*(x.keys() for x in dc.values()))}

# Fill gaps if needed and construct data frame
df = pd.DataFrame(
    {k: dict(defaults.items() + v.items()) for k, v in dc.items()},
    index=['a', 'b', 'c','d'], dtype=np.uint8)

Post a Comment for "How To Fill The Missing Value When Constructing A Dataframe?"