Pandas Json_normalize Produces Confusing `keyerror` Message?
Solution 1:
In this case, I think you'd just use this:
In[57]: json_normalize(data[0]['events'])
Out[57]:
groupschedule.IDschedule.dateschedule.location.building \
0A8152015-08-27BDC1A8162015-08-27BDCschedule.location.floor0515
The meta
paths ([['schedule','date']...]
) are for specifying data at the same level of nesting as your records, i.e. at the same level as 'events'. It doesn't look like json_normalize
handles dicts with nested lists particularly well, so you may need to do some manual reshaping if your actual data is much more complicated.
Solution 2:
I got the KeyError when the structue of the json was not consistent. Meaning, when one of the nested strucutes were missing from the json, I got KeyError.
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.json.json_normalize.html
From the examples mentioned on the pandas documentation site, if you make the nested tag (counties) missing on one of the records, you will get a KeyError. To circumvent this, you might have to make sure ignore the missing tag or consider only the records which have nested column/tag populated with data.
Solution 3:
I had this same problem! This thread helped, especially parachute py's answer.
I found a solution using:
df.dropna(subset=*column(s) with nested data*)
then saving the resultant df
as a new json.
Load the new json and now you'll be able to flatten the nested columns.
There's probably a more efficient way to get around this, but my solution works.
edit: forgot to mention, I tried using the *errors = 'ignore'*
arg in json.normalize()
and it didn't help.
Post a Comment for "Pandas Json_normalize Produces Confusing `keyerror` Message?"