Pandas Read_table With Duplicate Names
When reading a table while specifying duplicate column names - let's say two different names - pandas 0.16.1 will copy the last two columns of the data over and over again. In [1]:
Solution 1:
Using duplicate values in indexes are inherently problematic.
They lead to ambiguity. Code that you think works fine can suddenly fail on DataFrames with non-unique indexes. argmax
, for instance, can lead to a similar pitfall when DataFrames have duplicates in the index.
It's best to avoid putting duplicate values in (row or column) indexes if you can. If you need to use a non-unique index, use them with care. Double-check the effect duplicate values have on the behavior of your code.
In this case, you could use
df = pd.read_csv('data', header=None)
df.columns = ['one','two','one','two','one']
instead.
Post a Comment for "Pandas Read_table With Duplicate Names"