Why I Only Get The Last Output In My Output File?
I tried to find particular columns based on a list of column's name by using pandas in python 2.7. For example, >>>df = pd.read_csv('database.csv') A,B,C,D,E,F,G # A to
Solution 1:
You overwrite result
every time hence the reason you only get the last result, also you don't need to use a loop this will work:
df[name_list.index].to_csv('result.csv')
Example:
In [21]:
import pandas as pd
import io
temp="""A,B,C,D,E,F,G
1,2,3,4,5,6,7"""
temp1="""Name
B
E
F"""
df = pd.read_csv(io.StringIO(temp))
print(df)
name = pd.read_csv(io.StringIO(temp1), index_col=[0])
name
A B C D E F G
01234567
Out[21]:
Empty DataFrame
Columns: []
Index: [B, E, F]
In [20]:
df[name.index]
Out[20]:
B E F
0256
The above shows that it's not necessary to create another df just to get the columns of interest to write out, once you read in your names you can pass the index to sub-select the columns of interest from the original df and then write them out to a csv.
EDIT
If you have duplicated entries in your index you can call unique
to de-duplicate the values:
In [24]:
temp1="""Name
B
B
E
F"""
name = pd.read_csv(io.StringIO(temp1), index_col=[0])
print(name)
df[name.index.unique()]
Empty DataFrame
Columns: []
Index: [B, B, E, F]
Out[24]:
B E F
0256
Post a Comment for "Why I Only Get The Last Output In My Output File?"