Skip to content Skip to sidebar Skip to footer

Why I Only Get The Last Output In My Output File?

I tried to find particular columns based on a list of column's name by using pandas in python 2.7. For example, >>>df = pd.read_csv('database.csv') A,B,C,D,E,F,G # A to

Solution 1:

You overwrite result every time hence the reason you only get the last result, also you don't need to use a loop this will work:

df[name_list.index].to_csv('result.csv')

Example:

In [21]:

import pandas as pd
import io
temp="""A,B,C,D,E,F,G
1,2,3,4,5,6,7"""
temp1="""Name
B
E
F"""
df = pd.read_csv(io.StringIO(temp))
print(df)
name = pd.read_csv(io.StringIO(temp1), index_col=[0])
name
   A  B  C  D  E  F  G
01234567
Out[21]:
Empty DataFrame
Columns: []
Index: [B, E, F]
In [20]:

df[name.index]
Out[20]:
   B  E  F
0256

The above shows that it's not necessary to create another df just to get the columns of interest to write out, once you read in your names you can pass the index to sub-select the columns of interest from the original df and then write them out to a csv.

EDIT

If you have duplicated entries in your index you can call unique to de-duplicate the values:

In [24]:

temp1="""Name
B
B
E
F"""
name = pd.read_csv(io.StringIO(temp1), index_col=[0])
print(name)
df[name.index.unique()]
Empty DataFrame
Columns: []
Index: [B, B, E, F]
Out[24]:
   B  E  F
0256

Post a Comment for "Why I Only Get The Last Output In My Output File?"