2

I want to merge rows in csv files by matching the id with a given dictionary.

I have a dictionary: l= {2.80215: [376570], 0.79577: [378053], 22667183: [269499]}

I have a csv file.

          A        B        C        D      
2000-01-03 -0.59885 -0.18141 -0.68828 -0.77572
2000-01-04  0.83935  0.15993  0.95911 -1.12959
2000-01-05  2.80215 -0.10858 -1.62114 -0.20170
2000-01-06  0.71670 -0.26707  1.36029  1.74254
2000-01-07 -0.45749  0.22750  0.46291 -0.58431
2000-01-10 -0.78702  0.44006 -0.36881 -0.13884
2000-01-11  0.79577 -0.09198  0.14119  0.02668
2000-01-12 -0.32297  0.62332  1.93595  0.78024
2000-01-13  1.74683 -1.57738 -0.02134  0.11596

The output should be :

     A        B        C        D          E      F     
2000-01-03 -0.59885 -0.18141 -0.68828 -0.77572    0
2000-01-04  0.83935  0.15993  0.95911 -1.12959    0
2000-01-05  2.80215 -0.10858 -1.62114 -0.20170    376570 
2000-01-06  0.71670 -0.26707  1.36029  1.74254    0
2000-01-07 -0.45749  0.22750  0.46291 -0.58431    0
2000-01-10 -0.78702  0.44006 -0.36881 -0.13884    0
2000-01-11  0.79577 -0.09198  0.14119  0.02668    378053
2000-01-12 -0.32297  0.62332  1.93595  0.78024    0
2000-01-13  1.74683 -1.57738 -0.02134  0.11596    0 

I tried to do it this way:

import pandas
data = panda.read_csv("thisfiel.csv")

data["F"] = data["B"].apply(lambda x: l[x])

But I couldn't get the aimed results.

1
  • I think Andy's answer is the right approach, however you shoudn't create a dict storing it's values as lists unless you happen to have more than 1 value for the entries Commented Mar 10, 2014 at 18:46

2 Answers 2

2

If you l were a DataFrame you could do a merge:

In [11]: l_df = pd.DataFrame.from_dict(l, orient='index')

In [12]: l_df.columns = ['F']

In [13]: l_df
Out[13]: 
                     F
2.80215         376570
0.79577         378053
22667183.00000  269499

Merge on column A and the index of l_df:

In [14]: merged = df.merge(l_df, left_on='A', right_index=True, how='left')

In [15]: merged
Out[15]: 
                  A        B        C        D       F
2000-01-03 -0.59885 -0.18141 -0.68828 -0.77572     NaN
2000-01-04  0.83935  0.15993  0.95911 -1.12959     NaN
2000-01-05  2.80215 -0.10858 -1.62114 -0.20170  376570
2000-01-06  0.71670 -0.26707  1.36029  1.74254     NaN
2000-01-07 -0.45749  0.22750  0.46291 -0.58431     NaN
2000-01-10 -0.78702  0.44006 -0.36881 -0.13884     NaN
2000-01-11  0.79577 -0.09198  0.14119  0.02668  378053
2000-01-12 -0.32297  0.62332  1.93595  0.78024     NaN
2000-01-13  1.74683 -1.57738 -0.02134  0.11596     NaN

Note: atm NaN mean missing, you can fill them in with fillna:

In [16]: merged['F'].fillna(0, inplace=True)
Sign up to request clarification or add additional context in comments.

4 Comments

What If I need to match with A column, how can I merge it ?
@user3378649 I don't follow.. this is matching column A with the index of l_df
What Is l1, is it l_df
@user3378649 ah yes, my bad. Tried for better naming convention but missed one.
1

Do this:

def getVal(x):
    try:
        return l[x][0]

    except KeyError:
        return 0

data['F'] = data['B'].map(getVal)

7 Comments

It didn't work ! I still get zeros in the new column
@user3378649 did you want zeros or NaN? You can change the get(x,0) to get(x,NaN)
Wired ! I am getting value "1" in all the new column
Literally, I should match: B.value, if (B.Value == l.item() ), I should insert the item from dict in the same row where we found the value in column B.
I think the issue is that l is a dict with list values.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.