0

I attempt to left join these two dataframe by using this landline.merge(AreaCode, how = 'left', left_on = 'source', right_on = 'codes') However the values become ALL null, what did I do wrongly?

edited 1 I have used following code to make sure the data type are the same.

landline['source'] = landline['source'].astype(str)
AreaCode['codes'] = AreaCode['codes'].astype(str)
  • Failed merge
    datetime              source       Day         code     area
0   2019-01-01 16:22:46 |        |  Tuesday    |    NaN   |  NaN
1   2019-01-02 09:33:29 |        |  Wednesday  |    NaN   |  NaN
2   2019-01-02 09:44:46 |        |  Wednesday  |    NaN   |  NaN
  • landline dataframe
<class 'pandas.core.frame.DataFrame'>
Int64Index: 3562 entries, 0 to 7097
Data columns (total 3 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   datetime  3555 non-null   object
 1   source    3562 non-null   object
 2   Day       3555 non-null   object
dtypes: object(3)
memory usage: 111.3+ KB
  • areacode dataframe
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 611 entries, 0 to 610
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   codes   611 non-null    object
 1   area    611 non-null    object
dtypes: object(2)
memory usage: 9.7+ KB
6
  • Probably because all values in 'source' column is landline an empty string, while it's not true for 'codes' in areacode. So, there wasn't a match at all. Commented Nov 15, 2020 at 15:58
  • make sure that the data type of source and codes columns are the same Commented Nov 15, 2020 at 16:03
  • However, I have also tried to covert 'source' and 'codes' into str data type, the combined result still empty. Commented Nov 15, 2020 at 16:05
  • @Jason , They look like the same in 'Object'. Anything I could do in order to make them in consistent data type? Commented Nov 15, 2020 at 16:06
  • using .astype(str)? Commented Nov 15, 2020 at 16:08

1 Answer 1

1

There is nothing wrong with your command/code.

It might be all values in 'source' column are really not matching with codes.

If you think that values are present, but still are not matching, probably the values of source column are different type. In Pandas, 'Object' type doesn't mean it is a string, it means it is mixed type.

For example:

df_l = pd.DataFrame({'source': [1,'1','2','2']})
df_r = pd.DataFrame({'codes':['1','2'], 'value':['x','y']})

df_l.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   source  4 non-null      object
dtypes: object(1)
memory usage: 160.0+ bytes

df_r.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   codes   2 non-null      object
 1   value   2 non-null      object
dtypes: object(2)
memory usage: 160.0+ bytes

df_l.merge(df_r, how='left', left_on=['source'], right_on=['codes'])
  source codes value
0      1   NaN   NaN
1      1     1     x
2      2     2     y
3      2     2     y

In the above example df_l, has both int & string of '1' but info shows as object. If you see the merge result, it matches with one row which is string and doesn't match with another row which is int.

Sign up to request clarification or add additional context in comments.

3 Comments

If it is possible that because some values of left table can't match with right table, which causes this error. Could I convert object into string data type?
Yes, it can help, If value of both is same. But if value of source can not be found in codes, obviously it will return Nan. But I just observed your edit1 where you converted it to 'str' type. Is problem persists even after converting ?
I have rerun the whole coding again and examine the values of source, I found that the values are empty there. I think that's why even though I change the data type and it doesn't work. Much appreciated for your help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.