Im trying to merge two csv's based on a condition. The Value 'KEYS' on csv2 has to match the 'TCNUM' on CSV1, and append it the third column. The csv's are very large and it has to be done through code.
df1 - CSV1:
ID TC_NUM
dialog_testcase_0101.0001_greeting.xml 101.0001
dialog_testcase_0101.0002_greeting.xml 101.0002
dialog_testcase_0101.0003_greeting.xml 101.0003
dialog_testcase_0101.0004_greeting.xml 101.0004
dialog_testcase_0101.0005_greeting.xml 101.0005
dialog_testcase_0101.0006_greeting.xml 101.0006
dialog_testcase_0901.0008_greeting.xml 901.0007
dialog_testcase_0101.0008_greeting.xml 101.0008
dialog_testcase_0501.001_greeting.xml 501.001
dialog_testcase_0801.0011_greeting.xml 801.0011
df2 - CSV2:
KEYS TC_NUM
FIT-3982 TC 101.0011, 101.0004
FIT-3980 TC 801.0011.901.007
FIT-3979 TC 101.0006, 501.001, 1907.0019, 1907.0020, 1907.0021
What I want:
csvFinal:
ID TC_NUM Keys
dialog_testcase_0101.0001_greeting.xml 101.0011 FIT-3982
dialog_testcase_0101.0002_greeting.xml 101.0002
dialog_testcase_0101.0003_greeting.xml 101.0006 FIT_3979
dialog_testcase_0101.0004_greeting.xml 101.0004 FIT-3982
dialog_testcase_0101.0005_greeting.xml 101.0005
dialog_testcase_0101.0006_greeting.xml 101.0011 FIT_3982
dialog_testcase_0901.0008_greeting.xml 901.0007 FIT_3979
dialog_testcase_0101.0008_greeting.xml 101.0008
dialog_testcase_0501.001_greeting.xml 501.001 FIT-3979
dialog_testcase_0801.0011_greeting.xml 801.0011 FIT-3980
My code ..
mergedOpen = pd.merge(df1, df2, on=['TC_NUM'])
mergedOpen.set_index('TC_NUM', inplace=True)
mergedOpen.to_csv('MergedCSVOPEN.csv')