I am looking to learn how to concatenate multiple columns in python. I have a dataset which looks like this:
gene match_type drug sources pmids
ABO Definite CHEMBL50267 DrugBank 17139284|17016423
ABO Definite URIDINE_DIPHOSPHATE TdgClinicalTrial 17139284|17016423
ABO Definite CHEMBL439009 DrugBank 12972418
ABO Definite CHEMBL1232343 DrugBank NA
ABO Definite CHEMBL503075 DrugBank NA
I am trying to bring this into one row (concatenating the drug column, the sources column and the pmids column) to look like:
gene match_type drug sources pmids
ABO Definite CHEMBL1232343 CHEMBL439009 CHEMBL50267 CHEMBL503075 URIDINE_DIPHOSPHATE NA DrugBank TdgClinicalTrial DrugBank DrugBank DrugBank 0 12972418 17139284|17016423 17139284|17016423 NA NA
I have looked into using if statements using pandas.concat and .iterrows to go through everything, but I have gotten a bit lost with this and I am not sure actually what functions I should have started with to achieve my goal. Any help in the right direction would be appreciated.
This is what I've tried but it's got a lot wrong it if not everything:
for index, row in data.iterrows():
if[1,2]==[2,1]:
pd.concat(['drug'],['interaction_types'],['sources'],['pmids'],)
else:
print(row[:])