I have the following dataframe Positive Samples Dataframe loaded from a txt file using pandas
This positive samples dataframe has a column called Gene Set, which is basically a list of genes. When I run postive_samples["Gene Set"] I get the following output
['YAL004W', 'YLL024C'] ['YAL005C', 'YLL024C'] ['YAL005C', 'YMR006C'] ['YAL005C', 'YOL090W'] ['YAL009W', 'YBR074W'] ['YAL009W', 'YER162C'] ['YAL009W', 'YHL024W'] ['YAL009W', 'YJL187C'] ['YAL009W', 'YKR003W']
I also have another dataframe called new_expression_df New Expression Dataframe, which has positive_samples["Gene Set"] column as its index.
So What i am trying to do is get the values which are stored in postive_samples["Gene Set"] as they are and locate them using loc in the new_expression_df index using a loop.
samples_column_list= ["GSM144760","GSM144761","GSM144762","GSM144763","GSM144764"]
for gene_class_column in postive_samples[['Gene Set']]:
#Select column contents by column name using [] operator
geneSeriesObj = postive_samples[gene_class_column]
gene_pairs = geneSeriesObj.values
#get gene pairs and locate their expression in the given samples
for gene_pair in gene_pairs:
new_expression_df.loc[gene_pair,samples_column_list]
I am getting a key Error at the beginning of the iteration when I try to do this using a loop, ideally I want to get each gene set as a list, locate its values in another data frame using its index.
But, when I plug in each set like below without using a loop it works just fine for the same values I get a Key Error for, so what am I doing wrong here?
new_expression_df.loc[['YAL002W','YBL001C'],samples_column_list]
I want to put the row argument of the loc function dynamically from another data frame column values which are stored in a list.