I just started using python and am trying to convert some of my R code into python. The task is relatively simple; I have many csv file with a variable name (in this case cell lines) and values ( IC50's). I need to pull out all variables and their values shared in common among all files. Some of these files share the save variables but are formatted differently. For example in some files a variable is just "Cell_line" and in others it is MEL:Cell_line. So first things first to make a direct string comparison I need to format them the same and hence am trying ti use str.split() to do so. There is probably a much better way to do this but for now I am using the following code:
import csv
import os
# Change working directory
os.chdir("/Users/joshuamannheimer/downloads")
file_name="NCI60_Bleomycin.csv"
with open(file_name) as csvfile:
NCI_data=csv.reader(csvfile, delimiter=',')
alldata={}
for row in NCI_data:
name_str=row[0]
splt=name_str.split(':')
n_name=splt[1]
alldata[n_name]=row
[1] name_str.split return a list of length 2. Since the portion I want is after the ":" I want the second element which should be indexed as splt[1] as splt[0] is the first in python. However when I run the code I get this error message "IndexError: list index out of range" I'm trying the second element out of a list of length 2 thus I have no idea why it is out of range. Any help or suggestions would be appreciated.
:in it?"MEL:Cell_line"; it will fail on"Cell_line", as it will only havesplt[0]. You can usesplt[-1]to always get the last element, however many there are.