I have a txt file like this:
`Empty DataFrame
Columns: [0, 1, 2, 3, 4]
Index: []
Empty DataFrame
Columns: [0, 1, 2, 3, 4]
Index: []
0 1 2 \
46 RNA/4v6p.csv,46AA/U/551 RNA/4v6p.csv,46AA/A/33 RNA/4v6p.csv,46WW_cis
47 RNA/4v6p.csv,46AA/G/550 RNA/4v6p.csv,46AA/C/34 RNA/4v6p.csv,46WW_cis
48 RNA/4v6p.csv,46AA/A/553 RNA/4v6p.csv,46AA/U/30 RNA/4v6p.csv,46WW_cis
49 RNA/4v6p.csv,46AA/U/552 RNA/4v6p.csv,46AA/A/33 RNA/4v6p.csv,46WW_cis
50 RNA/4v6p.csv,46AA/U/1199 RNA/4v6p.csv,46AA/G/1058 RNA/4v6p.csv,46WW_cis
3 4
46 NaN NaN
47 NaN NaN
48 NaN NaN
49 NaN NaN
50 NaN NaN`
And I want to read it into an array with 3 columns. For now I tried using pd.read_csv(self.filename,delim_whitespace=True), but that gives me a lot of errors while trying to read Empty DataFrame part. How can I make program ignore this part?
Edit: Optimal solution would be if there was no Empty DataFrames in my file. The file is an effect of searching in many files, among which some are empty. I thought I had filtered empty files by giving an exception so that effect of searching in empty files would not be stored in results. I suppose I did it in the wrong way. Can somebody please correct me?
from numpy import numpy.mean as nm
def find_same_direction_chain(self, results):
separation= lambda x: pd.Series([i for i in x.split('/')])
left_chain=self.data[0].apply(separation)
right_chain=self.data[1].apply(separation)
i=1
try:
while i<len(self.data[:])-5:
if nm(left_chain[2][i:i+3])>=nm(left_chain[2][i+2:i+5]) and nm(right_chain[2][i:i+3])>=nm(right_chain[2][i+2:i+5]) and len(self.data[:])>0:
if nm(left_chain[2][i+2:i+5])>=nm(left_chain[2][i+4:i+7]) and nm(right_chain[2][i+2:i+5])>=nm(right_chain[2][i+4:i+7]):
results.chains.append(str(self.filename+", "+str(i)+self.data[0:3][i:i+5]))
else: pass
i+=1
except ValueError:
results.bin.append(self.filename)
except TypeError:
results.data_structure_error.append(self.filename)