I'm trying to compare two files, and to extract lines in the first file that correspond to the second file for the first column. For example:
File 1:
VarID GeneID TaxName PfamName
3810359 1327 Isochrysidaceae Methyltransf_21&Methyltransf_22
6557609 5442 Peridiniales NULL
4723299 7370 Prorocentrum PEPCK_ATP
3019317 10454 Dinophyceae NULL
2821675 10965 Bacillariophyta PK;PK_C
5559318 12824 Dinophyceae Cyt-b5&FA_desaturase
File 2:
VarID
3810359
6557609
4723299
5893435
4852156
For the output I want this file :
VarID GeneID TaxName PfamName
3810359 1327 Isochrysidaceae Methyltransf_21&Methyltransf_22
6557609 5442 Peridiniales NULL
4723299 7370 Prorocentrum PEPCK_ATP
I tried this code :
f1 = sys.argv[1]
f2 = sys.argv[2]
file1_rows = []
with open(f1, 'r') as file1:
for row in file1:
file1_rows.append(row.split())
# Read data from the second file
file2_rows = []
with open(f2, 'r') as file2:
for row in file2:
file2_rows.append(row.split())
# Compare data and compute results
results = []
for row in file2_rows:
if row[:1] in file1_rows:
results.append(row[:4])
else:
results.append(row[:4])
# Print the results
for row in results:
print(' '.join(row))
Can you please help me ??? Thank you !!
if row[:1] in file1_rows:withif row[0] in file1_rows:. also delete the else