I am trying to create a simple line graph to compare columns from two files. I have written some code and would like to know how to ignore lines in the two .csv files that I have. The code is as follows:
import numpy as np
import csv
from matplotlib import pyplot as plt
def read_cell(x, y):
with open('Illumina_Heart_Gencode_Paired_End_Novel_Junctions.csv', 'r') as f:
reader = csv.reader(f)
y_count = 0
for n in reader:
if y_count == y:
cell = n[x]
return cell
y_count += 1
print(read_cell(6, 932)
def read_cell(x, y):
with open('Illumina_Heart_RefSeq_Paired_End_Novel_Junctions.csv', 'r') as f:
reader = csv.reader(f)
y_count = 0
for n in reader:
if y_count == y:
cell = n[x]
return cell
y_count += 1
print(read_cell(6, 932))
d1 = []
for i in set1:
try:
d1.append(float(i[5]))
except ValueError:
continue
d2 = []
for i in set2:
try:
d2.append(float(i[5]))
except ValueError:
continue
min_len = len(d1)
if len(d2) < min_len:
min_len = len(d2)
d1 = d1[0:min_len]
d2 = d2[0:min_len]
plt.plot(d1, d2, 'r*')
plt.plot(d1, d2, 'b-')
plt.xlabel('Data Set 1: PE_NJ')
plt.ylabel('Data Set 2: PE_SJ')
plt.show()
The first csv file has 932 rows and the second one has 99,154 rows. I am only interested in taking the first 932 rows from both files and then want to compare the 7th column in both files.
How do I go about doing that?
The first file looks like this:
chr1 1718493 1718764 2 2 0 12 0 24
chr1 8928117 8930883 2 2 0 56 0 24
chr1 8930943 8931949 2 2 0 48 0 25
chr1 9616316 9627341 1 1 0 12 0 24
chr1 10166642 10167279 1 1 0 31 1 24
The second file looks like so:
chr1 880181 880421 2 2 0 15 0 21
chr1 1718493 1718764 2 2 0 12 0 24
chr1 8568735 8585817 2 2 0 12 0 21
chr1 8617583 8684368 2 2 0 14 0 23
chr1 8928117 8930883 2 2 0 56 0 24