I have several fastq.gz files, both R1 and R2 in a directory dir on a Linux system. It looks like:
dir
|____sampleA_1.fastq.gz
|____sampleA_2.fastq.gz
|____sampleB_1.fastq.gz
|____sampleB_2.fastq.gz
|____sampleC_1.fastq.gz
|____sampleC_2.fastq.gz
I wanted to create a txt file with sample name as first column, path to R1 fastq as second column and path to R2 fastq as third column.
Inside dir I tried in the following way:
find "$PWD" -name \*1.fastq.gz > list1.txt
find "$PWD" -name \*2.fastq.gz > list2.txt
And again I have to merge those two files and give a column name and again create another column with sample names. Instead, Is there a way to make the file with a single command?
And txt file should look like below:
sample Second Third
sampleA dir/sampleA_1.fastq.gz dir/sampleA_2.fastq.gz
sampleB dir/sampleB_1.fastq.gz dir/sampleB_2.fastq.gz
sampleC dir/sampleC_1.fastq.gz dir/sampleC_2.fastq.gz
sampleA_L001_R1_001.fastq.gzandsampleA_L001_R2_001.fastq.gz? Do you really only have one underscore (_) per file name? Also, can you be sure you will never have more than a single pair of reads per sample? Depending on coverage and size of the assay's target regions, you can have several.joinmight be able to do it