In the bash below I am looping through the pairs of .fastq files and using them in the commented command. The variable $pre has the name in it and it does extract it, the problem that I can't figure out is how to only use it in the commented command once? In the example below $pre is NA11111 but is extracted twice. Is there a way to only use it once in the command? I have tried removing duplicates with awk with no luck and trying cut. Thank you :).
Bash
for file in /home/cmccabe/Desktop/fastq/*.fastq ; do
sample=${file%.fastq}
bname=`basename $sample`
pre="$(echo $bname|cut -d- -f1,1)"
#bwa mem -M -t 16 /home/cmccabe/Desktop/NGS/picard-tools-1.140/resources/ucsc.hg19.fasta "$sample.fastq" "$sample" /home/cmccabe/Desktop/fastq/${pre}_aln.sam
echo "$sample.fastq"
echo "$sample"
echo "$pre"
done
current output
/home/cmccabe/Desktop/fastq/NA11111-100ng-E08A-C06_S5_L001_R1_001.fastq `this is $sample.fastq`
/home/cmccabe/Desktop/fastq/NA11111-100ng-E08A-C06_S5_L001_R1_001 `this is $sample`
NA11111 `this is $pre`
/home/cmccabe/Desktop/fastq/NA11111-100ng-E08A-C06_S5_L001_R2_001.fastq `this is $sample.fastq`
/home/cmccabe/Desktop/fastq/NA11111-100ng-E08A-C06_S5_L001_R2_001 `this is $sample`
NA11111 `this is $pre`
desired output
#bwa mem -M -t 16 /home/cmccabe/Desktop/NGS/picard-tools-1.140/resources/ucsc.hg19.fasta "$sample.fastq" "$sample" /home/cmccabe/Desktop/fastq/${pre}_aln.sam
$sample.fastq = /home/cmccabe/Desktop/fastq/NA11111-100ng-E08A-C06_S5_L001_R1_001.fastq
$sample = /home/cmccabe/Desktop/fastq/NA11111-100ng-E08A-C06_S5_L001_R1_001
$pre = NA11111
NA11111value is seen in bothR1andR2files. So what logic is used to know if R1 or R2 is the file you want? Of is once you find a file forNA11111, other files withNA11111are to be discarted? If so, you could extract theNA?????values present, list the files with that prefix and keep only the first one (head -1).bwacommand does a paired end aliment using the R1 and R2 of the same sample but that isNA11111which is duplicated. Am I missing something? Thank you :)