0

I am using below regex in my script to read files ending of type _L001_R1_001.fastq or _L001_R2_001.fastq

if it is R1 it should be read into readPair_1 and if R2 it should be read into readPair_2 but its not matching anything.

can anyone please tell me what is wrong here?

My script:

#! /bin/bash -l

Proj_Dir="${se_ProjDir}/*.fastq"

for Dir in $Proj_Dir
do

        if [[ "$Dir" =~ _L.*_R1_001.fastq]]
        then

            readPair_1=$Dir
            echo $readPair_1

        fi
        if [[ "$Dir" =~ _L.*_R2_001.fastq]]
        then

            readPair_2=$Dir
            echo $readPair_2

        fi

Files:

Next-ID-1-MN-SM5144-170509-ABC_S1_L001_R1_001.fastq
Next-ID-1-MN-SM5144-170509-ABC_S1_L001_R2_001.fastq
Next-ID-1-MN-SM5144-170509-ABC_S2_L001_R1_001.fastq
Next-ID-1-MN-SM5144-170509-ABC_S2_L001_R2_001.fastq
Next-ID-1-MN-SM5144-170509-ABC_S3_L001_R1_001.fastq
Next-ID-1-MN-SM5144-170509-ABC_S3_L001_R2_001.fastq
5
  • Try _L[^_]*_R[0-9]+_001\.fastq\.gz. A $ at the end might also be useful to match only at the end of input. Commented Jun 1, 2017 at 21:02
  • When you say it's not working, what does that mean? Is it only matching some of the strings you want, or is it not matching anything? Your regex is imprecise due to unescaped .s but it looks like it should still match the R1_001 files. Commented Jun 1, 2017 at 21:06
  • Thanks for the comment. No Its not matching anything. Commented Jun 1, 2017 at 21:09
  • What is the language you are using? Tag it. Commented Jun 1, 2017 at 21:09
  • 1
    Your code contains a wildcard pattern (aka glob), not a regular expression. Commented Jun 1, 2017 at 21:12

3 Answers 3

1

You need .gz at the end of your pattern. You're not getting any files at all:

Proj_Dir="${se_ProjDir}/*.fastq.gz"

You also need spaces before ]]:

if [[ "$Dir" =~ _L.*_R1_001.fastq ]]

and

if [[ "$Dir" =~ _L.*_R1_002.fastq ]]
Sign up to request clarification or add additional context in comments.

Comments

0

Try:

L001_R[12]_001\.fastq\.gz$

This will look for either the R1 or R2 files, and ensure that that's how the filename string ends.

Comments

0

The regular expression for =~ operator must match the whole string. Therefore you should modify your regular expression in if statements as follows: .*_L.*_R1_001.fastq and .*_L.*_R2_001.fastq.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.