0

I have a script below, I am looping throught GTEx brain tissues and running a program laid out below. How do I in the --output put a unique output file? I am lost on this, still kind of new to bash. BTW, TISSUE files look like this: Brain_Frontal_Cortex_BA9.genes.annot and Lung.genes.annot In the GTExV8 folder, I have 48 unique files that TISSUE will loop through. Thus, the output for one example would be Brain_Frontal_Cortex_BA9_emagma and Lung_emagma

#!/bin/bash

for TISSUE in GTExV8/*;
do
./magma 
--bfile g1000_eur 
--gene-annot $TISSUE 
--pval Summary_Statistics_GWAS_2016/ALSGWAS.txt ncol=Ne 
--gene-settings adap-permp=10000 
--out LABEL WITH EACH TISSUE LOOPED
done
9
  • Hi Ted, Could you add an example below. I have 48 tissues with variable names like Brain_Frontal_Cortex_BA9.genes.annot and Lung.genes.annot. Would your example work? Commented Aug 4, 2020 at 5:54
  • Hi Ted, I fixed this, sorry for the confusion. I want to loop through each tissue file, and extract the unique name for my output name in out. Commented Aug 4, 2020 at 6:07
  • Hi Ted, add them to question, I want them to be Brain_Frontal_Cortex_BA9_emagma and Lung_emagma Commented Aug 4, 2020 at 6:18
  • Hi Ted, There are 48 tissues total, I am providing two examples with variable name lengths. Thanks for your help and editing, much better! Commented Aug 4, 2020 at 6:26
  • Do you generally want the string between the last slash and the first dot after it? Please try to state a precise requirement; edit your question to keep it self-contained rather than heaping extra information in comments. Commented Aug 4, 2020 at 6:29

2 Answers 2

2

Using Parameter Expansion you can extract the part of the $TISSUE variable that you'd like to use in your output filename.

for TISSUE in GTExV8/*;
do
    # remove the directory part:
    outfile="${TISSUE##*/}"

    # remove the file extension:
    outfile="${outfile%%.genes.annot}"

    # add a filename ending:
    outfile="${outfile}_emagma"

    # use $outfile:
    ./magma ... --gene-annot "$TISSUE" --out "$outfile"
done
Sign up to request clarification or add additional context in comments.

1 Comment

In case anyone is wondering, basename variant of the answer (instead of parameter expansion) is for TISSUE in GTExV8/*; do ./magma ... --gene-annot "$TISSUE" --out "$(basename "$TISSUE" .genes.annot)_emagma"; done. But parameter expansion is bash builtin and hence preferred. basename is external executable.
0

No sure what you mean, but if your intention is to generate different names for the --out files based on the $tissue file name, and assuming BA9 is the variable part and you want to name the output after it, you can do

#!/bin/bash
for tissue in GTExV8/*;
do
IFS='_.' a=( $tissue )
./magma \
--bfile g1000_eur \
--gene-annot $tissue \
--pval Summary_Statistics_GWAS_2016/ALSGWAS.txt ncol=Ne \
--gene-settings adap-permp=10000 \
--out Amygdala_emagma_${a[3]}
done

which is splitting the $tissue name in an array a to get ${a[3]}

3 Comments

Hi Diego, I think I understand what your example is. The 48 tissues are names differently. Here is another example: Lung.genes.annot. Could this example still work?
Hi Diego, I would want to extract this whole portion Brain_Frontal_Cortex_BA9 for the output name.
Probably fix the quoting here in case some file names contain shell metacharacters. Keep in mind that your solution should work for future visitors as well, whose requirements may differ from the OP's. See also When to wrap quotes around a shell variable?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.