2

I have the following structure:

.
├── dag_1
│   ├── dag
│   │   ├── current
│   │   └── deprecated
│   └── sparkjobs
│       ├── current
│       |    └── spark_3.py
│       └── deprecated
│           └── spark_1.py
│           └── spark_2.py
├── dag_2
│   ├── dag
│   │   ├── current
│   │   └── deprecated
│   └── sparkjobs
│       ├── current
│       |    └── spark_3.py
│       └── deprecated
│           └── spark_1.py
│           └── spark_2.py

I want to create a new folder getting only current spark jobs, my expected output folder is:

.
├── dag_1
|    └── spark_3.py
├── dag_2
     └── spark_3.py

I've tried to use

find /mnt/c/Users/User/Test/ -type f -wholename "sparkjob/current" | xargs -i cp {} /mnt/c/Users/User/Test/output/

Although my script is not writing the files and returns me no error. How can I solve this?

4 Answers 4

2

Use this, install command take the input file and copy it to another dir structure, creating the whole tree of dirs if necessary as mkdir -p transparently:

(you need to add wildcard * in -wholename to effectively find files)

find . -type f -wholename "*/sparkjob/current/*" -exec bash -c '
    dir=${1#./} dir=${dir%%/*} file=${1##*/}
    install -D "$1" "./$dir/$file"
' bash {} \;

Exemple of what is done:

install -D ./dag_2/sparkjob/current/spark_3.py ./dag_2/spark_3.py
install -D ./dag_1/sparkjob/current/spark_3.py ./dag_1/spark_3.py

The source path is an example, if longer, no issue.

Sign up to request clarification or add additional context in comments.

Comments

2

First you should check what find returns by removing everything after |. You'll see find doesn't find any files. The reasons:

  • as the name implies, -wholename matches the whole name, so you need */sparkjob/current/*
  • according to your tree output, the folder is not named sparkjob but sparkjobs.

I'd start with something like this:

find /mnt/c/Users/User/Test/ -type f -wholename "*/sparkjobs/current/*" -print0 | while IFS= read -r -d '' file; do
    echo mv "$file" "$(realpath "$(dirname "$file")"/../..)"
done

I added an echo so you can check all paths and commands are correct.

You may want to trade simplicity for performance. See https://mywiki.wooledge.org/BashFAQ/001 if performance is important (many files or frequent runs).

2 Comments

This will break on files with spaces
@GillesQuenot Yes, changed.
1

You'll want to do:

mkdir ../new_folder
find . -type f \
       -path '*/sparkjobs/current/*' \
       -exec sh -c 'f=$1
                    new=${f/sparkjobs\/current\//}
                    dest="../new_folder/$(dirname "$new")"
                    mkdir -p "$dest"
                    cp -v "$f" "$dest"' sh '{}' \;
‘./dag_1/sparkjobs/current/spark_3.py’ -> ‘../new_folder/./dag_1/spark_3.py’
‘./dag_2/sparkjobs/current/spark_3.py’ -> ‘../new_folder/./dag_2/spark_3.py’

Comments

1

This looks pretty straightforward.

for d in $old_loc/dag_*
do mkdir -p "$new_loc/${d##*/}"
   cp "$d"/sparkjobs/current/spark_*.py "${d##*/}"
done

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.