1

I'm monitoring a folder which is receiving log files. For each log file received, I need to send it to a remote server via SCP. SCP transfer is done via transfer.sh script. Since I need to perform a transfer for each file, its probable that a single file may delay other new files. I would like to "create" a new parallel process for each file in my directory.

MONITOR_FOLDER='/repository/'
PATTERN='log_*'

    for log_file in $MONITOR_FOLDER$PATTERN     
        do              
            echo "$(date +%c) monitor() Processing $log_file CDR file..."
            parallel --will-cite -n0 "sh transfer.sh $log_file 1" ::: {1..1}
        done

the $MONITOR_FOLDER$PATTERN can return 0 or more files. When there is more than 1 file, I want to create a parallel process per file. The following command display the correct list.

ls $MONITOR_FOLDER | grep 'log_*' 

Question:

1) For each entry use it as param for my shell script and at the same time create a new process without the loop

5
  • 1
    Why not just run the command in the background by appending & to it? Commented May 19, 2015 at 1:58
  • Your greatest bottleneck is the network. Opening a parallel connection to the same server won't alleviate that. (In fact, multiple connections will have more overhead.) Commented May 19, 2015 at 1:59
  • However, you could try gzipping the file locally, piping it through ssh, and then ungzip it on the remote end. That may be faster or slower. YMMV. Commented May 19, 2015 at 2:00
  • 1
    @jpaugh You can just enable SSH compression. Commented May 19, 2015 at 2:01
  • 1
    See wiki.ncsa.illinois.edu/display/~wglick/Parallel+Rsync Commented May 19, 2015 at 2:02

2 Answers 2

1

I'm monitoring a folder which is receiving log files. For each log file received, I need to send it to a remote server via SCP. SCP transfer is done via transfer.sh script.

That part is easy:

MONITOR_FOLDER='/repository/'
PATTERN='log_*'

parallel -j0 'echo "$(date +%c) monitor() Processing {} CDR file..."; sh transfer.sh {} 1' ::: $MONITOR_FOLDER$PATTERN

Or:

ls $MONITOR_FOLDER | grep 'log_*' | parallel -j0 'echo "$(date +%c) monitor() Processing {} CDR file..."; sh transfer.sh {} 1'

Since I need to perform a transfer for each file, its probable that a single file may delay other new files. I would like to "create" a new parallel process for each file in my directory.

This is also easy if you allow for a file to be copied more than once and to have as many scp's running as there are files. Simply add & to the command:

MONITOR_FOLDER='/repository/'
PATTERN='log_*'

for log_file in $MONITOR_FOLDER$PATTERN       
    do              
        echo "$(date +%c) monitor() Processing $log_file CDR file..."
        sh transfer.sh $log_file 1 &
    done

Now it gets more tricky if:

  • You at most want 12 scp's running at the same time
  • You only want to copy a file once

But you can probably use this: http://www.gnu.org/software/parallel/man.html#EXAMPLE:-GNU-Parallel-as-dir-processor

notifywait -q -m -r -e MOVED_TO -e CLOSE_WRITE --format %w%f $MONITOR_FOLDER |\
grep 'log_*' | parallel -j12 'echo "$(date +%c) monitor() Processing {} CDR file..."; sh transfer.sh {} 1'

It will just sit there waiting for a new file to be written. So if you want to stop it, you will have to kill it.

Sign up to request clarification or add additional context in comments.

Comments

0

I think the problem is in your code :

 for log_file in $MONITOR_FOLDER$PATTERN

Please go through the loop process and study how loop works, in your case for

For e.g.

 for i in '1 2 3 4 5'     # it will iterate from 1 to 5

but

  for i in $VAR    # it will iterate over `echo $VAR` means its value

Thus in your case the variable log_file will get first value as /repository/log_* but not its content.

To make your code working you may do like.

 for log_file in `ls $MONITOR_FOLDER$PATTERN`

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.