0

I'm encountering a very strange thing while writing a shell script.

Initially I have the following script.

OUTPUT_FOLDER=/home/user/output
HOST_FILE=/home/user/host.txt
LOG_DIR=/home/user/log
SLEEPTIME=60

mkdir -p $OUTPUT_FOLDER

while true
do
    rm -f $OUTPUT_FOLDER/*
    DATE=`date +"%Y%m%d"`
    DATETIME=`date +"%Y%m%d_%H%M%S"`
    pssh -h $HOST_FILE -o $OUTPUT_FOLDER
    while read host
    do
        numline=0
        while read line
        do
            if [[ $numline == 0 ]]
            then
                FIELD1=`echo $line | cut -d':' -f2 | cut =d' ' -f2`
            fi
            ((numline+=1))
        done < ${OUTPUT_FOLDER}/${host}
        echo "$DATETIME,$FIELD1" >> ${LOG_DIR}/${DATE}.log
    done < $HOST_FILE
    sleep $SLEEPTIME
done

When I ran this script, every 60 seconds, I'll see the $DATETIME,$FIELD1 values in my log file.

What is very strange, is that every 30 seconds or so, after the first minute has passed, I'll see 20160920_120232,, meaning there was output where there shouldn't be, and I guess because I deleted the contents of my output folder, FIELD1 is empty.

What is even more strange, is while trying to debug this, I added a few more echo statements to print to my log file, and then I deleted those lines. However, they continued to be printed every 30 seconds or so, after the first minute has passed.

What is stranger still is that I then commented out everything inside the while true block, that is,

OUTPUT_FOLDER=/home/user/output
HOST_FILE=/home/user/host.txt
LOG_DIR=/home/user/log
SLEEPTIME=60

mkdir -p $OUTPUT_FOLDER

while true
do
: << 'END'
    rm -f $OUTPUT_FOLDER/*
    DATE=`date +"%Y%m%d"`
    DATETIME=`date +"%Y%m%d_%H%M%S"`
    pssh -h $HOST_FILE -o $OUTPUT_FOLDER
    while read host
    do
        numline=0
        while read line
        do
            if [[ $numline == 0 ]]
            then
                FIELD1=`echo $line | cut -d':' -f2 | cut =d' ' -f2`
            fi
            ((numline+=1))
        done < ${OUTPUT_FOLDER}/${host}
        echo "$DATETIME,$FIELD1" >> ${LOG_DIR}/${DATE}.log
    done < $HOST_FILE
    sleep $SLEEPTIME
END
done

Even with this script, where I'm expecting nothing to be printed to my log file, I see my previous echo statements that I have deleted, and the lines with the empty FIELD1. I have checked that I'm running the correct version of the script each time.

What is going on?

4
  • 2
    Can there be any background process you are not aware of? Commented Sep 20, 2016 at 10:36
  • Did you restarted the process after you made changes? And made sure the old process is really not running? Commented Sep 20, 2016 at 11:21
  • fuser -k /path/to/file, to ensure that everything writing to that file is dead, is your friend. Or just fuser /path/to/file, and making sure the list of processes with handles on it is what you expect. Commented Sep 21, 2016 at 1:01
  • Beyond that, right now this question is very much suffering from the lack of a reproducer -- if nobody else can actually see the problem, nobody can tell if their guesses are correct or if their proposed fix actually works. If still in a place of trying to pin down what's going on even after using fuser, I might pull out sysdig with a query to process all writes to the relevant file across the operating system (vs strace tracking only one process with very high performance overhead). Commented Sep 21, 2016 at 1:05

2 Answers 2

2

Am not sure, it could be a the actual reason for the screw-up. You have an incorrect usage of cut in line number 22 which could have mangled the FIELD1 inadvertently.

FIELD1=`echo $line | cut -d':' -f2 | cut =d' ' -f2`
#                                        ^ incorrect usage of the delimiter '-d' flag

which should have been used as

FIELD1=`echo $line | cut -d':' -f2 | cut -d' ' -f2`
#                                        ^ Proper usage of the '-d' flag with space delimiter

I tried to capture the piped command output to see if the last command could have succeeded and below was my observation.

echo $line | cut -d':' -f2 | cut =d' ' -f2;echo "${PIPESTATUS[@]}"
cut: =d : No such file or directory     # --> Error from the failed 'cut' command
0 0 1                                   # --> Return codes of each of the command being piped

Also, if you have been a bash purist, you have avoided the legacy `` style command expansion and used a $(cmd) syntax like,

FIELD1=$(echo "$line" | cut -d':' -f2 | cut -d' ' -f2)

Another simpler way would have been to avoid use of cut & echo and do pure bash string-manipulation.

Assuming your $line contains : and de-limited string e.g."What Hell:Hello World" and you are trying to extract the World part as I could see, just do

FIELD1="${line##* }"   # Strips-off characters until last occurrence of whitespace
Sign up to request clarification or add additional context in comments.

Comments

1

I tried to reproduce it and couldn't. Thre is no reason for this script (when commented as you have) to produce any output.

There are couple of cases:

  1. As others pointed out, there may be some other script writing to your log file
  2. You may have left few instances of this script running on other terminals
  3. You may be running this script from crontab once every 5 mins

In order to rule out possibilities, do a "ps -aef" and check out all the processes that could resemble the name of your script. To catch things from crontab, you may have to watch the output of "ps -aef" a big longer (or just check in the crontab entries).

1 Comment

Thanks! Turns out, I did have a few instances of this script running in the background. Once I killed them, the output is as expected.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.