4

I have a text file named raw.txt with something like the following:

T DOTTY CRONO 52/50 53/40 54/30 55/20 RESNO NETKI
U CYMON DENDU 51/50 52/40 53/30 54/20 DOGAL BEXET
V YQX KOBEV 50/50 51/40 52/30 53/20 MALOT GISTI
W VIXUN LOGSU 49/50 50/40 51/30 52/20 LIMRI XETBO
X YYT NOVEP 48/50 49/40 50/30 51/20 DINIM ELSOX
Y DOVEY 42/60 44/50 47/40 49/30 50/20 SOMAX ATSUR
Z SOORY 43/50 46/40 48/30 49/20 BEDRA NERTU
A DINIM 51/20 52/30 50/40 47/50 RONPO COLOR
B SOMAX 50/20 51/30 49/40 46/50 URTAK BANCS
C BEDRA 49/20 50/30 48/40 45/50 VODOR RAFIN
D ETIKI 48/15 48/20 49/30 47/40 44/50 BOBTU JAROM
E 46/40 43/50 42/60 DOVEY
F 45/40 42/50 41/60 JOBOC
G 43/40 41/50 40/60 SLATN

I'm reading it into an array:

while read line; do
    set $line
    IFS=' ' read -a array <<< "$line"
done < raw.txt

I'm trying to replace all occurrences of [A-Z]{5} with an curl result where the match of [A-Z]{5} is fed as a variable into the curl call.

First match to be replaced would be DOTTY. The call looks similar to curl -s http://example.com/api_call/DOTTY and the result is something like -55.5833 50.6333 which should replace DOTTY in the array.

I was so far unable to correctly match the desired string and feed the match into curl.

Your help is greatly appreciated.

All the best, Chris

EDIT:

Solution

Working solution based on @Kevin extensive answer and @Floris hint about a possible carriage return in the curl result. This was indeed the case. Thank you! Combined with some tinkering on my side I now got it to work.

#!/bin/bash
while read line; do
    set $line
    IFS=' ' read -a array <<< "$line"
    i=0
    for str in ${array[@]}; do
        if [[ "$str" =~ [A-Z]{5} ]]; then
            curl_tmp=$(curl -s http://example.com/api_call/$str)
            # cut off line break
            curl=${curl_tmp/$'\r'}
            # insert at given index
            declare array[$i]="$curl"
        fi
        let i++
    done
    # write to file
    for index in "${array[@]}"; do
        echo $index 
    done >> $WORK_DIR/nats.txt
done < raw.txt
2
  • What do you want to do with the output? How complex is the curl request? Have you gotten anywhere with the bash script yet? In principle, "replace $a with $b" is a simple sed 's/'$a'/'$b'/' statement, so if you can get the values of "thing to replace" in $a and "output of curl" into $b, you're done. $b=curl $options mysite?$a might do it... Commented Jan 14, 2014 at 0:26
  • Glad to hear you were able to figure it out! Commented Jan 14, 2014 at 16:50

4 Answers 4

2

I didn't change anything about your script except add the matching part, since it seems that's what you're needing help on:

#!/bin/bash
while read line; do
        set $line
        IFS=' ' read -a array <<< "$line"
        for str in ${array[@]}; do
                if [[ "$str" =~ [A-Z]{5} ]]; then
                        echo curl "http://example.com/api_call/$str"
                fi
        done
done < raw.txt

EDIT: added in the url example you provided with the variable in the URI. You can do whatever you need with the fetched output by changing it to do_something "$(curl ...)"

EDIT2: Since you're wanting to maintain the bash array you create from each line, how about this:

I'm not great at bash when it comes to arrays, so I expect someone to call me out on it, but this should work.

I've left some echos there so you can see what it's doing. The shift commands are to push the array index from the current location when the regex matches. The tmp variable to hold your curl output could probably be improved, but this should get you started, I hope.

removed temporarily to avoid confusion

EDIT3: Oops the above didn't actually work. My mistake. Let me try again here.

EDIT4:

#!/bin/bash
while read line; do
        set $line
        IFS=' ' read -a array <<< "$line"
        i=0
        # echo ${array[@]} below is just so you can see it before processing.  You can remove this
        echo "Array before processing: ${array[@]}"
        for str in ${array[@]}; do
                if [[ "$str" =~ [A-Z]{5} ]]; then
                        # replace the echo command below with your curl command
                        # ie - curl="$(curl http://example.com/api_call/$str)"
                        curl="$(echo 1234 -1234)"
                        if [[ "$flag" = "1" ]]; then
                                array=( ${adjustedArray[@]} )
                                push=$(( $push + 2 ));
                                let i++
                        else
                                push=1
                        fi
                        adjustedArray=( ${array[@]:0:$i} ${curl[@]} ${array[@]:$(( $i + $push)):${#array[@]}} )
                        #echo "DEBUG adjustedArray in loop: ${adjustedArray[@]}"
                        flag=1;
                fi
                let i++
        done
        unset flag
        echo "final: ${adjustedArray[@]}"
        # do further processing here
done < raw.txt

I know there's a smarter way to do this than the above, but we're getting into areas in bash where I'm not really suited to give advice. The above should work, but I'm hoping someone can do better.

Hope it helps, anyway

ps - You should probably not use a shell script for this unless you really need to. Perl, php, or python would make the code simple and readable

Sign up to request clarification or add additional context in comments.

4 Comments

The output of the curl needs to be substituted into the input string in place of the match of [A-Z]{5}. There are multiple matches in each line, so you have to iterate over them...
@Floris I see now. I missed that part.
@Kevin - Your example is quite close to what I'm trying to achieve – my problem now is how to replace the matched string in the array with the curled output?
@Chris did you check my other answer using just sed?
2

Since I misread the first time:

How about just using sed?

sed "s/\([A-Z]\{5\}\)/$(echo curl http:\\/\\/example.com\\/api_call\\/\\1)/g" /tmp/raw.txt

Try that, then try removing the echo. I'm not 100% on this since I can't run it on the real domain

EDIT: And just so I'm clear, the echo is just there so you can see what it will do with the echo removed

5 Comments

the coordinates returned from curl are space separated which makes it difficult splitting the lines into an array further down the road.
Isn't that what you want though? "the result is something like -55.5833 50.6333 which should replace DOTTY in the array."
So I guess you want to maintain an array for further manipulation?
Yes, I'd like to maintain the array. I just want to swap the value in the array where [A-Z]{5} matches. I'm under the impression that my English is not up to the task. My apologies.
This is actually a very nice answer, although apparently it doesn't quite give Chris what he wanted. I am going to mark it as "useful" nonetheless.
2

create a file cmatch:

#!/bin/bash

while read line
do
  echo $line
  a=`echo $line | egrep -o '\b[A-Z]{5}\b'`
  for v in $a
  do
   echo "doing curl to replace $v in $line"
   r=`curl -s http://example.com/api_call/$v`
   r1=`echo $r | xargs echo`
   line=`echo $line | sed 's/'$v'/'$r1'/'`
  done
done

then call it with

chmod 755 cmatch
./cmatch < inputfile.txt > outputfile.txt

It will do what you asked

Notes:

  • the \b before and after the [A-Z]{5} ensures that ABCDEFG (which is not a five letter word) will not match.
  • using egrep -o produces an array of matches
  • I loop over this array to allow the replacement of multiple matches in a line
  • I update the line for each match found using the result of the curl call
  • to keep code clean, I assign the result of the curl to an intermediate variable

edit Just saw the comments about arrays. I suggest to take the output of this script and convert it to an array if you want to do further manipulation...

more edits If your curl command returns a multi-line string (which would explain the error you see), you can use the new line I introduced in the script to remove the newlines (essentially stringing all the arguments together):

echo $r | xargs echo

calls echo with one line at a time as argument, and without the carriage returns. It's a fun way of getting rid of carriage returns.

3 Comments

Thank you for your script but something doesn't play along nicely: $ ./label-replace.sh < raw.txt > output.txt sed: 1: "s/DOTTY/-55.5833": unterminated substitute in regular expressionand this goes on for every match.
Could it be that your curl command doesn't return a single line but that it has a carriage return in it? That wasn't clear from your question but would explain the issue.
Did you ever try using my updated answer? I see that the speculation about carriage return was correct... The code edit I made after that actually took that into account.
0
#!/bin/bash


while read line;do
  set -- $line
  echo "second parm is $2"
  echo "do your curl here"
 done < afile.txt

5 Comments

He does not want the second parameter. He wants the first parameter that matches [A-Z]{5}.
first parm is a single char , he needs to clarify his question
Upon rereading, I fail English too. He wants all parameters that match [A-Z]{5}, and states so clearly in the question. If his curl call reversed characters, the output for the first line should be T YTTOD ONORC 52/50 53/40 54/30 55/20 ONSER IKTEN, with parameters 2, 3, 8 and 9 replaced.
@Amadan - that's how I read the question as well. So there are two problems: loop over all the matches in turn; make the curl call; replace the string with the result of the curl. Repeat for all matches. Repeat for all lines.
@Floris and @Amadan – the two-digit-slash-two-digit params are geographic coordinates which need reformatting by removing the slash. I left the slash in there for easier splitting into an array. As I now have all the params separated, I can start replacing the [A-Z]{5} (those are Fix-Names) with their coordinates as returned by the URI mentioned above.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.