Using awk output for conditional statement

Question

I have some data with the following format:

 2     1  500  500  500
 3     1  500  500  500
 6     1  500  500  500
 8     1  500  500  500
 9     1  500  500  500
11     1  500  500  500
12     1  500  500  500
14     1  500  500  500
15     1  500  500  500
16     1  500  500  500
17     1  500  500  500
20     1  500  500  500
21     1  500  500  500
23     1  500  500  500
24     1  500  500  500
25     1  500  500  500
27     1  500  500  500
30     1  500  500  500
31     1  500  500  500
32     1  500  500  500
33     1  500  500  500
34     1  500  500  500
35     1  500  500  500
38     1  500  500  500
40     1  500  500  500
41     1  500  500  500
43     1  500  500  500
44     1  500  500  500
46     1  500  500  500
47     1  500  500  500

I want to change the 500 values to 100 only in the lines in which the 1st column equal from 11-40. For now I'm doing something like:

Numbers=($(seq 11 1 40))
File=filename.txt
for i in ${Numbers[*]}
do
        if [ $i == awk '{print $1}' $File ];then
        NumberLine=$(grep -n $i $File | cut -d : -f 1)
        sed -i "${NumberLine}s/500/100/" $File
        fi
done

Individually, each line seems to do what I want to do, but when I put them in them in the loop, I get the following error:

./changeRestraints.sh: line 5: [: too many arguments

I suspect that this has to do with my awk as a part of my conditional statement. How can I fix this to make this script run?

Thank you,

Numbers=($(seq 11 1 40)) is not doing what you expect it to do. It currently assigns the value 11 to Numbers. You should change this to: Numbers=$(seq 11 1 40). — Luuk
– Luuk, Commented Apr 28, 2022 at 12:04

tshiono · Accepted Answer · 2022-04-28 23:24:26Z

5

It is not a good idea to invoke awk and sed in the loop repeatedly wrt efficiency. Please try:

awk '$1>10 && $1<=40 {gsub(/\<500\>/, "100")} 1' filename.txt

Please note the regex \< and \> are GNU awk extension which match word boundaries.

[EDIT]
The gsub() function is a variant of sub() which replaces multiple occurences of the matched strings, while sub() replaces the first match only. The relationship between sub() and gsub() is similar to that of s/regex/repl/ and s/regex/repl/g in sed.

If you want to use variables for the embedded numbers, you can make use of -v varname=value mechanism which can assign awk variables via the command line option:

#!/bin/bash

start=11    # bash variables
stop=40
from=500
to=100
awk -v start="$start" -v stop="$stop" -v from="$from" -v to="$to" '$1>=start && $1<=stop {gsub("\\<" from "\\>", to)} 1' filename.txt

When assiging -v start="$start", the lhs start is an awk variable name and the rhs "$start" is a bash variable. We can use the same name for them (although they look confusing). You can also assign the variable with an immediate value such as -v start=11.
As we cannot use the regex quoting /regex/ to include an awk variable, we need to say "\\<" from "\\>" instead. The whitespaces in between are just for readability and it is equilalent to "\\<"from"\\>".

edited Apr 28, 2022 at 23:24

answered Apr 28, 2022 at 3:01

tshiono

22.3k2 gold badges18 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

German Barcenas Over a year ago

Technically you didn't solve the original question, but your solution was the easiest to implement and accomplished what I've set out to do originally. Thank you! I didn't know about gsub. I did see some people discuss sub when I was looking around for how to do this. Why use gsub vs sub in this case?

German Barcenas Over a year ago

Also quick question, how can I make the numbers inside of the awk a variable? currently I'm trying to enable a user to change the 11-40 numbers. Possbily the 500 and 100 as well, but those are less important. I'm thinking something like: awk "$1>="$Start" && $1<="$Stop" {gsub(/\<1000\>/, "500")} 1" filename.txt

German Barcenas Over a year ago

THANK YOU SO MUCH! I have never quite understood the -v flag in awk. You edit makes so much sense, and teaches me a lot for some future applications as well.

user1934428 · Accepted Answer · 2022-04-28 07:53:21Z

2

You never actually execute awk. You compare $i to the string awk, and then the test command finds additional arguments which it can't handle. Therefore you get the too many arguments error.

You need to run awk in order to get its output, for instance by doing a

if [ "$i" = "$(awk ... )" ]
then
  ...

answered Apr 28, 2022 at 7:53

user1934428

22.8k9 gold badges57 silver badges108 bronze badges

1 Comment

German Barcenas Over a year ago

You technically answered the original question, but it turns out my code is flawed as pointed out by some others.

The fourth bird · Accepted Answer · 2022-04-28 08:47:31Z

You could also use sed, where the pattern ^(40|[23][0-9]|1[1-9])[[:space:]] matches a number 11-40 followed by a space at the start of the string.

sed -E '/^(40|[23][0-9]|1[1-9])[[:space:]]/s/500/100/g' file

Output

 2     1  500  500  500
 3     1  500  500  500
 6     1  500  500  500
 8     1  500  500  500
 9     1  500  500  500
11     1  100  100  100
12     1  100  100  100
14     1  100  100  100
15     1  100  100  100
16     1  100  100  100
17     1  100  100  100
20     1  100  100  100
21     1  100  100  100
23     1  100  100  100
24     1  100  100  100
25     1  100  100  100
27     1  100  100  100
30     1  100  100  100
31     1  100  100  100
32     1  100  100  100
33     1  100  100  100
34     1  100  100  100
35     1  100  100  100
38     1  100  100  100
40     1  100  100  100
41     1  500  500  500
43     1  500  500  500
44     1  500  500  500
46     1  500  500  500
47     1  500  500  500

Luuk · Accepted Answer · 2022-04-28 12:12:52Z

1

Another way to do it in gawk:

gawk '$1~/1[1-9]|[23][0-9]|40/{ $3=$4=$5=100 }1' file

when you want to maintain the spaces:

gawk '$1~/1[1-9]|[23][0-9]|40/{ gsub(/\<500\>/,"100") }1' file

answered Apr 28, 2022 at 12:12

Luuk

15.4k5 gold badges28 silver badges44 bronze badges

2 Comments

German Barcenas Over a year ago

Can you explain about how that gawk command works? What does [23] mean here for example?

Luuk Over a year ago

regex101 has a good explanation about regular expression, see: regex101.com/r/Y7oZva/1 explanation in upper right corner.

ufopilot · Accepted Answer · 2022-04-28 12:49:27Z

1

with sed

sed -i '/^11/,/^40/s/500/100/g' filename.txt

answered Apr 28, 2022 at 12:49

ufopilot

3,9852 gold badges13 silver badges14 bronze badges

Comments

RARE Kpop Manifesto · Accepted Answer · 2022-04-29 00:44:16Z

1

maybe something like this :

mawk 'gsub("500", (_=+$1)<11 || (40<_) ? "&" : "100")'

 2     1  500  500  500
 3     1  500  500  500
 6     1  500  500  500
 8     1  500  500  500
 9     1  500  500  500
11     1  100  100  100
12     1  100  100  100
14     1  100  100  100
15     1  100  100  100
16     1  100  100  100
17     1  100  100  100
20     1  100  100  100
21     1  100  100  100
23     1  100  100  100
24     1  100  100  100
25     1  100  100  100
27     1  100  100  100
30     1  100  100  100
31     1  100  100  100
32     1  100  100  100
33     1  100  100  100
34     1  100  100  100
35     1  100  100  100
38     1  100  100  100
40     1  100  100  100
41     1  500  500  500
43     1  500  500  500
44     1  500  500  500
46     1  500  500  500
47     1  500  500  500

if you prefer a FS + OFS based approach, then

 mawk 'NF *= (OFS = (_=+$1)<11||(40<_) ? FS : __)^!__' FS='500' __='100'

answered Apr 29, 2022 at 0:44

RARE Kpop Manifesto

3,0096 silver badges15 bronze badges

3 Comments

The fourth bird Over a year ago

Can you explain in short what the second pattern does? What is the use of the leading NF *= and what does this part do ^! ?

RARE Kpop Manifesto Over a year ago

_ _ is pre-defined as 100, inverting makes it a zero (0), and ANYTHING to 0-th power becomes a 1, at least in the awk world, even +/- INF or NAN - even "helloworld"^0 (zero), and it'll become a numeric 1. I'll let theorist debate the merit of 0^0. I'm only setting OFS based on whether it should be kept at 500 or changed to 100 depending on range criteria, & becomes NF *= (OFS-new)^0 -> NF *= 1, which is same as evaluating (NF+=0), (NF=NF), or ($1=$1) (the most frequently used syntax, but all 4 have IDENTICAL effect of reformatting all FS splits into OFS, typically tab+spaces into 1 space)

RARE Kpop Manifesto Over a year ago

@The fourth bird : the typical way awk is explained to those less familar is ' / regex-pattern / { action } ' . While that suffices for most use cases, it's also not entirely accurate. It's better phrased as ' < any syntax combo resulting in boolean evaluation ; defaulting to true when missing > { any combo of statements ; defaulting to "print" } ' . And that's what a lot of my codes are - maximizing upper boolean layer, if possible, then leveraging default print. AWK "boolean" means empty-string or zero as false, all else true : (awk '0' <<< 123) prints nothing, but (awk '"0"' <<< 123) does.

Collectives™ on Stack Overflow

Using awk output for conditional statement

6 Answers 6

3 Comments

1 Comment

Comments

2 Comments

Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

3 Comments

1 Comment

Comments

2 Comments

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related