1

I have some data with the following format:

 2     1  500  500  500
 3     1  500  500  500
 6     1  500  500  500
 8     1  500  500  500
 9     1  500  500  500
11     1  500  500  500
12     1  500  500  500
14     1  500  500  500
15     1  500  500  500
16     1  500  500  500
17     1  500  500  500
20     1  500  500  500
21     1  500  500  500
23     1  500  500  500
24     1  500  500  500
25     1  500  500  500
27     1  500  500  500
30     1  500  500  500
31     1  500  500  500
32     1  500  500  500
33     1  500  500  500
34     1  500  500  500
35     1  500  500  500
38     1  500  500  500
40     1  500  500  500
41     1  500  500  500
43     1  500  500  500
44     1  500  500  500
46     1  500  500  500
47     1  500  500  500

I want to change the 500 values to 100 only in the lines in which the 1st column equal from 11-40. For now I'm doing something like:

Numbers=($(seq 11 1 40))
File=filename.txt
for i in ${Numbers[*]}
do
        if [ $i == awk '{print $1}' $File ];then
        NumberLine=$(grep -n $i $File | cut -d : -f 1)
        sed -i "${NumberLine}s/500/100/" $File
        fi
done

Individually, each line seems to do what I want to do, but when I put them in them in the loop, I get the following error:

./changeRestraints.sh: line 5: [: too many arguments

I suspect that this has to do with my awk as a part of my conditional statement. How can I fix this to make this script run?

Thank you,

1
  • 1
    Numbers=($(seq 11 1 40)) is not doing what you expect it to do. It currently assigns the value 11 to Numbers. You should change this to: Numbers=$(seq 11 1 40). Commented Apr 28, 2022 at 12:04

6 Answers 6

5

It is not a good idea to invoke awk and sed in the loop repeatedly wrt efficiency. Please try:

awk '$1>10 && $1<=40 {gsub(/\<500\>/, "100")} 1' filename.txt

Please note the regex \< and \> are GNU awk extension which match word boundaries.

[EDIT]
The gsub() function is a variant of sub() which replaces multiple occurences of the matched strings, while sub() replaces the first match only. The relationship between sub() and gsub() is similar to that of s/regex/repl/ and s/regex/repl/g in sed.

If you want to use variables for the embedded numbers, you can make use of -v varname=value mechanism which can assign awk variables via the command line option:

#!/bin/bash

start=11    # bash variables
stop=40
from=500
to=100
awk -v start="$start" -v stop="$stop" -v from="$from" -v to="$to" '$1>=start && $1<=stop {gsub("\\<" from "\\>", to)} 1' filename.txt
  • When assiging -v start="$start", the lhs start is an awk variable name and the rhs "$start" is a bash variable. We can use the same name for them (although they look confusing). You can also assign the variable with an immediate value such as -v start=11.
  • As we cannot use the regex quoting /regex/ to include an awk variable, we need to say "\\<" from "\\>" instead. The whitespaces in between are just for readability and it is equilalent to "\\<"from"\\>".
Sign up to request clarification or add additional context in comments.

3 Comments

Technically you didn't solve the original question, but your solution was the easiest to implement and accomplished what I've set out to do originally. Thank you! I didn't know about gsub. I did see some people discuss sub when I was looking around for how to do this. Why use gsub vs sub in this case?
Also quick question, how can I make the numbers inside of the awk a variable? currently I'm trying to enable a user to change the 11-40 numbers. Possbily the 500 and 100 as well, but those are less important. I'm thinking something like: awk "$1>="$Start" && $1<="$Stop" {gsub(/\<1000\>/, "500")} 1" filename.txt
THANK YOU SO MUCH! I have never quite understood the -v flag in awk. You edit makes so much sense, and teaches me a lot for some future applications as well.
2

You never actually execute awk. You compare $i to the string awk, and then the test command finds additional arguments which it can't handle. Therefore you get the too many arguments error.

You need to run awk in order to get its output, for instance by doing a

if [ "$i" = "$(awk ... )" ]
then
  ...

1 Comment

You technically answered the original question, but it turns out my code is flawed as pointed out by some others.
1

You could also use sed, where the pattern ^(40|[23][0-9]|1[1-9])[[:space:]] matches a number 11-40 followed by a space at the start of the string.

sed -E '/^(40|[23][0-9]|1[1-9])[[:space:]]/s/500/100/g' file

Output

 2     1  500  500  500
 3     1  500  500  500
 6     1  500  500  500
 8     1  500  500  500
 9     1  500  500  500
11     1  100  100  100
12     1  100  100  100
14     1  100  100  100
15     1  100  100  100
16     1  100  100  100
17     1  100  100  100
20     1  100  100  100
21     1  100  100  100
23     1  100  100  100
24     1  100  100  100
25     1  100  100  100
27     1  100  100  100
30     1  100  100  100
31     1  100  100  100
32     1  100  100  100
33     1  100  100  100
34     1  100  100  100
35     1  100  100  100
38     1  100  100  100
40     1  100  100  100
41     1  500  500  500
43     1  500  500  500
44     1  500  500  500
46     1  500  500  500
47     1  500  500  500

Comments

1

Another way to do it in gawk:

gawk '$1~/1[1-9]|[23][0-9]|40/{ $3=$4=$5=100 }1' file

when you want to maintain the spaces:

gawk '$1~/1[1-9]|[23][0-9]|40/{ gsub(/\<500\>/,"100") }1' file

2 Comments

Can you explain about how that gawk command works? What does [23] mean here for example?
regex101 has a good explanation about regular expression, see: regex101.com/r/Y7oZva/1 explanation in upper right corner.
1

with sed

sed -i '/^11/,/^40/s/500/100/g' filename.txt

Comments

1

maybe something like this :

mawk 'gsub("500", (_=+$1)<11 || (40<_) ? "&" : "100")'

 2     1  500  500  500
 3     1  500  500  500
 6     1  500  500  500
 8     1  500  500  500
 9     1  500  500  500
11     1  100  100  100
12     1  100  100  100
14     1  100  100  100
15     1  100  100  100
16     1  100  100  100
17     1  100  100  100
20     1  100  100  100
21     1  100  100  100
23     1  100  100  100
24     1  100  100  100
25     1  100  100  100
27     1  100  100  100
30     1  100  100  100
31     1  100  100  100
32     1  100  100  100
33     1  100  100  100
34     1  100  100  100
35     1  100  100  100
38     1  100  100  100
40     1  100  100  100
41     1  500  500  500
43     1  500  500  500
44     1  500  500  500
46     1  500  500  500
47     1  500  500  500

if you prefer a FS + OFS based approach, then

 mawk 'NF *= (OFS = (_=+$1)<11||(40<_) ? FS : __)^!__' FS='500' __='100'

3 Comments

Can you explain in short what the second pattern does? What is the use of the leading NF *= and what does this part do ^! ?
_ _ is pre-defined as 100, inverting makes it a zero (0), and ANYTHING to 0-th power becomes a 1, at least in the awk world, even +/- INF or NAN - even "helloworld"^0 (zero), and it'll become a numeric 1. I'll let theorist debate the merit of 0^0. I'm only setting OFS based on whether it should be kept at 500 or changed to 100 depending on range criteria, & becomes NF *= (OFS-new)^0 -> NF *= 1, which is same as evaluating (NF+=0), (NF=NF), or ($1=$1) (the most frequently used syntax, but all 4 have IDENTICAL effect of reformatting all FS splits into OFS, typically tab+spaces into 1 space)
@The fourth bird : the typical way awk is explained to those less familar is ' / regex-pattern / { action } ' . While that suffices for most use cases, it's also not entirely accurate. It's better phrased as ' < any syntax combo resulting in boolean evaluation ; defaulting to true when missing > { any combo of statements ; defaulting to "print" } ' . And that's what a lot of my codes are - maximizing upper boolean layer, if possible, then leveraging default print. AWK "boolean" means empty-string or zero as false, all else true : (awk '0' <<< 123) prints nothing, but (awk '"0"' <<< 123) does.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.