Removing chars until a numeric is found from array item Bash

Question

I have the line of text within a text file. The line looks something like this:

xxxx,xxxxx,xxxxxx,xxxxx,xxxx,NL-1111 xx,xxxx,xxx

The NL- is an identifier for the country so this could be anything. I would like to remove the NL- part from the line so it looks like this:

xxxx,xxxxx,xxxxxx,xxxxx,xxxx,1111 xx,xxxx,xxx

And write the file afterwards.

Thanks in advance.

@ChrisMaes Yes, i played with sed and awk but i'm not sure what methods to use. I dont work with bash that often — Florian Schaal
– Florian Schaal, Commented Jan 14, 2015 at 8:15

Bentoy13 · Accepted Answer · 2015-01-14 08:30:24Z

2

Another solution close to sed's ones, but with perl:

perl -i -pe "s/(?<=,)[a-zA-Z]{2}-//g" file.txt

It uses look behind expression, so that you don't need to repeat the comma in the replacement part.

answered Jan 14, 2015 at 8:30

Bentoy13

5,0061 gold badge23 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

bgoldst Over a year ago

This is a good solution, +1. It works on all platforms, unlike sed-in-place (sed -i); see my answer for details.

Bentoy13 Over a year ago

@bgoldst There is also some differences between platforms with perl -i: on some (on my Windows cmd for example), you need to specify a backup pattern after -i (no backup is not possible). With that, it works everywhere, but you get a backup file.

Jasen · Accepted Answer · 2015-01-14 08:27:00Z

2

something like this using sed

sed -i 's/,[A-Z][A-Z]-\([0-9]\+,\)/,\1/i' file.txt

,[A-Z][A-Z]-\([0-9]\+,\)search for comma letter, letter, -, digit(s), comma

,\1keep only the commas and the digits.

iignore case on the letters

thankyou to @chris for proof-reading.

edited Jan 14, 2015 at 8:27

answered Jan 14, 2015 at 8:21

Jasen

12.5k2 gold badges37 silver badges50 bronze badges

1 Comment

Chris Maes Over a year ago

better solution with regex. is the < necessary?

bgoldst · Accepted Answer · 2015-01-14 08:33:31Z

2

I think the simplest solution here is reading it from the file into a shell variable, then writing it back immediately and using the pattern substitution variation of parameter expansion:

line="$(<file)"; echo "${line/[a-zA-Z][a-zA-Z]-}" >|file;

I would warn you against solutions that use sed-in-place functionality. I've found that sed behavior differs on different platforms with respect to the -i option. On Mac you have to give an empty argument ('') to the -i option, while on Cygwin you must not have an empty argument following the -i. To get platform compatibility you'd have to test what platform you're on.

edited Jan 14, 2015 at 8:33

answered Jan 14, 2015 at 8:19

bgoldst

35.6k6 gold badges44 silver badges64 bronze badges

4 Comments

Chris Maes Over a year ago

not sure this will work since there is 'XXX,XXX,...' on the same line before NL-

bgoldst Over a year ago

It works. The dash at the end of the pattern protects the rest of the string.

Bentoy13 Over a year ago

Won't work if you have a prefix with more than 2 characters: it deletes only the last two characters plus the dash. Instead the prefix should be untouched.

bgoldst Over a year ago

The prefix is only two characters. Probably ISO 3166-1 alpha-2.

Chris Maes · Accepted Answer · 2015-01-14 08:26:01Z

1

sed might do the trick: remove the string ",NL-", "BE-" etc from anywhere in the file:

sed -i 's/,[A-Z][A-Z]-/,/' file.txt

edited Jan 14, 2015 at 8:26

answered Jan 14, 2015 at 8:18

Chris Maes

38.2k15 gold badges119 silver badges158 bronze badges

3 Comments

Florian Schaal Over a year ago

Yes, This works if the NL- part would be static. The problem is that this could be anything. NL-,BE-,DE- ect.

Bentoy13 Over a year ago

You should not remove the comma I think.

Chris Maes Over a year ago

thanks for the comments; I adapted it slightly so it won't remove the comma, and will remove other countries as well... The regex by @Jasen might be better...

Collectives™ on Stack Overflow

Removing chars until a numeric is found from array item Bash

4 Answers 4

2 Comments

1 Comment

4 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

1 Comment

4 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related