Add Regex Matches as new columns to the csv file [Batch Scripting]

Question

I have a .csv file that I need to add regex matches in each line as new columns after the original columns, here is a part of the .csv file:

"Event";"User";"Description"   
"stock_change";"[email protected]";"Change Product Teddy-Bear (Shop ID: AR832H0823)"
"stock_update";"[email protected]";"Update Product 30142_Pen (Shop ID: GI8759)"

Here is the two Regex Patterns I want to add their extracted results from each row as new columns (one column for each)

(?<=Product\s)\w.*?(?=\s*\(Shop)

(?<=Shop ID:\s)\w.*?(?=\))

The Result on the data should be Like this (Header Row is not important):

"stock_change";"[email protected]";"Change Product Teddy-Bear (Shop ID: AR832H0823)";"Teddy-Bear";"AR832H0823"  
"stock_update";"[email protected]";"Update Product 30142_Pen (Shop ID: GI8759)";"30142_Pen";"GI8759"

Sorry I'm very basic in Batch Scripting, thanks in advance

dbenham · Accepted Answer · 2016-03-06 02:50:21Z

1

Windows batch does not have a native regex find/replace utility. The only regex utility is FINDSTR, and that is extremely limited and non-standard, and it can only print out entire lines that match the search - it cannot print out just the matching portion.

You could use PowerShell.

But I would use JREPL.BAT - a purely script based utility (hybrid JScript/batch) that works on any Windows machine from XP onward. It uses ECMA regular expressions, so no look-behind, but it has plenty of power to do the task.

jrepl "Product\s(\S+?)\s*\(Shop ID:\s(.*?)\)\q$" "$&;\q$1\q;\q$2\q" /a /x /f test.csv /o -

The /a switch discards unchanged lines, which effectively removes the header line. The /o - option overwrites the original file with the output. The /x switch enables extended escape sequences, thus enabling \q for ".

Use call jrepl if you put the command in a batch script.

Full documentation is available from the command line via jrepl /?, or jrepl /?? for paged output.

edited Mar 6, 2016 at 2:50

answered Mar 6, 2016 at 2:35

dbenham

132k29 gold badges273 silver badges410 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Behnam2016 Over a year ago

Thanks dbenham, I was surprised when I saw that JREPL tool was developed by you :) , wondering if I can use the same tool for this issue too? stackoverflow.com/questions/35828741/…

Lars Fischer · Accepted Answer · 2016-03-05 20:47:30Z

0

You can do it with this GNU sed command:

sed -r 's/^.*Product (.+) \(Shop ID: (.+)\)"$/&;\"\1\";\"\2\"/g' shop.csv

it captures the parts between Product, (Shop ID: and )" into \1 and \2
the replacement uses & (the whole line) and appends a string made up of \1 and \2

edited Mar 5, 2016 at 20:47

answered Mar 5, 2016 at 20:43

Lars Fischer

10.4k3 gold badges31 silver badges38 bronze badges

1 Comment

SomethingDark Over a year ago

Worth noting that this this an external program that he's going to have to download.

Aacini · Accepted Answer · 2016-03-06 05:32:19Z

This problem may be solved in a very simple way without a regex with this Batch file:

@echo off

(for /F "skip=1 tokens=1-3 delims=;" %%a in (input.csv) do (
   for /F "tokens=3,6 delims=() " %%d in (%%c) do (
      echo %%a;%%b;%%c;"%%d";"%%e"
   )
)) > output.txt
move /Y output.csv input.csv

Result:

"stock_change";"[email protected]";"Change Product Teddy-Bear (Shop ID: AR832H0823)";"Teddy-Bear";"AR832H0823"
"stock_update";"[email protected]";"Update Product 30142_Pen (Shop ID: GI8759)";"30142_Pen";"GI8759"

However, if there are lines that have not the format of the example data (that could be correctly processed with a regex, but not with this code), then an adjustment in this code may be required. Note that depending on the differences in the data, the problem may not be solved via a pure Batch file.

Collectives™ on Stack Overflow

Add Regex Matches as new columns to the csv file [Batch Scripting]

3 Answers 3

1 Comment

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related