-2

I have a semicolon ";" delimited CSV file with "" as Text Quantifier, however there are fields which have ";" or "" which break the lines; How can I use a batch script to replace such values in each field each row, but keep the Field delimiter (;) and Text Quantifier ("") the same? (Example Replace ";" in each field with "|" and Double-Quotes Single-Quotes)

Note: We can Rely on the ";" part between each two fields (Start and End of each field has the double-quotes, possible to use it as imaginary delimiter in the solution)

Here as an example of my csv rows with corrupted Fields:

"Event";"User";"Description"   
"stock_change";"[email protected]";"Change Product Teddy;Bear (Shop ID: "AR832H0823")"
"stock_update;change";"[email protected]";"Update Product "30142_Pen" (Shop ID: GI8759)"
1
  • That's not a question but a task request; please share what you have tried so far; remember that SO is not a free code writing service... Commented Mar 6, 2016 at 16:04

2 Answers 2

0
@ECHO Off
SETLOCAL
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
SET "filename1=%sourcedir%\q35828741.txt"
SET "outfile=%destdir%\outfile.txt"
FOR /L %%f IN (1,1,3) DO SET "field%%f="
(
FOR /f "usebackqdelims=" %%a IN ("%filename1%") DO (
 FOR %%b IN (%%a) DO CALL :process %%b
)
)>"%outfile%"

GOTO :EOF

:process
IF NOT DEFINED field1 SET "field1=%~1"&GOTO :EOF 
IF NOT DEFINED field2 SET "field2=%~1"&GOTO :EOF 
SET "field3=%~1"
:repcwp
FOR /f "tokens=1*delims=:" %%f IN ("%field3%") DO (
 SET "field3=%%g"
 IF DEFINED field3 (SET "field3=%%f''%%g"&GOTO repcwp) ELSE (SET "field3=%%~f")
)
set "field1=%field1:;=|%"
set "field1=%field1:"='%"
set "field2=%field2:;=|%"
set "field2=%field2:"='%"
set "field3=%field3:;=|%"
set "field3=%field3:"='%"
ECHO "%field1:''=:%";"%field2:''=:%";"%field3:''=:%"
FOR /L %%f IN (1,1,3) DO SET "field%%f="
GOTO :eof

You would need to change the settings of sourcedir and destdir to suit your circumstances.

I used a file named q35828741.txt containing your data for my testing.

Produces the file defined as %outfile%

Process each line of the file, presuming it is well-constructed.

Use a simple for loop to deliver the three fields to the procedure :process. The lines are each of the form "data1"separator"data2"separator"data3"

Within :process, accumulate the data to field1..3

Since the common substring-replace mechanism uses : to separate the "to" and "from" strings, replace each : with a distinct string ''. This is only done for field3 since it appears from the sample data that it is the only field that may contain colons. If colons may appear in the other fields, it's simply a matter of following the bouncing ball.

Having replaced all the colons, replace the semicolons and rabbit's-ears as required, then in the echo which outputs the data to the destination file, replace any '' with colon.

This makes a number of assumptions, including that the data contains no % or other awkward characters and that there are no instances of :: in the data.

Sign up to request clarification or add additional context in comments.

Comments

0

I don't understand why you would want to convert teddy;bear to teddy|bear, but... OK.

As requested in comment at https://stackoverflow.com/a/35822437/1012053, you can use the /T option of my JREPL.BAT utility to perform the following find/replace (earlier find/replace take precedence):

  • " at beginning of line, or ";" anywhere, or " at end of line ==> leave as us
  • " any place else ==> convert to '
  • ; any place else ==> convert to |
jrepl "^\q|\q;\q|\q$ \q ;" "$& ' |" /x /t " " /f test.csv /o -

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.