0

I have to restart the batch processing of a csv file several times to recover from certain errors. I don't want to reprocess the lines from the csv already successfully processed since that could waste lot of time. When a crash happens an error message gets printed out into my log file that looks in part like this:

.... Error found in row 3611. Exception ...

so I need to read the log file, find that row number and then restart my process at that line or maybe even right after it if I can't recover from the error. I have some code I can run to try to recover from the error. I'll need to copy my csv file from that line down and then rename the files so the new file has the same name as the original but I'd like to keep the original file too maybe with a date/time string appended to the filename. So, my questions: How do I find that row number programmatically and then use it to copy the file from that line down, ie if the number is 3611 then I want to skip the first 3610 line when I copy the file.

I need a batch script that will run on winxp without any extras installed, no unix utils, no powershell just basic batch.

Thanks

UPDATE: here is what my batch file looks like:

@echo. >> dataupdatelog.txt
@echo ============================================ >> dataupdatelog.txt
@echo %date% - %time% >> dataupdatelog.txt
@echo ============================================ >> dataupdatelog.txt
@echo. >> dataupdatelog.txt

RENAME PlayerSyncLog.txt PlayerSyncLog_%date:~-4,4%%date:~-7,2%%date:~0,2%_%time:~0,2%%time:~3,2%%time:~6,2%.TXT

rem call download.bat >> dataupdatelog.txt
SET CSVFILE=smallfromwebsite.csv
call PlayerSync.exe -SYNC %CSVFILE%
rem i need a delay.bat here to allow the log file to get written before i try to parse it
:loop

findstr /c:"Import Successful!" "PlayerSyncLog.txt" >nul 2>&1 && (
    rem tail.bat
    GOTO FOUND  
) || (
    rem only loop for errors of type PK_MemberNumHistory
    findstr /c:"PK_MemberNumHistory" "PlayerSyncLog.txt" >nul 2>&1 && (
        fix.bat
        RENAME PlayerSyncLog.txt PlayerSyncLog_%date:~-4,4%%date:~-7,2%%date:~0,2%_%time:~0,2%%time:~3,2%%time:~6,2%.TXT
        call PlayerSync.exe -SYNC %CSVFILE%
        rem need a delay.bat here
        GOTO loop 
    )
)
:FOUND
rem call backitup.bat >> dataupdatelog.txt
rem call upload.bat >> dataupdatelog.txt
rem call uploadlogs.bat >> dataupdatelog.txt

and here is what a line in my csv file looks like:

2004031,Robby,Brown,65 Lonely St.,Peterborough,,a2d3f4,,,,01/01/1952,01/01/1900,06/18/2013,,2/31/1969,4445556677,[email protected],,

and here is what the last lines in my log file look like after a crash:

12/17/2013 12:52:07: 19994017 updated successfully.
12/17/2013 12:52:07: 19999919 updated successfully.
12/17/2013 11:51:12: Violation of PRIMARY KEY constraint 'PK_MemberNumHistory'. Cannot insert duplicate key in object 'Players'.
The statement has been terminated.. Error found in row 12345. Exception Stack Trace:    at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnec... etc

so in this case I want to find the string "Error found in row 12345" in my log, read the number part 12345 and then copy my csv from line 12346 (ie trim the first 12345 lines from my csv file) and then start processing again and just loop until the whole csv file is processed.

UPDATE 2: new scripts, main.bat:

@echo. >> dataupdatelog.txt
@echo ============================================ >> dataupdatelog.txt
@echo %date% - %time% >> dataupdatelog.txt
@echo ============================================ >> dataupdatelog.txt
@echo. >> dataupdatelog.txt

RENAME PlayerSyncLog.txt PlayerSyncLog_%date:~-4,4%%date:~-7,2%%date:~0,2%_%time:~0,2%%time:~3,2%%time:~6,2%.TXT

rem call download.bat >> dataupdatelog.txt
SET CSVFILE=fromwebsite.csv
call PlayerSync.exe -SYNC %CSVFILE%
:loop

findstr /c:"Import Successful!" "PlayerSyncLog.txt" >nul 2>&1 && (
    tail.bat
    goto success    
) || (
    rem only loop for errors of type PK_MemberNumHistory
    findstr /c:"PK_MemberNumHistory" "PlayerSyncLog.txt" >nul 2>&1 && (

        Fix.bat
        copycsv.bat 
        RENAME PlayerSyncLog.txt PlayerSyncLog_%date:~-4,4%%date:~-7,2%%date:~0,2%_%time:~0,2%%time:~3,2%%time:~6,4%.TXT

        call PlayerSync.exe -SYNC %CSVFILE% 
        goto loop 
    )
    findstr /c:"PK_PlayerInfo" "PlayerSyncLog.txt" >nul 2>&1 && (

        Fix.bat
        copycsv.bat 
        RENAME PlayerSyncLog.txt PlayerSyncLog_%date:~-4,4%%date:~-7,2%%date:~0,2%_%time:~0,2%%time:~3,2%%time:~6,4%.TXT

        call PlayerSync.exe -SYNC %CSVFILE% 
        goto loop 
    ) || (
        echo "some other error"
        goto eof
    )
)   

:success

call backitup.bat >> dataupdatelog.txt
call upload.bat >> dataupdatelog.txt
call uploadlogs.bat >> dataupdatelog.txt

and the csv rewriter copycsv.bat:

@echo off
for /F "tokens=10 delims= " %%a in ('type PlayerSyncLog.txt ^| find /i "error found"') do set $row=%%a
set /a $row="%$row:.=%"
for /f "skip=%$row:.=% delims=" %%a in (fromwebsite.csv) do echo %%a>>newfile.csv

RENAME fromwebsite.csv "fromwebsite_%date:~-4,4%%date:~-7,2%%date:~0,2%_%time:~0,2%%time:~3,2%%time:~6,4%.TXT"
RENAME newfile.csv fromwebsite.cs

so far so good it works except that after the call to copycsv.bat the next call to PlayerSync seems to be skipped and I the goto is ignored and I hit the echo "some other error" line and then the goto eof works. I think whats happening is the subsequent call to PlayerSync fails (also hits another error ) but delays writing to the log for a sec after it returns so the attempt to find an error in the log fails as it hasn't been written yet. How can I build in a delay of a few seconds? Thanks

7
  • Are the lines numbered in the file? Have you tried anything yet? Commented Dec 19, 2013 at 15:58
  • the lines are not numbered in the csv file. no, no idea how to get the number. Commented Dec 19, 2013 at 16:25
  • MORE +n will allow you to start at a specified line. See MORE /? Commented Dec 19, 2013 at 16:29
  • more, of course, I should have known that, but how to get the number? While the csv lines are not numbered they do begin with a sequential number which is copied to the log file upon each successful processing of a line, eg in the log I have lines like this: 12/19/2013 11:33:47: 68030932 updated successfully. so maybe I could read the 68030932 number and then search for it in the csv file and restart processing after that line. No sure that would be easier though, plus there's a chance that consecutive line in the csv would crash I'd have nothing in my log to go on when I restart. Commented Dec 19, 2013 at 16:44
  • another problem I'm having is that the processing utility I use will sometimes quit in the shell but keep running in the background for a few seconds before writing to the log so my batch script continues running and tries to read the log to find it empty, so i need to delay continues execution for a few seconds before i try to read the log. Tried using a ping script for that but it seems to crash my batch script. I guess this is a problem for another thread though. Commented Dec 19, 2013 at 16:49

3 Answers 3

1

It's difficult to give an answer without nowing "what your batch do", the size and structure of your CSV, ....

But an idea is to get the row number like this :

@echo off
for /F "tokens=5 delims= " %%a in ('type YourLog.txt ^| find "error"') do set $row=%%a
echo %$row%

And then set the tokens with the recuparating value with the ; or , as delims who is the standard for a CSV file.

for /f "tokens=%$row.=%,* delims=;" %%a in (file.csv) do (set $start=%%a
                                                           set $rest=%%b)
echo Error Row=%$start%
echo Rest of the line : %$rest%

That's an idea IF THE ROWS ARE ON THE SAME LINE... With the restriction of the size of an environnment Variable in batch (for the *).

EDIT :

OK If the value of the error message is corresponding to the line number of the error in your .CSV. You can just read thee .CSV skipping the N line of the error message +1 et genertae un new .CSV file.

@echo off
for /F "tokens=10 delims= " %%a in ('type YourLog.txt ^| find /i "error"') do set $row=%%a
set /a $row="%$row:.=%"+1
for /f "skip=%$row% delims=" %%a in (file.csv) do echo %%a>>newfile.csv

And then work with the newfile.csv .

Sign up to request clarification or add additional context in comments.

5 Comments

interesting I'll have to study these answers. I'll update my post to show my batch file.
my cvs line number will not the the same as the line number the error message occurs on in my log. however the csv line number is given in the error log message eg : ".... Error found in row 3611. Exception ..." 3611 is the line in the csv file that hit the problem.
to clarify : The line number of the error message in the log will not be the same as the line number of the csv row that has a problem, I need to parse out the number from the log text. If I understand your script it assumes the line numbers are the same.
No. If the ERROR ROW VALUE is 1000, then my script will skipp 1000+1 line of your .CSV file and save the rest in NewFile.Csv. It don't count the line number of your LogFile.
thats great thanks, seems to work well. Please see my update 2 it has my scripts and my last lingering issue - a needed delay after the subsequent calls to PLayerSync.exe
0

Here is one way to do it:

@echo off
setlocal enabledelayedexpansion

for /f "tokens=6 delims= " %%a in ('Findstr /i /n "error found" error.txt') do (
   set num=%%a&set num=!num:.=!
   set /a num+=1
   echo !num!
      for /f "tokens=1,2 delims=:" %%a in ('Findstr /n "^" File1.txt') do ( 
        if %%a GEQ !num! (
          REM Do the rest of your processing. 
          echo %%b
        )
   )
)

5 Comments

Thanks, trying this now, will let you know how it goes.
the command Findstr /i /n "error found" file.txt finds lines with the word "found" not just "Error found" is there a way to change that? also the num it gets is the line number the error message occurs not and not the 12345 number that occurs in the message and which represents the line in the csv file.
used Findstr /i /n /c:"Error found" and it worked to find the error message. not to parse out the line number.
Knowing the format of the entire line containing the error text is needed, to parse the number. At least the part before the number.
example error line : The statement has been terminated.. Error found in row 18. Exception Stack Trace: at... My updated answer has a solution that so far seems to find the line number in the error message, from sachadee above.
0

I would suggest looking in a different direction than batch scripting to solve your problem.

It seems that your process is running from PlayerSync.exe where according to your log file, you are running into primary key violations intermittently. Given that you're on a Windows machine, perhaps you're using some type of SSIS (Integration Serices) or other ETL tool.

Pending bandwidth and knowledge of that executable, you should re-compile whatever executable/ETL routine to redirect "bad" rows instead of failing the entire process. This cuts down on the custom code you need to maintain. For example in SSIS, you can configure INSERT statements to either "Ignore", "Redirect", or "Fail" on various events such as Error (key violations) or Truncation.

1 Comment

yes but my access to the machines running the script is very limited, i cannot even examine the database so recompiling the playersync.exe or even seeing its code is not possible. And I know nothing about SSIS and ETL anyway.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.