0

So my current shell scripting is:

for j in *.jpg
do
  tesseract $j $j
done

where tesseract converts jpg files into text files. With this script, if there was a file HAHA.jpg, then the output file name becomes, HAHA.jpg.txt but I want it to be just HAHA.txt

Is there a way to make the output file name as HAHA.txt instead of HAHA.jpg.txt?

3 Answers 3

2

If you have a shell variable j you can strip a suffix matching a given pattern as follows

${j%%.jpg}

Where %% indicates that the longest matching suffix should be removed and .jpg is the pattern ("a dot, followed by three letters: j, p, and g").

Sign up to request clarification or add additional context in comments.

1 Comment

to strip any extension, you can use ${j%%.*}.
0

Add this line after your tesseract command:

for j in *.jpg
do
  tesseract $j $j
  mv ${j}.txt  ${j/jpg/txt}
done

Even though the tesseract has renamed your file, the variable $j will be containing HAHA.jpg.

1 Comment

Thank you ! could you tell me what the ${j/jpg/txt} line does?? I am quite new to shell scripting,, !
0

Using basename:

for j in *.jpg
do
  tesseract $j $(basename -s .jpg $j)
done

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.