0

I was tasked with running a script developed by someone else. It's quite simple, but it's a bash script and I had almost never touched Linux, so I'm not sure on how to proceed. I was able to install WLS so I can run bash on Windows, but now I have to run a specific python script inside the bash script. The script goes like this:

#!/bin/bash
BASE_DIR=dir

find $BASE_DIR -type f | grep '\.pdf' | while read pdf_filename; do
  filebase=`echo $pdf_filename | cut -d '.' -f 1`
  txt_filename="$filebase.txt"
  echo "Processing $pdf_filename..."
  pdf2txt.py $pdf_filename > $txt_filename
  echo "Done!"

done

It should run the pdf2txt.py script, but I'm getting this error:

convert_all.sh: line 8: pdf2txt.py: command not found

So, I'm not sure how to connect bash to my Python installation, I'm guessing it's not being able to find it. I would ideally like to link it to this project's virtual environment . Any ideas on how to proceed?

Edit:

This is my current error based on what I responded to @DV82XL:

/mnt/c/Users/jeco_/Desktop/Otros repositorios/sesgo_medios/Code/hello.py: line 1: $'\r': command not found
/mnt/c/Users/jeco_/Desktop/Otros repositorios/sesgo_medios/Code/hello.py: line 2: syntax error near unexpected token `"hello world"'
/mnt/c/Users/jeco_/Desktop/Otros repositorios/sesgo_medios/Code/hello.py: line 2: `print("hello world")'
4
  • 1
    Instead of just calling pdf2text.py directly, you may need to use python pdf2text.py .... Python scripts are not initially set up to be directly executable. Also, you’ll need to make sure your PATH includes the location of pdf2text.py. Commented May 28, 2020 at 23:44
  • I added it to the path and followed your advice and got convert_all.sh: line 8: python: command not found Commented May 28, 2020 at 23:52
  • @JuanC : I bet that the python executable is not in your PATH, or that pdf2txt.py has an incorrect `#!' -line. Did you check this? Commented May 29, 2020 at 6:42
  • @JuanC : Also, since you now get a different error message than before, you should update your question with this, or ask a new question, instead of just leaving it in a comment, since you have a slightly different problem now. Commented May 29, 2020 at 6:44

1 Answer 1

2

Can you convert the bash script to Python? That way you can easily run in Windows or Linux without WSL.

If you must run the bash script in WSL, make sure Python is installed in WSL:

  • Type type -a python or type -a python3. This will give you the interpreter path.

If it doesn't show up, you will need to install Python on WSL:

sudo apt update && upgrade
sudo apt install python3 python3-pip ipython3

Then do these things:

  1. Make sure the Python interpreter is in the PATH env var by typing echo $PATH. If it's not there, add it by typing export PATH="$PATH:/usr/bin/python3" or add it to ~/.profile. On Linux, it's usually included by default.
  2. Add the path to the script to PATH env var if you want to run it from anywhere
  3. Type python --version or python3 --version to get versions and make sure python path was set correctly.
  4. Add a shebang with the interpreter path at the start of the python script:
    • #!/path/to/interpreter
    • Typically: #!/usr/bin/python3
    • For specific interpreter version: #!/usr/bin/python2.7
  5. Make the script executable: chmod +x pdf2txt.py

Now you should be able to run pdf2txt.py directly instead of python pdf2txt.py.

Hint: In WSL, you can access your Windows files at /mnt/c/Users/<user>/path/to/file if you need to.

If this doesn't work, please let us know which Linux distro/version you are running and what version of Python these scripts require.

Sign up to request clarification or add additional context in comments.

7 Comments

Thanks a lot! I'll try your answer and report back
Do you know what does this line do : filebase= 'echo $pdf_filename | cut -d '.' -f 1 ? It's the only one I can't decode, else I would translate the script to python
echo $pdf_filename sends the actual PDF filename to the next stage of the pipeline, which calls cut. -d means split at a delimiter, in this case, . and -f means keep only a certain field, in this case, the first term to the left of the delimiter. So this cut command will go split the filename at the period, keeping only the first term to the left of it, i.e. the filename base. This can be done in Python using Path(filename).stem using the pathlib2 library.
Hey @DV82XL, sorry for taking so long. I tried your solution with a simpler python script, it's called "hello.py" and its content is only print('hello world'). Still, I'm getting the error I added to my original answer. Do you know what might be happening?
I made it in the end! I'm not sure what did it, because I tried many things, but something along the lines of adding the shebang to the python script and the shell script, exporting a couple of paths, and couple other things. Thank you very much, your answer was quite comprehensive and helped me a lot
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.