2

I'm trying to automate parsing XML files to MySQL using a simple shell script. The parser works fine, it takes ABC123.xml and outputs ABC123.sql.

I've tried to write a shell script that will grab the files in a directory, and pass the file name to the script. In the files, the prefix ABC is always present, but the numbers change, and are not sequential. So what I want is, for example, take file ABC*.xml and output ABC*.sql.

Here is what I have:

for file in $(ABC*.xml); do

temp="java -cp XMLParser-2.0-SNAPSHOT.jar:$MYSQL_CJ Engine.xmlparsers.Parser 
-url='jdbc:mysql://localhost/DBNAME?useUnicode=true&characterEncoding=UTF-8&user=USERNAME&password=PASSWORD' 
-file='/mnt/volume/SQL/$file.sql' /mnt/volume/XML/$file.xml"
eval $temp
done

The parser runs and I get an output file ABC*.sql, and nothing else.

I have also tried

for file in *; do

and this just processes all the file in the directory of the script

Previously I ran a script that worked perfectly with files that had sequential suffixes, and the solution was:

for i in $( seq 1 500 ); do

temp="java -cp XMLParser-2.0-SNAPSHOT.jar:$MYSQL_CJ Engine.xmlparsers.Parser 
-url='jdbc:mysql://localhost/DBNAME?useUnicode=true&characterEncoding=UTF-8&user=USERNAME&password=PASSWORD' 
-file='/mnt/volume/SQL/ABC-$i.sql' /mnt/volume/XML/ABC-$i.xml"
eval $temp
done

Any ideas would be helpful. Thanks

UPDATE

Many thanks for the initial comments.

I tried $(ls ABC*.xml) as squeamishossifrage suggested, but that gives an error ls: cannot access ABC*.xml: No such file or directory.

I would love to upload directly to MySQL, but each XML file has several dozen elements, with many child elements, which are broken out into different tables.

The function of the variable, as user1934428 requested is as follows:

Let's say I have three XML files:

ABC12345.xml
ABC98172.xml
ABC7211891.xml

The parser should read these files and parse the elements into SQL and output

ABC12345.sql
ABC98172.sql
ABC7211891.sql

What I want the shell script to do is that for all the XML files in this directory, feed them to the script where $file.xml is the input file and $file.sql is the output file.

As I mentioned above, this works perfectly if I have files with sequential suffixes, for example

ABC-1.xml
ABC-2.xml
ABC-3.xml

using for i in $( seq 1 3 ); do outputs

ABC-1.sql
ABC-2.sql
ABC-3.sql

where ABC-$i.xml is the input and ABC-$i.sql is the output.

What I cannot figure out is how to make the equivalent of for i in $( seq 1 3 ); do when I do not know the file names.

SOLUTION

Many thanks for the comments from everyone, especially Charles and Alexei. The final solution I used is what is provided by Alexei below. I don't seem to have the capability to up-vote here, or I would have done that.

9
  • Perhaps you could change for file in $(ABC*.xml) to for file in $(ls ABC*.xml). But why not just import the XML files directly into MySQL? Commented Dec 10, 2015 at 11:04
  • 1
    @squeamishossifrage or just not open a new subshell for no reason... Commented Dec 10, 2015 at 11:09
  • 2
    Do you just want for file in ABC*.xml; do? Commented Dec 10, 2015 at 11:54
  • 1
    What is $(ABC*.xml) supposed to do? This expression would first expand to all the xml files starting with ABC, then picking up the first one in the expanded list, and try to execute it as a program. Perhaps you just wanted to have for file in ABC*.xml; do ??? Commented Dec 10, 2015 at 13:23
  • 1
    @squeamishossifrage, using ls at all here is broken; see mywiki.wooledge.org/ParsingLs -- and completely unnecessary, as ABC*.xml will evaluate to the correct list as a glob. Commented Dec 11, 2015 at 2:17

1 Answer 1

2

Synopsis:

  1. Use shopt -s nullglob to prevent the literal ABC*.xml from matching. If your pattern doesn't match, nothing will be returned, and would suggest ABC isn't the leftmost, commonly shared string within this search space. Ref

  2. Within the for loop, use base="${file%.*}"; to extract the name. Ref

  3. If you want single quotes to appear inside you eval, escape them: \'

  4. Don't use eval (if possible)

    • Execute the command directly. Variables passed within your arguments should be unwrapped ("$MYSQL_CJ","$base.xml"). Double quotes to prevent odd behavior. Won't work with ~ (e.g "~/$base.xml"), if needed could ~/$base.xml, with above caveat.
    • See Charles Duffy comments

Place in shell script or run from bash (replace /path/to/ with absolute path to ABC files):

FILES=/mnt/volume/XML/ABC*.xml;
shopt -s nullglob; #don't match ABC*.xml literal
for file in $FILES; do
 filename =$(basename "$file");
 filename="${filename%.*}";
 java -cp "XMLParser-2.0-SNAPSHOT.jar:$MYSQL_CJ" Engine.xmlparsers.Parser \
  -url="jdbc:mysql://localhost/DBNAME?useUnicode=true&characterEncoding=UTF-8&user=USERNAME&password=PASSWORD" \
  -file="/mnt/volume/SQL/$filename.sql" "$file";
 done; \
shopt -u nullglob;  #disable nullglob

Test:

touch Harrison_Wells.xml; 
bash; 
shopt -s nullglob;
for file in Harrison_*.xml; do 
  filename="${file%.*}"; 
  echo "The 'Reverse-Flash' is $filename.are_we_right"; 
  done; \
shopt -u nullglob;
Sign up to request clarification or add additional context in comments.

21 Comments

Why keep the OP's use of eval, and bugs (including security bugs) caused by same?
...to be more specific, and give an example of the severity of this concern: What happens if someone runs touch $'ABC$(rm -rf ..)\'$(rm -rf ..)\'.xml' before invoking the code you're showing here? And that's a relatively tame case -- there are tricks that can be used to generate / characters (ie. uuencode/uudecode), to refer to parent directories / the root / etc.
You're welcome to contribute to the answer. Eval needs to be used carefully. However, I think you'll need to provide better justification. First, the OP did not ask for advice on the use of eval. Second, you did not provide an alternative to his solution that would be meaningfully safer, which needs to include his use case. Third, the example you provide is unconvincing. Preventing users from deleting data is the use case for permissions. All imo, happy to learn something new.
Edited to state that eval may not be the best solution. Charles, if you have any comments, please add your contributions.
The example is convincing if the script is run by a user with more privileges than the one who creates the files in question (or, for an even more pathological case, with more privileges than the user providing data used to name the files in question; consider if files are uploaded via FTP or named per data submitted to a CGI script); should any of the above be true, this becomes a privilege escalation bug. In any event, I'm glad to provide an eval-free answer; frankly, I thought the construction of one to be so obvious that my own guidance wouldn't be needed beyond showing the necessity.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.