1

I have a list of possible Python import statements and I need to parse them in JavaScript. I found this post regex to parse import statements in python and adopted it for JavaScript but for some reason, not all the statements are parsed.

Here is the test:

const re = /^(?:from[ ]+(\S+)[ ]+)?import[ ]+(\S+)(?:[ ]+as[ ]+\S+)?[ ]*$/g;

const lines = ['import numpy as np',
'import pandas as pd',
'import pkg.mod1, pkg.mod2',
'from pkg.mod2 import Bar as Qux',
'from abc.lmn import pqr',
'from abc.lmn import pqr as xyz',
'import mod',
'from mod import s, foo',
'from mod import *',
'from pkg.mod3 import *',
'from mod import s as string, a as alist',
'import re, json'];

for (var i = 0; i < lines.length; i++){
const res = re.exec(lines[i]);
console.log(res);
}

Ideally, the code would extract the names of packages that need to be loaded (not modules) but it's okay if it would work at least on all the examples.

Ideal expected result:

  • 'numpy',
  • 'pandas',
  • 'pkg',
  • 'pkg',
  • 'abc',
  • 'abc',
  • 'mod',
  • 'mod',
  • 'mod',
  • 'pkg',
  • 'mod'
  • ['re', 'json']
8
  • 1
    When regex is your hammer everything looks like a thumb. Maybe a simple parser instead? Commented Jun 18, 2021 at 18:42
  • @DaveNewton what do you mean by "simple parser"? Commented Jun 18, 2021 at 18:54
  • 2
    To be full clear, please share the result expected from your examples. Commented Jun 18, 2021 at 19:22
  • 1
    I'm not sure how else to explain it. Commented Jun 18, 2021 at 20:04
  • 1
    @WiktorStribiżew You are right, I added this example into the list! Commented Jun 18, 2021 at 20:14

3 Answers 3

1

You can use this regex:

/^(?:from\s+(\w+)(?:\.\w+)?\s+)?import\s+([^\s,.]+)(?:\.\w+)?/

RegEx Demo

Code:

const lines = ['import numpy as np',
'import pandas as pd',
'import pkg.mod1, pkg.mod2',
'from pkg.mod2 import Bar as Qux',
'from abc.lmn import pqr',
'from abc.lmn import pqr as xyz',
'import mod',
'from mod import s, foo',
'from mod import *',
'from pkg.mod3 import *',
'from mod import s as string, a as alist',
'import re, json'];

const re = /^(?:from\s+(\w+)(?:\.\w+)?\s+)?import\s+([^\s,.]+)(?:\.\w+)?((\s*,\s*\w+)*$)?/;

var results = []
lines.forEach(el => {
  var m = el.match(re);
  if (m)
    results.push(m[1] === undefined ? m[2] + (m[3] === undefined ? "" : m[3]) : m[1]);
});

console.log(results);

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, really good job! I added one more example in your snippet, can you have a look?
ok that will make it a bit complicated. Try my updated answer.
1

You can try , but don't match your last line since you edited aferward (updated at the end)

const re = /(import|from)\s+([^\s\.]+)/;

const lines = [
'import numpy as np',
'import pandas as pd',
'import pkg.mod1, pkg.mod2',
'from pkg.mod2 import Bar as Qux',
'from abc.lmn import pqr',
'from abc.lmn import pqr as xyz',
'import mod',
'from mod import s, foo',
'from mod import *',
'from pkg.mod3 import *',
'from mod import s as string, a as alist'
];

for (var i = 0; i < lines.length; i++){
    // console.log(lines[i]);
    const res = re.exec(lines[i]);
    console.log(res[2]);
}

More easy to explain that yours that did not work.

(import|from) : begin by import or from

\s+ : one or more space

[^\s.]+ : every characters not space and not dot

and beware of /g in a loop

Why does a RegExp with global flag give wrong results?

Update to match your last line

const re = /(import|from)\s+([^\.]+?[^,])(\s|\.|$)/;

Just the regex, I did not put the last in an array since you should know and you have already the other answer.

Comments

0

Would this be enough? /(?:import|from)\s+(\w+)/

2 Comments

I didn't try but I believe it wouldn't be enough.
Ok, I tried and it failed in half of the cases.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.