2

I need to get a list of all placeholders in a string:

Thus, "There're %(num_items)d items in the %(container)s" should yield (('num_items', 'd'), ('container', 's')).

What I tried:

1) I tried looking into the source code and found that the

PyObject *
PyString_Format(PyObject *format, PyObject *args)

function does % interpolation on C level.

2) I also tried searching pypi and found a parse lib that does the same thing as string.Formatter.parse which is parsing {}-style string, which is not what I need.

Warning: a quick regexp is unlikely to cover all syntax of % substitution, which is what I need.

Similar question: How can I find all placeholders for str.format in a python string using a regex?

Update

It seems to be solvable pretty well with a reasonably complex regexp, so it will make a nice homework task.

I'll accept this as an answer in two days and I don't anticipate any new answers to the question.

Update2

Is the question so localized that will never be useful to anyone else (except maybe those taking the same class)? If so, vote to close.

(from Please clarify the policy on homework questions)

15
  • Yes, that's a very nice behavior to vote to close without leaving a comment Commented Aug 31, 2015 at 18:55
  • It appears you're asking for a library, that should explain the close vote Commented Aug 31, 2015 at 18:58
  • @Tim Castelijns Yes, probably. I've carefully reworded the question to avoid such allegations. Commented Aug 31, 2015 at 19:00
  • I would phrase it like this A regex is unlikely to cover all syntax of %-substitution, so I'm looking for another solution, removing anything that might look like you're asking for a library Commented Aug 31, 2015 at 19:03
  • @Tim Castelijns Thanks, fixed Commented Aug 31, 2015 at 19:10

2 Answers 2

0
import re

s = "There're %(num_items)d items in the %(container)s"
print re.findall(r'%\((.*?)\)', s)
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks, but this regexp does not cover all syntax of %-substitution
What would it miss aside from something like %s, and what would you want to extract in that case?
@chepner Please, have a look at python help on '% interpolation' for all kinds of tokens potentially present in the interpolated string. I need to extract (name, type) tuples.
Edit your question. It only indicates you want the names following the %.
@chepner thanks, edited. And yes, it would miss %s. And would find aaa in %%(aaa)
|
0

I ended up with this regexp:

re.findall(r'%\(([^)]+)\)[0-9]*(?:\.[0-9]*)?([diouxXeEfFgGcrs%])', a)

as a sensible approximation to the problem (matching 5 tokens out of 7).

3 Comments

What's the extra stuff beyond %\(([^)]+)\) ?
@sln That's to match %(x)12.3f. But I don't want to match %(x)12.3f only. I want to match any kind of stuff capable of being interpolated in a string. Plus I've updated the question a little bit: I found out that type information is also useful for me.
Oh, sort of like printf

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.