Inverse of string format in python

Question

In python, we can use str.format to construct string like this:

string_format + value_of_keys = formatted_string

Eg:

FMT = '{name:} {age:} {gender}'                   # string_format
VoK = {'name':'Alice', 'age':10, 'gender':'F'}    # value_of_keys
FoS = FMT.format(**VoK)                           # formatted_string

In this case, formatted_string = 'Alice 10 F'

I just wondering if there is a way to get the value_of_keys from formatted_string and string_format? It should be function Fun with

VoK = Fun('{name:} {age:} {gender}', 'Alice 10 F')
# the value of Vok is expected as {'name':'Alice', 'age':10, 'gender':'F'}

Is there any way to get this function Fun?

ADDED :

I would like to say, the '{name:} {age:} {gender}' and 'Alice 10 F' is just a simplest example. The realistic situation could be more difficult, the space delimiter may not exists.

And mathematically speaking, most of the cases are not reversible, such as:

FMT = '{key1:}{key2:}'
FoS = 'HelloWorld'

The VoK could be any one in below:

{'key1':'Hello','key2':'World'}
{'key1':'Hell','key2':'oWorld'}
....

So to make this question well defined, I would like to add two conditions:

1. There are always delimiters between two keys
2. All delimiters are not included in any value_of_keys.

In this case, this question is solvable (Mathematically speaking) :)

Another example shown with input and expected output:

In '{k1:}+{k2:}={k:3}', '1+1=2'    Out {'k1':1,'k2':2, 'k3':3}
In 'Hi, {k1:}, this is {k2:}', 'Hi, Alice, this is Bob' Out {'k1':'Alice', 'k2':'Bob'}

Only if there is a consistent delimiter (like the space) and certain order (where the keys are ordered exactly as the values are) in the formatted string can you reliably extract something useful. Are those conditions present? — JacobIRR
– JacobIRR, Commented Jan 31, 2018 at 7:03

Sweeper · Accepted Answer · 2018-01-31 07:16:24Z

3

You can indeed do this, but with a slightly different format string, called regular expressions.

Here is how you do it:

import re
# this is how you write your "format"
regex = r"(?P<name>\w+) (?P<age>\d+) (?P<gender>[MF])"
test_str = "Alice 10 F"
groups = re.match(regex, test_str)

Now you can use groups to access all the components of the string:

>>> groups.group('name')
'Alice'
>>> groups.group('age')
'10'
>>> groups.group('gender')
'F'

Regex is a very cool thing. I suggest you learn more about it online.

answered Jan 31, 2018 at 7:16

Sweeper

292k23 gold badges260 silver badges438 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

spring cc Over a year ago

I edited the question to make the question well defined, and in your case the solution is not as general as I expected, we don't want to assume the delimiters to be space and write the specific regex by hand, right?

Sweeper Over a year ago

@springcc If you know the delimiter, just replace the spaces with that delimiter. You can use a more general regex like (?P<name>\S+) (?P<age>\S+) (?P<gender>\S+) if you don't want to write a new regex each time, but this obviously captures a lot more stuff.

spring cc Over a year ago

Yep, but the latter case is indeed my original purpose...To find a general solution about the inverse problem of string format...

spring cc · Accepted Answer · 2018-02-01 09:10:31Z

2

I wrote a funtion and it seems work:

import re

def Fun(fmt,res):

    reg_keys = '{([^{}:]+)[^{}]*}'
    reg_fmts = '{[^{}:]+[^{}]*}'
    pat_keys = re.compile(reg_keys)
    pat_fmts = re.compile(reg_fmts)

    keys = pat_keys.findall(fmt)
    lmts = pat_fmts.split(fmt)
    temp = res
    values = []
    for lmt in lmts:
        if not len(lmt)==0:
            value,temp = temp.split(lmt,1)
            if len(value)>0:
                values.append(value)
    if len(temp)>0:
        values.append(temp)
    return dict(zip(keys,values))

Usage:

eg1:

fmt = '{k1:}+{k2:}={k:3}'
res = '1+1=2'
print Fun(fmt,res)
>>>{'k2': '1', 'k1': '1', 'k': '2'}

eg2:

fmt = '{name:} {age:} {gender}'
res = 'Alice 10 F'
print Fun(fmt,res)
>>>

eg3:

fmt = 'Hi, {k1:}, this is {k2:}'
res = 'Hi, Alice, this is Bob'
print Fun(fmt,res)
>>>{'k2': 'Bob', 'k1': 'Alice'}

answered Feb 1, 2018 at 9:10

spring cc

1,0571 gold badge13 silver badges20 bronze badges

2 Comments

Brett7533 Over a year ago

need change reg_keys = '{([^{}:]+)[^{}]*}' to reg_keys = r'{([^{}:]+)[^{}]*}', or if '{{' in fmt will fail

spring cc Over a year ago

@Brett7533 Hi, I just test in python2.7 and it works without the 'r'. However, it also works with the 'r'

Moinuddin Quadri · Accepted Answer · 2018-01-31 07:15:44Z

1

There is no way for python to determine how you created the formatted string once you get the new string.

For example: once your format "{something} {otherthing}" with values with space and you get the desired string, you can not differentiate whether the word with space was the part of {something} or {otherthing}

However you may use some hacks if you know about the format of the new string and there is consistency in the result.

For example, in your given example: if you are sure that you'll have word followed by space, then a number, then again a space and then a word, then you may use below regex to extract the values:

>>> import re
>>> my_str = 'Alice 10 F'

>>> re.findall('(\w+)\s(\d+)\s(\w+)', my_str)
[('Alice', '10', 'F')]

In order to get the desired dict from this, you may update the logic as:

>>> my_keys = ['name', 'age', 'gender']

>>> dict(zip(my_keys, re.findall('(\w+)\s(\d+)\s(\w+)', my_str)[0]))
{'gender': 'F', 'age': '10', 'name': 'Alice'}

edited Jan 31, 2018 at 7:15

answered Jan 31, 2018 at 7:10

Moinuddin Quadri

48.4k13 gold badges101 silver badges137 bronze badges

4 Comments

spring cc Over a year ago

Hi, I edited the question, and the space is not always assumed exists, this solution may not as general as I expected,

Moinuddin Quadri Over a year ago

@springcc simple answer to your edit is that it is not possible until and unless you know of some way to differentiate both swords

spring cc Over a year ago

Hi, with two conditions: 1. There are always delimiters between two keys and 2. All delimiters are not included in any value_of_keys., this question is solvable (mathematically speaking, it's easy to prove).

Moinuddin Quadri Over a year ago

If there are delimiters, you can use the delimiters to split the string. However for your example of '{key1:}{key2:}' - > 'HelloWorld'. There is no way to get back "Hello" and "World"

IMCoins · Accepted Answer · 2018-01-31 07:24:51Z

1

I suggest another approach to this problem using **kwargs, such as...

def fun(**kwargs):
    result = '{'
    for key, value in kwargs.iteritems():
        result += '{}:{} '.format(key, value)

    # stripping the last space
    result = result[:-1]
    result += '}'
    return result


print fun(name='Alice', age='10', gender='F')
# outputs : {gender:F age:10 name:Alice}

NOTE : kwargs is not an ordered dict, and will only keep the parameters order up to version 3.6 of Python. If order is something you with to keep, it is easy though to build a work-around solution.

edited Jan 31, 2018 at 7:24

answered Jan 31, 2018 at 7:19

IMCoins

3,3161 gold badge13 silver badges27 bronze badges

Comments

Mark Ransom · Accepted Answer · 2018-01-31 07:40:40Z

1

This code produces strings for all the values, but it does split the string into its constituent components. It depends on the delimiter being a space, and none of the values containing a space. If any of the values contains a space this becomes a much harder problem.

>>> delimiters = ' '
>>> d = {k: v for k,v in zip(('name', 'age', 'gender'), 'Alice 10 F'.split(delimiters))}
>>> d
{'name': 'Alice', 'age': '10', 'gender': 'F'}

edited Jan 31, 2018 at 7:40

answered Jan 31, 2018 at 7:11

Mark Ransom

310k44 gold badges423 silver badges660 bronze badges

5 Comments

spring cc Over a year ago

Hi, I think we don't need to assume that there is a space delimiter. I edited the question to make it well defined.

Mark Ransom Over a year ago

@springcc and I just made an edit to allow any arbitrary character as a delimiter.

spring cc Over a year ago

For this case: FMT = '{k1:}+{k2:}={k3}' and FoS = '1+1=2' Your solution is not work :(

Mark Ransom Over a year ago

@springcc if you're going to move the goalposts, nobody can answer your question.

spring cc Over a year ago

Just to make it clear, this is indeed my original purpose. What I expected in the very beginning is the general solution for the inverse problem of string format.

Brett7533 · Accepted Answer · 2018-01-31 10:27:17Z

for your requirement, I have a solution. This solution concept is:

change all delimiters to same delimiter
split input string by the same delimiter
get the keys
get the values
zip keys and values as dict

import re
from collections import OrderedDict

def Func(data, delimiters, delimiter):
    # change all delimiters to delimiter
    for d in delimiters:
        data[0] = data[0].replace(d, delimiter)
        data[1] = data[1].replace(d, delimiter)

    # get keys with '{}'
    keys = data[0].split(delimiter)
    # if string starts with delimiter remove first empty element
    if keys[0] == '':
        keys = keys[1:]

    # get keys without '{}'
    p = re.compile(r'{([\w\d_]+):*.*}')
    keys = [p.match(x).group(1) for x in keys]

    # get values
    vals = data[1].split(delimiter)
    # if string starts with delimiter remove first empty element
    if vals[0] == '':
        vals = vals[1:]

    # pack to a dict
    result_1 = dict(zip(keys, vals))

    # if you need Ordered Dict
    result_2 = OrderedDict(zip(keys, vals))

    return result_1, result_2

The usage:

In_1 = ['{k1}+{k2:}={k3:}', '1+2=3']
delimiters_1 = ['+', '=']
result = Func(In_1, delimiters_1, delimiters_1[0])
# Out_1 = {'k1':1,'k2':2, 'k3':3}
print(result)


In_2 = ['Hi, {k1:}, this is {k2:}', 'Hi, Alice, this is Bob']
delimiters_2 = ['Hi, ', ', this is ']
result = Func(In_2, delimiters_2, delimiters_2[0])
# Out_2 = {'k1':'Alice', 'k2':'Bob'}
print(result)

The output:

({'k3': '3', 'k2': '2', 'k1': '1'}, 
OrderedDict([('k1', '1'), ('k2', '2'), ('k3', '3')]))

({'k2': 'Bob', 'k1': 'Alice'}, 
OrderedDict([('k1', 'Alice'), ('k2', 'Bob')]))

Hi, I post an answer, it doesn't need to write the delimiters by hand.

Vikas Periyadath · Accepted Answer · 2018-01-31 07:11:43Z

0

try this :

import re


def fun():
   k = 'Alice 10 F'
   c = '{name:} {age:} {gender}'
   l = re.sub('[:}{]', '', c)
   d={}
   for i,j in zip(k.split(), l.split()):
       d[j]=i
   print(d)

you can change the fun parameters as your wish and assign it to variables. It accepts the same string you want to give. and gives the dict like this:

{'name': 'Alice', 'age': '10', 'gender': 'F'}

answered Jan 31, 2018 at 7:11

Vikas Periyadath

3,1961 gold badge25 silver badges35 bronze badges

Comments

Hacker · Accepted Answer · 2022-01-11 20:11:44Z

0

I think the only right answer is that, what you are searching for isn't really possible generally after all. You just don't have enough information. A good example is:

#python 3
a="12"
b="34"
c="56"
string=f"{a}{b}{c}"
dic = fun("{a}{b}{c}",string)

Now dic might be {"a":"12","b":"34","c":"56"} but it might as well just be {"a":"1","b":"2","c":"3456"}. So any universal reversed format function would ultimately fail to this ambiguity. You could obviously force a delimiter between each variable, but that would defeat the purpose of the function.

I know this was already stated in the comments, but it should also be added as an answer for future visitors.

answered Jan 11, 2022 at 20:11

Hacker

3775 silver badges17 bronze badges

Collectives™ on Stack Overflow

Inverse of string format in python

8 Answers 8

3 Comments

2 Comments

4 Comments

Comments

5 Comments

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

3 Comments

2 Comments

4 Comments

Comments

5 Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related