Each row in my dataframe has a string containing some URL query parameters, e.g. flt=promotionflag%3A1%3Borganicfilter%3AOrganic&sortBy=MOST_POPULAR. The flt section could contain multiple parameters as in this example.
I want to parse this string into multiple columns like:
| flt_promotionflag | flt_organicfilter | sortBy |
|---|---|---|
| 1 | Organic | MOST_POPULAR |
There could be lots of different filters so I don't want to hardcode these as column names. If there is already a column with that filter name I want to put the value in the existing column, and if there there isn't a column with that filter name I want to create it.
I've written some code that creates a dictionary in the structure I want in a new column but I think that's probably an unnecessary step.
def createDict(string):
try:
d = dict(x.split("=") for x in string.strip("&").split("&"))
if 'flt' in d:
if '%3B' in d['flt']:
d['flt'] = dict(x.split("%3A") for x in d['flt'].split("%3B"))
else:
d['flt'] = {d['flt'].split("%3A")[0] : 1}
else:
pass
return d
except:
pass
df['Parsed params'] = df['URL Query Parameters'].apply(createDict)
How do I get the data I want in the right columns?
url lib.parse.parse_qs()