WHAT I HAVE:
import pandas as pd
inp = [{'long string':'ha: (tra: 1 la: 2) \n hi: (tra: 1 la: 2) \n ho: (tra: 1 la: 2)'},
{'long string':'hi: (tra: 1 la: 2) \n ha: (tra: 1 la: 2) \n ho: (tra: 1 la: 2)'},
{'long string':'ho: (tra: 1 la: 2) \n hi: (tra: 1 la: 2) \n ha: (tra: 1 la: 2)'}]
df = pd.DataFrame(inp)
df
GIVES
long string
0 ha: (tra: 1 la: 2) \n hi: (tra: 1 la: 2) \n ho...
1 hi: (tra: 1 la: 2) \n ha: (tra: 1 la: 2) \n ho...
2 ho: (tra: 1 la: 2) \n hi: (tra: 1 la: 2) \n ha...
WHAT I WANT
inp = {'ha-tra':['1', '1', '1'], 'ha-la':['2', '2', '2'], 'hi-tra':['1', '1', '1'], 'hi-la':['2', '2', '2'],'ho-tra':['1', '1', '1'], 'ho-la':['2', '2', '2']}
df = pd.DataFrame(inp)
df
GIVES
ha-tra ha-la hi-tra hi-la ho-tra ho-la
0 1 2 1 2 1 2
1 1 2 1 2 1 2
2 1 2 1 2 1 2
CONTEXT
From a large string, I want to get each combination of (ha hi ho) and (tra la), and get the scores related to those combinations from the string. The problem is that the order of (ha hi ho) is not similar.