Looks like you want a multi-level dataframe from the string. Here's how I would do it.
- Step 1: Split the string by
'\r\n'. Then for each value, split by
','
- Step 2: The above step will create a list of list. Element #0 has 4
items and element #1 has 2 items. The rest have 3 items each and is
the actual data
- Step 3: Convert the data into a dictionary from element #3 onwards.
Use values in element #2 as keys for the dictionary (namely
x data
and y data). To ensure you have key:[list of values], use the
dict.setdefault(key,[]).append(value). This will ensure the data
is created as a `key:[list of values]' dictionary.
- Step 4: Create a normal dataframe using the dictionary as all the
values are stored as key and values in the dictionary.
- Step 5: Now that you have the dictionary, you want to create the
MultiIndex. Convert the column to MultiIndex.
Putting all this together, the code is:
import pandas as pd
text = ',file_05,,\r\nx data,y data\r\n-970.0,-34.12164,\r\n-959.0,-32.37526,\r\n-949.0,-30.360199,\r\n-938.0,-28.74816,\r\n-929.0,-27.53912,\r\n-920.0,-25.92707,\r\n-911.0,-24.31503,\r\n-900.0,-23.64334,\r\n-891.0,-22.29997,'
line_text = [txt.split(',') for txt in text.split('\r\n')]
dct = {}
for x,y,z in line_text[2:]:
dct.setdefault(line_text[1][0], []).append(x)
dct.setdefault(line_text[1][1], []).append(y)
df = pd.DataFrame(dct)
df.columns = pd.MultiIndex.from_tuples([(line_text[0][i],line_text[1][i]) for i in [0,1]])
print (df)
Output of this will be:
file_05
x data y data
0 -970.0 -34.12164
1 -959.0 -32.37526
2 -949.0 -30.360199
3 -938.0 -28.74816
4 -929.0 -27.53912
5 -920.0 -25.92707
6 -911.0 -24.31503
7 -900.0 -23.64334
8 -891.0 -22.29997