Say that I have a nested list with unequal number of elements in the second layer like a=[[1,2,3],[4,5],[6,7,8,9]]. I also have corresponding list containing date variables like b=[['Mon','Tues','Wed'],['Mon','Wed'],['Mon','Tues','Wed','Thur']]. I would like to convert a and b to pandas dataframes since it is able to take in unequal rows, and then I would like to combine a and b into one dataframe and perform merge on the three dates columns to find the common dates and the corresponding values. However I am not sure how to convert the nested lists to dataframes. I tried converting them to np.array but it is unable to hold unequal rows.
-
2Show what you have tried.Merlin– Merlin2016-07-05 04:06:56 +00:00Commented Jul 5, 2016 at 4:06
-
2'dataframes since it is able to take in unequal rows' Really, since when.Merlin– Merlin2016-07-05 04:09:50 +00:00Commented Jul 5, 2016 at 4:09
-
@A1122 Is your problem is to construct dataframe from a nested dict? or something else, if only df from nested dict then it's pretty simple.min2bro– min2bro2016-07-05 04:14:50 +00:00Commented Jul 5, 2016 at 4:14
-
Up to two level nesting you can use pd.DataFrame.from_dict(), for three level nesting you need to write few lines..min2bro– min2bro2016-07-05 04:15:24 +00:00Commented Jul 5, 2016 at 4:15
Add a comment
|
2 Answers
Best I could come up with is to zip up each record into a dictionary, create a single row dataframe for that row then concat (or outerjoin) this to the previous rows. Here's the code:
a = [[1,2,3],[4,5],[6,7,8,9]]
b = [['Mon','Tues','Wed'],['Mon','Wed'],['Mon','Tues','Wed','Thur']]
df = pd.DataFrame()
for row in zip(a,b):
d = dict(zip(row[1], row[0]))
dfrow = pd.DataFrame(d, index=[0]) #Dataframe for 1 row
df = pd.concat([df, dfrow])
Gives this:
Mon Thur Tues Wed
0 1 NaN 2.0 3
0 4 NaN NaN 5
0 6 9.0 7.0 8
Comments
I guess something like this [dict(zip(keysA,keysB)) for keysA,keysB in zip(a,b)] ... but it has nothing to do with pandas or numpy ...
>>> a=[[1,2,3],[4,5],[6,7,8,9]]
>>> b=[['Mon','Tues','Wed'],['Mon','Wed'],['Mon','Tues','Wed','Thur']]
>>> print [dict(zip(keysA,keysB)) for keysA,keysB in zip(a,b)]
[{1: 'Mon', 2: 'Tues', 3: 'Wed'}, {4: 'Mon', 5: 'Wed'}, {8: 'Wed', 9: 'Thur', 6: 'Mon', 7: 'Tues'}]
or maybe you want to switch day name to the key instead of the number ... its not really clear from your question ...
(something like df = DataFrame([dict(zip(keysB,keysA)) for keysA,keysB in zip(a,b)]))