I have this data of measurements at two time values with replicates:
name t value replicate
foo 1 0.5 a
foo 1 0.55 b
foo 1 0.6 c
foo 2 0.7 a
foo 2 0.71 b
foo 2 0.72 c
bar 1 0.1 a
bar 1 0.12 b
bar 1 0.3 c
bar 2 0.4 a
bar 2 0.45 b
bar 2 0.44 c
I want to parse it into dataframe and get the mean and standard deviation of the replicates for each time point ("t" column) and for each sample ("name" column). This can be done with:
df = pandas.read_table("data.txt",sep="\t")
g = df.groupby(["name", "t"])
new_df = g.agg([np.mean, np.std])
The problem is that new_df has a hierarchical index:
value
mean std
name t
bar 1 0.173333 0.110151
2 0.430000 0.026458
foo 1 0.550000 0.050000
2 0.710000 0.010000
How can I get a flat dataframe instead where the mean and std values are just regular columns? I tried reset_index() but that does not do it:
>>> new_df.reset_index()
name t value
mean std
0 bar 1 0.173333 0.110151
1 bar 2 0.430000 0.026458
2 foo 1 0.550000 0.050000
3 foo 2 0.710000 0.010000
i'd like the final dataframe to have columns: sample, t, mean, std (or value_mean, value_std). How can this be done in pandas?