1

I am pretty new to Python and hence I need your help on the following:

I have two tables (dataframes):

Table 1 has all the data and it looks like that:

Table1

GenDate column has the generation day. Date column has dates. Column D and onwards has different values

I also have the following table:

Table 2

Column I has "keywords" that can be found in the header of Table 1 Column K has dates that should be in column C of table 1

My goal is to produce a table like the following:

Table 3

I have omitted a few columns for Illustration purposes.

Every column on table 1 should be split base on the Type that is written on the Header.

Ex. A_Weeks: The Weeks corresponds to 3 Splits, Week1, Week2 and Week3

Each one of these slits has a specific Date.

in the new table, 3 columns should be created, using A_ and then the split name:

A_Week1, A_Week2 and A_Week3.

for each one of these columns, the value that corresponds to the Date of each split should be used.

I hope the explanation is good.

Thanks

1
  • Check out pd.pivot_table Commented Oct 10, 2018 at 11:45

1 Answer 1

1

You can get the desired table with the following code (follow comments and check panda api reference to learn about functions used):

import numpy as np
import pandas as pd

# initial data
t_1 = pd.DataFrame(
    {'GenDate': [1, 1, 1, 2, 2, 2],
     'Date': [10, 20, 30, 10, 20, 30],
     'A_Days': [11, 12, 13, 14, 15, 16],
     'B_Days': [21, 22, 23, 24, 25, 26],
     'A_Weeks': [110, 120, 130, 140, np.NaN, 160],
     'B_Weeks': [210, 220, 230, 240, np.NaN, 260]})
# initial data
t_2 = pd.DataFrame(
    {'Type': ['Days', 'Days', 'Days', 'Weeks', 'Weeks'],
     'Split': ['Day1', 'Day2', 'Day3', 'Week1', 'Week2'],
     'Date': [10, 20, 30, 10, 30]})

# create multiindex
t_1 = t_1.set_index(['GenDate', 'Date'])
# pivot 'Date' level of MultiIndex - unstack it from index to columns
# and drop columns with all NaN values
tt_1 = t_1.unstack().dropna(axis=1)

# tt_1 is what you need with multi-level column labels

# map to rename columns
t_2 = t_2.set_index(['Type'])
mapping = {
    type_: dict(zip(
        t_2.loc[type_, :].loc[:, 'Date'],
        t_2.loc[type_, :].loc[:, 'Split']))
    for type_ in t_2.index.unique()}

# new column names
new_columns = list()
for letter_type, date in tt_1.columns.values:
    letter, type_ = letter_type.split('_')
    new_columns.append('{}_{}'.format(letter, mapping[type_][date]))

tt_1.columns = new_columns
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.