0

I have 2 dataframes, one having individual data and the other dataframe is a configuration rule for the individual data. These are the dataframes:

df1:

  employee_Id first_Name  last_Name    email_Address   
0       E1000       Manas         Jani      [email protected]
1       E2000         Jim         Kong      [email protected]
2       E3000       Olila   Jayavarman      [email protected]
3       E4000        Lisa     Kopkingg      [email protected]
4       E5000     Kishore      Pindhar      [email protected]
5       E6000        Gobi        Nadar      [email protected]

df2:

  Input_file_name Is_key Config_file_name           Value
0     Employee ID      Y      employee_Id  idTypeCode:001
4        EntityID      N        entity_Id    entity_Id:01

I need my resulting individual dataframe to look like this,

Result_df:

employee_Id first_Name  last_Name    email_Address      idTypeCode  entity_Id
0       E1000       Manas         Jani      [email protected]         001         01
1       E2000         Jim         Kong      [email protected]         001         01
2       E3000       Olila   Jayavarman      [email protected]         001         01
3       E4000        Lisa     Kopkingg      [email protected]         001         01
4       E5000     Kishore      Pindhar      [email protected]         001         01
5       E6000        Gobi        Nadar      [email protected]         001         01

I am unable to understand how to get the Value column to the final dataframe.

1
  • Oh! So the configuration dataframe is a rule file which says that on the input file the column name is "Employee ID" but it should be "employee_Id" really. So it does not have an actual ID under it, if you get what I mean. Commented Nov 28, 2018 at 19:06

1 Answer 1

1

What you want to do is not crystal clear. However I hope, this may help you.

First working on the first dataset to extract the values.

import pandas as pd
import io

# test data
zz = """Input_file_name Is_key Config_file_name           Value
0     Employee ID      Y      employee_Id  idTypeCode:001
4        Entity ID      N        entity_Id    entity_Id:01
"""

df = pd.read_table(io.StringIO(zz), delim_whitespace=True)


extract = df['Value'].str.split(':',expand=True).transpose()
extract.columns = extract.iloc[0]
extract = extract.drop(extract.index[0]).reset_index(drop=True)
print(extract)

# 0 idTypeCode entity_Id
# 0        001        01

Then merging the two.

# test data
zz = """employee_Id first_Name  last_Name    email_Address   
0       E1000       Manas         Jani      [email protected]
1       E2000         Jim         Kong      [email protected]
2       E3000       Olila   Jayavarman      [email protected]
3       E4000        Lisa     Kopkingg      [email protected]
4       E5000     Kishore      Pindhar      [email protected]
5       E6000        Gobi        Nadar      [email protected]
"""
empl = pd.read_table(io.StringIO(zz), delim_whitespace=True)

pd.concat([empl, extract], axis=1, join='outer', ignore_index=True).fillna(method='ffill')

#   employee_Id first_Name   last_Name email_Address idTypeCode entity_Id
# 0       E1000      Manas        Jani   [email protected]        001        01
# 1       E2000        Jim        Kong   [email protected]        001        01
# 2       E3000      Olila  Jayavarman   [email protected]        001        01
# 3       E4000       Lisa    Kopkingg   [email protected]        001        01
# 4       E5000    Kishore     Pindhar   [email protected]        001        01
# 5       E6000       Gobi       Nadar   [email protected]        001        01
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.