2

I need to create lookup tables in python from a csv. I have to do this, though, by unique values in my columns. The example is attached. I have a name column that is the name of the model. For reach model, I need a dictionary with the title from the variable column, the key from the level column and value from the value column. I'm thinking the best thing is a dictionary of dictionaries. I will use this look up table in the future to multiply the values together based on the keys.

Here is code to generate sample data set:

 Name = ['model1', 'model1', 'model1', 'model2', 'model2', 
'model2','model1', 'model1', 'model1', 'model1', 'model2', 'model2', 
'model2','model2']
 Variable = ['channel_model','channel_model','channel_model','channel_model','channel_model','channel_model', 'driver_age', 'driver_age', 'driver_age', 'driver_age', 
'driver_age', 'driver_age', 'driver_age', 'driver_age']
channel_Level = ['Dir', 'IA', 'EA','Dir', 'IA', 'EA', '21','22','23','24', '21','22','23','24']
Value = [1.11,1.18,1.002, 2.2, 2.5, 2.56, 1.1,1.2,1.3,1.4,2.1,2.2,2.3,2.4]
df= {'Name': Name, 'Variable': Variable, 'Level': channel_Level, 'Value':Value}
factor_table = pd.DataFrame(df)

I have read the following but it hasn't yielded great results: Python Creating Dictionary from excel data

I've also tried:

import pandas as pd
factor_table = pd.read_excel('...\\factor_table_example.xlsx')

#define function to be used multiple times
def factor_tables(file, model_column, variable_column, level_column, value_column):
    for i in file[model_column]:
        for row in file[variable_column]:
            lookup = {}
            lookup = dict(zip(file[level_column], file[value,column]))

This yields the error: `dict expected at most 1 arguments, got 2

What I would ultimately like is: {{'model2':{'channel':{'EA':1.002, 'IA': 1.18, 'DIR': 1.11}}}, {'model1'::{'channel':{'EA':1.86, 'IA': 1.66, 'DIR': 1.64}}}}

5
  • 1
    It may be a bit easier to use a list of dictionaries. I haven't tested it yet, but that error could be from the fact that you aren't supplying a key: value pair, thus, it's only getting a key, and no value. A structure like [{'model2':{'channel':{'EA':1.002, 'IA': 1.18, 'DIR': 1.11}}}, {'model1'::{'channel':{'EA':1.86, 'IA': 1.66, 'DIR': 1.64}}}] may suit your needs better Commented Jun 25, 2018 at 14:11
  • typo fixed. When I run the code now set to a variable, I don't get an error, I just get none. Is my for loop sequence off? Hi @C.Nivs, I understand what you're communicating but can't conceptualize how the loop should run...I'm a python novice. Commented Jun 25, 2018 at 14:14
  • 1
    you have to return something from your function for starters Commented Jun 25, 2018 at 14:21
  • 1
    then don't create a dictionary, update it with the values. and edit your question because "dict expected at most 1 arguments, got 2" doesn't make any sense with the corrected code either. Commented Jun 25, 2018 at 14:22
  • @Jean-FrançoisFabre, don't I need a dictionary for the lookup table? Commented Jun 25, 2018 at 14:22

2 Answers 2

1

Using collections.defaultdict, you can create a nested dictionary while iterating your dataframe. Then realign into a list of dictionaries via a list comprehension.

from collections import defaultdict

tree = lambda: defaultdict(tree)

d = tree()
for row in factor_table.itertuples(index=False):
    d[(row.Name, row.Variable)].update({row.Level: row.Value})

res = [{k[0]: {k[1]: dict(v)}} for k, v in d.items()]

print(res)

[{'model1': {'channel_model': {'Dir': 1.110, 'EA': 1.002, 'IA': 1.180}}},
 {'model2': {'channel_model': {'Dir': 2.200, 'EA': 2.560, 'IA': 2.500}}},
 {'model1': {'driver_age': {'21': 1.100, '22': 1.200, '23': 1.300, '24': 1.400}}},
 {'model2': {'driver_age': {'21': 2.100, '22': 2.200, '23': 2.300, '24': 2.400}}}]
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks @jpp. For some reason, I'm still getting a TypeError: 'NoneType' object is not callable
@Jordan, Can't replicate. I ran this code straight after factor_table = pd.DataFrame(df) as you've defined it.
I don' t know what happened, but i restarted the kernal and it worked! Thank you.
1

It looks like your error could be comming from this line:

lookup = dict(zip(file[level_column], file[value,column]))

where file is a dict expecting one key, yet you give it value,column, thus it got two args. The loop you might be looking for is like so

def factor_tables(file, model_column, variable_column, level_column, value_column):
    lookup = {}

    for i in file[model_column]:

        lookup[model_column] = dict(zip(file[level_column], file[value_column]))

    return lookup

This will return to you a single dictionary with keys corresponding to individual (and unique) models:

{'model_1':{'level_col': 'val_col'}, 'model_2':...}

Allowing you to use:

lookups.get('model_1') {'level_col': 'val_col'}

If you need the variable_column, you can wrap it one level deeper:

def factor_tables(file, model_column, variable_column, level_column, value_column):
    lookup = {}

    for i in file[model_column]:

        lookup[model_column] = {variable_column: dict(zip(file[level_column], file[value_column]))}

    return lookup

2 Comments

Thankyou. I'm getting a Nonetype object is not callable error when trying to run the file on a data frame.
I have now added some code to generate a sample data set like the one I'm working with. @C.Nivs

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.