1

Dynamically create string from pandas column

I have three data frame like below one is df and another one is anomalies:-

d = {'10028': [0], '1058': [25], '20120': [29], '20121': [22],'20122': [0], '20123': [0], '5043': [0], '5046': [0]}
    
    df1 = pd.DataFrame(data=d)

Basically anomalies in a mirror copy of df just in anomalies the value will be 0 or 1 which indicates anomalies where value is 1 and non-anomaly where value is 0

d = {'10028': [0], '1058': [1], '20120': [1], '20121': [0],'20122': [0], '20123': [0], '5043': [0], '5046': [0]}

df2 = pd.DataFrame(data=d)

enter image description here

And a third data frame like below:-

d = {'10028': ['US,IN'], '1058': ['NA, JO, US'], '20120': [''], '20121': ['US,PK'],'20122': ['IN'], '20123': ['Us,LN'], '5043': ['AI,AL'], '5046': ['AA,AB']}

df3 = pd.DataFrame(data=d)

enter image description here and I am converting that into a specific format with the below code:-

details = (
        '\n' + 'Metric Name' + '\t' + 'Count' + '\t' + 'Anomaly' + '\t' + 'Country' 
        '\n' + '10028:' + '\t'+ '\t' + str(df1.tail(1)['10028'][0]) + '\t' + str(df2['10028'][0]) + '\t'+ str(df3['10028'][0]) + 
        '\n' + '1058:' + '\t' + '\t' + str(df1.tail(1)['1058'][0]) + '\t' + str(df2['1058'][0]) + '\t'+ str(df3['1058'][0]) +
        '\n' + '20120:' + '\t' +'\t' + str(df1.tail(1)['20120'][0]) + '\t' + str(df2['20120'][0]) + '\t'+ str(df3['20120'][0]) +
        '\n' + '20121:' + '\t' + '\t' +str(round(df1.tail(1)['20121'][0], 2)) + '\t' + str(df2['20121'][0]) + '\t'+ str(df3['20121'][0]) +
        '\n' + '20122:' + '\t' + '\t' +str(round(df1.tail(1)['20122'][0], 2)) + '\t' + str(df2['20122'][0]) + '\t'+str(df3['20122'][0]) +
        '\n' + '20123:' + '\t' + '\t' +str(round(df1.tail(1)['20123'][0], 3)) + '\t' + str(df2['20123'][0]) + '\t'+str(df3['20123'][0]) +
        '\n' + '5043:' + '\t' + '\t' +str(round(df1.tail(1)['5043'][0], 3)) + '\t' + str(df2['5043'][0]) + '\t'+str(df3['5043'][0]) +
        '\n' + '5046:' + '\t' + '\t' +str(round(df1.tail(1)['5046'][0], 3)) + '\t' + str(df2['5046'][0]) + '\t'+str(df3['5046'][0]) +
        '\n\n' + 'message:' + '\t' +
        'Something wrong with the platform as there is a spike in [values where anomalies == 1].'
            )

The problem is the column values are changing always in every run I mean like in this run its '10028', '1058', '20120', '20121', '20122', '20123', '5043', '5046' but maybe in next run it will be '10029', '1038', '20121', '20122', '20123', '5083', '5946'

How I can create the details dynamically depending on what columns are present in the data frame as I don't want to hard code and in the message I want to pass the name of columns whose value is 1.

The value of columns will always be either 1 or 0 for df1 and df2 and for df3 either a list or blank.

Expected Output:- enter image description here

For two data frames I got a working solution which is below :-

# first part of the string
s = '\n' + 'Metric Name' + '\t' + 'Count' + '\t' + 'Anomaly' 

# dynamically add the data
for idx, val in df1.iloc[-1].iteritems():
    s += f'\n{idx}\t{val}\t{df2[idx][0]}' 
# last part
s += ('\n\n' + 'message:' + '\t' +
      'Something wrong with the platform as there is a spike in [values where anomalies == 1].'
     )

and if the matching value is not present then print null

1 Answer 1

1

To obtain the expected result, you can do the following (the input data must be the dictionaries as shown in question, if not, please provide the real input data):

import pandas as pd

final_d = []
d = {'10028': 0, '1058': 25, '20120': 29, '20121': 22,'20122': 0, '20123': 0, '5043': 0, '5046': 0}
final_d.append(d)

d = {'10028': 0, '1058': 1, '20120': 1, '20121': 0,'20122': 0, '20123': 0, '5043': 0, '5046': 0, '91111':0}
final_d.append(d)

d = {'10028': ['US','IN'], '1058': ['NA', 'JO', 'US'], '20120': [''], '20121': ['US','PK'],'20122': ['IN'], '20123': ['Us','LN'], '5043': ['AI','AL'], '5046': ['AA','AB'], '00000':['kk','dd','ee']}
final_d.append(d)

# Now, we will merge the dictionaries on key
data = {}
for i, dt in enumerate(final_d):
    for k,v in dt.items():
        if k in data:
            if type(v)==list:
                data[k][i] = ','.join(v)
            else:
                data[k][i] = v
        else:
            data[k] = ['']*len(final_d)
            if type(v)==list:
                data[k][i] = ','.join(v)
            else:
                data[k][i] = v
maxlen = max([len(v) for v in data.values()])
data = {k:v if len(v)==maxlen else v+['']*(maxlen-len(v)) for k,v in data.items()}

# Creating the base dataframe
df = pd.DataFrame.from_dict(data)

# Converting the column headers (metric names) into a row in the dataframe
df = pd.concat([pd.DataFrame.from_dict({k:[v] for k,v in zip(df.columns.tolist(), df.columns.tolist())}), df], ignore_index=True)

# removing column names
df.columns = [''] * len(df.columns)

# organising the dataframe according to your required output
result = df.T.reset_index(drop=True)

# Adding the column names as required
result.columns = ['Metric Name', 'Count', 'Anomaly', 'Country']

# Voila!
print(result.to_string(index=False))

The generated dataframe:

Metric Name Count Anomaly   Country
      10028     0       0     US,IN
       1058    25       1  NA,JO,US
      20120    29       1          
      20121    22       0     US,PK
      20122     0       0        IN
      20123     0       0     Us,LN
       5043     0       0     AI,AL
       5046     0       0     AA,AB
      91111             0          
      00000                kk,dd,ee
Sign up to request clarification or add additional context in comments.

7 Comments

Hey, Anurag thank you for your answer. but how we will handle the key error. Suppose there is one extra column 11111 in df1 then how we will handle that.
That would be an easy fix, let me update the answer
Finalized from my end! Have a look!
Kindly upvote/accept iff this answers your question!
it's giving me an error! like you did that out of the dictionary and I am doing this with df columns.TypeError: 'int' object is not subscriptable
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.