How to convert R code syntax into Python syntax using Pandas data frame?

Question

Let's say we have following code in R, what would be it's equivalent Pandas data frame syntax/method in Python ?

network_tickets <- contains(comcast_data$CustomerComplaint, match = 'network', ignore.case = T)
internet_tickets <- contains(comcast_data$CustomerComplaint, match = 'internet', ignore.case = T)
billing_tickets <- contains(comcast_data$CustomerComplaint, match = 'bill', ignore.case = T)
email_tickets <- contains(comcast_data$CustomerComplaint, match = 'email', ignore.case = T)
charges_ticket <- contains(comcast_data$CustomerComplaint, match = 'charge', ignore.case = T)
    
comcast_data$ComplaintType[internet_tickets] <- "Internet"
comcast_data$ComplaintType[network_tickets] <- "Network"
comcast_data$ComplaintType[billing_tickets] <- "Billing"
comcast_data$ComplaintType[email_tickets] <- "Email"
comcast_data$ComplaintType[charges_ticket] <- "Charges"
    
comcast_data$ComplaintType[-c(internet_tickets, network_tickets, billing_tickets, c
                              harges_ticket, email_tickets)] <- "Others"

I could convert the first set of operation like below in Python:

network_tickets = df.ComplaintDescription.str.contains ('network', regex=True, case=False)

But, finding challenge to assign the variable network_tickets as value "Internet" into a new pandas dataframe column i.e. ComplaintType. In R, it seems you can do that in just one single line.

However, not sure how we could do this in Python in one single line of code, tried below ways but with errors:

a) df['ComplaintType'].apply(internet_tickets) = "Internet"
b) df['ComplaintType'] = df.apply(internet_tickets)
c) df['ComplaintType'] = internet_tickets.apply("Internet")

I think we could first create a new column in dataframe :

df['ComplaintType'] = internet_tickets

But not sure about next steps.

jezrael · Accepted Answer · 2021-10-27 10:40:33Z

1

Use Series.str.contains with DataFrame.loc for set values by list:

df = pd.DataFrame(data = {"ComplaintDescription":["BiLLing is super","email","new"]})

L = [ "Internet","Network", "Billing", "Email", "Charges"]
for val in L:
    df.loc[df['ComplaintDescription'].str.contains(val, case=False), 'ComplaintType'] = val

df['ComplaintType'] = df['ComplaintType'].fillna('Others')
print (df)
  ComplaintDescription ComplaintType
0     BiLLing is super       Billing
1                email         Email
2                  new        Others

EDIT:

If need use values separately:

df.loc[df['ComplaintDescription'].str.contains('network', case=False), 'ComplaintType'] = "Internet"

edited Oct 27, 2021 at 10:40

answered Oct 27, 2021 at 9:45

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

ManiK Over a year ago

ComplaintType would be a new dataframe column based on values from variables like internet_tickets, service_tickets etc. where value == 'True' i.e. there was a string/expression match.

jezrael Over a year ago

@ManiK - I got it, need new column

ManiK Over a year ago

thanks, one twist though - the list L is just a category type. But , the regexp search could be on different words. For example- "Network" - the search criteria could be either network, netwrk, wifi, bandwidth etc. and then assign it to common category "Network". So, I think these two operations should be separated out rather than searching and listing on the same list. So, lets say if data is like: {"ComplaintDescription":["BiLLing is super","email","new", "bill", "netwrk", "old"]}; Then- we want output of ComplaintType as : Billing, Email, Others, Billing, Network, Others

jezrael Over a year ago

@ManiK - I am confused, so contains(comcast_data$CustomerComplaint, match = 'network', ignore.case = T) match network, netwrk, wifi, bandwidth ? Or need different code like r ?

ManiK Over a year ago

Hey, I think I got the answer.. all I need was to use df.loc [rows, columns] = value df.loc[internet_tickets,'ComplaintType'] = "Internet"

Collectives™ on Stack Overflow

How to convert R code syntax into Python syntax using Pandas data frame?

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related