Use Python code in R in order to change a dataframe

Question

I was just wondering if it was possible to use python codes in R to modify or create new tables?

Here is an example of code I use in python

Here is a dataframe:

   species family    Events      groups
    1     SP1      A     10,22          G1
    2     SP1      B         7          G2
    3     SP1    C,D 4,5,6,1,3 G3,G4,G5,G6
    4     SP2      A     22,10          G1
    5     SP2    D,C 6,5,4,3,1 G4,G6,G5,G3
    6     SP3      C 4,5,3,6,1    G3,G6,G5
    7     SP3      E         7          G2
    8     SP3      A        10          G1
    9     SP4      C        7,22        G12

with this code in Python:

g = df['groups'].apply(lambda x: set(x.split(',')))   # explode into sets
# keep the larger set from g containing the current one and make it back a string
g2 = g.apply(lambda s: ','.join(sorted(
    g[g.apply(lambda x: x.issuperset(s))].max())))

resul = df[['species', 'family', 'Events']].groupby(g2).agg(
    lambda x: ','.join(sorted(set((i for j in x for i in j.split(',')))))
    ).reset_index()

I can transform it as :

       species family     Events       groups
0  SP1,SP2,SP3      A      10,22           G1
1          SP4      C       22,7          G12
2      SP1,SP3    B,E          7           G2
3  SP1,SP2,SP3    C,D  1,3,4,5,6  G3,G4,G5,G6

And I just wanted to know if there is a way to call the python code and produce them directly on R?

In fact I need to work on R but I'm much more familiar with Python code.

Python dictionary equivalent dataframe:

{'species': {1: 'SP1', 2: 'SP1', 3: 'SP1', 4: 'SP2', 5: 'SP2', 6: 'SP3', 7: 'SP3', 8: 'SP3', 9: 'SP4'}, 'family': {1: 'A', 2: 'B', 3: 'C,D', 4: 'A', 5: 'D,C', 6: 'C', 7: 'E', 8: 'A', 9: 'C'}, 'Events': {1: '10,22', 2: '7', 3: '4,5,6,1,3', 4: '22,10', 5: '6,5,4,3,1', 6: '4,5,3,6,1', 7: '7', 8: '10', 9: '7,22'}, 'groups': {1: 'G1', 2: 'G2', 3: 'G3,G4,G5,G6', 4: 'G1', 5: 'G4,G6,G5,G3', 6: 'G3,G6,G5', 7: 'G2', 8: 'G1', 9: 'G12'}}

r equivalent dataframe: 
structure(list(species = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 
3L, 3L, 4L), .Label = c("SP1", "SP2", "SP3", "SP4"), class = "factor"), 
    family = structure(c(1L, 2L, 4L, 1L, 5L, 3L, 6L, 1L, 3L), .Label = c("A", 
    "B", "C", "C,D", "D,C", "E"), class = "factor"), Events = structure(c(2L, 
    7L, 5L, 3L, 6L, 4L, 7L, 1L, 8L), .Label = c("10", "10,22", 
    "22,10", "4,5,3,6,1", "4,5,6,1,3", "6,5,4,3,1", "7", "7,22"
    ), class = "factor"), groups = structure(c(1L, 3L, 4L, 1L, 
    6L, 5L, 3L, 1L, 2L), .Label = c("G1", "G12", "G2", "G3,G4,G5,G6", 
    "G3,G6,G5", "G4,G6,G5,G3"), class = "factor")), class = "data.frame", row.names = c(NA, 
-9L))

Irfaan · Accepted Answer · 2021-03-05 21:22:36Z

1

You can use the reticulate package in r like this in the example given below

Sys.setenv(RETICULATE_PYTHON = "C:\\Users\\anaconda3\\python.exe")
library(reticulate)

str1 = "

import pandas as pd
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
    'Price': [22000,25000,27000,35000]}

df = pd.DataFrame(cars, columns = ['Brand', 'Price'])

print (df)

"

py_run_string(str1)

answered Mar 5, 2021 at 21:22

Irfaan

1555 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

user3054605 · Accepted Answer · 2021-03-05 13:15:10Z

1

You can use the reticulate package: https://rstudio.github.io/reticulate/

answered Mar 5, 2021 at 13:15

user3054605

264 bronze badges

Collectives™ on Stack Overflow

Use Python code in R in order to change a dataframe

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related