0

I was just wondering if it was possible to use python codes in R to modify or create new tables?

Here is an example of code I use in python

Here is a dataframe:

   species family    Events      groups
    1     SP1      A     10,22          G1
    2     SP1      B         7          G2
    3     SP1    C,D 4,5,6,1,3 G3,G4,G5,G6
    4     SP2      A     22,10          G1
    5     SP2    D,C 6,5,4,3,1 G4,G6,G5,G3
    6     SP3      C 4,5,3,6,1    G3,G6,G5
    7     SP3      E         7          G2
    8     SP3      A        10          G1
    9     SP4      C        7,22        G12

with this code in Python:

g = df['groups'].apply(lambda x: set(x.split(',')))   # explode into sets
# keep the larger set from g containing the current one and make it back a string
g2 = g.apply(lambda s: ','.join(sorted(
    g[g.apply(lambda x: x.issuperset(s))].max())))

resul = df[['species', 'family', 'Events']].groupby(g2).agg(
    lambda x: ','.join(sorted(set((i for j in x for i in j.split(',')))))
    ).reset_index()

I can transform it as :

       species family     Events       groups
0  SP1,SP2,SP3      A      10,22           G1
1          SP4      C       22,7          G12
2      SP1,SP3    B,E          7           G2
3  SP1,SP2,SP3    C,D  1,3,4,5,6  G3,G4,G5,G6

And I just wanted to know if there is a way to call the python code and produce them directly on R?

In fact I need to work on R but I'm much more familiar with Python code.

Python dictionary equivalent dataframe:

{'species': {1: 'SP1', 2: 'SP1', 3: 'SP1', 4: 'SP2', 5: 'SP2', 6: 'SP3', 7: 'SP3', 8: 'SP3', 9: 'SP4'}, 'family': {1: 'A', 2: 'B', 3: 'C,D', 4: 'A', 5: 'D,C', 6: 'C', 7: 'E', 8: 'A', 9: 'C'}, 'Events': {1: '10,22', 2: '7', 3: '4,5,6,1,3', 4: '22,10', 5: '6,5,4,3,1', 6: '4,5,3,6,1', 7: '7', 8: '10', 9: '7,22'}, 'groups': {1: 'G1', 2: 'G2', 3: 'G3,G4,G5,G6', 4: 'G1', 5: 'G4,G6,G5,G3', 6: 'G3,G6,G5', 7: 'G2', 8: 'G1', 9: 'G12'}}

r equivalent dataframe: 
structure(list(species = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 
3L, 3L, 4L), .Label = c("SP1", "SP2", "SP3", "SP4"), class = "factor"), 
    family = structure(c(1L, 2L, 4L, 1L, 5L, 3L, 6L, 1L, 3L), .Label = c("A", 
    "B", "C", "C,D", "D,C", "E"), class = "factor"), Events = structure(c(2L, 
    7L, 5L, 3L, 6L, 4L, 7L, 1L, 8L), .Label = c("10", "10,22", 
    "22,10", "4,5,3,6,1", "4,5,6,1,3", "6,5,4,3,1", "7", "7,22"
    ), class = "factor"), groups = structure(c(1L, 3L, 4L, 1L, 
    6L, 5L, 3L, 1L, 2L), .Label = c("G1", "G12", "G2", "G3,G4,G5,G6", 
    "G3,G6,G5", "G4,G6,G5,G3"), class = "factor")), class = "data.frame", row.names = c(NA, 
-9L))

2 Answers 2

1

You can use the reticulate package in r like this in the example given below

Sys.setenv(RETICULATE_PYTHON = "C:\\Users\\anaconda3\\python.exe")
library(reticulate)

str1 = "

import pandas as pd
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
    'Price': [22000,25000,27000,35000]}

df = pd.DataFrame(cars, columns = ['Brand', 'Price'])

print (df)

"

py_run_string(str1)
Sign up to request clarification or add additional context in comments.

Comments

1

You can use the reticulate package: https://rstudio.github.io/reticulate/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.