3

Caue:

I'm creating dataframes programmatically in Python using globals().

In the below code, I'm creating 5 datasets that starts with a 'PREFIX' in caps, followed by a letter then ending with a suffix.

R

library(reticulate)
repl_python()

Python

import os
import pandas as pd

letters = ('a','b','c','d','e')
df_names = []

for ele in letters:
  globals()['PREFIX_{}_suffix'.format(ele)] = pd.DataFrame(columns = ['col_a', 'col_b']).astype(str)
  df_names.append(['PREFIX_{}_suffix'.format(ele)][0])
print(df_names)
['PREFIX_a_suffix', 'PREFIX_b_suffix', 'PREFIX_c_suffix', 'PREFIX_d_suffix', 'PREFIX_e_suffix']

Request:

I would like to select dataframes starting with a prefix (ideally with regular expression ^PREFIX) and move those specific dataframes from reticulate's python environment to R environment programmatically.

For the sake of the task, I have added the dataframes variable names into df_names. However, using regex is highly encouraged.

I know the variables are stored in py object that can be accessed with a $ .. but I'm not sure how to select dataframes iteratively and move those dataframes from python's environment to R's environment programmatically all at once.


In R, I usually use ls(pattern=<regex>) to select objects in R environment.

In Python, you can list the variables using locals(), see this thread.

This thread discuss passing python functions from R to python.

1 Answer 1

1

Here is my solution using regex:

In python:

  • Create your regex pattern to fetch desired defined variables
  • Apply your pattern to dir() output, which captures the defined variables in your python's environment
  • Save selected/fetched variables (dfs) in a list
import os
import re

r = re.compile("^PREFIX")
py_dfs = list(filter(r.match, dir())) # fetch defined variables from python's env
print(py_dfs)
['PREFIX_a_suffix', 'PREFIX_b_suffix', 'PREFIX_c_suffix', 'PREFIX_d_suffix', 'PREFIX_e_suffix']

In R:

  • Access that list from python that has the selected variables names
  • Using R's reticulate::py_eval evaluate your python object converting it to r using reticulate::py_to_r
  • Using assign to assign dynamic defined variables with the same name of the variables (dataframes) in python
for (df in py$py_dfs){
  name  = df
  r_df = py_to_r(py_eval(df))
  assign(paste0(name), r_df)
}

> ls(pattern="^PREFIX")
[1] "PREFIX_a_suffix" "PREFIX_b_suffix" "PREFIX_c_suffix" "PREFIX_d_suffix" "PREFIX_e_suffix"
> dim(PREFIX_a_suffix)
[1] 0 2
> class(PREFIX_a_suffix)
[1] "data.frame"
> 
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.