I have a dataframe, df, and a list of strings, cols_needed, which indicate the columns I want to retain in df. The column names in df do not exactly match the strings in cols_needed, so I cannot directly use something like intersection. But the column names do contain the strings in cols_needed. I tried playing around with str.contains but couldn't get it to work. How can I subset df based on cols_needed?
import pandas as pd
df = pd.DataFrame({
'sim-prod1': [1,2],
'sim-prod2': [3,4],
'sim-prod3': [5,6],
'sim_prod4': [7,8]
})
cols_needed = ['prod1', 'prod2']
# What I want to obtain:
sim-prod1 sim-prod2
0 1 3
1 2 4