0

I have a list of files named such as

topaccount_2015_09_individuals
topaccount_2015_12_indiviuuals
...
topaccount_2021_12_individuals

which are subsets of

topaccount_2015_09
topaccount_2015_12
...
topaccount_2021_12

I want to call them and do some data manipulation so i created a list,

known_series = known['Address']
y = ['2015_09', '2015_12', '2016_03', '2016_06', '2016_09', '2016_12', '2017_03', '2017_06', '2017_09', '2017_12',
     '2018_03', '2018_06', '2018_09', '2018_12', '2019_03', '2019_06', '2019_09', '2019_12' , '2020_03', '2020_06', '2020_09', '2020_12', 
     '2021_03', '2021_03', '2021_06', '2021_09', '2021_12']

for q in y:
    topaccount_[q]_individuals = topaccount_[q][~topaccount_[q]['address'].isin(known_series)]
    topaccount_[q]_individuals = topaccount_[q]_individuals.reset_index(drop=True)

but it is giving me an error. what am I doing wrong? (known_series is already defined in the script)

UPDATE I followed the suggestion below, but i have one more problem, which is how to address the master dataframe from which I am extracting _individuals dataframes.

y = ['2015_09', '2015_12', '2016_03', '2016_06', '2016_09', '2016_12', '2017_03', '2017_06', '2017_09', '2017_12',
     '2018_03', '2018_06', '2018_09', '2018_12', '2019_03', '2019_06', '2019_09', '2019_12' , '2020_03', '2020_06', '2020_09', '2020_12', 
     '2021_03', '2021_03', '2021_06', '2021_09', '2021_12']

file_individuals = []
file = []

for x in y:
    file_individuals.append(f'topaccount_{x}_individuals')
    file.append(f'topaccount_{x}')
    
print(file_individuals)
print(file)
    
for file_individuals in file_individuals:
    file_individuals = **topaccount_[q][~topaccount_[q]**['address'].isin(known_series)]  
    file_individuals = file_individuals[~file_individuals['address'].isin(coinmarketcap_series)]
    file_individuals = file_individuals[~file_individuals['address'].isin(tord_series)]
    file_individuals = file_individuals[~file_individuals['address'].isin(exchanges_series)]
    file_individuals = file_individuals.reset_index(drop=True)

REUPDATE

d = {}
names=[]
for x in y:
     d['ind'] = f"topaccount_{x}_individuals"
     d['top'] = f"topaccount_{x}"
     names.append(d)
    
for n in names:
    n['ind'] = n['top'][~n['top']['address'].isin(known_series)] 

and I get the following error:

   n['ind'] = n['top'][~n['top']['address'].isin(known_series)]

TypeError: string indices must be integers
10
  • Maybe known_series is already defined but your example is not reproducible. You can't create variables dynamically like that topaccount_[q]_individuals. Commented Mar 21, 2022 at 9:42
  • so I have to do everything manually? Commented Mar 21, 2022 at 9:50
  • No you can use a dictionary topaccount_individuals indexed by y Commented Mar 21, 2022 at 9:52
  • You mean this file topaccount_[q]? Commented Mar 21, 2022 at 10:19
  • exactly. I am not sure how to address them since their date have to match. (i.e. extracting top_account_2012_12_individuals from top_account_2012_12 Commented Mar 21, 2022 at 10:22

1 Answer 1

1

Something like this then and use the name list later.

name = []
for x in y:
    name.append(f'topaccount_{x}_individuals') 
    
print(name)

['topaccount_2015_09_individuals', 'topaccount_2015_12_individuals', 'topaccount_2016_03_individuals', 'topaccount_2016_06_individuals', 'topaccount_2016_09_individuals', 'topaccount_2016_12_individuals', 'topaccount_2017_03_individuals', 'topaccount_2017_06_individuals', 'topaccount_2017_09_individuals', 'topaccount_2017_12_individuals', 'topaccount_2018_03_individuals', 'topaccount_2018_06_individuals', 'topaccount_2018_09_individuals', 'topaccount_2018_12_individuals', 'topaccount_2019_03_individuals', 'topaccount_2019_06_individuals', 'topaccount_2019_09_individuals', 'topaccount_2019_12_individuals', 'topaccount_2020_03_individuals', 'topaccount_2020_06_individuals', 'topaccount_2020_09_individuals', 'topaccount_2020_12_individuals', 'topaccount_2021_03_individuals', 'topaccount_2021_03_individuals', 'topaccount_2021_06_individuals', 'topaccount_2021_09_individuals', 'topaccount_2021_12_individuals']

Alternatively,

d = {}
names=[]
for x in y:
     d['ind'] = f"topaccount_{x}_individuals"
     d['top'] = f"topaccount_{x}"
     names.append(d)

for n in names:
    n['ind'] = n['top'].....
Sign up to request clarification or add additional context in comments.

3 Comments

The very first line is calling data from the master file, "topaccount_2015_12[~topaccount_2015_12['address'].isin(known_series)]" how can I approach this?? do I specify a loop inside a loop??
You can create these file names before and then use them in the code.
yes i understand that, but these are two different sets of files. code y = ['2015_09', '2015_12', '2016_03', '2016_06', '2016_09', '2016_12', '2017_03', '2017_06', '2017_09', '2017_12', '2018_03', '2018_06', '2018_09', '2018_12', '2019_03', '2019_06', '2019_09', '2019_12' , '2020_03', '2020_06', '2020_09', '2020_12', '2021_03', '2021_03', '2021_06', '2021_09', '2021_12'] file_individuals = [] file = [] for x in y: file_individuals.append(f'topaccount_{x}_individuals') file.append(f'topaccount_{x}')

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.