Dynamically generating dataframes from another dataframe

Question

I need help generating a dynamic list of files to be opened by pandas.read_csv()

_ExtensionToLookFor = '.csv'
# Setting up boolean to check for CSV
_isCsv = None
# Returning all files in folder ** If the folder needs to be changed - FilePath #
_FileReturn = [_Files for _Files in listdir(_FilePath) if isfile(join(_FilePath,_Files))]
_FileReturn = pd.DataFrame(_FileReturn)
_FileReturn.columns = ['Files']
# Returning only CSV files #
_FileReturn = _FileReturn[_FileReturn['Files'].str.contains('.csv')]

def SpcFormat(_FileReturn):
    def __init__(self,_FileReturn):
        self._FileReturn = _FileReturn
    def __DataframeCreation__(self):
        _FileReturn = self._FileReturn
        for i in _FileReturn:
            StartInt = 1

I am having trouble accomplishing this, near the bottom I am trying to iterate over the list and name the dataframes equivalent to the count position.

So in pseudo code it should go like this

For Files in _FileReturn:
Create New DataFrame(StartInt) = Pandas.Read_csv(DataFrame(IntPosition)+_ExtensionToLookFor)
StartInt ++ // Add One

Thanks !

*Edit: For clarity what I am trying to do is check a folder - return all the files in the folder - filter by a specific file type THEN dynamically create Dataframes WITH a format of names according to the amount of Csv files retrieved *

_FilePath = r'\\Ezquest\Quality Control\Transend Programs\ConversionTest'
# Returning all files in folder ** If the folder needs to be changed - FilePath #
_FileReturn  = glob(_FilePath + '\\' + '*.csv')
#_FileReturn = [_Files for _Files in listdir(_FilePath) if isfile(join(_FilePath,_Files))]
_FileReturn = pd.DataFrame(_FileReturn)
_FileReturn.columns = ['Files']

# Returning only CSV files #
_Files  = {
        'csv_' + str(_FileReturnName): pd.read_csv(_FileReturn['Files'],sep=',',encoding='latin')
        for _FileReturnName in range(len(_FileReturn['Files']))
          }

The above code includes part of an answer from @J. Doe - Although I am returning

_Files  = {
        'csv_' + str(_FileReturnName): pd.read_csv(_FileReturn['Files'],sep=',',encoding='latin')
        for _FileReturnName in range(len(_FileReturn['Files']))
          }
Traceback (most recent call last):

  File "<ipython-input-3-a7027d4eb492>", line 3, in <module>
    for _FileReturnName in range(len(_FileReturn['Files']))

  File "<ipython-input-3-a7027d4eb492>", line 3, in <dictcomp>
    for _FileReturnName in range(len(_FileReturn['Files']))

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 702, in parser_f
    return _read(filepath_or_buffer, kwds)

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 413, in _read
    filepath_or_buffer, encoding, compression)

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\common.py", line 232, in get_filepath_or_buffer
    raise ValueError(msg.format(_type=type(filepath_or_buffer)))

ValueError: Invalid file path or buffer object type: <class 'pandas.core.series.Series'>

Any further help would be appreciated!

don't know what you try to accomplish. Post a example and desired output. — Zaraki Kenpachi
– Zaraki Kenpachi, Commented Jun 17, 2019 at 14:58
@ZarakiKenpachi -- I want to: 1) Look in a folder 2) Retrieve all the files in folder 3) Filter files by file type 4) Create new dataframes according to how many files I have ** The dataframes named according to the number of files in the folder that has been retrieved - So if I have 5 csv files they are named csv(1), csv(2)... etc — Adrian Hudson
– Adrian Hudson, Commented Jun 17, 2019 at 15:01
It's not clear to me either. You want to get all filenames of csv files in a directory. You got that, rigth? You're stuck with what follows, which is the creation of the dataframes itself. But what exactly do you want to do? — 89f3a1c
– 89f3a1c, Commented Jun 17, 2019 at 15:07
@89f3a1c Sorry for my poor explanation - I have a Dataframe with all the file names - Thats gravy, but now I want to generate a seperate Dataframe for each entry in my file list - so if i have 6 csv files - 6 dataframes get generated and attached to each file - the stipulation is the number of CSV files can change so the size has to be dynamic — Adrian Hudson
– Adrian Hudson, Commented Jun 17, 2019 at 15:12

J. Doe · Accepted Answer · 2019-06-17 15:47:06Z

1

You can consider the following

import pandas as pd
from glob import glob
path = ('path/to/csv/folder/')
# get all csv files in a folder
files = glob(path + '*.csv')
# create a dictionary and read csv files
df_dict = {'csv_' + str(k): pd.read_csv(files[k], sep=',', encoding='latin') 
           for k in range(len(files))}
# then check each dataframe by indexing df_dict['csv_1']

answered Jun 17, 2019 at 15:47

J. Doe

3,6443 gold badges26 silver badges44 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Adrian Hudson Over a year ago

Works like a charm! Thanks !

Adrian Hudson Over a year ago

It was working on my test set - now I am trying to apply it to my real data and I am returning _FileReturn = glob(_FilePath + '*.csv') TypeError: can't concat str to bytes Any ideas?

Collectives™ on Stack Overflow

Dynamically generating dataframes from another dataframe

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related