0

I need help generating a dynamic list of files to be opened by pandas.read_csv()

_ExtensionToLookFor = '.csv'
# Setting up boolean to check for CSV
_isCsv = None
# Returning all files in folder ** If the folder needs to be changed - FilePath #
_FileReturn = [_Files for _Files in listdir(_FilePath) if isfile(join(_FilePath,_Files))]
_FileReturn = pd.DataFrame(_FileReturn)
_FileReturn.columns = ['Files']
# Returning only CSV files #
_FileReturn = _FileReturn[_FileReturn['Files'].str.contains('.csv')]

def SpcFormat(_FileReturn):
    def __init__(self,_FileReturn):
        self._FileReturn = _FileReturn
    def __DataframeCreation__(self):
        _FileReturn = self._FileReturn
        for i in _FileReturn:
            StartInt = 1

I am having trouble accomplishing this, near the bottom I am trying to iterate over the list and name the dataframes equivalent to the count position.

So in pseudo code it should go like this

For Files in _FileReturn:
Create New DataFrame(StartInt) = Pandas.Read_csv(DataFrame(IntPosition)+_ExtensionToLookFor)
StartInt ++ // Add One

Thanks !

*Edit: For clarity what I am trying to do is check a folder - return all the files in the folder - filter by a specific file type THEN dynamically create Dataframes WITH a format of names according to the amount of Csv files retrieved *

_FilePath = r'\\Ezquest\Quality Control\Transend Programs\ConversionTest'
# Returning all files in folder ** If the folder needs to be changed - FilePath #
_FileReturn  = glob(_FilePath + '\\' + '*.csv')
#_FileReturn = [_Files for _Files in listdir(_FilePath) if isfile(join(_FilePath,_Files))]
_FileReturn = pd.DataFrame(_FileReturn)
_FileReturn.columns = ['Files']

# Returning only CSV files #
_Files  = {
        'csv_' + str(_FileReturnName): pd.read_csv(_FileReturn['Files'],sep=',',encoding='latin')
        for _FileReturnName in range(len(_FileReturn['Files']))
          }

The above code includes part of an answer from @J. Doe - Although I am returning

_Files  = {
        'csv_' + str(_FileReturnName): pd.read_csv(_FileReturn['Files'],sep=',',encoding='latin')
        for _FileReturnName in range(len(_FileReturn['Files']))
          }
Traceback (most recent call last):

  File "<ipython-input-3-a7027d4eb492>", line 3, in <module>
    for _FileReturnName in range(len(_FileReturn['Files']))

  File "<ipython-input-3-a7027d4eb492>", line 3, in <dictcomp>
    for _FileReturnName in range(len(_FileReturn['Files']))

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 702, in parser_f
    return _read(filepath_or_buffer, kwds)

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 413, in _read
    filepath_or_buffer, encoding, compression)

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\common.py", line 232, in get_filepath_or_buffer
    raise ValueError(msg.format(_type=type(filepath_or_buffer)))

ValueError: Invalid file path or buffer object type: <class 'pandas.core.series.Series'>

Any further help would be appreciated!

4
  • don't know what you try to accomplish. Post a example and desired output. Commented Jun 17, 2019 at 14:58
  • @ZarakiKenpachi -- I want to: 1) Look in a folder 2) Retrieve all the files in folder 3) Filter files by file type 4) Create new dataframes according to how many files I have ** The dataframes named according to the number of files in the folder that has been retrieved - So if I have 5 csv files they are named csv(1), csv(2)... etc Commented Jun 17, 2019 at 15:01
  • It's not clear to me either. You want to get all filenames of csv files in a directory. You got that, rigth? You're stuck with what follows, which is the creation of the dataframes itself. But what exactly do you want to do? Commented Jun 17, 2019 at 15:07
  • 1
    @89f3a1c Sorry for my poor explanation - I have a Dataframe with all the file names - Thats gravy, but now I want to generate a seperate Dataframe for each entry in my file list - so if i have 6 csv files - 6 dataframes get generated and attached to each file - the stipulation is the number of CSV files can change so the size has to be dynamic Commented Jun 17, 2019 at 15:12

1 Answer 1

1

You can consider the following

import pandas as pd
from glob import glob
path = ('path/to/csv/folder/')
# get all csv files in a folder
files = glob(path + '*.csv')
# create a dictionary and read csv files
df_dict = {'csv_' + str(k): pd.read_csv(files[k], sep=',', encoding='latin') 
           for k in range(len(files))}
# then check each dataframe by indexing df_dict['csv_1']
Sign up to request clarification or add additional context in comments.

2 Comments

Works like a charm! Thanks !
It was working on my test set - now I am trying to apply it to my real data and I am returning _FileReturn = glob(_FilePath + '*.csv') TypeError: can't concat str to bytes Any ideas?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.