I have multiple csv files in the form of a star schema. To perform analytics using Python, is it better to combine all these csv files into one csv file, or to extract data from each csv file and then do analytics? People online have almost always combined all files into one and have then performed analytics. However, combining all csv files would eliminate my star schema. I currently have approximately 25,000 rows and 10 columns in each csv file. The size of each csv file is around 7 MB. Thank you in advance for your help.
-
@RoadRunner Should I combine all files into one big file, or read multiple files and then do analytics from the multiple files?pack24– pack242018-07-04 04:13:57 +00:00Commented Jul 4, 2018 at 4:13
-
1How many csv files are there? Is it a big issue if you remove your star schema? I'm assuming you have 6 csv files, from your previous question. If this is the case, if you combine the files together, the file will be around 42MB, which shouldn't be a problem. Then you only have to read one file. Otherwise, just read the files seperately.RoadRunner– RoadRunner2018-07-04 04:19:29 +00:00Commented Jul 4, 2018 at 4:19
-
@RoadRunner Thank you for your help! I'll combine all files into one and proceed.pack24– pack242018-07-04 04:37:57 +00:00Commented Jul 4, 2018 at 4:37
Add a comment
|