I have a dataset of more than 300k files which I need to read and append to dictionary.
corpus_path = "data"
article_paths = [os.path.join(corpus_path,p) for p in os.listdir(corpus_path)]
doc = []
for path in article_paths:
dp = pd.read_table(path, header=None, encoding='utf-8', quoting=3, error_bad_lines=False)
doc.append(dp)
Is there a faster way to do this, as the current method takes more than an hour.