1. Home
2. Questions
3. AI Assist Labs
4. Tags
6. Challenges
7. Chat
8. Articles
9. Users
11. Jobs
12. Companies
13. Collectives
14. Communities for your favorite technologies. Explore all Collectives
Stack Internal

Stack Overflow for Teams is now called Stack Internal. Bring the best of human thought and AI automation together at your work.
Try for free Learn more
Stack Internal
Bring the best of human thought and AI automation together at your work. Learn more

Reading multiple files in python

Ask Question

Asked 7 years, 9 months ago

Modified 7 years, 9 months ago

Viewed 97 times

1

I have a dataset of more than 300k files which I need to read and append to dictionary.

corpus_path = "data"
article_paths = [os.path.join(corpus_path,p) for p in os.listdir(corpus_path)]

doc = []
for path in article_paths:
    dp = pd.read_table(path, header=None, encoding='utf-8', quoting=3, error_bad_lines=False)
    doc.append(dp)

Is there a faster way to do this, as the current method takes more than an hour.

asked Feb 24, 2018 at 15:30

John Constantine

1,0924 gold badges17 silver badges49 bronze badges

If you have ssd then you can try threads. Otherwise probably no.

freakish
– freakish

2018-02-24 15:34:27 +00:00
Commented Feb 24, 2018 at 15:34

Add a comment |

1 Answer 1

Sorted by:

1

You can use multiprocessing module.

from multiprocessing import Pool

def readFile(path):
    return pd.read_table(path, header=None, encoding='utf-8', quoting=3, error_bad_lines=False)


result = list(Pool(processes=nprocs).imap(readFile, article_paths))  #nprocs = Number of processors

answered Feb 24, 2018 at 15:45

Rakesh

82.9k17 gold badges85 silver badges122 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

Sign up or log in

Post as a guest

Name

Email

Required, but never shown

Post as a guest

Name

Email

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.