I am creating a string that is about 30 million words long. As you can imagine, this takes absolutely forever to create with a for-loop increasing by about 100 words at a time. Is there a way to represent the string in a more memory-friendly way, like a numpy array? I have very little experience with numpy.
bigStr = ''
for tweet in df['text']:
bigStr = bigStr + ' ' + tweet
len(bigStr)
bigStris, and will be, a regular Pythonstrvalue, no matter what compatible typetweetmay have.