Now I have written a parser to extract the information from raw html source code, which could return them as a tuple, and I have to loop this function and use the return to construct a DataFrame (each loop's return as a row). Here's what I have done:
import pandas as pd
import leveldb
for key, value in db.RangeIter():
html = db.Get(key)
result = parser(html)
df = df.append(pd.Series(result, index = index), ignore_index = True)
Note that parser and index are already defined, and db is a leveldb object which store all links and corresponding html source code. My problem is what's the more efficient way to construct that DataFrame? THANKS!
len(tuple)columns? If the former, you're probably better off just appending to a simple list, then converting that list to a series after the for loop.len(tuple)columns.