I have extracted some html using BeautifulSoup, and created a function to get the useful information only. I intend to run this function for multiple keywords, and put them in a dataframe. However, I cannot get to all lists into the pandas DataFrame.
Example:
words = ['header', 'title', 'number']
The following code gets me lists all headers, titles and numbers and are all the same length.
def create_list(x):
column = []
BRKlist = BRK.find_all(x)
for n in BRKlist:
drop_beginning = r'<'+x+'>'
drop_end = r'</'+x+'>'
no_beginning = re.sub(drop_beginning, '', str(n))
final = re.sub(drop_end, '', str(no_beginning))
column.append(final)
print(column)
This code outputs:
['header1', 'header2', 'header3']
['title1', 'title2', 'title3']
['number1', 'number2', 'number3']
I am looking for something to get 1 dataframe that gives me a DataFrame that looks like this:
| header | title | number |
|---|---|---|
| header1 | title1 | number1 |
| header2 | title2 | number2 |
| header3 | title3 | number3 |
Getting the lists was no problem, but when I make an empty data frame:
df = pd.DataFrame({x: []})
and try to append the columns, I get the following error:
TypeError: unhashable type: 'list'
Is there any way to circumvent this, or any other/easier way to "append columns"?
create_listor outside? As it stands, this function doesn't return anything; just prints lists.