I wish to extract the data from a txt file which is given below and store in to a pandas Dataframe that has 8 columns.
Lorem | Ipsum | is | simply | dummy
text | of | the | printing | and
typesetting | industry. | Lorem
more | recently | with | desktop | publishing | software | like | Aldus
Ipsum | has | been | the | industry's
standard | dummy | text | ever | since | the | 1500s
took | a | galley | of | type | and
scrambled | it | to | make | a | type | specimen | book
It | has | survived | not | only | five | centuries, | but
the | leap | into | electronic | typesetting
remaining | essentially | unchanged
It | was | popularised | in | the | 1960s | with | the
Lorem | Ipsum | passages, | and
PageMaker | including | versions | of | Lorem | Ipsum
Data on each line is separated by a pipe sign which refers to a data inside each cell of a row and column. My end goal is to have the data inserted in dataframe as per below format.
Column 1 | Column 2 | Column 3 | Column 4 | Column 5 | Column 6 | Column 7 | Column 8
-------------------------------------------------------------------------------------
Lorem | Ipsum | is | simply | dummy |
text | of | the | printing | and |
typesetting| industry. | Lorem |
more | recently | with | desktop | publishing| software | like | Aldus |
and so on.....
I performed below but I am unable to add data dynamically into dataframe.
import pandas as pd
with open(file) as f:
data = f.read().split('\n')
columns = ['Column 1', 'Column 2', 'Column 3', 'Column 4', 'Column 5', 'Column 6', 'Column 7', 'Column 8']
df = pd.DataFrame(columns=columns)
for i in data:
row = i.split(' | ')
df = df.append({'Column 1': f'{row[0]}', 'Column 2': f'{row[1]}', 'Column 3': f'{row[2]}', 'Column 4': f'{row[3]}', 'Column 5': f'{row[4]}'}, ignore_index = True)
Above is manual way of adding row's cells to a dataframe, but I require the dynamic way i.e. how do append the rows so as whatever may be number of cells in row, it may get added.