1

I'm new to Python, as a part of of Selenium web scraping project, I managed to pull the data I need and turned it into a list as below;

clean_data = ['06/25/21 (w)', '1', '105', '382', '0.27', '11,396', '8,654', '1.32', '40.56%', '07/02/21 (w)', '8', '43', '80', '0.54', '6,480', '6,288', '1.03', '32.19%', '07/09/21 (w)', '15', '30', '251', '0.12', '1,062', '458', '2.32', '30.51%', '07/16/21 (m)', '22', '198', '235', '0.84', '87,464', '74,588', '1.17', '31.20%', '07/23/21 (w)', '29', '16', '28', '0.57', '1,043', '1,387', '0.75', '32.20%', '07/30/21 (w)', '36', '33', '15', '2.20', '686', '482', '1.42', '32.21%', '08/20/21 (m)', '57', '111', '171', '0.65', '1,211', '951', '1.27', '32.86%', '10/15/21 (m)', '113', '5', '41', '0.12', '16,005', '10,111', '1.58', '34.58%', '12/17/21 (m)', '176', '76', '258', '0.29', '35,904', '59,572', '0.60', '35.43%', '01/21/22 (m)', '211', '6', '72', '0.08', '2,124', '6,998', '0.30', '34.90%', '01/20/23 (m)', '575', '15', '19', '0.79', '2,697', '2,217', '1.22', '34.75%']

The original table has 9 columns and can have n rows depending on source data.

I would like to turn this list into a 9 columns X n rows table, the table should look something like this

06/25/21 (w) ___ 1 ___ 105 ___ 382 ___ 0.27 ___ 11,396 ___ 8,654 ___ 1.32 ___ 40.56%
07/02/21 (w) ___ 8 ___ 43 ___ 80 ___ 0.54 ___ 6,480 ___ 6,288 ___ 1.03 ___ 32.19%
07/16/21 (m) ___ 22 ___ 198 ___ 235 ___ 0.84 ___ 87,464 ___ 74,588 ___ 1.17 ___ 31.20%
07/23/21 (w) ___ 29 ___ 16 ___ 28 ___ 0.57 ___ 1,043 ___ 1,387 ___ 0.75 ___ 32.20%
07/30/21 (w) ___ 36 ___ 33 ___ 15 ___ 2.20 ___ 686 ___ 482 ___ 1.42 ___ 32.21%
08/20/21 (m) ___ 57 ___ 111 ___ 171 ___ 0.65 ___ 1,211 ___ 951 ___ 1.27 ___ 32.86%
10/15/21 (m) ___ 113 ___ 5 ___ 41 ___ 0.12 ___ 16,005 ___ 10,111 ___ 1.58 ___ 34.58%
12/17/21 (m) ___ 176 ___ 76 ___ 258 ___ 0.29 ___ 35,904 ___ 59,572 ___ 0.60 ___ 35.43%
01/21/22 (m) ___ 211 ___ 6 ___ 72 ___ 0.08 ___ 2,124 ___ 6,998 ___ 0.30 ___ 34.90%
01/20/23 (m) ___ 575 ___ 15 ___ 19 ___ 0.79 ___ 2,697 ___ 2,217 ___ 1.22 ___ 34.75%

Any guidance would be highly appreciated.

Many thanks,
MT

1
  • Do you need to use selenium to scrape the table from the website or can you use pandas: pd.read_html(...) Commented Jun 24, 2021 at 15:52

1 Answer 1

1

If you want just format the clean_data list to text table, you can do:

N = 9
max_len = max(len(w) for w in clean_data)
f = "{:<" + str(max_len + 2) + "}"

for i in range(0, len(clean_data), N):
    print((f * N).format(*clean_data[i : i + N]))

Prints:

06/25/21 (w)  1             105           382           0.27          11,396        8,654         1.32          40.56%        
07/02/21 (w)  8             43            80            0.54          6,480         6,288         1.03          32.19%        
07/09/21 (w)  15            30            251           0.12          1,062         458           2.32          30.51%        
07/16/21 (m)  22            198           235           0.84          87,464        74,588        1.17          31.20%        
07/23/21 (w)  29            16            28            0.57          1,043         1,387         0.75          32.20%        
07/30/21 (w)  36            33            15            2.20          686           482           1.42          32.21%        
08/20/21 (m)  57            111           171           0.65          1,211         951           1.27          32.86%        
10/15/21 (m)  113           5             41            0.12          16,005        10,111        1.58          34.58%        
12/17/21 (m)  176           76            258           0.29          35,904        59,572        0.60          35.43%        
01/21/22 (m)  211           6             72            0.08          2,124         6,998         0.30          34.90%        
01/20/23 (m)  575           15            19            0.79          2,697         2,217         1.22          34.75%        

EDIT: To create a dataframe from clean_data:

N = 9
tmp = []
for i in range(0, len(clean_data), N):
    tmp.append(clean_data[i : i + N])

df = pd.DataFrame(tmp)
print(df)

Prints:

               0    1    2    3     4       5       6     7       8
0   06/25/21 (w)    1  105  382  0.27  11,396   8,654  1.32  40.56%
1   07/02/21 (w)    8   43   80  0.54   6,480   6,288  1.03  32.19%
2   07/09/21 (w)   15   30  251  0.12   1,062     458  2.32  30.51%
3   07/16/21 (m)   22  198  235  0.84  87,464  74,588  1.17  31.20%
4   07/23/21 (w)   29   16   28  0.57   1,043   1,387  0.75  32.20%
5   07/30/21 (w)   36   33   15  2.20     686     482  1.42  32.21%
6   08/20/21 (m)   57  111  171  0.65   1,211     951  1.27  32.86%
7   10/15/21 (m)  113    5   41  0.12  16,005  10,111  1.58  34.58%
8   12/17/21 (m)  176   76  258  0.29  35,904  59,572  0.60  35.43%
9   01/21/22 (m)  211    6   72  0.08   2,124   6,998  0.30  34.90%
10  01/20/23 (m)  575   15   19  0.79   2,697   2,217  1.22  34.75%
Sign up to request clarification or add additional context in comments.

1 Comment

Hi Andrej, I tried to turn this into a data frame but seems like each row is one cell, how can I turn each row to 9 separate cells?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.