1

I'm trying to write a Python script to extract the Wi-Fi data from txt file to csv

Here is the txt data:

Wed Oct  7 09:00:01 UTC 2020

BSS 02:ca:fe:ca:ca:40(on ap0_1)
freq: 2422
capability: IBSS (0x0012)
signal: -60.00 dBm
primary channel: 3
last seen: 30 ms ago
BSS ac:86:74:0a:73:a8(on ap0_1)
TSF: 229102338752 usec (2d, 15:38:22)
freq: 2422
capability: ESS (0x0421)
signal: -62.00 dBm
primary channel: 3

I need to extract the txt data to csv file in this format:

 Time                        | BSS                       | freq |capability   |signal| primary channel |                                                
 ----------------------------+---------------------------+------+-------------+------+-----------------+                  
 Wed Oct  7 09:00:01 UTC 2020|02:ca:fe:ca:ca:40(on ap0_1)| 2422 |IBSS (0x0012)|-60.00|             3   |
                             |ac:86:74:0a:73:a8(on ap0_1)| 2422 |IBSS (0x0012)|-62.00|             3   |

This is my unfinished code:

import csv
import re

fieldnames = ['TIME', 'BSS', 'FREQ','CAPABILITY', 'SIGNAL', 'CHANNEL']

re_fields = re.compile(r'({})+:\s(.*)'.format('|'.join(fieldnames)), re.I)

with open('ap0_1.txt') as f_input, open('ap0_1.csv', 'w', newline='') as f_output:
    csv_output = csv.DictWriter(f_output, fieldnames= fieldnames)
    csv_output.writeheader()
    start = False

    for line in f_input:
        line = line.strip()

        if len(line):
            if 'BSS' in line:
                if start:
                    start = False
                    block.append(line)
                    text_block = '\n'.join(block)

                    for field, value in re_fields.findall(text_block):
                        entry[field.upper()] = value

                    if line[0] == 'on ap0_1':
                        entry['BSS'] = block[0]

                    csv_output.writerow(entry)

                else:
                    start = True
                    entry = {}
                    block = [line]
            elif start:
                block.append(line)

When I run it, the data isn't placed correctly.

enter image description here

Please let me know how to fix this. I'm just a beginner in programming and would appreciate any advice. Thank you.

4
  • Please add the desired and observed output for the input samples to your question. Commented Oct 19, 2020 at 4:11
  • Hello Klaus D. I've added the desired output. Commented Oct 19, 2020 at 4:20
  • The question is confusing. You say "here is the data", and you also say "the data is in this format", and those two examples are wildly different. What does the input data actually look like? Commented Oct 19, 2020 at 4:40
  • Hi John Gordon, I'm sorry for confusing you. i've edited the question Commented Oct 19, 2020 at 5:00

3 Answers 3

1

Using str.startswith

Ex:

import csv

fieldnames = ('TIME', 'BSS', 'freq','capability', 'signal', 'primary channel')
with open(filename) as f_input, open(outfile,'w', newline='') as f_output:
    csv_output = csv.DictWriter(f_output, fieldnames= fieldnames)
    csv_output.writeheader()
    result = {"TIME": next(f_input).strip()}   #Get Time, First Line
    for line in f_input:
        line = line.strip()
        if line.startswith(fieldnames):
            if line.startswith('BSS'):
                key, value = line.split(" ", 1)
            else:
                key, value = line.split(": ")
            result[key] = value
            
    csv_output.writerow(result)

EDIT as per comment

If you have multiple blocks of the above text

import re
import csv

week_ptrn = re.compile(r"\b(" + "|".join(('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun')) + r")\b")
fieldnames = ('TIME', 'BSS', 'freq','capability', 'signal', 'primary channel')

with open(filename) as f_input, open(outfile,'w', newline='') as f_output:
    csv_output = csv.DictWriter(f_output, fieldnames= fieldnames)
    csv_output.writeheader()
    result = []    #Get Time, First Line
    for line in f_input:
        line = line.strip()
        week = week_ptrn.match(line)
        if week:
            result.append({"TIME": line})
            
        if line.startswith(fieldnames):
            if line.startswith('BSS'):
                key, value = line.split(" ", 1)
            else:
                key, value = line.split(": ")
            result[-1][key] = value
            
    csv_output.writerows(result)
Sign up to request clarification or add additional context in comments.

3 Comments

Hi Rakesh, the code only extract the 1st BSS but not the 2nd BSS
Sorry i do not understand...Do you have multiple blocks of the above string in the text file?
Yes, each block start with BSS and end with primary channel
0

You tried to search time with "TIME". But there is no "TIME" in input data. So output with empty time is a natural.

And I think follow lines also have problem.

            if line[0] == 'on ap0_1':
                entry['BSS'] = block[0]

In my guess, you tried to find on ap0_1 of BSS ac:86:74:0a:73:a8(on ap0_1). But line[0] is 'BSS', first of ['BSS', 'ac:86:74:0a:73:a8(on', 'ap0_1)']. It should changed like this:

            if 'on ap0_1' in block[0]:
                entry['BSS'] = block[0][4:].lstrip()

Comments

0

Here is my version of the code.

import csv, re

fieldnames = ['TIME', 'BSS', 'FREQ','CAPABILITY', 'SIGNAL', 'CHANNEL']
re_fields = re.compile(r'({})+:\s(.*)'.format('|'.join(fieldnames)), re.I)

with open('ap0_1.txt') as f_input, open('ap0_1.csv', 'w', newline='') as f_output:
    csv_output = csv.DictWriter(f_output, fieldnames= fieldnames)
    csv_output.writeheader()
    start = False
 
    time_condition = lambda @l: l.startswith('Mon') or l.startswith('Tue') or \ 
                     l.startswith('Wed') or l.startswith('Thu') or l.startswith('Fri') \ 
                     or l.startswith('Sat') or l.startswith('Sun')
    
    row = dict{}
    for line in f_input:
        line = line.strip()
        if not line:
            continue
        elif time_condition(line):
            row['TIME'] = line
        else:
            # not sure how you define the start of a new block, say, it is by 'BSS' string
            key, value = line.split(' ', 1) # split one time exactly
            key = key.rstrip(':').upper()
            if key == 'BSS' and row:
                row = (row.get(k, '') for k in fieldnames)
                csv_output.writerow(row)
                row = dict()
  
            row[key.upper()] = value
    row = (row.get(k, '') for k in fieldnames)
    csv_output.writerow(row)   

It looks like '\n' creates blank rows.

1 Comment

Hi @kate-melnykova, when I tried running the code it said block['TIME'] = line is not define.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.