1

I have a string :

 Station Disconnect:1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.8 StaMAC:00:9F:0B:00:38:B8 BSSID:00 9F Radioid:2

I want split this string. It look like this -

'Station Disconnect:1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.8' 'StaMAC:00:9F:0B:00:38:B8' 'BSSID:00 9F' 'Radioid:2'

I tried this logic - msgRegex = re.compile('[\w\s]+:') and split function also. How can I do Please help me Thank you

7
  • Simple split with space. str.split(' ') Commented Feb 6, 2017 at 10:34
  • It does not give proper output @VivekKumar Commented Feb 6, 2017 at 10:41
  • I used split function and regex also msgRegex = re.compile('[\w\s]+:') @RedBassett Commented Feb 6, 2017 at 10:46
  • You can use the "edit" button below your original question to add this to the question! Commented Feb 6, 2017 at 10:49
  • You can use split, And manually joints these area's Station Disconnect:,BSSID:00 9F. Commented Feb 6, 2017 at 11:06

2 Answers 2

1

From what I see, you have a problem when you have a whitespace inside the matches with hex values.

Because of that, I believe you cannot use a splitting approach here. Match your tokens with a regex like

(?<!\S)\b([^:]+):((?:[a-fA-F0-9]{2}(?:[ :][a-fA-F0-9]{2})*|\S)+)\b

See the regex demo

Python code:

import re

rx = r"(?<!\S)\b([^:]+):((?:[a-fA-F0-9]{2}(?:[ :][a-fA-F0-9]{2})*|\S)+)\b"
ss = ["Station Disconnect:1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.8 StaMAC:00:9F:0B:00:38:B8 BSSID:00 9F Radioid:2",
    "Station Deassoc:1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.5 StaMac1:40:83:DE:34:04:75 StaMac2:40:83:DE:34:04:75 UserName:4083de340475 StaMac3:40:83:DE:34:04:75 VLANId:1 Radioid:2 SSIDName:Devices SessionDuration:12 APID:CN58G6749V AP Name:1023-noida-racking-zopnow BSSID:BC:EA:FA:DC:A6:F1"]
for s in ss:
    matches = re.findall(rx, s)
    print(matches)

Result:

[('Station Disconnect', '1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.8'), ('StaMAC', '00:9F:0B:00:38:B8'), ('BSSID', '00 9F'), ('Radioid', '2')]
[('Station Deassoc', '1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.5'), ('StaMac1', '40:83:DE:34:04:75'), ('StaMac2', '40:83:DE:34:04:75'), ('UserName', '4083de340475'), ('StaMac3', '40:83:DE:34:04:75'), ('VLANId', '1'), ('Radioid', '2'), ('SSIDName', 'Devices'), ('SessionDuration', '12'), ('APID', 'CN58G6749V'), ('AP Name', '1023-noida-racking-zopnow'), ('BSSID', 'BC:EA:FA:DC:A6:F1')] 

NOTE: If you need no tuples in the result, remove the capturing parentheses from the pattern.

Pattern details:

  • (?<!\S)\b - start of string or whitespace followed with a word boundary (next char must be a letter/digit or _)
  • ([^:]+) - Capturing group #1: 1+ chars other than :
  • : - a colon
  • ((?:[a-fA-F0-9]{2}(?:[ :][a-fA-F0-9]{2})*|\S)+) - Capturing group 2 matching one or more occurrences of:
    • [a-fA-F0-9]{2}(?:[ :][a-fA-F0-9]{2})* - 2 hex chars followed with zero or more occurrences of a space or : and 2 hex chars
    • | - or
    • \S - a non-whitespace char
  • \b - trailing word boundary.
Sign up to request clarification or add additional context in comments.

5 Comments

'Desc:Unknown Failure APID:CN58G6749V BSSID:BC:EA:FA:DC:A6:F0 AuthMode: 1 AP MAC:BC:EA:FA:DC:A6:E0' For this type of line it doesn't work
output shoulde be : ['Desc:Unknown Failure' 'APID:CN58G6749V' 'BSSID:BC:EA:FA:DC:A6:F0' 'AuthMode: 1' 'AP MAC:BC:EA:FA:DC:A6:E0']
Just a sec: how do you delimit the key from the value? That example ruins the whole logic, and keys cannot be distinguished from the values with regex since both keys and values can have arbitrary number of whitespace inside them.
Ok, a regex is still possible, but only if all the keys are known. If you can provide their list, I can adjust the pattern.
Keys are not fixed.Keys may change or may be the sequence is changed.
1

In this particular case you can implement it like so:

import re

a = 'Station Disconnect:1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.8 StaMAC:00:9F:0B:00:38:B8 BSSID:00 9F Radioid:2'
print re.split(r'(?<=[A-Z0-9]) (?=[A-Z])', a)

Output:

['Station Disconnect:1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.8', 'StaMAC:00:9F:0B:00:38:B8', 'BSSID:00 9F', 'Radioid:2']

Regex:

(?<=[A-Z0-9]) - Positive lookbehind for A-Z or 0-9

- 1 space character

(?=[A-Z]) - Positive look ahead for A-Z

3 Comments

Still it gives something wrong output. line = ' Station Deassoc:1.3.6.1.4.1.11.2.14.11.15.2.75.3.2.0.5 StaMac1:40:83:DE:34:04:75 StaMac2:40:83:DE:34:04:75 UserName:4083de340475 StaMac3:40:83:DE:34:04:75 VLANId:1 Radioid:2 SSIDName:Devices SessionDuration:12 APID:CN58G6749V AP Name:1023-noida-racking-zopnow BSSID:BC:EA:FA:DC:A6:F1'
Give me some time, i'll have a look.
Have you checked this problem? Thank you @MYGz

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.