2

I am trying to parse a config settings file that I am getting from stdout with an ssh script. I need to get these into key/value pairs. The config settings look something like this:

OUTPUT SETTINGS

show all    <==== TRYING TO KEEP THIS LINE FROM BEING PARSED
Active System Configuration     <==== TRYING TO KEEP THIS LINE FROM BEING PARSED
# General Information
   Unit undefined
   Subdivision undefined
   Address undefined
   Site ID undefined
   Device ID 0
# Application FOO Information
   FOO BAR AAA 0000
   FOO Checkin 0000
# LSD Status Information
   LSD not configured/built for vital parameters.
# System Time
   Local Time 01-08-14 16:13:50
   Time sync Source None
# Last Reset:
   A Processor:01-08-14 16:04:31 -- App Select Alarm Not Cleared
   B Processor:01-08-14 16:04:26 -- A Processor Initiated Reset
# Active Alarms:
   01-08-14 16:04:33 -- App Select Required
# Comm Settings - Port 1
   MAC Address 00:00:00:00:01:D3
   IP Address 172.168.0.11
   SubnetMask 255.255.255.0
   DCDC Server Enabled
   DCDC Server IP Pool Start 172.168.0.11
   DCDC Server IP Pool End 172.168.0.43
   DCDC Server Default Gateway 0.0.0.0
# Comm Settings - Port 2
   MAC Address 00:00:00:00:01:D3
   IP Address 172.168.0.11
   SubnetMask 255.255.255.0
   DCDC Server Enabled
   DCDC Server IP Pool Start 172.168.0.11
   DCDC Server IP Pool End 172.168.0.44
   DCDC Server Default Gateway 0.0.0.0
   Default Gateway 0.0.0.0
# Comm Settings - Routing Table
   Route #1 - Disabled
   Route #2 - Disabled
   Route #3 - Disabled
   Route #4 - Disabled
   Route #5 - Disabled
   Route #6 - Disabled
   Route #7 - Disabled
   Route #8 - Disabled
# Comm Settings - HTTP Settings
   HTTP TCP Port# 1000
   Inactivity timeout 60
   Trusted Source 1 Status Disabled
   Trusted Source 1 IP Addr 0.0.0.0
   Trusted Source 1 Net Mask 0.0.0.0
   Trusted Source 2 Status Disabled
   Trusted Source 2 IP Addr 0.0.0.0
   Trusted Source 2 Net Mask 0.0.0.0
# Comm Settings - Count Settings
   Count Port 1 Enabled
   Count Port 2 Enabled
   Inactivity timeout 0
   HTTP TCP Port# 23
   Trusted Source 1 Status Disabled
   Trusted Source 1 IP Addr 0.0.0.0
   Trusted Source 1 Net Mask 0.0.0.0
   Trusted Source 2 Status Disabled
   Trusted Source 2 IP Addr 0.0.0.0
   Trusted Source 2 Net Mask 0.0.0.0
# Comm Settings - SSH Settings
   SSH Port 1 Enabled
   SSH Port 2 Enabled
   SSH Inactivity timeout 0
   SSH Server Port# 10
# Comm Settings - Diagnostic Port Settings
   Bad Rate 57000
   Parity None
   Data Bits 8
   Stop Bits 1
   Flow Control Disabled
# Executive Information
   PN 050000-000
   Ver 8.09Bld0000F
   Module KMO-3
   Processor A
   Copyright FOO(C)2013
   TXT AAA0AAA0
#
   PN 050000-000
   Ver 8.09Bld0000F
   Module KMO-3
   Processor B
   Copyright FOO(C)2013
   TXT ABB0ABB0
#
   PN 050000-000
   Ver 8.09Bld0000F
   Module KMO-3
   Processor C
   Copyright FOO(C)2013
   TXT BCC0BCC0
#
   HPN 202551-001
   Ver 1.1
   Module CDU
   Processor DF123000
   Ref U2
   Copyright FOO(C)2013
   Datecode 060808

# Boot Information
   PN 072000-000
   Ver 5.12Bld002
   Module FOO-3
   Processor A
   Copyright FOO(C)2012
   TXT  DCC0DCC0
#
   PN 072000-000
   Ver 5.12Bld002
   Module FOO-3
   Processor B
   Copyright FOO(C)2012
   TXT  EFF0EFF0
#
   PN 072000-000
   Ver 5.12Bld002
   Module FOO-3
   Processor C
   Copyright FOO(C)2012
   TXT  EEE0EEE0
# BAR Application
   BAR MAP file not loaded
   BAR CONFIG file not loaded
# ROK Key Management Configuration
   Encrypted CARR Key (No CARR Key Present)
   Encrypted CARR TXT (No CARR Key Present)
   Pending Encrypted CARR Key (No Future CARR Key Present)
   Pending Encrypted CARR TXT (No Future CARR Key Present)
   RC2 Key File TXT (No RC2 Key Present)
# Vital Application Information
   Name VVDefault App
   Index 0
   EPT TXT 2578
   EPT Checkin 80DC
# Non-Vital Application Information
   Name BBDefault App
   Index 0
   EPT TXT 521D
   EPT Checkin 64E0
# ROK Vital Configuration
   ROK not configured/build for vital parameters.
# ROK Non-Vital Configuration
   ROK not configured/built for non-vital parameters.
# SNMP General Configuration
   Build incomplete - ZZ2 module may not present.
 SSH>      <==== TRYING TO KEEP THIS LINE FROM BEING PARSED

PARSER

# BNF for data

# dataGroups ::= "#" + Optional(restOfLine)
# keyword ::= ( alpha+ )+
# value ::= ( alpha+ )
# configDef ::= Group(keyname + value)


hashmark = Literal('#').suppress()
snmpCommand = Literal("show all").suppress()
sshResidue = Literal("SSH>").suppress()
keyname = Word(alphas,alphanums+'-')
value = Combine(empty + SkipTo(LineEnd()))
GCONF = Keyword("#")
configDef = Group(GCONF + value("name") + \
    Dict(OneOrMore(Group(~GCONF + keyname + value))))
configDef = Group(value("name") + \
    Dict(OneOrMore(Group(keyname + value))))

configDef.ignore(snmpCommand)
configDef.ignore(sshResidue)
configDef.ignore(hashmark)

# parse the data
ifcdata = OneOrMore(configDef).parseString(data)

for ifc in ifcdata:
    print ifc.dump()

Above is what I'm working on using pyparsing, reading through Getting Started with Pyparsing but still getting hung up. Now I have EVERYTHING parsing out, even the "show all" and "Active System Configuration". I am looking at how to omit those and then group the settings based on the "#" symbol, since that is the only similar identifier. I need the parsed data to look something like this:

PARSED DATA

['General Information',['Unit', 'undefined',],['Subdivision', 'undefined',],['Address', 'undefined'],['Site ID','undefined'],['Device ID', '0']]
['Application FOO Information',['FOO BAR', 'AAA 0000'],['FOO Checkin', '0000']]
['LSD Status Information', ['LSD', 'not configured/built for vital parameters.']]
['System Time', ['Local Time', '01-08-14 16:13:50'],['Time sync Source', 'None']]
['Last Reset:', ['A Processor', '01-08-14 16:04:31 -- App Select Alarm Not Cleared']['B Processor', '01-08-14 16:04:26 -- A Processor Initiated Reset']]
['Active Alarms:', ['01-08-14 16:04:33', 'App Select Required']]
.... and so on

I am playing with pyparsing for this because of this post over here. I really like this module. Any help is greatly appreciated. Thanks!

4
  • My pyparsing is a little rusty, but you should be able to create a list of key-value, with keys as those lines starting with #{somestring}, and a value of whatever you need to parse of the substructure. You could pull out the Executive Information section this way, and then finding everything after Ver should be easy. Commented Jan 8, 2014 at 21:33
  • @EelcoHoogendoorn I think I understand and I will give that a whirl. I'll update how it goes. Thanks! Commented Jan 9, 2014 at 1:50
  • @urbanrunic How do you propose to separate "FOO BAR AAA 0000" into "FOO BAR" + "AAA 0000"? And more generally, how do you want to divide each key-value line into a key and a value? If you don't have a general rule for this that will work across the entire config file (and it doesn't look like you do), you'll either need to write pyparsing rules for each individual category or handle each individual category separately after running the config file through pyparsing. Commented Jan 10, 2014 at 21:37
  • @senshin I have a feeling that this is going to need pyparsing rules for each category. I'm hoping that isn't the case but the areas like "FOO BAR AAA 0000" and a few others where it's hard to have a general rule to separate into key/value pairs, make it difficult not to Commented Jan 10, 2014 at 21:41

1 Answer 1

3
+50

Consider this:

from pyparsing import *
import re

data = ... # data goes here

date_regex = re.compile(r'\d\d-\d\d-\d\d')
time_regex = re.compile(r'\d\d:\d\d:\d\d')
pairs = [{'category': 'General Information',
          'kv': Group(Word(alphanums) + Word(alphanums))},
         {'category': 'Last Reset:',
          'kv': Group(Word(alphas, max=1) + Word(alphas)) + Literal(':').suppress()
                + Group(Regex(date_regex) + Regex(time_regex)
                + Optional(SkipTo(LineEnd())))
          }
         ]
# build list of categories with associated parsing rules
categories = [Word("# ").suppress() + x['category']
              + OneOrMore(Group(x['kv']))
              for x in pairs]
# account for thing you don't have specific rules for
categories.append(Word("#").suppress() + Optional(SkipTo(LineEnd())) +
                  Group(OneOrMore(Combine(Word(alphanums) + SkipTo(LineEnd()))))
                  )
# OR all the categories together
categories_ored = categories[0]
for c in categories[1:]:
    categories_ored |= c
configDef = OneOrMore(categories_ored)
suppress_tokens = ["show all", "SSH>", "Active System Configuration"]
suppresses = [Literal(x).suppress() for x in suppress_tokens]
for s in suppresses:
    configDef.ignore(s)

result = configDef.parseString(data)
for e in result:
    print(e)

This gives you the following result:

General Information
[['Unit', 'undefined']]
[['Subdivision', 'undefined']]
[['Address', 'undefined']]
[['Site', 'ID']]
[['undefined', 'Device']]
[['ID', '0']]
Application FOO Information
['FOO BAR AAA 0000', 'FOO Checkin 0000']
LSD Status Information
['LSD not configured/built for vital parameters.']
System Time
['Local Time 01-08-14 16:13:50', 'Time sync Source None']
Last Reset:
[['A', 'Processor'], ['01-08-14', '16:04:31', '-- App Select Alarm Not Cleared']]
[['B', 'Processor'], ['01-08-14', '16:04:26', '-- A Processor Initiated Reset']]
Active Alarms:
['01-08-14 16:04:33 -- App Select Required']
Comm Settings - Port 1
['MAC Address 00:00:00:00:01:D3', 'IP Address 172.168.0.11', 'SubnetMask 255.255.255.0', 'DCDC Server Enabled', 'DCDC Server IP Pool Start 172.168.0.11', 'DCDC Server IP Pool End 172.168.0.43', 'DCDC Server Default Gateway 0.0.0.0']
Comm Settings - Port 2
['MAC Address 00:00:00:00:01:D3', 'IP Address 172.168.0.11', 'SubnetMask 255.255.255.0', 'DCDC Server Enabled', 'DCDC Server IP Pool Start 172.168.0.11', 'DCDC Server IP Pool End 172.168.0.44', 'DCDC Server Default Gateway 0.0.0.0', 'Default Gateway 0.0.0.0']
Comm Settings - Routing Table
['Route #1 - Disabled', 'Route #2 - Disabled', 'Route #3 - Disabled', 'Route #4 - Disabled', 'Route #5 - Disabled', 'Route #6 - Disabled', 'Route #7 - Disabled', 'Route #8 - Disabled']
Comm Settings - HTTP Settings
['HTTP TCP Port# 1000', 'Inactivity timeout 60', 'Trusted Source 1 Status Disabled', 'Trusted Source 1 IP Addr 0.0.0.0', 'Trusted Source 1 Net Mask 0.0.0.0', 'Trusted Source 2 Status Disabled', 'Trusted Source 2 IP Addr 0.0.0.0', 'Trusted Source 2 Net Mask 0.0.0.0']
Comm Settings - Count Settings
['Count Port 1 Enabled', 'Count Port 2 Enabled', 'Inactivity timeout 0', 'HTTP TCP Port# 23', 'Trusted Source 1 Status Disabled', 'Trusted Source 1 IP Addr 0.0.0.0', 'Trusted Source 1 Net Mask 0.0.0.0', 'Trusted Source 2 Status Disabled', 'Trusted Source 2 IP Addr 0.0.0.0', 'Trusted Source 2 Net Mask 0.0.0.0']
Comm Settings - SSH Settings
['SSH Port 1 Enabled', 'SSH Port 2 Enabled', 'SSH Inactivity timeout 0', 'SSH Server Port# 10']
Comm Settings - Diagnostic Port Settings
['Bad Rate 57000', 'Parity None', 'Data Bits 8', 'Stop Bits 1', 'Flow Control Disabled']
Executive Information
['PN 050000-000', 'Ver 8.09Bld0000F', 'Module KMO-3', 'Processor A', 'Copyright FOO(C)2013', 'TXT AAA0AAA0']

['PN 050000-000', 'Ver 8.09Bld0000F', 'Module KMO-3', 'Processor B', 'Copyright FOO(C)2013', 'TXT ABB0ABB0']

['PN 050000-000', 'Ver 8.09Bld0000F', 'Module KMO-3', 'Processor C', 'Copyright FOO(C)2013', 'TXT BCC0BCC0']

['HPN 202551-001', 'Ver 1.1', 'Module CDU', 'Processor DF123000', 'Ref U2', 'Copyright FOO(C)2013', 'Datecode 060808']
Boot Information
['PN 072000-000', 'Ver 5.12Bld002', 'Module FOO-3', 'Processor A', 'Copyright FOO(C)2012', 'TXT  DCC0DCC0']

['PN 072000-000', 'Ver 5.12Bld002', 'Module FOO-3', 'Processor B', 'Copyright FOO(C)2012', 'TXT  EFF0EFF0']

['PN 072000-000', 'Ver 5.12Bld002', 'Module FOO-3', 'Processor C', 'Copyright FOO(C)2012', 'TXT  EEE0EEE0']
BAR Application
['BAR MAP file not loaded', 'BAR CONFIG file not loaded']
ROK Key Management Configuration
['Encrypted CARR Key (No CARR Key Present)', 'Encrypted CARR TXT (No CARR Key Present)', 'Pending Encrypted CARR Key (No Future CARR Key Present)', 'Pending Encrypted CARR TXT (No Future CARR Key Present)', 'RC2 Key File TXT (No RC2 Key Present)']
Vital Application Information
['Name VVDefault App', 'Index 0', 'EPT TXT 2578', 'EPT Checkin 80DC']
Non-Vital Application Information
['Name BBDefault App', 'Index 0', 'EPT TXT 521D', 'EPT Checkin 64E0']
ROK Vital Configuration
['ROK not configured/build for vital parameters.']
ROK Non-Vital Configuration
['ROK not configured/built for non-vital parameters.']
SNMP General Configuration
['Build incomplete - ZZ2 module may not present.']

I've implemented parsing for a few key-value pairs in pairs, and added a fallback for the ones that don't have specific parsing rules implemented yet (the categories.append() part). This also successfully keeps the lines you don't want ("SSH>", etc) out of the parsing output. I hope this helps.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you, this is great. This is my first stab at parsing, I was starting it with only regex parsing and found pyparsing which I thought might make this simple, until I started to dig in. You help is greatly appreciated and puts me on the right track. Thank you again!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.