0

I'm sure this is a basic question, but I have spent about an hour on it already and can't quite figure it out. I'm parsing smartctl output, and here is the a sample of the data I'm working with:

smartctl 5.41 2011-06-09 r3365 [x86_64-linux-2.6.32-39-pve] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     TOSHIBA MD04ACA500
Serial Number:    Y9MYK6M4BS9K
LU WWN Device Id: 5 000039 5ebe01bc8
Firmware Version: FP2A
User Capacity:    5,000,981,078,016 bytes [5.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Thu Jul  2 11:24:08 2015 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

What I'm trying to achieve is pulling out the device model (some devices it's just one string, other devices, such as this one, it's two words), serial number, time, and a couple other fields. I assume it would be easiest to capture all data after the colon, but how to eliminate the variable amounts of spaces?

Here is the relevant code I currently came up with:

deviceModel = ""
serialNumber = ""
lines = infoMessage.split("\n")
for line in lines:
    parts = line.split()
    if str(parts):
        if parts[0] == "Device Model:     ":
            deviceModel = parts[1]
        elif parts[0] == "Serial Number:    ":
            serialNumber = parts[1]
vprint(3, "Device model: %s" %deviceModel)
vprint(3, "Serial number: %s" %serialNumber)

The error I keep getting is:

File "./tester.py", line 152, in parseOutput
if parts[0] == "Device Model:     ":
IndexError: list index out of range

I get what the error is saying (kinda), but I'm not sure what else the range could be, or if I'm even attempting this in the right way. Looking for guidance to get me going in the right direction. Any help is greatly appreciated.

Thanks!

3
  • 2
    If somelist[0] throws an IndexError, the list is empty. Note that if str(somelist): will always be true, as '[]' isn't ''; you mean if somelist:. Commented Jul 2, 2015 at 15:32
  • 2
    Try printing parts and see what you get. Commented Jul 2, 2015 at 15:34
  • 1
    This is a job for Regex! Commented Jul 2, 2015 at 15:37

7 Answers 7

2

The IndexError occurs when the split returns a list of length one or zero and you access the second element. This happens when it isn't finding anything to split (empty line).

No need for regular expressions:

deviceModel = ""
serialNumber = ""
lines = infoMessage.split("\n")

for line in lines:
    if line.startswith("Device Model:"):
        deviceModel = line.split(":")[1].strip()
    elif line.startswith("Serial Number:"):
        serialNumber = line.split(":")[1].strip()

print("Device model: %s" %deviceModel)
print("Serial number: %s" %serialNumber)
Sign up to request clarification or add additional context in comments.

1 Comment

Dude, you freaking rock. This works perfectly, and I understand it very well! Thanks so much!!
0

I guess your problem is the empty line in the middle. Because,

>>> '\n'.split()
[]

You can do something like,

>>> f = open('a.txt')
>>> lines = f.readlines()
>>> deviceModel = [line for line in lines if 'Device Model' in line][0].split(':')[1].strip()
# 'TOSHIBA MD04ACA500'
>>> serialNumber = [line for line in lines if 'Serial Number' in line][0].split(':')[1].strip()
# 'Y9MYK6M4BS9K'

Comments

0

Try using regular expressions:

import re

r = re.compile("^[^:]*:\s+(.*)$")
m = r.match("Device Model:     TOSHIBA MD04ACA500")
print m.group(1)   # Prints "TOSHIBA MD04ACA500"

Comments

0

Not sure what version you're running, but on 2.7, line.split() is splitting the line by word, so

>>> parts = line.split()
parts = ['Device', 'Model:', 'TOSHIBA', 'MD04ACA500']

You can also try line.startswith() to find the lines you want https://docs.python.org/2/library/stdtypes.html#str.startswith

2 Comments

I am running 2.7, and @InSilico gave me an example using line.startswith() which worked like a charm.
Just saw his reply, that is pretty nice!
0

The way I would debug this is by printing out parts at every iteration. Try that and show us what the list is when it fails.

Edit: Your problem is most likely what @jonrsharpe said. parts is probably an empty list when it gets to an empty line and str(parts) will just return '[]' which is True. Try to test that.

Comments

0

I think it would be far easier to use regular expressions here.

import re

for line in lines:
    # Splits the string into at most two parts
    # at the first colon which is followed by one or more spaces
    parts = re.split(':\s+', line, 1)
    if parts:
        if parts[0] == "Device Model":
            deviceModel = parts[1]
        elif parts[0] == "Serial Number":
            serialNumber = parts[1]

Mind you, if you only care about the two fields, startswith might be better.

Comments

0

When you split the blank line, parts is an empty list. You try to accommodate that by checking for an empty list, But you turn the empty list to a string which causes your conditional statement to be True.

>>> s = []
>>> bool(s)
False
>>> str(s)
'[]'
>>> bool(str(s))
True
>>> 

Change if str(parts): to if parts:.

Many would say that using a try/except block would be idiomatic for Python

for line in lines:
    parts = line.split()
    try:
        if parts[0] == "Device Model:     ":
            deviceModel = parts[1]
        elif parts[0] == "Serial Number:    ":
            serialNumber = parts[1]
    except IndexError:
        pass

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.