Extract a column from a string in Python

Question

I run a command remotely using Python and this is the output I get:

Vserver   Volume       Aggregate    State      Type       Size  Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
vs_cfm06  Available    aggr_backup_1 online    RW        100GB    66.37GB   33%
vs_cfm06  Discovery    aggr_backup_1 online    RW        100GB    66.36GB   33%
vs_cfm06  NonDebugCF01 aggr_backup_1 online    RW        100GB    64.63GB   35%
vs_cfm06  NonDebugCF01_BACKUP aggr_backup_1 online RW      5GB     4.75GB    5%
vs_cfm06  Software     aggr_backup_1 online    RW        100GB    65.08GB   34%
vs_cfm06  Template     aggr_backup_1 online    RW        100GB    66.35GB   33%
vs_cfm06  breakresetafterfaildelCF01 aggr_backup_1 online RW 100GB 69.52GB  30%
vs_cfm06  breakresetafterfaildelCF01_BACKUP aggr_backup_1 online RW 5GB 4.75GB  5%
vs_cfm06  rootvol      aggr_backup_1 online    RW          1GB    972.5MB    5%
vs_cfm06  vol          aggr_backup_1 online    RW          1GB    972.6MB    5%
10 entries were displayed.

How do I extract one column from this so my output is something like this:

Available   
Discovery    
NonDebugCF01 
NonDebugCF01_BACKUP 
Software     
Template     
breakresetafterfaildelCF01 
breakresetafterfaildelCF01_BACKUP 
rootvol      
vol

The code to run the command and print the output is this:

def get_volumes(usrname, ip):

    raw_output = ru.run('volume show', user=usrname, host=ip, set_e=False) //logs onto netapp and runs command

    print raw_output

When I run print type(raw_output) it says it's unicode. Any help would be much appreciated.

Have you tried anything in the direction of your goal?

Bhargav Rao
– Bhargav Rao

2015-01-15 13:52:03 +00:00
Commented Jan 15, 2015 at 13:52 — Bhargav Rao
– Bhargav Rao, Commented Jan 15, 2015 at 13:52
So, is that table stored in a string object?

nbro
– nbro

2015-01-15 13:53:42 +00:00
Commented Jan 15, 2015 at 13:53 — nbro
– nbro, Commented Jan 15, 2015 at 13:53

gboffi · Accepted Answer · 2015-01-15 15:21:47Z

Reading Columns from a File

A text file is inherently row oriented, when you open it in a text editor you see, and you can operate on, lines of text.

This inherent structure is reflected in an idiomatic way of slurping a text file content using python:

data = [line for line in file(fname)]

data being a list of strings corresponding to the rows of the file.

Sometimes the text is more structured and you can see that there is a columnar organization in it. For the sake of simplicity, say that we have

an initial line of headers,
possibly some line of junk and
a number of lines containing the actual data,

moreover we assume that every relevant line contains the same number of columns.

An idiom that you can use is

data = [line.split() for line in file(fname)]

here data is now a list of lists, one sublist for each row of the file, each sublist a list of the strings obtained splitting column-wise a row.

Reordering in Columns

While you can access every single data item by data[row][column] it may be more convenient to refer to data using the headers, as in data['Aggregate'][5]... In python, to address data using a string you usually use a dictionary, and you can build a dictionary using what is called a dictionary comprehension

n = 2 # in your example data
data_by_rows = [line.split() for line in file(fname)]
data_by_cols = {col[0]:list(col[n:]) for col in zip(*data_by_rows)}

This works because the idiom zip(*list_of_rows) returns you a list_of_cols.

>>> a = [[1,2,3],[10,20,30]]
>>> zip(*a)
[(1, 10), (2, 20), (3, 30)]
>>>

Moving On

What we have seen is simple and convenient to use if the file format is simple and the manipulations you want to do are not involved. For more complex formats and/or manipulation requirements, python offers a number of options, either in the standard library

the csv module eases the task of reading (and writing as well) comma(/tab) separated values files,

or as optional maodules

the numpy module, aimed to numerical analysis, has facilites for slurping all data from a text file and putting them in an array structure,
the pandas module, aimed at data analysis and modeling, built on numpy, also has facilities to turn a structured text file into a dataframe structure.

Also, data_by_cols needs braces instead of brackets. I could not edit it myself since the fix would be less than 6 characters.

qsantos · Accepted Answer · 2015-01-15 15:12:34Z

0

There are two handy functions for what you want: readlines() splits a files in lines and str.split() splits a string (by default, using any whitespace as separator).

with open("input.txt") as f:
     lines = f.readlines()

for line in lines[2:]:
     columns = line.split()
     print(columns[1])

An alternative way to it without using readlines() would be:

with open("input.txt") as f:
     content = f.read()  # does not detect lines

lines = content.split("\n")
for line in lines[2:]:
     columns = line.split()
     print(columns[1])

Finally, you may be handling files whose line termination is either "\n", (GNU/Linux), "\r\n" (Windows) or "\r" (Mac OS). Then you have to use the re module:

with open("input.txt") as f:
     content = f.read()  # does not detect lines

lines = re.split("\r?\n?", content)
for line in lines[2:]:
     columns = line.split()
     print(columns[1])

edited Jan 15, 2015 at 15:12

answered Jan 15, 2015 at 14:16

qsantos

2,2321 gold badge15 silver badges15 bronze badges

4 Comments

Jennifer M Over a year ago

but the raw output is a variable, not a file

qsantos Over a year ago

Then you can use variable.split("\n") instead of readlines(). I'll put it in the answer.

Neetz Over a year ago

what if there is a blank data in some column?

qsantos Over a year ago

My answer was specifically for the original question, where columns are misaligned. If your columns are well aligned, you can just use slices to get the relevant data for each line; for instance column4 = line[42:57]. Now, if your columns are both misaligned and include empty fields, it will get very tricky to parse correctly.

Collectives™ on Stack Overflow

Extract a column from a string in Python

2 Answers 2

Reading Columns from a File

Reordering in Columns

Moving On

2 Comments

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Reading Columns from a File

Reordering in Columns

Moving On

2 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related