Reading data blocks from a file in Python

Question

I'm new to python and am trying to read "blocks" of data from a file. The file is written something like:

# Some comment
# 4 cols of data --x,vx,vy,vz
# nsp, nskip =           2          10


#            0   0.0000000


#            1           4
 0.5056E+03  0.8687E-03 -0.1202E-02  0.4652E-02
 0.3776E+03  0.8687E-03  0.1975E-04  0.9741E-03
 0.2496E+03  0.8687E-03  0.7894E-04  0.8334E-03
 0.1216E+03  0.8687E-03  0.1439E-03  0.6816E-03


#            2           4
 0.5056E+03  0.8687E-03 -0.1202E-02  0.4652E-02
 0.3776E+03  0.8687E-03  0.1975E-04  0.9741E-03
 0.2496E+03  0.8687E-03  0.7894E-04  0.8334E-03
 0.1216E+03  0.8687E-03  0.1439E-03  0.6816E-03


#          500  0.99999422


#            1           4
 0.5057E+03  0.7392E-03 -0.6891E-03  0.4700E-02
 0.3777E+03  0.9129E-03  0.2653E-04  0.9641E-03
 0.2497E+03  0.9131E-03  0.7970E-04  0.8173E-03
 0.1217E+03  0.9131E-03  0.1378E-03  0.6586E-03

and so on

Now I want to be able specify and read only one block of data out of these many blocks. I'm using numpy.loadtxt('filename',comments='#') to read the data but it loads the whole file in one go. I searched online and someone has created a patch for the numpy io routine to specify reading blocks but it's not in mainstream numpy.

It's much easier to choose blocks of data in gnuplot but I'd have to write the routine to plot the distribution functions. If I can figure out reading specific blocks, it would be much easier in python. Also, I'm moving all my visualization codes to python from IDL and gnuplot, so it'll be nice to have everything in python instead of having things scattered around in multiple packages.

I thought about calling gnuplot from within python, plotting a block to a table and assigning the output to some array in python. But I'm still starting and I could not figure out the syntax to do it.

Any ideas, pointers to solve this problem would be of great help.

So you want the user to specify, say a pair of values (i,j) and read all the lines between the line "# i j" and the next blank line? — Pascal Bugnion
– Pascal Bugnion, Commented May 9, 2012 at 8:22
Alsmot! The exact thing I want to be able to do is to be able to specify i,j where i is the starting block and j is the final block and a block is defined by rows seperated by two or more blank rows. — toylas
– toylas, Commented May 9, 2012 at 16:13

Emmanuel · Accepted Answer · 2012-05-09 17:23:57Z

5

A quick basic read:

>>> def read_blocks(input_file, i, j):
    empty_lines = 0
    blocks = []
    for line in open(input_file):
        # Check for empty/commented lines
        if not line or line.startswith('#'):
            # If 1st one: new block
            if empty_lines == 0:
                blocks.append([])
            empty_lines += 1
        # Non empty line: add line in current(last) block
        else:
            empty_lines = 0
            blocks[-1].append(line)
    return blocks[i:j + 1]

>>> for block in read_blocks(s, 1, 2):
    print '-> block'
    for line in block:
        print line


-> block
 0.5056E+03  0.8687E-03 -0.1202E-02  0.4652E-02
 0.3776E+03  0.8687E-03  0.1975E-04  0.9741E-03
 0.2496E+03  0.8687E-03  0.7894E-04  0.8334E-03
 0.1216E+03  0.8687E-03  0.1439E-03  0.6816E-03
-> block
 0.5057E+03  0.7392E-03 -0.6891E-03  0.4700E-02
 0.3777E+03  0.9129E-03  0.2653E-04  0.9641E-03
 0.2497E+03  0.9131E-03  0.7970E-04  0.8173E-03
 0.1217E+03  0.9131E-03  0.1378E-03  0.6586E-03
>>>

Now I guess you can use numpy to read the lines...

answered May 9, 2012 at 17:23

Emmanuel

14.2k12 gold badges53 silver badges73 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

toylas Over a year ago

Thanks a lot Emmanuel!! It worked almost exactly out of box. I did need to learn a few bits and take care of spaces in my files etc. I also implemented Pascal's suggestion to split the line before appending the block. Now the final code gives me a 3D array with each block as being a plane in the 3D array. Thanks a lot for all the help! I wish I could learn python a little faster.

toylas Over a year ago

Is there a more elegant way of implementin this line? b=a.split();b=np.array(b);b=b.astype(float)

Emmanuel Over a year ago

Thanks for validating ! Concerning your question, I don't see why b=np.array(a.split()).astype(float) would not work.

toylas Over a year ago

Thanks! It seems like a much cleaner way than what I had.

Pascal Bugnion · Accepted Answer · 2012-05-09 17:05:03Z

1

The following code should probably get you started. You will probably need the re module.

You can open the file for reading using:

f = open("file_name_here")

You can read the file one line at a time by using

line = f.readline()

To jump to the next line that starts with a "#", you can use:

while not line.startswith("#"):
    line = f.readline()

To parse a line that looks like "# i j", you could use the following regular expression:

is_match = re.match("#\s+(\d+)\s+(\d+)",line)
if is_match:
    i = is_match.group(1)
    j = is_match.group(2)

See the documentation for the "re" module for more information on this.

To parse a block, you could use the following bit of code:

block = [[]] # block[i][j] will contain element i,j in your block
while not line.isspace(): # read until next blank line
    block.append(map(float,line.split(" "))) 
    # splits each line at each space and turns all elements to float
    line = f.readline()

You can then turn your block into a numpy array if you want:

block = np.array(block)

Provided you have imported numpy as np. If you want to read multiple blocks between i and j, just put the above code to read one block into a function and use it multiple times.

Hope this helps!

answered May 9, 2012 at 17:05

Pascal Bugnion

4,9381 gold badge26 silver badges29 bronze badges

1 Comment

toylas Over a year ago

Thanks a lot Pascal! I took idea of splitting the line from your post and used it in Emmanual's suggestion. Ultimately I will implement your suggestion of parsing the text from the '#' lines too but right now, I just have to get a working code and make some plots within this week! Sigh...

Collectives™ on Stack Overflow

Reading data blocks from a file in Python

2 Answers 2

4 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related