0

I would like to extract data from a txt file while removing the text present in the file using python.

I have a file, say ABC.txt as follows:

STEP = 1

22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
STEP = 2

22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
STEP = 3

22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000
22.530183726628522 0.0000000000000000

disregarding the 'STEP = ' and the following space, I want to store all the numeric data into a numpy array.

I tried the following script that worked :

import numpy as np

with open("ABC.txt", "r") as f:
    lines = f.readlines()
    

data =np.zeros([24,2])

kk=0

for ii in range(3):         
  
    for jj in range(10*ii+2, 10*ii+9+1):
    
        data[kk,:] = np.fromstring(lines[jj], dtype=float, sep=' ')
        kk=kk+1

Is there a more direct way of doing this operation ?

4
  • Where are you disregarding the step? Commented Feb 17, 2021 at 3:39
  • @MadPhysicist I think the two loops are iterating over the right line numbers. Commented Feb 17, 2021 at 3:43
  • A common way is read the file line by line. If the line has data, split and append to a list. np.array(alist, dtype=float) will convert the list of lists to a numeric array. The step lines can be ignored or used to start a new group. Commented Feb 17, 2021 at 3:55
  • Sorry it's late. Of course they are Commented Feb 17, 2021 at 4:08

2 Answers 2

1

You can try this:

import re
with open("abc.txt") as f:
    s = f.read()

# get a list of all lines of the text file which start with a digit
lines =  re.findall(r"^\d.*", s, re.M)

# split every line at the space character and convert 
# the resulting substrings into floats 
numlist = [list(map(float, line.split())) for line in lines]

# convert the resulting list of lists of floats into a numpy array
np.array(numlist)
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you ! but could you please explain what are we doing here ?
I added comments to the code. Let me know if this is sufficient.
1

Alternatively, if you don't have access to external libraries and still want to perform this task. You can do the following:

with open("ABC.txt", "r") as f:
    lines = f.readlines()

arr = list()

for line in lines:
    if line[0].isdecimal(): # for every line see if it begins with a decimal number
        arr.append(line.split())

The above can also be done with list comprehensions as follows, both will give same results:

arr1 = [line.split() for line in lines if line[0].isdecimal()]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.