0

I've been stumped on this one for a bit now. I have the following string:

LAT:  6.90N    LON: 80.58E    ELEV: 1097.6M

I need to extract 6.90N,80.58E, and 1097.6M.

The problem is that I iterate through other files with similar formats. There are a few files with missing values or other characters (i.e. ***** if no value is present).

I want to be able to capture these as best as possible. Is there a way to write a regular expression to capture the values between LAT:, LON:, and ELEV: without including the spaces?

3
  • 3
    regex101.com/r/xJ4sF5/2 Commented Apr 20, 2015 at 21:12
  • Also check out stream-based parsing. I find that much faster for things like this. Commented Apr 20, 2015 at 21:13
  • Can you show a line, which you do not want to be matched? Are the values tab separated? Commented Apr 20, 2015 at 21:14

3 Answers 3

3

How about this:

>>> s = "LAT: 6.90N LON: 80.58E ELEV: 1097.6M"

>>> m = re.findall(r'(\d+\.\d+[A-Z])', s)

>>> print m
['6.90N', '80.58E', '1097.6M']

broken down:

(            # start of capturing group
\d+          # one or more numbers
\.           # a dot(escaped)
\d+          # one or more numbers
[A-Z]        # a letter
)            # end of capturing group
Sign up to request clarification or add additional context in comments.

2 Comments

I think you could get away with just re.findall("(\d+\.\d+[A-Z])",s)
@JoranBeasley Right you are.
1

You don't need a regex for this:

input_str = 'LAT:  6.90N    LON: 80.58E    ELEV: 1097.6M'
# Split into strings separated by whitespace
parts = input_str.split()
# Take every other item from the list, skipping the first
lat, lon, elev = parts[1::2]

If every line has the format that it has "variables" separated by whitespace but there can be different sets of variables, you can just use a dictionary:

def line_to_dict(input_str):
  parts = input_str.split()
  return dict(itertools.izip(parts[::2], parts[1::2]))

Comments

1

Given your current question there is no need for re and I would just do it like this:

s = 'LAT: 6.90N LON: 80.58E ELEV: 1097.6M'
l = s.split()
if l[1] != '*'*len(l[1]):
    print(l[1], l[3], l[5])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.