Python: elegant and code saving way to split an string in a list

Question

i have a string as:

mydata
'POINT (558750.3267372231900000 6361788.0628051758000000)'

i wish a code saving way to convert in a list numeric as

(g, (x,y))

where:

g = geometry (POINT)
x = coordinates x
y = coordinates y

i am using

mydata.split(" ")
['POINT', '(558750.3267372231900000', '6361788.0628051758000000)']

but after that i need to use several code line to get x and y

what about storing data in list as point objects, shapely provides method to parse your point strings for you pypi.python.org/pypi/Shapely — dm03514
– dm03514, Commented Dec 6, 2012 at 17:52

Jon Clements · Accepted Answer · 2012-12-06 17:47:31Z

3

Step by step:

>>> s = 'POINT (558750.3267372231900000 6361788.0628051758000000)'
>>> word, points = s.split(None, 1)
>>> word
'POINT'
>>> points
'(558750.3267372231900000 6361788.0628051758000000)'
>>> points = points.strip('()').split()
>>> points
['558750.3267372231900000', '6361788.0628051758000000']
>>> x, y = (float(i) for i in points)
>>> x
558750.3267372232
>>> y
6361788.062805176

answered Dec 6, 2012 at 17:47

Jon Clements

143k34 gold badges254 silver badges288 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Lev Levitsky · Accepted Answer · 2012-12-06 17:50:04Z

3

Regex can spare you some typing here:

In [1]: import re

In [2]: def nice_tuple(s):                                                    
    g, x, y, _ = re.split(' ?[()]?', s)
    return g, tuple(map(float, (x, y)))
   ...: 

In [3]: nice_tuple('POINT (558750.3267372231900000 6361788.0628051758000000)')
Out[3]: ('POINT', (558750.3267372232, 6361788.062805176))

answered Dec 6, 2012 at 17:50

Lev Levitsky

66.4k23 gold badges155 silver badges184 bronze badges

Comments

Claudiu · Accepted Answer · 2012-12-06 18:08:55Z

2

If your data is always in that exact format, it's easy:

>>> def parse_data(d):
    geom, xs, ys = d.split()
    return (geom, (float(xs[1:]), float(ys[:-1])))

>>> mydata
'POINT (558750.3267372231900000 6361788.0628051758000000)'
>>> parse_data(mydata)
('POINT', (558750.32673722319, 6361788.0628051758))

edited Dec 6, 2012 at 18:08

answered Dec 6, 2012 at 17:52

Claudiu

231k174 gold badges507 silver badges702 bronze badges

Comments

Ashwini Chaudhary · Accepted Answer · 2012-12-06 17:54:47Z

1

using regex:

In [59]: g,[x,y]=re.findall(r"[A-Za-z]+",mydata)[0],
                       [float(x) for x in re.findall(r"[\d+.]+",mydata)]

In [60]: g
Out[60]: 'POINT'

In [61]: x
Out[61]: 558750.3267372232

In [62]: y
Out[62]: 6361788.062805176

using str.strip() and str.split():

In [35]: mydata='POINT (558750.3267372231900000 6361788.0628051758000000)'

In [39]: data=mydata.split(None,1)

In [40]: data
Out[40]: ['POINT', '(558750.3267372231900000 6361788.0628051758000000)']

In [41]: g,[x,y]=data[0], map(lambda x: float(x.strip("()")), data[1].split())

In [42]: g,x,y
Out[42]: ('POINT', 558750.3267372232, 6361788.062805176)

edited Dec 6, 2012 at 17:54

answered Dec 6, 2012 at 17:46

Ashwini Chaudhary

252k60 gold badges478 silver badges519 bronze badges

Comments

Cameron Sparr · Accepted Answer · 2012-12-06 18:00:12Z

1

I would use .translate and .split:

In [126]: mydata = 'POINT (558750.3267372231900000 6361788.0628051758000000)'

In [127]: mysplitdata = mydata.translate(None, '()').split()

In [128]: mysplitdata
Out[128]: ['POINT', '558750.3267372231900000', '6361788.0628051758000000']

In [129]: g,x,y = mysplitdata[0],float(mysplitdata[1]),float(mysplitdata[2])

In [130]: outdata = (g, (x,y))

In [131]: outdata
Out[131]: ('POINT', (558750.32673722319, 6361788.0628051758))

answered Dec 6, 2012 at 18:00

Cameron Sparr

3,9912 gold badges26 silver badges31 bronze badges

Comments

Hemesh Singh · Accepted Answer · 2012-12-06 18:02:17Z

1

Recently I created an application in python where I did almost the same thing. Here is a class I created to parse wkt files.

link

Hope you find it useful. See line number 136 for usage. You can use this class to read Linestrings and Multilinestrings as well.

answered Dec 6, 2012 at 18:02

Hemesh Singh

1,1752 gold badges9 silver badges13 bronze badges

Comments

Mark · Accepted Answer · 2012-12-06 18:24:27Z

1

found = re.match(r'([a-zA-Z]*) \(([0-9\.]*) ([0-9\.]*)\)', mydata)
found.group(1), (float(found.group(2)), float(found.group(3)))

That's probably the shortest one, don't know about elegant.

edited Dec 6, 2012 at 18:24

answered Dec 6, 2012 at 17:56

Mark

20.2k10 gold badges113 silver badges137 bronze badges

1 Comment

Lev Levitsky Over a year ago

You still have to convert x and y to float.

Mark · Accepted Answer · 2012-12-06 18:24:46Z

1

v = mydata.split()
g = v[0]
x = float(v[1].strip('('))
y = float(v[2].strip(')'))
(g, (x, y))

Code saving yes, elegant not so much

edited Dec 6, 2012 at 18:24

answered Dec 6, 2012 at 17:45

Mark

20.2k10 gold badges113 silver badges137 bronze badges

Collectives™ on Stack Overflow

Python: elegant and code saving way to split an string in a list

8 Answers 8

Comments

Comments

Comments

Comments

Comments

Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

Comments

Comments

Comments

Comments

Comments

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related