Python: How split data into different data types into 2D array

Question

I’m trying to split downloaded data to an 2D array into different datatypes. The downloaded data looks like this:

000|17:40
000|17:45
010|17:50
025|17:55
056|18:00
178|18:05
202|18:10
203|18:15
190|18:20
072|18:25
013|18:30
002|18:35
000|18:40
000|18:45
000|18:50
000|18:55
000|19:00
000|19:05
000|19:10
000|19:15
000|19:20
000|19:25
000|19:30
000|19:35
000|19:40

I’m using the following code to parse this into a two dimensional array:

#!/usr/bin/python

import urllib2

response = urllib2.urlopen('http://gps.buienradar.nl/getrr.php?lat=52&lon=4')
html = response.read()
htmlsplit = []

for record in html.split("\r\n"):
    htmlsplit.append(record.split("|"))

print htmlsplit

This is working great, but as expected, it treats it as a string. I’ve found some examples that splits into integers. That’s great if both sides where integers. But in my case it’s an integer | string (or maybe some kind of Python time format)

How can I split this directly into different data types?

What kind of array? module array.array (weird)? List? Numpy array? — Aaron Hall
– Aaron Hall ♦, Commented Jun 17, 2014 at 21:26

DrV · Accepted Answer · 2014-06-18 03:49:16Z

3

Something like this?

for record in html.split("\r\n"):  # beware, newlines are treacherous!
    s = record.split("|")
    htmlsplit.append((int(s[0]), s[1]))

Just write a parser for each record, if you have data this simple. However, I would add some try/except clause to catch errors for non-conforming lines, empty lines, etc. which may be present in the data. The code above is very fragile. Also, you might want to break at only \n and then clean your strings by strip() (i.e. replace s[1] by s[1].strip()). The integer conversion takes care of it automatically.

edited Jun 18, 2014 at 3:49

answered Jun 17, 2014 at 21:22

DrV

23.7k8 gold badges66 silver badges75 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Satoer Over a year ago

Hi DrV, Thank you! I've changed it into this: for record in html.splitlines(): s = record.split("|") htmlsplit.append((int(s[0]),s[1])) Used splitlines from the advice from Aron Hall below. And added the : you forgot ;) One question, do I need to "free" the temporary "s" to rule out memory leaks?

Satoer Over a year ago

I will use try/except, thanks for pointing that out. (This is my first time on Stackoverflow) but what a horrible reply editor is this. I hope I can make sense without the line-breaks)

DrV Over a year ago

@Satoer I just omitted tho colon to see if you are awake :) (Fixed now.) No need to free variables in python; if you are not using them anymore (no-one references to them), a big yellow lorry with the text "Garbage Collector" comes and picks them us. I suggest you do some reading on GC and python's "everything is an object" model, as understanding the basics is sometimes useful. BTW, Aaron Hall's solution is in a way more pythonic than mine; once you learn the basics, you'll learn to love the nice modules available!

Aaron Hall · Accepted Answer · 2014-06-17 21:39:24Z

1

Use str.splitlines instead of splitting on \r\n Use the csv module to iterate over the lines:

import csv
txt = '000|17:40\n000|17:45\n000|17:50\n000|17:55\n000|18:00\n000|18:05\n000|18:10\n000|18:15\n000|18:20\n000|18:25\n000|18:30\n000|18:35\n000|18:40\n000|18:45\n000|18:50\n000|18:55\n000|19:00\n000|19:05\n000|19:10\n000|19:15\n000|19:20\n000|19:25\n000|19:30\n000|19:35\n000|19:40\n'

reader = csv.reader(txt.splitlines(), delimiter='|')
column1 = []
column2 = []
for c1, c2 in reader:
    column1.append(c1)
    column2.append(c2)

You can also use the DictReader

import StringIO
reader2 = csv.DictReader(StringIO.StringIO(txt), 
                         fieldnames=['int', 'time'], 
                         delimiter='|')

column1 = []
column2 = []
for row in reader2:
    column1.append(row['time'])
    column2.append(row['int'])

edited Jun 17, 2014 at 21:39

answered Jun 17, 2014 at 21:30

Aaron Hall♦

400k93 gold badges416 silver badges342 bronze badges

1 Comment

Satoer Over a year ago

Hi Aaron, thanks for the " splitlines" advice. I've discovered that this keeps the array free from a closing empty record. Thanks for the solution, but the solution from DrV does exactly what I need.

Collectives™ on Stack Overflow

Python: How split data into different data types into 2D array

2 Answers 2

3 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related