How to create a list out of excel sheet values with Python?

Question

I'm pretty new to all this python and Excel things. I have a really big excel sheet with 1 million rows, my goal is to create a list with a few for list (I think that is what I need to do) and the problem is the follow:

I have something like this:

I want to add to a list all the POINT_Y and POINT_X (gps coordinates) where the objectid is the same and when the objectid changes I want to add those values to another list.

I only manage to get the data from the excel, my problem is in the construction of the lists.

import pandas
from shapely.geometry import Point

polygon_sheet =pandas.read_excel("C:...\coords.xlsx")
for row in polygon_sheet.index:
    data = (polygon_sheet['POINT_Y'][row]) + ", " + \
           (polygon_sheet['POINT_X'][row])

I'm sure I need a nested for loop but I don't know how to obtain what I want, that is something like this:

id1 = [[41.15226404, 25.41327506], [41.15221095, 25.41363158], [41.15220779, 25.41395401], [...(all coords from id1)]]

id2= [[41.11617356, 25.53488120], [41.11541559, 25.53517417], [41.11494893, 25.53531324], [...(all coords from id2]]

And so on until all ID's are iterated.

Hope I could explain what I expect correctly. Appreciate any help

where do you want to store those output lists ? to another pandas df ? or in a dict ? — kederrac
– kederrac, Commented Sep 16, 2019 at 10:29
if you want to store in a dict than each id is a key from dict — kederrac
– kederrac, Commented Sep 16, 2019 at 10:33

kederrac · Accepted Answer · 2019-09-16 11:07:29Z

1

you can apply a custom function over each row in order to create a dict with your desired output:

from collections import defaultdict

id_points = defaultdict(list)

def extrct_points(row):
    o_id, point_x, point_y = row
    id_points[o_id].append([point_x, point_y])

polygon_sheet.apply(extrct_points, axis=1)

for example:

polygon_sheet = pd.DataFrame({'OBJECTID': [1, 1,  2, 2, 2, 3], 'POINT_X':[0.23, 0.123, 0.1, 0.1, 0.2, 1],
                  'POINT_Y':[0.223, 0.3123, 0.11, 0.11, 0.22, 11]})

from pprint import pprint

polygon_sheet.apply(extrct_points, axis=1)

pprint(id_points)

output:

defaultdict(<class 'list'>,
            {1.0: [[0.23, 0.223], [0.123, 0.3123]],
             2.0: [[0.1, 0.11], [0.1, 0.11], [0.2, 0.22]],
             3.0: [[1.0, 11.0]]})

edited Sep 16, 2019 at 11:07

answered Sep 16, 2019 at 10:46

kederrac

17.4k6 gold badges36 silver badges58 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Kostas Charitidis · Accepted Answer · 2019-09-16 10:46:15Z

1

You can create a dict of lists.Try something like this:

import pandas
from shapely.geometry import Point

dt = {}
polygon_sheet =pandas.read_excel("C:...\coords.xlsx")
for row in polygon_sheet.index:
    try:
        dt[polygon_sheet['ID'][row]] += [[polygon_sheet['POINT_Y'][row], polygon_sheet['POINT_X'][row]]]
    except:
        dt.update({polygon_sheet['ID'][row]: [[polygon_sheet['POINT_Y'][row], polygon_sheet['POINT_X'][row]]]})

print(dt)

edited Sep 16, 2019 at 10:46

answered Sep 16, 2019 at 10:27

Kostas Charitidis

3,1231 gold badge15 silver badges23 bronze badges

Collectives™ on Stack Overflow

How to create a list out of excel sheet values with Python?

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related