Intersection Area of Two Shapefiles - Python

Question

I have two shapefiles in python and I would like to find the area of all spaces they overlap.

I can use sjoin from geopandas to get areas where they join, but for locations where there are multiple overlaps I would like to only keep the one with the largest area.

municipality = gpd.read_file(muni_file)
soil_type = gpp.read_file(soil)
combined = gpd.sjoin(municipality,soil_type,how="left",op="intersects")

With OGR I can get the area of a polygon as below

from osgeo import ogr

wkt = "POLYGON ((1162440.5712740074 672081.4332727483, 1162440.5712740074 647105.5431482664, 1195279.2416228633 647105.5431482664, 1195279.2416228633 672081.4332727483, 1162440.5712740074 672081.4332727483))"
poly = ogr.CreateGeometryFromWkt(wkt)

So I am wondering if there is a way to take my combined shapefile and have the area where the two intersect so that I only keep the maximum one for each municipality.

Bob Haffner · Accepted Answer · 2017-06-10 21:37:40Z

Yes, I believe you can loop thru combined via apply and get the size of each of the intersections.

start with reindex of combined as I'm assuming they're are duplicates from sjoin()

combined = combined.reset_index()

Then define a helper function (get_size_of_intersection) then we'll loop thru combined and apply get_size_of_intersection() and create a new series called intersection_size

some notes:

-combined will have the geometry of municipality

-combined will have a column/series called index_right which will be the index of soil_type

-since these are shapely objects we're dealing with we can leverage intersection() and the area property

def get_size_of_intersection(row, soil_type):
    return row['geometry'].intersection(soil_type['geometry'].iloc[int(row['index_right'])]).area

combined['intersection_size'] = combined.apply(lambda row : 
                                       get_size_of_intersection(row, soil_type), axis=1)

We'll create another series called max_intersection_size. Here I'm assuming municipality has some sort of 'name' series that we can group on and apply max()

combined['max_intersection_size'] = combined.groupby('name')['intersection_size'].transform(max)

Then using boolean indexing we get our desired data

(ie where intersection_size equals max_intersection_size)

filter = combined['intersection_size'] == combined['max_intersection_size']
combined[filter]

Collectives™ on Stack Overflow

Intersection Area of Two Shapefiles - Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related