0

I have two shapefiles in python and I would like to find the area of all spaces they overlap.

I can use sjoin from geopandas to get areas where they join, but for locations where there are multiple overlaps I would like to only keep the one with the largest area.

municipality = gpd.read_file(muni_file)
soil_type = gpp.read_file(soil)
combined = gpd.sjoin(municipality,soil_type,how="left",op="intersects")

With OGR I can get the area of a polygon as below

from osgeo import ogr

wkt = "POLYGON ((1162440.5712740074 672081.4332727483, 1162440.5712740074 647105.5431482664, 1195279.2416228633 647105.5431482664, 1195279.2416228633 672081.4332727483, 1162440.5712740074 672081.4332727483))"
poly = ogr.CreateGeometryFromWkt(wkt)

So I am wondering if there is a way to take my combined shapefile and have the area where the two intersect so that I only keep the maximum one for each municipality.

1 Answer 1

3

Yes, I believe you can loop thru combined via apply and get the size of each of the intersections.

start with reindex of combined as I'm assuming they're are duplicates from sjoin()

combined = combined.reset_index()

Then define a helper function (get_size_of_intersection) then we'll loop thru combined and apply get_size_of_intersection() and create a new series called intersection_size

some notes:

-combined will have the geometry of municipality

-combined will have a column/series called index_right which will be the index of soil_type

-since these are shapely objects we're dealing with we can leverage intersection() and the area property

def get_size_of_intersection(row, soil_type):
    return row['geometry'].intersection(soil_type['geometry'].iloc[int(row['index_right'])]).area

combined['intersection_size'] = combined.apply(lambda row : 
                                       get_size_of_intersection(row, soil_type), axis=1)

We'll create another series called max_intersection_size. Here I'm assuming municipality has some sort of 'name' series that we can group on and apply max()

combined['max_intersection_size'] = combined.groupby('name')['intersection_size'].transform(max)

Then using boolean indexing we get our desired data

(ie where intersection_size equals max_intersection_size)

filter = combined['intersection_size'] == combined['max_intersection_size']
combined[filter]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.