0

Unable to combine/merge/crossjoin Dataframe and Nested List with a where condition(If the nearest zip from the nested list is equal to the actual zip do not show it in the nearest zip field) to get to the desired output.

The code i have so far

x=0
print(test_df)
print(type(test_df))
for x in range(5):      
 nearest_result=search.by_coordinates(test_df.iloc[x,1],test_df.iloc[x,2], radius=30,returns=3)
n_zip=[res.zipcode for res in nearest_result]
print(n_zip)
print(type(n_zip))

The dataframe and nested list: enter image description here

Desired Output: enter image description here

1 Answer 1

1

Maybe a simplier approach can be proposed, but as a first shot, initially dropping 'NEAREST_ZIP':

>>> print(test_df)  # /!\ dropped 'NEAREST_ZIP
ID  BEGIN_LAT  BEGIN_LON  ZIP_CODE
0   0    30.9958   -87.2388     36441
1   1    42.5589   -92.5000     50613
2   2    42.6800   -91.9000     50662
3   3    37.0800   -97.8800     67018
4   4    37.8200   -96.8200     67042
>>> # used nzip:
>>> nzip = [[36441, 32535, 36426],
             [50613, 50624, 50613],  # i guess there was a typo in your code here
             [50662, 50641, 50671],
             [67018, 67003, 67049],
             [67042, 67144, 67074]]

>>> # build a `closest` dataframe:
>>> closest = pd.DataFrame(data={k: (v1, v2) for k, v1, v2 in nzip}).T.stack().reset_index().drop(columns=['level_1'])
>>> closest.columns = ['ZIP_CODE', 'NEAREST_ZIP']
>>> # merging
>>> test_df.merge(closest)
   ID  BEGIN_LAT  BEGIN_LON  ZIP_CODE  NEAREST_ZIP
0   0    30.9958   -87.2388     36441        32535
1   0    30.9958   -87.2388     36441        36426
2   1    42.5589   -92.5000     50613        50624
3   1    42.5589   -92.5000     50613        50613
4   2    42.6800   -91.9000     50662        50641
5   2    42.6800   -91.9000     50662        50671
6   3    37.0800   -97.8800     67018        67003
7   3    37.0800   -97.8800     67018        67049
8   4    37.8200   -96.8200     67042        67144
9   4    37.8200   -96.8200     67042        67074
Sign up to request clarification or add additional context in comments.

1 Comment

my nzip dataframe can have elements up to 30,000. Any other way rather than writing it as "closest = pd.DataFrame(data={k: (v1, v2) for k, v1, v2 in nzip}).T.stack().reset_index().drop(columns=['level_1'])"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.