2

I'm just getting up to speed on Pandas and cannot resolve one issue. I have a list of Counties in NY State. If the County is one of the 5 boroughs, I want to change the county name to New York, otherwise I leave it alone. The following gives the idea, but is not correct.

EDIT - so if the counties in the County column of the first few rows were Albany, Allegheny, Bronx before the change, they would be Albany, Allegheny, New York after the change

# clean up county names
# 5 boroughs must be combined to New York City
# eliminate the word county
nyCounties = ["Kings", "Queens", "Bronx", "Richmond", "New York"]

nypopdf['County'] = ['New York' for nypopdf['County'] in nyCounties else   
nypopdf['County']]
3
  • 1
    can you show your expected output from that list ? Commented Oct 27, 2018 at 15:10
  • also - this comment in your code looks like a mistake #nypopdf['County'] = ['New York' for nypopdf['County'] in nyCounties else . you Probably meant to have #nypopdf['County'] = ['New York' for nypopdf['County'] in nyCounties else nypopdf['County']] Commented Oct 27, 2018 at 15:11
  • nypopdf['County'] = ['New York' for nypopdf['County'] in nyCounties else nypopdf['County']] results in a syntax error at else Commented Oct 27, 2018 at 15:17

1 Answer 1

1

A small mockup:

In [44]: c = ['c', 'g']
In [45]: df = pd.DataFrame({'county': list('abccdefggh')})
In [46]: df['county'] = df['county'].where(~df['county'].isin(c), 'N')
In [47]: df
Out[47]:   county
         0      a
         1      b
         2      N
         3      N
         4      d
         5      e
         6      f
         7      N
         8      N
         9      h

So this is using pd.Series.where ~df['county'].isin(c) selects rows that are not in the list c (the ~ at the start is the 'not' operation), the second argument is the value to replace with (when the condition is False).

To fit your example:

nypopdf['County'] = nypopdf['County'].where(~nypopdf['County'].isin(nyCounties), 'New York')

or

nypopdf['County'].where(~nypopdf['County'].isin(nyCounties), 'New York', inplace=True)

Complete example:

nypopdf = pd.DataFrame({'County': ['Albany', 'Allegheny', 'Bronx']})
nyCounties = ["Kings", "Queens", "Bronx", "Richmond", "New York"]
print(nypopdf)
      County
0     Albany
1  Allegheny
2      Bronx
nypopdf['County'].where(~nypopdf['County'].isin(nyCounties), 'New York', inplace=True)
print(nypopdf)
      County
0     Albany
1  Allegheny
2   New York
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.