How to replace specific entries of a Numpy array based on its content

Question

So let's say I have a simple matrix made out of ndarrays (just an example of how part of the data might look like):

import numpy as np
a = np.asarray([['1.0', 'Miami'],
   ['2.0', 'Boston'],
   ['1.4', 'Miami']])

I want to do data analysis in this complex data set ;) - I want to transform 'Miami' in 0 and Boston in 1 in order to use a really fancy ML algorithm.

What is a good way to accomplish this in Python?
(I am not asking for the obvious one of iterating and using a dictionary / if sentence to replace the entry) but more if there's a better way using NumPy or native Python to do this.

Andy Hayden · Accepted Answer · 2013-06-16 19:27:32Z

2

pandas is a good tool for this.
First convert the array to a DataFrame:

In [11]: import pandas as pd

In [12]: df = pd.DataFrame(a, columns=['value', 'city'])

and then replace entries from the city column:

In [13]: df.city = df.city.replace({'Miami': 0, 'Boston': 1})

In [14]: df
Out[14]:
  value city
0   1.0    0
1   2.0    1
2   1.4    0

answered Jun 16, 2013 at 19:27

Andy Hayden

378k110 gold badges640 silver badges546 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

mfcabrera Over a year ago

I thought there was a way of doing it really clean without using a library. But this looks OK. I am going to start using Pandas.

Collectives™ on Stack Overflow

How to replace specific entries of a Numpy array based on its content

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related