1

So let's say I have a simple matrix made out of ndarrays (just an example of how part of the data might look like):

import numpy as np
a = np.asarray([['1.0', 'Miami'],
   ['2.0', 'Boston'],
   ['1.4', 'Miami']]) 

I want to do data analysis in this complex data set ;) - I want to transform 'Miami' in 0 and Boston in 1 in order to use a really fancy ML algorithm.

What is a good way to accomplish this in Python?
(I am not asking for the obvious one of iterating and using a dictionary / if sentence to replace the entry) but more if there's a better way using NumPy or native Python to do this.

1 Answer 1

2

pandas is a good tool for this.
First convert the array to a DataFrame:

In [11]: import pandas as pd

In [12]: df = pd.DataFrame(a, columns=['value', 'city'])

and then replace entries from the city column:

In [13]: df.city = df.city.replace({'Miami': 0, 'Boston': 1})

In [14]: df
Out[14]:
  value city
0   1.0    0
1   2.0    1
2   1.4    0
Sign up to request clarification or add additional context in comments.

1 Comment

I thought there was a way of doing it really clean without using a library. But this looks OK. I am going to start using Pandas.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.