Replace values in array using mask and other array

Question

I have a 1D "from"-array (call it "frm") containing values with an associated Boolean mask-array: "mask" (same shape as frm). Then I have a third "replace" array: "repl", also 1D but shorter in length than the other two.

With these, I would like to generate a new array ("to") which contains the frm values except where mask==True in which case it should take in-order the values from repl. (Note that the number of True elements in mask equals the length of repl).

I was looking for a "clever" numpy way of implementing this? I looked at methods like np.where, np.take, np.select, np.choose but none seem to "fit the bill"?

"Cutting to the code", here's what I have thus far. It works fine but doesn't seem "Numpythonic"? (or even Pythonic for that matter)

frm  = [1, 2, 3, 4, 5]
mask = [False, True, False, True, True]
repl = [200, 400, 500]
i = 0; to = []
for f,m in zip(frm,mask):
    if m:
        to.append(repl[i])
        i += 1
    else:
        to.append(f)
print(to)

Yields: [1, 200, 3, 400, 500]

(Background: the reason I need to do this is because I'm subclassing Pandas pd.Dataframe class and need a "setter" for the Columns/Index. As pd.Index cannot be "sliced indexed" I need to first copy the index/column array, replace some of the elements in the copy based on the mask and then have the setter set the complete new value. Let me know if anyone would know a more elegant solution to this).

sacuL · Accepted Answer · 2018-08-31 22:42:08Z

6

`numpy` solution:

Its pretty straightforward like this:

# convert frm to a numpy array:
frm = np.array(frm)
# create a copy of frm so you don't modify original array:
to = frm.copy()

# mask to, and insert your replacement values:
to[mask] = repl

Then to returns:

>>> to
array([  1, 200,   3, 400, 500])

`pandas` solution:

if your dataframe looks like:

Then you can use loc:

df.loc[mask,'column'] = repl

Then your dataframe looks like:

edited Aug 31, 2018 at 22:42

answered Aug 31, 2018 at 22:34

sacuL

51.6k9 gold badges88 silver badges115 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Hans Bouwmeester Over a year ago

Re the numpy solution: Nice! Here I am looking for "special methods" completely overlooking the fact that I can simply assign to a variable using the mask for the indexing! :-)

Hans Bouwmeester Over a year ago

Re the Pandas solution: I'm aware using "loc" for this for the contents of a DataFrame. As I can tell, there's no equivalent for the Axes (the "index" and "column" names, not the actual values inside the dataframe). For example: df.columns[3] works. But df.columns[3] = "new-name" gives TypeError: "Index does not support mutable operations" (which prompted me to take it into numpy for the solution).

sacuL Over a year ago

Oh I guess I misunderstood what you were going for... yeah it's probably best to do it in numpy and use the resulting array as your index (IIUC)

Collectives™ on Stack Overflow

Replace values in array using mask and other array

1 Answer 1

`numpy` solution:

`pandas` solution:

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

numpy solution:

pandas solution:

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related

`numpy` solution:

`pandas` solution: