1

I have a 1D numpy array containing some audio data. I'm doing some processing and want to replace certain parts of the data with white noise. The noise should, however, be shorter then the replaced part. Generating the noise is not a problem, but I'm wondering what the easiest way to replace the original data with the noise is. My first thought of doing data[10:110] = noise[0:10] does not work due to the obvious dimension mismatch.

What's the easiest way to replace a part of a numpy array with another part of different dimension?

edit: The data is uncompressed PCM data that can be up to an hour long, taking up a few hundred MB of memory. I would like to avoid creating any additional copies in memory.

2 Answers 2

5

What advantage does a numpy array have over a python list for your application? I think one of the weaknesses of numpy arrays is that they are not easy to resize:

http://mail.python.org/pipermail/python-list/2008-June/1181494.html

Do you really need to reclaim the memory from the segments of the array you're shortening? If not, maybe you can use a masked array:

http://docs.scipy.org/doc/numpy/reference/maskedarray.generic.html

When you want to replace a section of your signal with a shorter section of noise, replace the first chunk of the signal, then mask out the remainder of the removed signal.

EDIT: Here's some clunky numpy code that doesn't use masked arrays, and doesn't allocate more memory. It also doesn't free any memory for the deleted segments. The idea is to replace data that you want deleted by shifting the remainder of the array, leaving zeros (or garbage) at the end of the array.

import numpy
a = numpy.arange(10)
# [0 1 2 3 4 5 6 7 8 9]
## Replace a[2:7] with length-2 noise:
insert = -1 * numpy.ones((2))
new = slice(2, 4)
old = slice(2, 7)
#Just to indicate what we'll be replacing:
a[old] = 0
# [0 1 0 0 0 0 0 7 8 9]
a[new] = insert
# [0 1 -1 -1 0 0 0 7 8 9]
#Shift the remaining data over:
a[new.stop:(new.stop - old.stop)] = a[old.stop:]
# [0 1 -1 -1 7 8 9 7 8 9]
#Zero out the dangly bit at the end:
a[(new.stop - old.stop):] = 0
# [0 1 -1 -1 7 8 9 0 0 0]
Sign up to request clarification or add additional context in comments.

2 Comments

The library I use to read audio data returns a numpy array. Additionally the data is hundreds of megabytes in size, so I think the overhead of using a normal list would be quite large in this case.
Ok, I added a hacky way to replace part of the array with a shorter array.
0

not entirely familiar with numpy but can't you just break down the data array into pieces that are the same size as the noise array and set each data piece to the noise piece. for example:

data[10:20] = noise[0:10]
data[21:31] = noise[0:10]

etc., etc.?

you could loop like this:

for x in range(10,100,10):
    data[x:10+x] = noise[0:10]

UPDATE:

if you want to shorten the original data array, you could do this:

data = data[:10] + noise[:10]

this will truncate the data array and add the the noise to the original array after the 10th location, you could then add the rest of the data array to the new array if you need it.

4 Comments

I might have been unclear. I don't want to repeat the noise. I want to remove the original data and instead put in the noise. After the operation the original array should be shorter than in the beginning because the noise part is shorter than the replaced part.
check my update, maybe i don't understand the question fully. but have a look and let me know.
Thanks, that is what I need. But I think this will create a completely new array, which would be a waste of memory since it's pretty much the same data as in the original.
actually, now that i think of it, you can just overwrite the original data array. no wasted memory.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.