I have a rather simple question, but I can't find a clean way to do it. I would like to delete a number of rows from my dataframe, based on their value in a specific column (id), but I only want to delete one occurrence at a time (preferably random). Here is an example:
I have the following list of ids, that I want to delete:
idsToDelete = [1,2,2,3,3]
In other words, I would like to delete one random row with id = 1, two random rows with id 2 and two random rows with id 3.
I have the follwoing dataframe:
list1 = np.array([[1,0],[1,0],[2,0],[2,0],[2,0],[2,0],[3,0],[3,0],[3,0]])
df = pd.DataFrame(list1, columns=["id","class"])
id | class ------ | ------ 1 | 0 1 | 0 2 | 0 2 | 0 2 | 0 2 | 0 3 | 0 3 | 0 3 | 0
My goal is to get this dataframe:
id | class ------ | ------ 1 | 0 2 | 0 2 | 0 3 | 0
Any ideas?