Replace strings from column with unique random strings

Question

I have a a csv that has multiple columns, one of these columns consists of strings.

I start with just reading the csv file and then just using two columns

df = pd.read_csv("MyDATA_otherstring.csv", usecols=["describe_file", "data_numbers"])

This is the output

    describe_file   data_numbers
0   This is the start of the story  7309.0
1   This is the start of the story  35.0
2   This is the start of the story  302.0
3   Difficult part  7508.5
4   Difficult part  363.0

In around 10k rows, there are around 150 unique strings. These strings appear multiple times within the file.

My goal Filter by the first string example 'This is is the start of the story' and replace it with a random string.

I want to run over all the strings in that column and replace them with unique strings

I have looked into the random library and some questions that have been asked here, unfortunately I have not found anything that would help me.

Please be more specific about what research you have done, and what you’ve tried. You could at the very least provide the data in a more convenient or practical format. — AMC
– AMC, Commented Mar 14, 2020 at 2:23

Nicolas Gervais · Accepted Answer · 2020-03-13 19:59:52Z

1

This is your example:

import pandas as pd
import numpy as np
from string import ascii_lowercase

df = pd.DataFrame([['This is the start of the story']*3 + ['Difficult part']*2, 
    np.random.rand(5)], index=['describe_file', 'data_numbers']).T

                    describe_file data_numbers
0  This is the start of the story     0.825913
1  This is the start of the story     0.704422
2  This is the start of the story      0.91563
3                  Difficult part     0.192693
4                  Difficult part     0.795088

This is how you can do it:

df.describe_file = df.join(df.groupby('describe_file')['describe_file'].apply(lambda x:
    ''.join(np.random.choice(list(ascii_lowercase), 10))), \
    on='describe_file', rsuffix='_NEW')['describe_file_NEW']

The result:

  describe_file data_numbers
0    skgfdrsktw     0.204907
1    skgfdrsktw     0.399947
2    skgfdrsktw     0.990196
3    rziuoslpqn     0.930852
4    rziuoslpqn     0.210122

edited Mar 13, 2020 at 19:59

answered Mar 13, 2020 at 19:32

Nicolas Gervais

36.9k23 gold badges123 silver badges160 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

E199504 Over a year ago

Thank you for your answer, however I was trying to find something that would do this with all the string. If I have to cope and paste into the code every string. I won't safe more than than just doing the find and replace option in excel

Nicolas Gervais Over a year ago

like this? (see answer). i.e., make random strings for an entire column?

E199504 Over a year ago

And "the string "this is the start of story" should be replaced by one strong not Everytime a different one

E199504 Over a year ago

Example. All the "this is the start of the story" replace with with "kkbim" . All the " difficult part" replace them by some other string

Nicolas Gervais Over a year ago

got it. hope this is what you're expecting (see edit)

|

Ruthger Righart · Accepted Answer · 2020-03-13 20:35:44Z

0

The previous answer by @Nicolas Gervais is fine, but after reading several times the question I interpret that the question is to replace 'This is the part of the story' by a random string, but leave the rest 'Difficult part' as it is. The following command including .replace() statement is doing that.

df['describe_file'].apply(lambda x: x.replace('This is the start of the story', ''.join(np.random.choice(list(ascii_lowercase), 10))))

0        glhrtqwlnl
1        qxrklnxhoj
2        kszgtysptj
3    Difficult part
4    Difficult part
Name: describe_file, dtype: object

answered Mar 13, 2020 at 20:35

Ruthger Righart

4,9692 gold badges32 silver badges35 bronze badges

Collectives™ on Stack Overflow

Replace strings from column with unique random strings

2 Answers 2

7 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related