Pandas code using regex to convert all strings with single quote by another literal not working

Question

I am trying to update a csv that has each line with multiple single quoted strings to one that replaces these strings to literal. but it puts all data in just first line in output. Can someone suggest what is the issue in the below code:

import pandas as pd
import re

df=pd.read_csv("t1.csv");
col1=df['col1']
col2=re.sub(r'\'([^\']*)\'','const',str(col1))
col3 = pd.Series(col2)

df['col1']=col3
df.to_csv('t_u.csv')
exit()

the file t1.csv has data like below:

col1
This one has 'many' 'such' 'quotes' in it.
Now it does not.
But 'this' 'one' does 'have' it 'too'.

The output generated has data like below ...which is wrong since it has only one line:

    col1
0   "0    This one has const const const in it.
1                              Now it does not.
2        But const const does const it const.
Name: col1, dtype: object"
1   
2

What happened here is that all the 3 lines just got combined into a single line in the final output, whereas I am looking to have output of resulting csv to have same format - of 3 lines with the required changes.

Please provide the output of df.to_dict('tight') after the import (df=pd.read_csv("t1.csv")) for reproducibility — mozway
– mozway, Commented Apr 28, 2024 at 18:21
Hi Mozway, Thanks for your inputs. I added that line and it ran without any issues and still the code gives the same output. that particular line did not generate any output when I ran the file. — Nirav Shah
– Nirav Shah, Commented Apr 28, 2024 at 18:30
You're doing str(col1), which converts the whole column into one string. Instead, see pandas applying regex to replace values. See also the docs: 10 minutes to pandas § String Methods — wjandrea
– wjandrea, Commented Apr 28, 2024 at 18:54
BTW, welcome to Stack Overflow! Check out the tour, and How to Ask for tips, like how to write a good title. ("not working" is not descriptive enough.) — wjandrea
– wjandrea, Commented Apr 28, 2024 at 18:55
@NiravShah this command is not supposed to solve your issue but returns a dictionary that you should add as edit to your question. If you run a script, use print(df.to_dict('tight')). Most likely, you need df['col1'] = df['col1'].str.replace(r'\'([^\']*)\'', 'const', regex=True) — mozway
– mozway, Commented Apr 28, 2024 at 18:55

mozway · Accepted Answer · 2024-04-28 18:59:10Z

1

You most likely want to use str.replace with a regex:

df['col1'] = df['col1'].str.replace(r'\'([^\']*)\'', 'const', regex=True)

Output:

0    This one has const const const in it.
1                         Now it does not.
2     But const const does const it const.
Name: col1, dtype: object

answered Apr 28, 2024 at 18:59

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Pandas code using regex to convert all strings with single quote by another literal not working

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related