1

I'm analyzing sales data I got from receipts. All bought items are in one column as one string like this:

'1 x Sandwich, "2 x Coffee, with cream", 1 x Apple pie'

I wish to separate all items to calculate the amount of items bought. A simple string.split(',') won't do, since there are also commas in the names of certain items. Luckily, these names are encapsulated by double quotes and 'normal' names are not.

How can I replace the commas within double quotes and not the commas separating items?

If these commas in names change into colons, for example, parsing the string can be done with string.split(). So the desired output will be something like this:

'1 x Sandwich, "2 x Coffee: with cream", 1 x Apple pie'

There might be other solutions, but this problem got me thinking about replacing very specific characters.

1
  • 4
    The csv module should be able to parse that. Commented May 11, 2022 at 15:12

3 Answers 3

1
text = '1 x Sandwich, "2 x Coffee, with cream", 1 x Apple pie'

def comma_changer(text):
  text = list(text)
  quote_counter = 0
  for i,char in enumerate(text):
    if char == '"':
      quote_counter+=1
    elif char == ",":
      if quote_counter%2 == 1:
        text[i] = ":"
  return("".join(text))

comma_changer(text) #'1 x Sandwich, "2 x Coffee: with cream", 1 x Apple pie'
Sign up to request clarification or add additional context in comments.

4 Comments

This did the trick! If I understand correctly: the function separates all characters and when it finds a double quote starts changing commas to colons until it finds another. While the amount of double colons found is uneven, it means the code is still within a product name and keeps replacing commas. Is my assessment correct?
Please explain your code and show us the output.
@larwain It is still not clear what you asking for. Sorry.
@Iarwain Yep ;)
0

you need to try to tell it to separate it by a specific character. in this case, try string.split('"')

Comments

0

Your input is invalid because of one missing closing " and one missing opening ".

"1 x Sandwich, "2 x Coffee, with cream", 1 x Apple pie"
             ^                           ^

I am using Pythons csv module here. Very important is the option skipinitialspace because you have blank chars (space) after your , which is unusual in CSV files.

#!/usr/bin/env python3
import io
import csv

your_invalid_input = '"1 x Sandwich, "2 x Coffee, with cream", 1 x Apple pie"'
valid_input        = '"1 x Sandwich", "2 x Coffee, with cream", "1 x Apple pie"'

# this simulates a file object
raw_data = io.StringIO(valid_input)

csv_reader = csv.reader(raw_data,
                        delimiter=',',
                        skipinitialspace=True)

for line in csv_reader:
    print(line)

The output is

['1 x Sandwich', '2 x Coffee, with cream', '1 x Apple pie']

1 Comment

I have placed the double quotes to indicate a string. When reading the CSV with Pandas, it gives me no errors in relation to this. Everything between the first and last double quote is a string in a column. Also, I don't want to split up amounts and item names. Ideally, the output in list form would be ['1 x Sandwich', '2 x Coffee: with cream', '1 x Apple pie']

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.