how to find specific string with a substring python

Question

I have similar problem to this guy: find position of a substring in a string

The difference is that I don't know what my "mystr" is. I know my substring but my string in the input file could be random amount of words in any order, but i know one of those words include substring cola.

For example a csv file: fanta,coca_cola,sprite in any order.

If my substring is "cola", then how can I make a code that says

mystr.find('cola')

or

match = re.search(r"[^a-zA-Z](cola)[^a-zA-Z]", mystr)

or

if "cola" in mystr

When I don't know what my "mystr" is?

this is my code:

import csv

with open('first.csv', 'rb') as fp_in, open('second.csv', 'wb') as fp_out:
        reader = csv.DictReader(fp_in)
        rows = [row for row in reader]
        writer = csv.writer(fp_out, delimiter = ',')

        writer.writerow(["new_cola"])

        def headers1(name):
            if "cola" in name:
                    return row.get("cola")


        for row in rows:
                writer.writerow([headers1("cola")])

and the first.csv:

fanta,cocacola,banana
0,1,0
1,2,1

so it prints out

new_cola
""
""

when it should print out

new_cola
1
2

what does these numbers in first.csv:mean ? Are they desired results ? — user3378649
– user3378649, Commented Apr 15, 2014 at 11:01
You should explain how you get mystr, why do you expect 1,2 under "new cola". — user3378649
– user3378649, Commented Apr 15, 2014 at 11:04
When you call headers1("cola"), of course "cola" in name; name == "cola"! I think you need to rethink your approach. Try looking at what is actually in rows. mystr is just a filler variable - it is whatever string you are trying to process, in this case name. — jonrsharpe
– jonrsharpe, Commented Apr 15, 2014 at 11:05

Stephan Kulla · Accepted Answer · 2014-04-15 12:55:30Z

1

Here is a working example:

import csv

with open("first.csv", "rb") as fp_in, open("second.csv", "wb") as fp_out:
        reader = csv.DictReader(fp_in)
        writer = csv.writer(fp_out, delimiter = ",")

        writer.writerow(["new_cola"])

        def filter_cola(row):
            for k,v in row.iteritems():
                if "cola" in k:
                    yield v

        for row in reader:
            writer.writerow(list(filter_cola(row)))

Notes:

rows = [row for row in reader] is unnecessary and inefficient (here you convert a generator to list which consumes a lot of memory for huge data)
instead of return row.get("cola") you meant return row.get(name)
in the statement return row.get("cola") you access a variable outside of the current scope
you can also use the unix tool cut. For example:
```
cut -d "," -f 2 < first.csv > second.csv
```

answered Apr 15, 2014 at 12:55

Stephan Kulla

5,1273 gold badges28 silver badges35 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

user3454635 Over a year ago

Thanks this was helpful, but what if i would have 2 filters? for row in reader: writer.writerow(list(filter_cola(row)), list(filter_fanta(row))) it gives me error (writerow takes only 1 argument). What am i not understanding here?

Stephan Kulla Over a year ago

writer.writerow(list(filter_cola(row)) + list(filter_fanta(row))) – you have to concatenate the two returned lists with +

Collectives™ on Stack Overflow

how to find specific string with a substring python

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related