I am reading a file from the web row by row and each row is a list. The list has three columns visibly separated by this pattern: +++$+++.
this is my code:
with closing(requests.get(url, stream=True)) as r:
reader = csv.reader(codecs.iterdecode(r.iter_lines(), 'latin-1'))
for i, row in enumerate(reader):
if i < 5:
t = row[0].split('(\s\+{3}\$\+{3}\s)+')
print(t)
I have tried to split the list using this instruction in python3.6 and can't get it to work. Any suggestion is well appreciated:
the list:
['m0 +++$+++ 10 things i hate about you +++$+++ http://www.dailyscript.com/scripts/10Things.html']
['m1 +++$+++ 1492: conquest of paradise +++$+++ http://www.hundland.org/scripts/1492-ConquestOfParadise.txt']
['m2 +++$+++ 15 minutes +++$+++ http://www.dailyscript.com/scripts/15minutes.html']
['m3 +++$+++ 2001: a space odyssey +++$+++ http://www.scifiscripts.com/scripts/2001.txt']
['m4 +++$+++ 48 hrs. +++$+++ http://www.awesomefilm.com/script/48hours.txt']
this is my regex expression:
row[0].split('(\s\+{3}\$\+{3}\s)+')
each row has only one component -> row[0]
when I print the result is not splitting the row.
.split()on a string isn't a regex match at all - it's literally looking for the string(\s\+{3}\$\+{3}\s)+! You wantre.split(r'(\s\+{3}\$\+{3}\s)+', row[0])instead.row[0].split(" +++$+++ "), since nothing you're doing here appears to benefit from the power of regular expressions.