I have a string with multiple urls extracted using BeautifulSoup and I want to split all of these urls to extract dates and year (the urls have dates and year in them).
print(dat)
http://www.foo.com/2016/01/0124
http://www.foo.com/2016/02/0122
http://www.foo.com/2016/02/0426
http://www.foo.com/2016/03/0129
.
.
I tried the following but it only retrieves the first:
import urlparse
parsed = urlparse(dat)
path = parsed[2] #defining after www.foo.com/
pathlist = path.split("/")
['', '2016', '01', '0124']
So I am only getting result for the first element of the string. How can I retrieve these parses for all of the urls, and store them so I can extract information? I would like know how many of the links there are for year and month.
Also strangely after doing this, when I do print(dat) I only get the first element http://www.foo.com/2016/01/0124, it seems that urlparse is not working for multiple urls.
forloop over your script.dat(what is the result oftype(dat))?