1

I am getting the following error when I am trying to parse "bloomberg" out of the self.web_url. type of self.web_url is unicode, so I am assuming that might be the reason why. However, I do not know how to implement type conversions if necessary or what to do

self.web_url = "http://www.bloomberg.com"
start = "http:/www."
    end = ".com")
    print type(self.web_url)
    web_name = re.search('%s(.*)%s' % (start, end), self.web_url).group(1)

2 Answers 2

1

You get the error because there is no match. Your pattern is incorrect since it matches a single /, while there are 2 /s after http:. You need to fix the pattern as heemayl suggests or use an alternative urlparse based solution to get the netloc part, and get the part in between the first and last dots (either with find and rfind, or regex):

import urlparse, re
path = urlparse.urlparse("http://www.bloomberg.com")
print(path.netloc[path.netloc.find(".")+1:path.netloc.rfind(".")]) # => bloomberg
# or a regex:
print(re.sub(r"\A[^.]*\.(.*)\.[^.]*\Z", r"\1", path.netloc)) # => bloomberg
# or Regex 2:
mObj = re.search(r"\.(.*)\.", path.netloc);
if mObj:
    print(mObj.group(1)) # => bloomberg

See Python demo

Regex 1 - \A[^.]*\.(.*)\.[^.]*\Z - will will match the start of string (\A), then 0+ non-.s ([^.]*), then a dot (\.), then will capture any 0+ chars other than a newline into Group 1, then will match . and 0+ non-.s up to the very end of the string (\Z).

Regex 2 will just match the first . followed with any 0+ chars up to the last . capturing what is in between .s into Group 1.

Sign up to request clarification or add additional context in comments.

Comments

1

You are missing a / in start:

start = 'http://www.'

Also note that, the . has a special meaning in Regex, its a Regex token that will match any single character, not literal .. You need to escape it to make it literal i.e. \..

So you better do:

start = "http://www\."
end = "\.com"

2 Comments

And what about a dot?
@WiktorStribiżew Which one?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.