-1

I want to break down an URL and extract the information I need. Breaking an URL was easy, but I'm not sure how to extract the information I need.

Below is the URL breaking part. I want to extract the destination id part and save it in dest_id. In the example URL below, it will be '1504033' (next to 'destination-id=')

url = 'https://www.hotels.com/search.do?resolved-location=CITY%3A1504033%3AUNKNOWN%3AUNKNOWN&destination-id=1504033&q-destination=Las%20Vegas,%20Nevada,%20United%20States%20of%20America&q-check-in=2019-10-12&q-check-out=2019-10-13&q-rooms=1&q-room-0-adults=2&q-room-0-children=0'
url_break = url.split('%')

I know how to call by index number, but it may not work all the time because that part can be anywhere instead of 5th index. (It can be 3rd or 4th)

1

1 Answer 1

2

Do not split the url yourself, use appropriate libraries:

url = 'https://www.hotels.com/search.do?resolved-location=CITY%3A1504033%3AUNKNOWN%3AUNKNOWN&destination-id=1504033&q-destination=Las%20Vegas,%20Nevada,%20United%20States%20of%20America&q-check-in=2019-10-12&q-check-out=2019-10-13&q-rooms=1&q-room-0-adults=2&q-room-0-children=0'

from urllib import parse

k = parse.urlsplit(url)
params = parse.parse_qs(k.query) 

print(params) 

Output:

{'resolved-location': ['CITY:1504033:UNKNOWN:UNKNOWN'], 
 'destination-id': ['1504033'], 
 'q-destination': ['Las Vegas, Nevada, United States of America'], 
 'q-check-in': ['2019-10-12'], 'q-check-out': ['2019-10-13'], 
 'q-rooms': ['1'], 'q-room-0-adults': ['2'], 'q-room-0-children': ['0']}

and access the dictionary:

dest_id = params.get("destination-id")
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.