0

I am trying to parse the EPG data at the below link. When I inspect the HTML using the below, all the program data is missing. I realise this is because it's being loaded async by Javascript, but I cannot figure out in Chrome Tools which is the API call as there seems to be a lot loaded into this page at once:

import requests

url = 'https://mi.tv/ar/programacion/lunes'
headers ={
    'Accept': 'text/html, */*; q=0.01',
    'Referer': outer,
    'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="90", "Google Chrome";v="90"',
    'sec-ch-ua-mobile': '?0',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36',
    'X-KL-Ajax-Request': 'Ajax_Request',
    'X-Requested-With': 'XMLHttpRequest'
    }

r = requests.get(url=url, headers=headers)
rr = r.text
print(rr)

...anyone identify for me what the correct API is? I can see there are API parameters given in the HTML, but I've not been able to assemble them into a working link and I cannot see anything with that URL root in chrome tools...

1 Answer 1

2

The following shows the right url to use and how to return listings in a dict by channel key

import requests
from bs4 import BeautifulSoup as bs
from pprint import pprint

headers = {'User-Agent': 'Mozilla/5.0'}
r = requests.get('https://mi.tv/ar/async/guide/all/lunes/60', headers = headers)
soup = bs(r.content, 'lxml')
listings = {c.select_one('h3').text: list(zip([i.text for i in c.select('.time')], [i.text for i in c.select('.title')])) 
            for c in soup.select('.channel')}
pprint(listings)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.