How to Get data-* attributes when web scraping using python requests (Python Requests Creating Some Issues)

Question

How can I get the value of data-d1-value when I am using requests library of python?

The request.get(URL) function is itself not giving the data-* attributes in the div which are present in the original webpage.

The web page is as follows:

<div id="test1" class="class1" data-d1-value="150">
180
</div>

The code I am using is :

req = request.get(url)
soup = BeautifulSoup(req.text, 'lxml')
d1_value = soup.find('div', {'class':"class1"})
print(d1_value)

The result I get is:

<div id="test1" class="class1">
180
</div>

When I debug this, I found that request.get(URL) is not returning the full div but only the id and class and not data-* attributes.

How should I modify to get the full value?

For better example: For my case the URL is: https://www.moneycontrol.com/india/stockpricequote/oil-drillingexploration/oilnaturalgascorporation/ONG

And the Information of variable: The DIV CLASS is : class="inprice1 nsecp" and The value of data-numberanimate-value is what I am trying to fetch

Thanks in advance :)

Thanks for adding information, but what is the url or the response from the request? Just in case read this please: How to create a Minimal, Reproducible Example Thanks — HedgeHog
– HedgeHog, Commented Jan 22, 2021 at 11:35
So, if there is no data attribute in the response, it might be that the website serves dynamic content, that requests could not get. To check this, please provide the url, you are requesting. — HedgeHog
– HedgeHog, Commented Jan 22, 2021 at 11:42
Thanks I'll read the link you just shared: For my case the link is: moneycontrol.com/india/stockpricequote/oil-drillingexploration/… And the value I am trying to get is : The DIV CLASS is : class="inprice1 nsecp" and The value of data-numberanimate-value is what I am trying to fetch — Xavier
– Xavier, Commented Jan 22, 2021 at 11:50
Thanks for improving looks much better and detailed, take a look at me edit, based on this new information. — HedgeHog
– HedgeHog, Commented Jan 22, 2021 at 12:14

HedgeHog · Accepted Answer · 2021-01-22 12:12:13Z

1

EDIT

Website response differs in case of requesting it - In your case using requests the value you are looking for is served in this way:

<div class="inprice1 nsecp" id="nsecp" rel="92.75">92.75</div>

So you can get it from the rel or from the text:

soup.find('div', {'class':"inprice1"})['rel']
soup.find('div', {'class':"inprice1"}).get_text()

Example

import requests
from bs4 import BeautifulSoup

req = requests.get('https://www.moneycontrol.com/india/stockpricequote/oil-drillingexploration/oilnaturalgascorporation/ONG')

soup = BeautifulSoup(req.text, 'lxml')

print('rel: '+soup.find('div', {'class':"inprice1"})['rel'])
print('text :'+soup.find('div', {'class':"inprice1"}).get_text())

Output

rel: 92.75
text: 92.75

To get a response that display the source as you inspect it, you have to try selenium

Example

from selenium import webdriver
from bs4 import BeautifulSoup
from time import sleep

driver = webdriver.Chrome(executable_path='C:\Program Files\ChromeDriver\chromedriver.exe')
url = "https://www.moneycontrol.com/india/stockpricequote/oil-drillingexploration/oilnaturalgascorporation/ONG"

driver.get(url)
sleep(2)

soup = BeautifulSoup(driver.page_source, "lxml")
print(soup.find('div', class_='inprice1 nsecp')['data-numberanimate-value'])
driver.close()

To get the attribute value just add ['data-d1-value'] to your find()

Example

from bs4 import BeautifulSoup

html='''
<div id="test1" class="class1" data-d1-value="150">
180
</div>
'''

soup = BeautifulSoup(html, 'lxml')
d1_value = soup.find('div', {'class':"class1"})['data-d1-value']
print(d1_value)

edited Jan 22, 2021 at 12:12

answered Jan 22, 2021 at 10:47

HedgeHog

25.4k5 gold badges18 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Xavier Over a year ago

Yes you are correct, but the request.get(URL) function itself doesn't give the data attributes, it is just returning: <div id="test1" class="class1"> 180 </div>

HedgeHog Over a year ago

@Xavier: Then you should improve your question and add this information, so that everybody knows and can help, please. Thanks

Xavier Over a year ago

I have done that, hope that clears some doubts.

HedgeHog Over a year ago

Take a look at my edit, hope this will help to understand and decide wich way you will go.

sunilbaba · Accepted Answer · 2021-01-22 10:55:36Z

0

you are seeing this issue, because you didn't retrieve all of the other attributes which we're defined on the DIV.

The below code will retrieve all of the custom attributes which we're defined on the div as well

from bs4 import BeautifulSoup
s = '<div id="test1" class="class1" data-d1-value="150">180</div>'
soup = BeautifulSoup(s)

attributes_dictionary = soup.find('div',{'class':"class1"}).attrs
print(attributes_dictionary)

answered Jan 22, 2021 at 10:55

sunilbaba

4452 silver badges10 bronze badges

1 Comment

Xavier Over a year ago

Yes you are correct, but the request.get(URL) function itself doesn't give the data attributes, it is just returning: <div id="test1" class="class1"> 180 </div>

DisappointedByUnaccountableMod · Accepted Answer · 2021-01-22 20:36:25Z

0

You can get data from HTML or you just can do it scraping the API

This is an example:

Website is: Money Control

If you going to developer tools into your browser, and select Network, you can see the requests that are doing the website:

See image

You can see that in headers, appear URL from API: priceapi.moneycontrol.com.

This is a strange case, because the API is open... and usually it isn't.

You can access to price:

Imagine that you save JSON data into a variable called 'json', you can access it with:

json.data.pricecurrent

edited Jan 22, 2021 at 20:36

DisappointedByUnaccountableMod

6,8444 gold badges21 silver badges23 bronze badges

answered Jan 22, 2021 at 13:16

Javix64

11 silver badge3 bronze badges

Collectives™ on Stack Overflow

How to Get data-* attributes when web scraping using python requests (Python Requests Creating Some Issues)

3 Answers 3

EDIT

To get a response that display the source as you inspect it, you have to try selenium

4 Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

EDIT

To get a response that display the source as you inspect it, you have to try selenium

4 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related