0

I am trying to scrape information from this page using Selenium on Python 3.6:

http://aogweb.state.ak.us/PoolStatistics/Pool/Overview?poolNo=60100

I need to extract the text in the Location box: "Eastern Arctic Slope". My code finds the tag but returns an empty string. I tried multiple ways to find it XPath, by_id, by_class, nothing works. Would appreciate your help!

My code:

options = Options()
options.add_argument("--headless")
driver = webdriver.Chrome(executable_path='C:/webdrivers/chromedriver.exe', options=options)

url = 'http://aogweb.state.ak.us/PoolStatistics/Pool/Overview?poolNo=60100'
driver.get(url)
t.sleep(5)

location = driver.find_elements_by_xpath('//*[@id="location"]')

print(len(location))
print(location[0].text)

It returns the length of 1 and empty text. Why is it not getting the text?

1
  • Pseudo element ! need to execute script. Commented Apr 10, 2020 at 18:45

1 Answer 1

1

Actually you are dealing with Hidden Pseudo element, which is fetched from the Back-End API using a Java Function, I've been able to track the XHR request under the Developer tools within the Browser and checking the Network Monitor for the API.

Below you can achieve your target easy.

import requests
import json

def main(url):
    r = requests.get(url).json()
    for item in r:
        print(item['PoolLocation'])
        #print(item.keys()) as it's a JSON dict now


main("http://aogweb.state.ak.us/PoolStatistics/Pool/GetPoolById?poolNo=60100")

Output:

Eastern Arctic Slope

For nicely readable format with indent.

print(json.dumps(r, indent=4))

Output:

[
    {
        "FieldPool": 60100,
        "WebTitle": "Badami, Badami Oil Pool",
        "Pool_Operator": "Savant Alaska LLC.",
        "Well_Operator": "Conoco Inc.",
        "Well_Nm": "BADAMI 1",
        "Well_Permit": "1891170",
        "Well_API": "50-029-22017-00-00",
        "Wh_Sec": 9,
        "Wh_Twpn": 9,
        "Wh_Twpd": "N",
        "Wh_RngN": 20,
        "Wh_RngD": "E",
        "Wh_Pm": "U",
        "DTD": 13595,
        "TVD": 12911,
        "Dt_Effect": "1990-04-27T00:00:00",
        "PoolStatus": "Producing",
        "PoolLocation": "Eastern Arctic Slope",
        "Text_Summary": "\\\"The Badami Oil Pool was discovered in 1990 and developed through a drilling program lasting from 1997 to 1998.|1|  The 
pool has now been penetrated by 19 well bores, most of which are clustered near the center of the Badami Unit. There are eight additional wells that lie inside of, or within four miles of, the unit boundaries. The Badami Oil Pool is defined in Conservation Order No. 402C, issued September 4, 2012.\r\n\r\nRegular production from the pool began on August 23, 1998,|2|  and peaked at average rate of 7,450 barrels of oil per day (BOPD) during September 1998.|3|  However, production rapidly declined to 3,300 BOPD by January 1999, and the field was shut in from February 4|4|  through April 30, 1999. After facilities were upgraded and remediated, production was restarted on May 1, 1999,|5|  and jumped to an average of nearly 5,300 BOPD during July of 1999. However, by year-end 1999, it declined to an average of less than 3,000 BOPD, and by July 2003 field production averaged less than 1,300 BOPD from six wells.|6|  In August 2003, the Regulatory Commission of Alaska approved BP\u2019s request to temporarily shut down the Badami 
oil pipeline and gas products pipeline for approximately two years.|7|  BP shut-in production and placed the facilities in \u201cwarm shutdown\u201d that same month.|8| \r\n\r\nRegular production from the Badami Pool resumed in September 2005. In October of that year, the pool averaged 1,785 BOPD from five producing wells, but by December average oil production declined to 1,437 BOPD from six producing wells. By July 2007, average oil production declined to 876 BOPD from four producing wells. BP shut the field in to recharge during late August 2007.|9||10| BP joined with Savant Alaska 
LLC, a subsidiary of Savant Resources LLC, in April 2008 to conduct engineering, permitting and inspection operations with the intent of restarting 
Badami.|11|  Late in 2008, Arctic Slope Regional Corporation joined with BP and Savant to revitalize production from the pool by drilling horizontal wells and hydraulically fracturing them.|12|  The pool was returned to regular production in November 2010, and for the first six months of 2011 the pool averaged 1,020 BOPD with no water cut. Since that time, the number 
of producing wells has increased from four to seven and, for the last quarter of 2015, pool production averaged 856 BOPD.  Over the first five months of 2019, the pool averaged 507 BOPD and a water cut of 1.2 percent. For 
the last six months of 2019, the pool averaged 493 BOPD. Savant added a second producer, Badami B1-07, in May 2018, which increased production by about 1,570 BOPD.  Production peaked in January 2019 at 1,754 BOPD, and for 
the first six months of 2019 the pool averaged 1,332 BOPD.|13|  \r\n\r\nAccording to the Alaska Department of Natural Resources\u2019 Case File Number ADL 367011, Tennessee-based Miller Energy Resources Inc. acquired 100% 
interest in Savant Alaska LLC effective December 1, 2014.|14|  Since early 2016, Miller Energy conducts business in Alaska as Glacier Oil & Gas Corp. |15| \\\"",
        "Text_Geology": "The Badami reservoir comprises several separate turbidite sandstone reservoirs assigned to the Tertiary-aged Canning Formation. These sandstone reservoirs were deposited largely as amalgamated channel sands|16|  within mud-dominated submarine fan systems.|17| Published descriptions suggest the reservoirs are complex, comprising 61 identified fans laid down during seven depositional events. Reservoir quality sands are thin  and discontinuous  reservoir quality sands.|18||19| No single well has encountered all of the identified fan systems; the Badami No. 1 exploratory well reportedly penetrated the most complete section.|20| The Badami Oil Pool is defined as the accumulation of hydrocarbons common to and correlating with the interval between the measured depths of 9,500 feet and 
11,500 feet in the Badami No. 1 well.|21|  The reservoir sandstones are very fine-to-fine grained and moderately sorted.|22|  Porosity ranges from 15 to 21 percent, permeability ranges from 1 to 400 md,|23|  and oil gravity reportedly ranges from 19 to 30 degrees API|24||25|.",
        "FolderName": "Badami,Badami_Oil",
        "PoolType": "OIL"
    }
]
Sign up to request clarification or add additional context in comments.

7 Comments

Hello, thanks for posting this. I receive the following error when I run this code: File "C:\Users\...\Python36\lib\json\decoder.py", line 357, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) And when executing print(json.dumps(r, indent=4)), I receive the following error: Object of type 'Response' is not JSON serializable I am not very proficient in json. What can cause these problems?
print(r) returns the following: <Response [200]>
@RusLan currently am online via phone. I’ll check that once I get in laptop later .
@RusLan I've just checked and it's works without any issues ! which Python version you are using ?
@ αԋɱҽԃ αмєяιcαη I use 3.6. I read a bit about pseudo-elements and I don't get why Selenium returns an empty string. I thought Selenium should be able to deal with java scripts. I never experienced this before with Selenium. Why json can handle this and Selenium can't?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.