1

I am willing to parse https://2gis.kz , and I encountered the problem that I am getting error while using .text or any methods used to extract text from a class

I am typing the search query such as "fitness"

My window variable is

all_cards = driver.find_elements(By.CLASS_NAME,"_1hf7139")
for card_ in all_cards:
    card_.click()
    window = driver.find_element(By.CLASS_NAME, "_18lzknl")

This is a quite simplified version of how I open a mini-window with all of the essential information inside it. Below I am attaching the piece of code where I am trying to extract text from a phone number holder.

    texts = window.find_elements(By.CLASS_NAME,'_b0ke8')

    print(texts) # this prints out something from where I am concluding that this thing is accessible
    try:
        print(texts.text)
    except:
        print(".text")
    try:
        print(texts.text())
    except:
        print(".text()")
    try:
        print(texts.get_attribute("innerHTML"))
    except:
       print('getAttribute("innerHTML")')
    try:
        print(texts.get_attribute("textContent"))
    except:
        print('getAttribute("textContent")')
    try:
        print(texts.get_attribute("outerHTML"))
    except:
        print('getAttribute("outerHTML")')

Hi, guys, I solved an issue. The .text was not working for some reason. I guess developers somehow managed to protect information from using this method. I used a

get_attribute("innerHTML") # afaik this allows us to get a html code of a particular class

and now it works like a charm.

                texts = window.find_elements(By.TAG_NAME, "bdo")

                with io.open("t.txt", "a", encoding="utf-8") as f:
                    for text in texts:
                        nums = re.sub("[^0-9]", "", 
                        text.get_attribute("innerHTML"))
                        f.write(nums+'\n')
                    f.close()

So the problem was that:

  1. I was trying to print a list of items just by using print(texts)
  2. Even when I tried to print each element of texts variable in a for loop, I was getting an error due to the fact that it was decoded in utf-8.

I hope someone will find it useful and will not spend a plethora of time trying to fix such a simple bug.

0

1 Answer 1

1

find_elements method returns a list of web elements. So this

texts = window.find_elements(By.CLASS_NAME,'_b0ke8')

gives you texts a list of web elements.
You can not apply .text method directly on list.
In order to get each element text you will have to iterate over elements in the list and extract that element text, like this:

text_elements = window.find_elements(By.CLASS_NAME,'_b0ke8')
for element in text_elements:
    print(element.text)

Also, I'm not sure about locators you are using.
_1hf7139, _18lzknl and _b0ke8 class names are seem to be dynamic class names i.e they may change each browsing session.

Sign up to request clarification or add additional context in comments.

4 Comments

I thought in the same way as you did. But it did not help me. When it comes to locators, so far they are working fine for me, selenium can locate these elements without any problems. I was wondering maybe developers of the website are using something that does not allow me to extract data.
What exactly do you mean by "But it did not help me"? What is the problem?
Thank you for your help. Now it works. I am going to attach my solution
In case it works please accept the answer. You can also update your question with the working solution, please don't post the working solution as an answer

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.