1

Im trying to scrape the number of followers count with selenium but it clearly identify the "ValueError" as a number:

Snapshot:

enter image description here

Code trials:

follower_count =int(browser.find_element_by_xpath('/html/body/div/div/div/div[2]/main/div/div/div/div[1]/div/div[2]/div/div/div[1]/div/div[5]/div[2]/a/span[1]/span').text)
following_count = int(browser.find_element_by_xpath('/html/body/div/div/div/div[2]/main/div/div/div/div[1]/div/div[2]/div/div/div[1]/div/div[5]/div[1]/a/span[1]/span').text)
        

The error message:

enter image description here

3
  • 1,961 has a comma which you should deal with. Commented Jan 31, 2021 at 2:05
  • You can use regular expression sub to clean your string so it's only numbers. After importing re. newstring = re.sub('[^0-9]','', oldstring) Commented Jan 31, 2021 at 2:15
  • @luthervespers i have this : Traceback (most recent call last): File "C:\Users\Desktop\InstaPy-master\quickstart.py", line 79, in <module> followers_count = re.sub('[^0-9]','', follower_count) File "C:\Users\AppData\Local\Programs\Python\Python38\lib\re.py", line 208, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or bytes-like object Commented Jan 31, 2021 at 20:10

1 Answer 1

2

The extracted text i.e. 1,961 contains a , character in between. So you won't be able to invoke int() directly on it.


Solution

You need to replace() the , character from the text 1,961 first and then invoke int() as follows:

  • Code Block:

    # count = browser.find_element_by_xpath('/html/body/div/div/div/div[2]/main/div/div/div/div[1]/div/div[2]/div/div/div[1]/div/div[5]/div[2]/a/span[1]/span').text
    count = "1,961"
    print(int(count.replace(",","")))
    print(type(int(count.replace(",",""))))
    
  • Console Output:

    1961
    <class 'int'>
    

This usecase

Effectively, your line of code will be:

follower_count =int(browser.find_element_by_xpath('/html/body/div/div/div/div[2]/main/div/div/div/div[1]/div/div[2]/div/div/div[1]/div/div[5]/div[2]/a/span[1]/span').text.replace(",",""))
following_count = int(browser.find_element_by_xpath('/html/body/div/div/div/div[2]/main/div/div/div/div[1]/div/div[2]/div/div/div[1]/div/div[5]/div[1]/a/span[1]/span').text.replace(",",""))

References

You can find a relevant detailed discussion in:

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.