0

I am trying to webscrape coinmarketcap.com using selenium where I am trying to retrieve data such as coin name, coinmarket cap, price and circulation supply. However, I am not successful with this. I am only able to retrieve 11 alt coins and not more. Also, I have looked into several ways how to render javascrip (which I presume coinmarketcap is made in) using different methods. Here is the start of my code:

driver = webdriver.Chrome(r'C:\Users\Ejer\PycharmProjects\pythonProject\chromedriver')
driver.get('https://coinmarketcap.com/')

Crypto = driver.find_elements_by_xpath("//div[contains(concat(' ', normalize-space(@class), ' '), 'sc-16r8icm-0 sc-1teo54s-1 lgwUsc')]")
#price = driver.find_elements_by_xpath('//td[@class="cmc-link"]')
#coincap = driver.find_elements_by_xpath('//td[@class="DAY"]')

CMC_list = []
for c in range(len(Crypto)):
    CMC_list.append(Crypto[c].text)
print(CMC_list)

driver.close()

My goal is to store the names, coinmarket cap, price and circulation supply in a dataframe so I can apply machine learning methods and analyze the data. So, I am open to any suggestions. Thank in advance

2 Answers 2

1

Facing the same problem, I added a page scrolling before Crypto = driver.find_elements_by_xpath... like this:

i=0
while i<15:
  driver.execute_script("window.scrollBy(0, window.innerHeight)")
  time.sleep(SCROLL_PAUSE_TIME)
  i+=1
Crypto = driver.find_elements_by_xpath('//div[@class="sc-16r8icm-0 sc-1teo54s-0 dBKWCw"]')

On my laptop, scrolling down the page for 13 times is enough to get refreshed all 100 coins. I put 15 just to be sure. The next step is to get the refreshed content. Perhaps I have to repeat scrolling every 1 or 2 minutes to get it. My first post here. Hard enough to insert the code. I hope it's useful

Sign up to request clarification or add additional context in comments.

Comments

0

To retrieve the list of coin names you need to close the cookies bar, close the popup and induce WebDriverWait for the visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR and get_attribute("innerHTML"):

    driver.get("https://coinmarketcap.com/")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.cmc-cookie-policy-banner__close"))).click()
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button/b[text()='No, thanks']"))).click()
    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table.cmc-table tbody tr td > a p[color='text']")))])
    
  • Using XPATH and text attribute:

    driver.get("https://coinmarketcap.com/")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.cmc-cookie-policy-banner__close"))).click()
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button/b[text()='No, thanks']"))).click()
    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[contains(@class, 'cmc-table')]//tbody//tr//td/a//p[@color='text']")))])
    driver.quit()
    
  • Console Output:

    ['Bitcoin', 'Ethereum', 'XRP', 'Tether', 'Litecoin', 'Bitcoin Cash', 'Chainlink', 'Cardano', 'Polkadot', 'Binance Coin', 'Stellar', 'USD Coin', 'Bitcoin SV']
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.