2

My main html page has an iframe it and I to need to get the text Code: LWBAD that lives there.

Check picture for a better understanding:

enter image description here

Bellow is my main html page source that has an iframe in it:

<td class="centerdata flag"><iframe style="width: 200px; height: 206px;" scrolling="no" src="https://www.example.com/test/somewhere" ></iframe></td>

The redirect link (iframe page) has this html source

<body>
<a href="http://www.test2.com" target="_blank">
<img src="https://img2.test2.com/LWBAD-1.jpg"></a>
<br/>Code: LWBAD

So far I can get the complete page source from my main html page.

from bs4 import BeautifulSoup
from selenium import webdriver
import time
import html5lib

driver_path = '/usr/local/bin/chromedriver 2'
driver = webdriver.Chrome(driver_path)
driver.implicitly_wait(10)

driver.get('http://example.com')
try:
    time.sleep(4)
    iframe = driver.find_elements_by_tag_name('iframe')
    driver.switch_to_default_content()

    output = driver.page_source

    print (output)

finally:
    driver.quit();

*urls are not accesible from outside of my network that's why I used example.com

2
  • You don't switch to the iframe anywhere. Commented Aug 14, 2018 at 9:55
  • @Guy I'm new to Python, do u mind showing me where switch to the frame should go ? Commented Aug 14, 2018 at 9:59

2 Answers 2

1

you should use

iframe = driver.find_elements_by_tag_name('iframe')[0]
driver.switch_to.frame(iframe)
 #  your work to extract link
driver.switch_to_default_content()

for multiple url

find_elements_by_tag_name will return an array. so use for loop

iframe = driver.find_elements_by_tag_name('iframe')
for i in iframe:
    driver.switch_to.frame(i)
    #  your work to extract link
driver.switch_to_default_content()

to get only text

use

text = driver.find_element_by_tag_name('body').text

after driver.switch_to.frame(i)

Sign up to request clarification or add additional context in comments.

7 Comments

Nice ! @Nihail what if I have multiple iframe urls ? what do I need to change to loop all iframe urls ? And how to print only text ? 'cause now it prints the whole html source.
what is the text? which do you want
my output looks like this : </style> </head> <body> <a href="http://www.test2.com" target="_blank"><img src="https://https://img2.test2.com/LWBAD-1.jpg></a><br />Code: LWBAD </body></html> But I want to print out only the Code: LWBAD
? any ideas anyone ?
no luck :( works only when there's a single url :( stale element reference: element is not attached to the page document
|
0

try this:

iframe = driver.find_elements_by_tag_name('iframe')
for i in range(0, len(iframe)):
    f = driver.find_elements_by_tag_name('iframe')[i]
    driver.switch_to.frame(i)
    #  your work to extract link
    text = driver.find_element_by_tag_name('body').text
    print(text)
    driver.switch_to_default_content()

10 Comments

there you go ! What was the issue though ? Working damn smooth now.
when driver.switch_to_default_content() drive need to find iframe all over again. so i find it using f = driver.find_elements_by_tag_name('iframe')[i]
i don't have IDE to test my code otherwise i would have suggested it from the start.
ahhhh ! You are the man ! Last question, simple one, How I can strip the Code: part out of the output out so I have only the clean : LWBAD ?
every time do you need to remove Code: ??
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.