0

I have html like this

<div id="ctl00_ContentPlaceHolder1_pnlRequirement" class="BlockContent">

            <h2 class="Header">Requirements for Canadian Students</h2>
            <div class="Content"><p></p><p>For admission to King's University College, applicants will have completed their Ontario Secondary School Diploma (OSSD) with at least six Grade 12U/M courses including Grade 12U English. Students applying from other provinces in Canada can contact the Office of Enrolment Services at King's or review our website: <a target="_blank" href="https://www.kings.uwo.ca/future-students/admissions/admission-requirements/high-school/">https://www.kings.uwo.ca/future-students/admissions/admission-requirements/high-school/</a>. The minimum grade average required for most programs is 79%.</p><p></p></div>

</div>
<div id="ctl00_ContentPlaceHolder1_pnlIRequirement" class="BlockContent">

            <h2 class="Header">Requirements for International Students</h2>
            <div class="Content"><p></p><p>Admissions requirements will vary by country curriculum. Please refer to <a target="_blank" href="https://www.kings.uwo.ca/future-students/admissions/admission-requirements/international-students/">https://www.kings.uwo.ca/future-students/admissions/admission-requirements/international-students/</a>. If you do not see your curriculum listed, please contact the Office of Enrolment Services directly at <a target="_blank" href="https://www.kings.uwo.ca/">kings.uwo.ca</a> or by phone at (519) 433-3491. Applicants will also be required to provide proof of English language proficiency (ELP) if English is not their first language. Please refer to our website for ELP requirements:  <a target="_blank" href="https://www.kings.uwo.ca/future-students/admissions/admission-requirements/english-proficiency/">https://www.kings.uwo.ca/future-students/admissions/admission-requirements/english-proficiency/</a>.</p><p></p></div>

</div>

I'm trying to get all text in content like this below in python but totally lost that how its gonna work. Any help is appreciated. Many thanks!

Requirements_for_Canadian_Students=''.join(response.css("#ctl00_ContentPlaceHolder1_pnlRequirement .Content *::text").getall())
           Requirements_for_International_Students=''.join(response.css("#ctl00_ContentPlaceHolder1_pnlRequirement .Content *::text").getall())

1 Answer 1

1

How about using XPath and string() function:

Requirements_for_Canadian_Students = response.xpath('string(//h2[.="Requirements for Canadian Students"]/following-sibling::div[1])').get()
Requirements_for_International_Students= response.xpath('string(//h2[.="Requirements for International Students"]/following-sibling::div[1])').get()
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks very much will try it

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.