0

I have been at this for a while and am throwing in the towel for help. I am trying to scrap this page specifically I am trying to get access to every table row that has information in it as highlighted green in the following picture. I do no need the table headers, just the rows.

enter image description here

With Scrapy I am able to get to each section area (where it says "Main Campus") with the following selector

response.css('.datadisplaytable .datadisplaytable')

I use .datadisplaytable twice because the tables I am trying to select are inside a table with that class. After that what seems logical to me to get to the table row I am after would be to use the following selector

response.css('.datadisplaytable .datadisplaytable tbody:nth-child(2)')

However, I get nothing with this selector. What am I doing wrong?

1 Answer 1

1

Your selector is a bit off. You're not trying to get the 2nd <tbody/> tag.

.datadisplaytable .datadisplaytable tbody tr:nth-child(n+2)

That will get you all the rows, and skip the header for each table.

Sign up to request clarification or add additional context in comments.

5 Comments

I tried that in the scrapy shell and got nothing back.
tr:nth-child(n+2) selects all tr children that are the 2nd element or later among their siblings.
@ehThind I'm not familiar with Scrapy, I only verified this by going to my console on that page and checking the result of $('.datadisplaytable .datadisplaytable tbody tr:nth-child(n+2)').
.datadisplaytable .datadisplaytable tr:nth-child(n+2) without the tbody seems to work however, It still provides the headers as well.
.datadisplaytable .datadisplaytable tr:nth-child(3) Works. Thank you every much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.