I am working on a web scraping problem in Ruby. I have seen multiple questions and answers related to this but in none I have seen HTML that include some JavaScript framework in it and I cannot figure out how to do it. I just want to select the HTML and return an array of objects. The following is my script and the HTML code. The HTML classes of the values like name, currency, balance are similar and the question of how can it be done?
content = document.css("div.acc-list").map do |parameters|
name = parameters.at_css("p.s3.bold.row.acDesc").text.strip, # argument?
currency = parameters.at_css(".row.ccy").text.strip, # argument?
balance = parameters.at_css(".row.acyOpeningBal").text.strip # argument?
Account.new name, currency, balance
end
pp content
These HTML paragraphs are inside multiple other classes which I think is due to the framework. However, they are inside a <div class = acc-list div>...</div> and I think I did correctly when I assigned "div.acc-list" to "content" variable.
<!-- HTML for name -->
<td bindonce="" ng-repeat="col in gridOptions.columns" sg-bind-html-compile="col.cellTemplate" bo-class="col.className" bo-style="{width: col.remWidth }"
class="ng-scope icon-two-line-col" style="width: 17.3333rem;">
<div style="width: 17.333333333333332rem" class="first-cell cellText ng-scope">
<i bo-class="{'active':row.selected }" class="i-32 active icon i-circle-account"></i>
<div class="info-wrapper" style="">
<p class="s3 bold" bo-bind="row.acDesc">Name_value</p> # value
<a ui-sref="app.layout.ACCOUNTS.DETAILS.{ID}({id:'091601003439274'})" href="/Bank/accounts/details/BG37FINV91503006938102">
<span bo-bind="row.iban">BG37FINV91503006938102</span>
<i class="i-arrow-right-5x8"></i>
</a>
</div>
</div>
</td>
<!-- HTML for currency -->
<td bindonce="" ng-repeat="col in gridOptions.columns" sg-bind-html-compile="col.cellTemplate" bo-class="col.className" bo-style="{width: col.remWidth }"
class="ng-scope" style="width: 4.4rem;">
<div style="width: 4.4rem" class="text-center cellText ng-scope">
<span bo-bind="row.ccy">EUR</span> # value
</div>
</td>
<!-- HTML for balance -->
<td bindonce="" ng-repeat="col in gridOptions.columns" sg-bind-html-compile="col.cellTemplate" bo-class="col.className" bo-style="{width: col.remWidth }"
class="ng-scope" style="width: 8.73333rem;">
<div style="width: 8.733333333333333rem" class="text-right cellText ng-scope">
<span bo-bind="row.acyAvlBal | sgCurrency">1 523.08</span> # value
</div>
</td>
row.acDescis not a CSS class, soat_cssdefinitely won't be able to use it. I would try Nokogiri's xpath syntax, which can do more complex selectors based on arbitrary attributes such asbo-bind