1

Image explaining the data to be extracted

I'm trying to extract data from a web page (marked red in the image) using HtmlUnit library of java. But I can't get that particular value.

WebClient webClient = new WebClient(BrowserVersion.CHROME);
Thread.sleep(5000);
HtmlPage page = webClient.getPage("https://earth.nullschool.net/#current/wind/isobaric/500hPa/orthographic=-283.71,14.19,2183/loc=76.850,11.440");
Thread.sleep(5000);
System.out.println(page.asXml());

I checked the html which I got on console window. It doesn't contain the value.

<p>
  <span id="location-wind" class="location">
          </span>
  <span id="location-wind-units" class="location text-button">
          </span>
</p>

1 Answer 1

1

It's because these are filled in via JavaScript. When you load the page, these fields are initially empty. You can check this by looking at the source code and searching for id="location.

The page makes two additional HTTP requests to fetch dynamic data:

  1. https://earth.nullschool.net/data/earth-topo.json?v3
  2. https://gaia.nullschool.net/data/gfs/current/current-wind-isobaric-500hPa-gfs-0.5.epak

Somewhere in this data (and combined they are around 1.2 MB) is the data that you're looking for. Your best bet is to use a tool (perhaps an online one) to convert the JSON to a Java object, or to study the JSON and write code to get the specific data that you're after.

That is, if that data is in the JSON, which I'm not convinced about. The EPAK file appears to be some sort of binary data with embedded JSON, but I couldn't figure out if the data is perhaps in there.

Another approach is to use Selenium, have it parse the page for you, and retrieve the data from there.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.