1

I want to extract data HTML using Java. I tried using Jsoup but so far I'm unable to extract the correct data. Here is the HTML code snippet from which I'm trying to extract the data.

<a href="javascript:;" id="listen_880966" onclick="MP3PREVIEWPLAYER.showHiddePlayer(880966, 'http://mksh.free.fr/' + 'lol/mp3/Paint_It_Black/18_the_black_dahlia_murder_-_paint_it_black_(rolling_stones)-bfhmp3.mp3')" title="Listen Paint it Black    The Black Dahlia Murder   Great Metal Covers 36" class="button button-s button-1 listen "   >

I want the link ("http://mksh.free.fr/' + 'lol/mp3/Paint_It_Black/18_the_black_dahlia_murder_-_paint_it_black_(rolling_stones)-bfhmp3.mp3") and the title to be extracted into different variables. It would be really helpful if a sample code is provided along with the answer.

1
  • First of all, can you show us what you tried?h Commented Jun 28, 2013 at 12:38

1 Answer 1

4

You can use Regular Expressions to parse out the section you want. Then you can use something like string.split(delimiter) to extract out the specific info. See this link for info on the string.split() method

import java.util.regex.*;
import java.lang.*;

class Main
{
    public static void main (String[] args) throws java.lang.Exception
    {
            String mydata = "<a href=\"javascript:;\" id=\"listen_880966\" onclick=\"MP3PREVIEWPLAYER.showHiddePlayer(880966, 'http://mksh.free.fr/' + 'lol/mp3/Paint_It_Black/18_the_black_dahlia_murder_-_paint_it_black_(rolling_stones)-bfhmp3.mp3')\" title=\"Listen Paint it Black    The Black Dahlia Murder   Great Metal Covers 36\" class=\"button button-s button-1 listen \"   >";
            Pattern pattern = Pattern.compile("'http://mksh.free.fr/'\\s.\\s'[\\(\\).A-Za-z0-9/_-]+'");
            Pattern title = Pattern.compile("title=\\\"[A-Za-z0-9\\s]+\\\"");
            Matcher matcher = pattern.matcher(mydata);
            if (matcher.find())
            {
                System.out.println(matcher.group(0));

            }
            matcher = title.matcher(mydata);
            if(matcher.find())
                System.out.println(matcher.group(0));
    }
}

Ideone

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.