0

I've looked all over the web for a reason why I can't remove duplicates from my array. I am aware of array_unique but it's not working.

I have an array called $links which is populated from the links found in html source code

If i then echo out

$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
if(preg_match_all("/$regexp/siU", $html, $links, PREG_SET_ORDER)) {
    foreach($links as $link) {
        echo($link[2]."<br />"); // $link[2] = address, $link[3] = text
    }
}

it outputs each link to a new line perfectly, however there are duplicates

$link[2] is the only dimension of the array i am interested in, and i cannot remove its duplicates, can anyone help?

This is what is echoed out, for example

/
/register
/forgotpass
/
/news
/news
/news/submit_article
#
/recruiters/companies
/recruiters/headhunters
/jobs
/jobs
/jobs/submit_job
/about
/jobs
/jobs/submit_job
/resume
/register
/fullmap

I would hope to see

/
/register
/forgotpass
/news
/news/submit_article
#
/recruiters/companies
/recruiters/headhunters
/jobs
/about
/jobs/submit_job
/resume
/fullmap

FIXED BY JON - ANSWER:

$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
if(preg_match_all("/$regexp/siU", $html, $links, PREG_SET_ORDER)) {
    $unique_links = array_unique(array_map(function($item) { return $item[2]; }, $links));
    foreach($unique_links as $link) {
        echo($link."<br />");
    }
}
2
  • Can you provide a sample for $html that demonstrates this problem? Commented Dec 8, 2011 at 23:41
  • Also, some more detail on what is your expected output? Just an array of unique URIs? Do you also need the text? Which piece of text do you want back if there are two links with the same URI but diffent text? Commented Dec 8, 2011 at 23:42

2 Answers 2

3

You can simply extract the "row" you are interested in from the array and call array_unique on that:

$unique_links = array_unique(array_map(function($item) { return $item[2]; }, $links));

This code will only run on PHP >= 5.3 due to the lambda syntax, but you can also use create_function for earlier PHP versions.

See it in action.

Sign up to request clarification or add additional context in comments.

1 Comment

Incredible, works perfectly, thanks SO much - updating question with example answer, thanks again!
0

If all you want to do is filter duplicate entries, you can do so by loading them into an array and using in_array() to determine if they're already there.

$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
if(preg_match_all("/$regexp/siU", $html, $links, PREG_SET_ORDER)) {
  $filter_links = array();
  foreach ($links as $link) {
    if (!in_array($filter_links)) {
      $filter_links[] = $link[2];
      echo $link[2]."<br />";
    }
  }
}

EDIT: But the answer above is a better solution :)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.