0

Say I have a list of genres that looks something like:

$genres = array(
    'soul', 
    'soul jazz', 
    'blues', 
    'jazz blues', 
    'rock', 
    'indie', 
    'cool jazz', 
    'rock-blues');

...And so on, for 762 values. How can I organize these genres into categories?

For example, I would want the Blues category to contain 'blues', 'jazz blues', and 'rock blues.' I would want the Jazz category to contain 'soul jazz', 'jazz blues', and 'cool jazz.'

Any and all help is appreciated.

4
  • Can you elaborate how you get this array? Is this from a database? Commented Dec 27, 2013 at 1:30
  • Sounds like you want to represent a hierarchy. Commented Dec 27, 2013 at 1:32
  • @Machavity: From an API. Commented Dec 27, 2013 at 1:38
  • Mike which was the problem with similar_text ? Commented Dec 27, 2013 at 12:57

2 Answers 2

1

Using preg_match would be one the best ways to solve your problem.

<?php
$categories = array("blues", "jazz");
$genres = array("soul", "soul jazz", "blues", "jazz blues", "rock", "indie", "cool jazz", "rock-blues");
$arr = array();
$others = array();
foreach($genres as $genre){
$num = 0;
    foreach($categories as $category){
        if(preg_match("/\\b".$category."\\b/", $genre)){
        $arr[$category][] = $genre;
        $num = 1;
        }
    }
    if($num == 0){
    $others[] = $genre;
    }
}
ksort($arr);
$arr["others"] = $others;
unset($genre, $num, $category, $others);
print_r($arr);
?>

The result will be:

Array
(
    [blues] => Array
        (
            [0] => blues
            [1] => jazz blues
            [2] => rock-blues
        )

    [jazz] => Array
        (
            [0] => soul jazz
            [1] => jazz blues
            [2] => cool jazz
        )

    [others] => Array
        (
            [0] => soul
            [1] => rock
            [2] => indie
        )

)
Sign up to request clarification or add additional context in comments.

1 Comment

As much as I dislike using regex, I actually got this one to work exactly how I desired. Thank you, Sharanya
1

Given some seeds:

$seeds = array('blues','jazz',...);

Then just compute its nearest:

foreach($genres as $v) {
 $similarity = 0;
 $k = 0;
 foreach($seeds as $kk=>$vv) {
     $current = similar_text($v,$vv);
     if ($current>$similarity) {
        $similarity = $current;
        $k=$kk;
     }
  }
 $categories[$k][]=$v;

}

At this point you have your $geners labled in $categories;

Array
(
    [blues] => Array
        (
            [0] => soul
            [1] => blues
            [2] => jazz blues
            [3] => rock
            [4] => indie
            [5] => rock-blues
        )

    [jazz] => Array
        (
            [0] => soul jazz
            [1] => cool jazz
        )

)

Tested code at codepad: http://codepad.org/HCPcO4Iy

PS. clearly if you have those two seeds (blues and jeez) and then you have to cluster the genre "jeez blues" then it might be assigned to one or to the other without any logic

4 Comments

the solution is based on the $similarity level, and therefore, clues may get included in the blues category, which will be much dearer to this function than jazz blues
@Stoic: That is actually something I want. Testing this out now
@stoic: if that's the problem, you only need to use another function instead of similar_text to get the similarity that you want more...
@link: removed the downvote since OP states, he wants the same. probably, I misinterpreted the question. sorry, about that, and upvoted for reminding me about similar_text :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.