1

I'm reading an RSS feed and outputting it on a page, and I need to take a substring of the <description> tag and store it as a variable (and then convert to a different time format, but I can figure that out myself). Here's a sample of the data I'm working with:

<description>&lt;b&gt;When:&lt;/b&gt; Tuesday, November 03, 2015 - 6:00 PM - 8:00 PM&lt;br&gt;&lt;b&gt;Where:&lt;/b&gt; Adult Literacy Classroom (Lower Level) dedicated in honor of Eleanor Moore&lt;br&gt;&lt;br&gt;Clases de preparaci&#243;n para el GED &#150; grupos de estudio para ayudar con sus habilidades y preparaci&#243;n para obtener su diploma de equivalencia de escuela. Las clases se llevaran a cabo en espa&#241;ol, seg&#250;n la materia (escritura, literatura, estudios sociales, ciencias, matem&#225;ticas y la constituci&#243;n) &lt;br /&gt;&lt;br /&gt;GED preparation classes &#150; Study groups to help build your skills that will prepare you to get your high school equivalency diploma. Classes are taught in Spanish by subject area (writing, literature, social studies, science, math and the constitution)&lt;br /&gt;</description>

I've already got everything within the description tag as a varible, I just need to grab the string Tuesday, November 03, 2015 - 6:00 PM - 8:00 PM, but I can't figure out how to do that. I have a feeling PHP's explode might work, but I'm terrible with regex. I'll keep working on it and post back my progress, but any help would be greatly appreciated.

By the way, I'm using this method to get the data: http://bavotasan.com/2010/display-rss-feed-with-php/


Thanks to @Bomberis123, I was able to do exactly what I needed to. My code may be a little messy, but I figured I'd share it for anyone who needs to do something similar:

<?php
$next_up_at_rss_feed = new DOMDocument();
$next_up_at_rss_feed->load("http://host7.evanced.info/waukegan/evanced/eventsxml.asp?ag=&et=&lib=0&nd=30&feedtitle=Waukegan+Public+Library%3CBR%3ECalendar+of+Programs+%26+Events&dm=rss2&LangType=0");
$next_up_at_posts = array();
foreach ($next_up_at_rss_feed->getElementsByTagName("item") as $node) {
    $date = preg_match("/((\s)([^\<])+)/", $node->getElementsByTagName("description")->item(0)->nodeValue, $matches, PREG_OFFSET_CAPTURE, 3);
    $date = $matches[0][0];
    $next_up_at_post = array (
        "title" => $node->getElementsByTagName("title")->item(0)->nodeValue,
        "date" => $date,
        "link" => $node->getElementsByTagName("guid")->item(0)->nodeValue,
    );
    array_push($next_up_at_posts, $next_up_at_post);
}
$next_up_at_limit = 4;
for ($next_up_at_counter = 0; $next_up_at_counter < $next_up_at_limit; $next_up_at_counter++) {
    // get each value from the array;
    $title = str_replace(" & ", " &amp; ", $next_up_at_posts[$next_up_at_counter]["title"]);
    $link = $next_up_at_posts[$next_up_at_counter]["link"];
    $date_raw = $next_up_at_posts[$next_up_at_counter]["date"];

    // seperate out the date so it can be formatted
    $date_array = explode(" - ", $date_raw);

    // set up various formats for date
    $date = $date_array[0];
    $date_time = strtotime($date);
    $date_iso = date("Y-m-d", $date_time);
    $date_pretty = date("F j", $date_time);

    // set up various formats for start time
    $start = $date_array[1];
    $start_time = strtotime($start);
    $start_iso = date("H:i", $start_time);
    $start_pretty = date("g:ia", $start_time);

    // set up various formats for end time
    $end = $date_array[2];
    $end_time = strtotime($end);
    $end_iso = date("H:i", $end_time);
    $end_pretty = date("g:ia", $end_time);

    // display the data
    echo "<article class='mini-article'><header class='mini-article-header'>";
    echo "<h6 class='mini-article-heading'><a href='{$link}' target='_blank'>{$title}</a></h6>";
    echo "<p class='mini-article-sub-heading'><a href='{$link}' target='_blank'><time datetime='{$date_iso}T{$start_iso}-06:00'>{$date_pretty}, {$start_pretty} - {$end_pretty}</time></a></p>";
    echo "</header></article>";
}
?>

3 Answers 3

2

Try this Regex you can use php regex and use first group https://regex101.com/r/fI8nU9/1

$subject = "<description>&lt;b&gt;When:&lt;/b&gt; Tuesday, November 03, 2015 - 6:00 PM - 8:00 PM&lt;br&gt;&lt;b&gt;Where:&lt;/b&gt; Adult Literacy Classroom (Lower Level) dedicated in honor of Eleanor Moore&lt;br&gt;&lt;br&gt;Clases de preparaci&#243;n para el GED &#150; grupos de estudio para ayudar con sus habilidades y preparaci&#243;n para obtener su diploma de equivalencia de escuela. Las clases se llevaran a cabo en espa&#241;ol, seg&#250;n la materia (escritura, literatura, estudios sociales, ciencias, matem&#225;ticas y la constituci&#243;n) &lt;br /&gt;&lt;br /&gt;GED preparation classes &#150; Study groups to help build your skills that will prepare you to get your high school equivalency diploma. Classes are taught in Spanish by subject area (writing, literature, social studies, science, math and the constitution)&lt;br /&gt;</description>";
$pattern = '/((\s)([^&])+)/';
preg_match($pattern, $subject, $matches, PREG_OFFSET_CAPTURE, 3);
echo $matches[0][0];
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks much, that's perfect :)
Glad I helped, Cheers ;) @Rev
Quick related question, I'm now trying to get just the date as a separate variable, and so far I've got ((\s)([^-)])+), which gets it, but has a trailing space. How can I check for ` -` with regex?
Hmm I think better way would be after you get main "date - time" just split it in to array using "-" it will give you (date, time1, time2+pm/am) @Rev
1

Hurray, something I can help with and my first StackOverflow answer! Try something like this. It does use regex but just a couple simple pieces of syntax you can pick up.

$data = "<description>&lt;b&gt;When:&lt;/b&gt; Tuesday, November 03, 2015 - 6:00 PM - 8:00 PM&lt;br&gt;&lt;b&gt;Where:&lt;/b&gt; Adult Literacy Classroom (Lower Level) dedicated in honor of Eleanor Moore&lt;br&gt;&lt;br&gt;Clases de preparaci&#243;n para el GED &#150; grupos de estudio para ayudar con sus habilidades y preparaci&#243;n para obtener su diploma de equivalencia de escuela. Las clases se llevaran a cabo en espa&#241;ol, seg&#250;n la materia (escritura, literatura, estudios sociales, ciencias, matem&#225;ticas y la constituci&#243;n) &lt;br /&gt;&lt;br /&gt;GED preparation classes &#150; Study groups to help build your skills that will prepare you to get your high school equivalency diploma. Classes are taught in Spanish by subject area (writing, literature, social studies, science, math and the constitution)&lt;br /&gt;</description>";
$regex = "~<description>&lt;b&gt;When:&lt;/b&gt; (.+?)&lt;br&gt;&lt;b&gt;Where:&lt;/b&gt;~";
preg_match($regex,$data,$match);
echo $match[1];

I tested this and it works.

In this instance, you just set up $regex with what you expect the raw string to look like, with ~ on either end and (.+?) where the part you want to extract is.

Comments

0

I am far from an expert on regexp, but this might be something for the more paranoid programmer:

$s = '<description>&lt;b&gt;When:&lt;/b&gt; Tuesday, November 03, 2015 - 6:00 PM - 8:00 PM&lt;br&gt;&lt;b&gt;Where:&lt;/b&gt; Adult Literacy Classroom (Lower Level) dedicated in honor of Eleanor Moore&lt;br&gt;&lt;br&gt;Clases de preparaci&#243;n para el GED &#150; grupos de estudio para ayudar con sus habilidades y preparaci&#243;n para obtener su diploma de equivalencia de escuela. Las clases se llevaran a cabo en espa&#241;ol, seg&#250;n la materia (escritura, literatura, estudios sociales, ciencias, matem&#225;ticas y la constituci&#243;n) &lt;br /&gt;&lt;br /&gt;GED preparation classes &#150; Study groups to help build your skills that will prepare you to get your high school equivalency diploma. Classes are taught in Spanish by subject area (writing, literature, social studies, science, math and the constitution)&lt;br /&gt;</description>';
$a = array();
$p = '/(Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sunday),\s'
    .'(January|February|March|April|May|June|July|August|September|October|November|December)\s'
    .'[0-3][0-9],\s[1-2][0-9]{3}\s-\s'    // Year 
    .'[0-2]?[0-9]:[0-5][0-9]\s[AP]M\s-\s' // Time
    .'[0-2]?[0-9]:[0-5][0-9]\s[AP]M/';    // Time 
preg_match( $p, $s, $a, PREG_OFFSET_CAPTURE );
echo $a[0][0];

Tested and working...

This will catch a date formatted as described, somewhere in the text.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.