0

I am working on scraping and then parsing an HTML string to get the two URL parameters inside the href. After scraping the element I need, $description, the full string ready for parsing is:

<a target="_blank" href="CoverSheet.aspx?ItemID=18833&amp;MeetingID=773">Description</a><br>

Below I use the explode parameter to split the $description variable string based on the = delimiter. I then further explode based on the double quote delimiter.

Problem I need to solve: I want to only print the numbers for MeetingID parameter before the double quote, "773".

<?php
echo "Description is: " . htmlentities($description); // prints the string referenced above
$htarray = explode('=', $description); // explode the $description string which includes the link. ... then, find out where the MeetingID is located
echo $htarray[4] .  "<br>"; // this will print the string which includes the meeting ID: "773">Description</a><br>"

$meetingID = $htarray[4];
echo "Meeting ID is " . substr($meetingID,0,3); 
?>

The above echo statement using substr works to print the meeting ID, 773.

However, I want to make this bulletproof in the event MeetingID parameter exceeds 999, then we would need 4 characters. So that's why I want to delimit it by the double quotes, so it prints all numbers before the double quotes.

I try below to isolate all of the amount before the double quotes... but it isn't seeming to work correctly yet.

<?php
 $htarray = explode('"', $meetingID); // split the $meetingID string based on the " delimiter
 echo "Meeting ID0 is " . $meetingID[0] ; // this prints just the first number, 7
 echo "Meeting ID1 is " . $meetingID[1] ; // this prints just the second number, 7
 echo "Meeting ID2 is " . $meetingID[2] ; // this prints just the third number, 3

?>

Question, why is the array $meetingID[0] not printing the THREE numbers before the delimiter, ", but rather just printing a single number? If the explode function works properly, shouldn't it be splitting the string referenced above based on the double quotes, into just two elements? The string is

"773">Description</a><br>"

So I can't understand why when echoing after the explode with double quote delimiter, it's only printing one number at a time..

5
  • 2
    "why is the array $meetingID[0] not printing the THREE numbers before the delimiter" -- because $meetingID is the string. The exploded arary is $htarray. I think you're looking for $htarray[0]? Commented Sep 27, 2020 at 18:55
  • you are right! Thank you, problem solved. Commented Sep 27, 2020 at 19:07
  • if you can write that as an answer I can give you the correct response. Commented Sep 27, 2020 at 19:07
  • You would normally be better off processing HTML with something like DOMDocument, see stackoverflow.com/questions/3577641/…. Commented Sep 27, 2020 at 19:16
  • @NigelRen thanks I am using the PHP Simple Dom Parser but trying to get more advanced once I have the strings. Commented Sep 27, 2020 at 19:19

2 Answers 2

1

The reason you're getting the wrong response is because you're using the wrong variable.

$htarray = explode('"', $meetingID);

echo "Meeting ID0 is " . $meetingID[0] ; // this prints just the first number, 7
echo "Meeting ID1 is " . $meetingID[1] ; // this prints just the second number, 7
echo "Meeting ID2 is " . $meetingID[2] ; // this prints just the third number, 3

echo "Meeting ID is " . $htarray[0] ; // this prints 773

There's an easier way to do this though, using regular expressions:

$description = '<a target="_blank" href="CoverSheet.aspx?ItemID=18833&amp;MeetingID=773">Description</a><br>';

$meetingID = "Not found";
if (preg_match('/MeetingID=([0-9]+)/', $description, $matches)) {
    $meetingID = $matches[1];
}

echo "Meeting ID is " . $meetingID;
// this prints 773 or Not found if $description does not contain a (numeric) MeetingID value
Sign up to request clarification or add additional context in comments.

Comments

1

There is a very easy way to do it:

Your Str:

$str ='<a target="_blank" href="CoverSheet.aspx?ItemID=18833&amp;MeetingID=773">Description</a><br>';

Make substr:

$params = substr( $str, strpos( $str, 'ItemID'), strpos( $str, '">') - strpos( $str, 'ItemID') );

You will get substr like this :

ItemID=18833&MeetingID=773

Now do whatever you want to do!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.