0

I am developing a plugin for my WordPress site. I want to select all non-empty paragraph elements.

Here is my code :



function my_php_custom_function($content){

 // Create a new DOMDocument instance
 $dom = new DOMDocument();

 // Load the HTML content into the DOMDocument
 $dom->loadHTML(mb_convert_encoding($content, 'HTML-ENTITIES', 'UTF-8'), LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);

 // Create a DOMXPath object to query the DOM
 $xpath = new DOMXPath($dom);

 // Find all non-empty p elements in the content
 $p_elements = $xpath->query('//p[string-length(normalize-space()) > 0]');
}


add_filter('the_content','my_php_custom_function')

$p_elements in this variable I am getting those paragraphs also which I have just created by pressing enter. When I check on DOM, it is showing as <p>&nbsp;</p>

3
  • &nbsp; is not the result of a carriage return and &nbsp; evaluates correctly as non-empty. Where does your content of the $content variable come from? Commented Mar 22, 2024 at 7:26
  • I have updated my code @Repox. I don't want to target paragraphs which has only `&nbsp;' Commented Mar 22, 2024 at 8:08
  • Su just check if content is &nbsp; Commented Mar 22, 2024 at 8:14

1 Answer 1

1

You're likely using some sort of WYSIWYG editor for your content, which in some cases produce elements only containing &nbsp;

To get non-empty P elements and also ignoring P elements containing only &nbsp; your XPath could look like the following:

//p[normalize-space() and not(normalize-space(.) = '&nbsp;')]

Updated answer:

Apparently, the representation in the DOMDocument of the &nbsp; converts fully (via bin2hex() to c2a0. Using this knowledge, we can input it as the hexidecimal conversion instead (\xC2\xA0).

This would render your query to look somewhat like the following:

$p_elements = $xpath->query('//p[normalize-space() and not(normalize-space(.) = "'."\xC2\xA0".'")]');

While not pretty (due to all the escaping), it works in my small tests.

Sign up to request clarification or add additional context in comments.

1 Comment

I have tried this but it won't work as expected. I am still getting paragraph which has &nbsp; only. Here is my updated code : $p_elements = $xpath->query('//p[normalize-space() and not(normalize-space(.) = "&nbsp;")]');

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.