0

I'm trying to use regular expressions in JS but I'm probably missing something because I can't get it to work. I have a couple of regular expression working fine in PHP (used with a preg_match) but when I use exactly the same expression in JS, I get no matched patterns.

Here is an example, I'm trying to parse the page:

https://www.coolblue.be/fr/rechercher?query=GTX+1060&trier=prix-les-moins-chers

My code:

var pattern = '/<a class=\"product__title js-product-title\" href=\"(.*)\" data-trackclickevent=\"(.*)\">[\n\r\s]+(.*)[\n\r\s]+<\/a>/gi';
var found = content.match(pattern);

The variable content contains the full source code of the page, I have dumped it in the console to make sure it was working and I see for example: (the code is dirty but I took it from the page mentionned above without changing anything)

<div class="product__titles"><div class="js-product-feature-title"></div><a class="product__title js-product-title" href="/fr/produit/654109" data-trackclickevent="Internal Search, Product, Oehlbach BTX 1000 (654109) - Product title">
                Oehlbach BTX 1000
            </a></div><div class="product__review-rating"><div class="review-rating alt-compact"><div class="review-rating--rating">

When I use https://regex101.com/ to test my regular expression, it also works but somehow in JS it doesn't.

Any idea of what I'm missing ?

thanks

Laurent

4
  • 3
    This is called parsing. Don't use Regular Expressions for parsing HTML documents. Use a DOM parser instead. Commented Feb 17, 2018 at 15:35
  • 2
    Don't use Regular Expressions for parsing HTML documents. Commented Feb 17, 2018 at 15:39
  • 1
    Even though the regex is a killer, it works while switching between php to javascript in regex101. Commented Feb 17, 2018 at 15:40
  • Thanks for pointing the obvious :) I'll give it a try with DOM parsing. Commented Feb 17, 2018 at 16:53

2 Answers 2

1

Ok, here is how I solved it.

var el = document.createElement( 'html' );
el.innerHTML = content;
all_links = el.getElementsByClassName("product__title");

content needs to contain your html

Sign up to request clarification or add additional context in comments.

Comments

1

In JavaScript you will use regular expressions mostly in two methods: test and replace. Whereas test just tells you whether its argument matches the regular expression, replace takes a second parameter: the string to replace the text that matches. Like most functions, replace generates a new string as a return value; it does not change the input eg;

document.write(/cats/i.test("Cats are fun. I like cats."));
And replace:

document.write("Cats are fun. I like cats.".replace(/cats/gi,"dogs"));

and also in Javascript, you have to escape the close bracket "]" as below;

\[([^\]\s]+).([^\]]+)\]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.