0

According to the accepted answer from this question, the following is the syntax for removing the last instance of a certain character from a string (In this case I want to remove the last &):

function remove (string) {

  string = string.replace(/&([^&]*)$/, '$1');
  return string;

}
console.log(remove("height=74&width=12&")); 

But I'm trying to fully understand why it works.

According to regex101.com,

/&([^&]*)$/

& matches the character & literally (case sensitive)

1st Capturing Group ([^&]*) Match a single character not present in the list below [^&]*

* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)

& matches the character & literally (case sensitive)

$ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)

So if we're matching the character & literally with the first &:

enter image description here

Then why are we also "matching a single character not present in the following list"?

enter image description here

Seems counter productive.

And then, "$ asserts position at the end of the string" - what does this mean? That it starts searching for matches from the back of the string first?

And finally, what is the $1 doing in the replaceValue? Why is it $1 instead of an empty string? ""

4
  • 1
    In human language, your pattern says to match &, then match and capture what follows that &, until the end of the pattern. The way the pattern is written, the & must be the last one in the input. Then, it replaces everything with just the capture group, effectively deleting the last &. Commented Aug 12, 2019 at 14:14
  • So, for my particular string - "height=74&width=12&" - it matches the first &, and then matches and captures all of this: width=12&? And then it removes the & from that capture group, since & is what's declared outside of the capture group, initially? Commented Aug 12, 2019 at 14:31
  • No, for the example input from your comment above, only the final & would be removed, see the demo. Commented Aug 12, 2019 at 14:33
  • Because the ([^&]*) captures all characters except for &? Why is only the final & removed? Commented Aug 12, 2019 at 14:43

1 Answer 1

3

1- The solution for that problem I think is different to the solution you want:

That regex will replace the last "&" no matter where it is, in the middle or in the end of the string.

If you apply this regex to this two examples you will see that the first get "incorrectly" replaced:

height=74&width=12&test=1
height=74&width=12&test=1&

They get replaced as :

height=74&width=12test=1
height=74&width=12&test=1

So to really replace the last "&" the only thing you need to do is :

string.replace(/&$/, '');

Now, if you want to replace the last ocurrence of "&" no matter where it is, I will explain that regex :

$1 Represents a (capturing group), everything inside those ([^&]*) are captured inside that $1. This is a oversimplification.

&([^&]*)$

& Will match a literal "&" then in the following capturing group this regex will look for any amount (0 to infinite) of characters (NOT EQUAL TO "&", explained latter) until the end of the string or line (Depending on the flag you use in the regex, /m for matching lines ). Anything captured in this capturing group will go to $1 when you apply the replacement.

So, If you apply this logic in your mind you will see that it will always match the last & and replace it with anything on its right that does not contain a single "&""

&(<nothing-like-a-&>*)<until-we-reach-the-end> replaced by anything found inside (<nothing-like-a-&>*) == $1. In this case because of the use of * , it means 0 or more times, sometimes the capturing group $1 will be empty.


NOT EQUAL TO part: The regex uses a [^], in simple terms [] represents a group of independent characters, example: [ab] or [ba] represents the same, it will always look for "a" or "b". Inside this you can also look for ranges like 0 to 9 like this [0-9ba], it will always match anything from 0 to 9, a or b.

The "^" here [^] represents a negation of the content, so, it will match anything not in this group, like [^0-9] will always match anything that is not a number. In your regex [^&] it was used for looking for anything that is not a "&"

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you for your explanation! So essentially, doing string.replace(/&$/, "") is like saying, replace the last character of the string with an empty string only if the last character is &.
Yes, the $ represents a position, in regex it is the end of the string or line, then you put &$ and it will look for "&" with the end of the string next to it. Also you can use ^ that represents the start of the string or line. So, ^& will replace the & at the start of the string.
I will add more to the anwer, I see I did not explain well the [^&]* part.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.