1

I am completely clueless on how to use regex and need some help on the problem above. I need to replace <> with new lines but keep the string between <>. So

<'sample text'><'sample text 2'>

becomes

'sample text'
'sample text2'
7
  • 1
    What is the language you're using? Also, do you need new line for first < and last > ? Commented Feb 2, 2017 at 19:04
  • Just replace >< with a newline, no regular expression needed. Commented Feb 2, 2017 at 19:04
  • The general answer about how to keep parts of a string when doing a regular expression replacement of other parts is to use a capture group for the parts you want to keep, and back-references in the replacement. Commented Feb 2, 2017 at 19:05
  • how should look the result for this input <'sample text'> <'sample text 2'> some text <'sample text 3'> <> ? Commented Feb 2, 2017 at 19:12
  • For niitaku its using powershell and the first and last group of <> shouldn't have new lines. For barmar i will try and see if it works with the text file i am using. RomanPerekhrest it should look like 'sample text' new line'sample text 2' some text new line'sample text 3' new line"empty line" Commented Feb 2, 2017 at 19:19

3 Answers 3

2
\<([^>]*)\>

This regex will capture the text between < and > into a capture groups, which you can then reference again and put a newline between them.

\1\n

Check it out here.

EDIT:

In PowerShell

PS C:\Users\shtabriz> $string = "<'sample text'><'sample text 2'>"
PS C:\Users\shtabriz> $regex = "\<([^>]*)\>"
PS C:\Users\shtabriz> [regex]::Replace($string, $regex, '$1'+"`n")
'sample text'
'sample text 2'
Sign up to request clarification or add additional context in comments.

Comments

0

This works for me in Textpad:

Example:

String:

" 1) Navigate to record. 2) Navigate to the tab and select. 3) Click the field. 4) Click on the tab and scroll."

Note: For search/replace blow, do NOT include the quotes, I used them to show the presence of a space in the search term

Search: "[0-9]+) " Replace: "\n$0"

Resulting String:

  1. Navigate to record.
  2. Navigate to the tab and select.
  3. Click the field.
  4. Click on the tab and scroll.

(note... stackoverflow changed my ")" to a ".")

Comments

0

To complement Shawn Tabrizi's helpful answer with a more PowerShell-idiomatic solution and some background information:

PowerShell surfaces the functionality of the .NET System.Text.RegularExpressions.Regex.Replace() method ([regex]::Replace(), from PowerShell) via its own -replace operator.

The most concise solution (but see below for potential pitfalls):

# Note the escaped "$" ("`$")
"<'sample text'><'sample text 2'>" -replace '<(.*?)>', "`$1`n"

Output:

'sample text'
'sample text 2'
  • $1 is a numbered capture-group substitution, referring to what the 1st (and only) capture group inside the regex ((...)) captured, which are the strings between < and > (.*? is a non-greedy expression that matches any run of characters but stops once the next construct, > in this case, is found).

    • However, inside a double-quoted string ("..."), also known as an expandable string, $1 would be interpreted as a PowerShell variable reference, so the $ character must be escaped in order to be preserved, using the backtick (`), PowerShell's general escape character: "`$1"

    • Conversely, if you want the .NET API not to interpret a $ character in the substitution string, use $$ (either $$ inside '...', or "`$`$" inside "...") - but note that inside the regex operand a verbatim $ must be escaped as \$.

  • "`n" is a PowerShell escape sequence that can be used inside expandable strings (only) - see the conceptual about_Special_Characters help topic.

Caveat:

  • While convenient here, there are pitfalls with respect to using expandable strings as the regexes and substitution operands, as it isn't always obvious what PowerShell expands (interpolates) up front, and what the .NET API ends up seeing as a result.

  • Therefore, it is generally preferable to use single-quoted strings ('...', also known as verbatim strings) - both for the substitution operand and the regex itself, and - if needed - use an expression ((...)) to build the overall string, which allows you to separate the verbatim (pass-through) parts from interpolated parts.

This is what Shawn did in his answer; translated to a -replace operation:

# Note the expression used to build the substitution string
# from a verbatim ('...') and an interpolated ("...") part.
"<'sample text'><'sample text 2'>" -replace '<(.*?)>', ('${1}' + "`n")

Another option, using -f, the format operator:

"<'sample text'><'sample text 2'>" -replace '<(.*?)>', ("{0}`n" -f '${1}')

Note the use of ${1} instead of just $1: Enclosing the number / name of the referenced capture group in {...} disambiguates it from the characters that follow, which avoids another pitfall, as the following example shows (incidentally, PowerShell's own variable references can be disambiguated the same way):

# FAILS and results in 'f$142', because the .NET API sees
# '$142' as the substitution string, and there is no 142nd capture group.
$suffix = '42'; 'foo' -replace '(oo)', ('$1' + $suffix)

# OK, with disambiguation via {...} -> 'foo42'
$suffix = '42'; 'foo' -replace '(oo)', ('${1}' + $suffix)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.