Select a string in regex with ruby

Question

I have to clean a string passed in parameter, and remove all lowercase letters, and all special character except :

+
|
^
space
=>
<=>

so i have this string passed in parameter:

aA azee + B => C=

and i need to clean this string to have this result:

A + B => C

I do

string.gsub(/[^[:upper:][+|^ ]]/, "")

output: "A + B C"

I don't know how to select the => (and for <=>) string's with regex in ruby)

I know that if i add string.gsub(/[^[:upper:][+|^ =>]]/, "") into my regex, the last = in my string passed in parameter will be selected too

(<?=>)|[^[:upper:]+|^ ] replace with $1?

ctwheels
– ctwheels

2018-04-09 15:46:40 +00:00
Commented Apr 9, 2018 at 15:46 — ctwheels
– ctwheels, Commented Apr 9, 2018 at 15:46
Why does your string contain those extra characters?

Stefan
– Stefan

2018-04-09 15:56:07 +00:00
Commented Apr 9, 2018 at 15:56 — Stefan
– Stefan, Commented Apr 9, 2018 at 15:56

Sweeper · Accepted Answer · 2018-04-09 20:24:10Z

5

You can try an alternative approach: matching everything you want to keep then joining the result.

You can use this regex to match everything you want to keep:

[A-Z\d+| ^]|<?=>

As you can see this is just a using | and [] to create a list of strings that you want to keep: uppercase, numbers, +, |, space, ^, => and <=>.

Example:

"aA azee + B => C=".scan(/[A-Z\d+| ^]|<?=>/).join()

Output:

"A  + B => C"

Note that there are 2 consecutive spaces between "A" and "+". If you don't want that you can call String#squeeze.

edited Apr 9, 2018 at 20:24

answered Apr 9, 2018 at 15:52

Sweeper

292k23 gold badges260 silver badges438 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Aleksei Matiushkin Over a year ago

join does not require a default argument to be explicitly passed and squeezing would probably make sense afterwards.

Aleksei Matiushkin Over a year ago

this is nevertheless the best approach AFAICT.

Vincent Cheloudiakoff Over a year ago

Thank you, i think that it's the best approach to solve my problem !

Cary Swoveland Over a year ago

I would think the regex should also include \d.

ctwheels Over a year ago

/[A-Z\d+| ^]|<?=>/ is faster and also includes the ^ character that you forgot :)

ctwheels · Accepted Answer · 2018-04-09 15:58:53Z

1

See regex in use here

(<?=>)|[^[:upper:]+|^ ]

(<?=>) Captures <=> or => into capture group 1
[^[:upper:]+|^ ] Matches any character that is not an uppercase letter (same as [A-Z]) or +, |, ^ or a space

See code in use here

p "aA azee + B => C=".gsub(/(<?=>)|[^[:upper:]+|^ ]/, '\1')

Result: A + B => C

answered Apr 9, 2018 at 15:58

ctwheels

23k9 gold badges47 silver badges81 bronze badges

1 Comment

Cary Swoveland Over a year ago

I prefer this solution because it explicitly excludes characters, as opposed to including what is inferred to be the strings to be kept. Also, the POSIX expression for uppercase letters has wider applicability than A-Z.

Cary Swoveland · Accepted Answer · 2019-11-02 04:55:33Z

r = /[a-z\s[:punct:]&&[^+ |^]]/

"The cat, 'Boots', had 9+5=4 ^lIVEs^ leF|t.".gsub(r,'')
  #=> "T  B  9+54 ^IVE^ F|"

The regular expression reads, "Match lowercase letters, whitespace and punctuation that are not the characters '+', ' ', '|' and '^'. && within a character class is the set intersection operator. Here it intersects the set of characters that match a-z\s[:punct:] with those that match [^+ |^]. (Note that this includes whitespaces other than spaces.) For more information search for "character classes also support the && operator" in Regexp.

I have not included '=>' and '<=>' as those, unlike '+', ' ', '|' and '^', are multi-character strings and therefore require a different approach than simply removing certain characters.

Collectives™ on Stack Overflow

Select a string in regex with ruby

3 Answers 3

5 Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

5 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related