4

I will have the following inputs from which i plan to extract the units
(expected output eg.: g, l, kg, ml, l) and the quantity if present (20 in last input)

  1. 0,5g
  2. 500l
  3. 1000kg
  4. 20,5ml
  5. 20x0,50l (1 l = 1,70 €) zzgl. 3,10€ Pfand

if it's simple case i am doing the following

Input: 500g

console.log("500g".replace(/ *\([^)]*\) */g, "") // remove brackets
      .replace(/[0-9]/g, "") // remove number eg. 500
      .replace(/\s/g, ""))
  

output: g ( works )

Input: 0,5g

console.log("0,5g".replace(/ *\([^)]*\) */g, "") // remove brackets
  .replace(/[0-9]/g, "") // remove number eg. 500g
  .replace(/\s/g, ""))
    

output: ,g ( breaks )

Input: 20x0,50l (1 l = 1,70 €) zzgl. 3,10€ Pfand

  console.log("20x0,50l (1 l = 1,70 €) zzgl. 3,10€ Pfand".replace(/ *\([^)]*\) */g, "") // remove brackets
      .replace(/[0-9]/g, "") // remove number eg. 500g
      .replace(/\s/g, ""))

output: x,lzzgl.,€Pfand ( breaks )

1
  • Did any of the posted answers work out? Commented Jan 18, 2021 at 16:17

3 Answers 3

2

Instead of using replace, you might want to use match, which will return a match object.

Mozilla Article


Regex Patterns:

For matching values and units:

([\d,\.]+)\s*(g|kg|l|ml)

More units can be added in the last group.

Example:

"20x0,50l (1 l = 1,70 €) zzgl. 3,10€ Pfand".match(/([\d,\.]+)\s*(g|kg|l|ml)/)

returns
(3) ...:
    0: "0,50l" // full match
    1: "0,50" // value
    2: "l" // unit
    ...

For matching just units (though this is kind of unnecessary, the previous regex matches values in group 1, units in group 2 at the same time):

(?<=[\d,\.]+)\s*(g|kg|l|ml)

For matching quantities:

([\d,\.]+)(?:x|\*)

Example:

"20x0,50l (1 l = 1,70 €) zzgl. 3,10€ Pfand".match(/([\d,\.]+)(?:x|\*)/)

returns
(2) ...:
    0: "20x" // full match
    1: "20" // quantity
    ...

EDIT: to further elaborate on my comment

var units = ["g", "kg", "l", "ml"];
var re = new RegExp(`([\\d,\\.]+)\\s*(${units.join("|")})`);

Then using re for matching:

"20x0,50l (1 l = 1,70 €) zzgl. 3,10€ Pfand".match(re)

Works the same, but is more maintainable.


Sign up to request clarification or add additional context in comments.

6 Comments

Super thanks for a new solution. I want to avoid doing something like this g|kg|l|ml since I am not aware of the units which will come as input beforehand
Well, you could just define all the units in an array and define the regex dynamically, inserting | between every unit, or in that case element in the array, converting it to a string. Regex is, in general, something hard to maintain. If you can avoid using Regex, you should; it's more of a last resort, or should be, anyway.
@SaurabhKumar you need to group the units somehow so if that is not mantainable use an array of units e.g. ["l", "kg"...]
@TARN4T1ON I think there is an unnecessary slash before point -> \. in /([\d,\.]+)(?:x|\*)/) . Probably you can correct me if i am wrong. Thx in advance.
@SaurabhKumar You're right, but it doesn't affect the functionality in any way. To each their own, but I always go the safe route; \. is guaranteed to not be a wildcard. I won't say anything about improving readability, though; we're talking about regex, after all.
|
0

You could use 2 capturing groups to extract the units and optional 20

(?:(\d+)x\d+(?:,\d+)?)?\d+(k?g|g|m?l)

Capture group 1 optionally matches the quantity like 20 if present, capture group 2 matches the unit.

Explanation

  • (?: Non capture group to make the whole part optional
    • (\d+)x Capture group 1, match 1+ digits followed by x
    • \d+(?:,\d+)? Match 1+ digits and an optional decimal part
  • )? Close group and make it optional
  • \d+ Match 1+ digits
  • (k?g|g|m?l) Capture group 2, match any of the listed alternatives

Regex demo

const regex = /(?:(\d+)x\d+(?:,\d+)?)?\d+(k?g|g|m?l)/g;
[
  "0,5g",
  "500l",
  "1000kg",
  "20,5ml",
  "20x0,50l (1 l = 1,70 €) zzgl. 3,10€ Pfand",
].forEach(s => console.log(Array.from(s.matchAll(regex), m => [m[1] ? m[1] : "", m[2]])));

Comments

0

Something like this might do, extracting units, quantities, and doses:

const convert = (input) => {
  const match = /(?:(\d+)x)?\s*([\d\,.]+)([a-z]+)/i .exec (input)
  return match
    ? {
        quantity: Number(match [2].replace(/\,/g, '.')), 
        unit: match [3], 
        ...(match [1] ? {doses: Number(match [1].replace(/,/g, '.'))} : {})
      }
    : {}
}

const inputs = ['500g', '0,5g', '500l', '1000kg', '20,5ml', 
                '20x0,50l (1 l = 1,70 €) zzgl. 3,10€ Pfand']

inputs .forEach (
  input => console .log (`"${input}" --> ${JSON.stringify(convert(input))}`)
)
.as-console-wrapper {max-height: 100% !important; top: 0}

Note that the number handling is naïve. It just looks for any combination of digits, commas and periods, which means that it might also accept '12,345,6.7.8'. I'm sure we could fix that up if it was an issue for your data.

Our regex looks like this:

   /(?:(\d+)x)?\s*([\d\,.]+)([a-z]+)/i
//   \_______/ \_/ \______/ \_____/\_/
//       |      |     |        |    +--- Case insensitive, accepts 'KG' as well as 'kg'
//       |      |     |        +-------- Capturing group for units, composed of letters.
//       |      |     +----------------- Capturing group for quantity, composed of
//       |      |                        digits, commas, and periods.
//       |      +----------------------- Optional space after dosage
//       +------------------------------ Optional non-capturing group with 
//                                         - a capturing group of digits
//                                         - the literal character `x1```

Note, an earlier version, which didn't deal with doses and didn't convert quantities and doses to numbers, was simpler:

const convert = (input) => {
  const match = /([\d\,.]+)([a-z]+)(?![a-z0-9])/i .exec (input)
  return match
    ? {qty: match [1], unit: match [2]}
    : {}
}

And of course, if we simply want the units, we could so something like this:

const convert2 = (input) => /(?:(?:\d+)x)?\s*(?:[\d\,.]+)([a-z]+)/i .exec (input) [1]

const inputs = ['500g', '0,5g', '500l', '1000kg', '20,5ml', 
                '20x0,50l (1 l = 1,70 €) zzgl. 3,10€ Pfand']

inputs .forEach (
  input => console .log (`"${input}" --> ${JSON.stringify(convert2(input))}`)
)
.as-console-wrapper {max-height: 100% !important; top: 0}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.