5

Is there a way to obtain:

"[][][]".split('[]')
#=> ["", "", ""]

instead of

#=>[]

without having to write a function?

The behavior is surprising here because sometimes irb would respond as expected:

"[]a".split('[]')
#=>["", "a"]`
4
  • 5
    Welcome to Stack Overflow. This really sounds like an XY Problem, where you're asking us how to do Y instead of X. Where does the data/string come from? Do you control its generation? It's very rare to see something like that in code. Please read "How to Ask", including the links at the bottom of the page, and "minimal reproducible example" along with meta.stackexchange.com/q/66377/153968 Commented Feb 15, 2016 at 16:58
  • 6
    I disagree with the idea that this in unclear: the example is clear an minimal, and it makes sense that String#split would work in the way the OP expects. There's also a clear, specific answer (see Jordan's and my answers). Commented Feb 15, 2016 at 17:13
  • Thank you all for replying even if some of you don't think the question wasn't "correct". Commented Feb 16, 2016 at 7:47
  • 1
    This is a perfect question. I find it annoying that someone would actively discourage the exact type of question we want to see on SO. Commented May 12, 2020 at 17:23

2 Answers 2

10

From the docs:

If the limit parameter is omitted, trailing null fields are suppressed. If limit is a positive number, at most that number of fields will be returned (if limit is 1, the entire string is returned as the only entry in an array). If negative, there is no limit to the number of fields returned, and trailing null fields are not suppressed.

And so:

"[][][]".split("[]", -1)
# => ["", "", "", ""]

This yields four empty strings rather than your three, but if you think about it it's the only result that makes sense. If you split ,,, on each comma you would expect to get four empty strings as well, since there's one empty item "before" the first comma and one "after" the last.

Sign up to request clarification or add additional context in comments.

2 Comments

Seems we're looking at the same questions today! A+, would upvote again.
Thanks, It was exactly what I was looking for. Don't know why i didn't look in the doc. and actually having ` ["","","",""] ` is what makes more sense.
3

String#split takes two arguments: a pattern to split on, and a limit to the number of results returned. In this case, limit can help us.

The documentation for String#split says:

If the limit parameter is omitted, trailing null fields are suppressed. If limit is a positive number, at most that number of fields will be returned (if *limit( is 1, the entire string is returned as the only entry in an array).

The key phrase here is trailing null fields are suppressed, in other words, if you have extra, empty matches at the end of the string, they'll be dropped from the result unless you have set a limit.

Here's an example:

"[]a[][]".split("[]")
#=> ["", "a"]

You might expect to get ["", "a", "", ""], but because trailing null fields are suppressed, everything after the last non-empty match (the a) is dropped.

We could set a limit, and only get that many results:

"[]a[][]".split("[]", 3)
#=> ["", "a", "[]"]

In this case, since we've asked for 3 results, the last [] is ignored and forms part of the last result. This is useful when we know how many results we expect, but not so useful in your specific case.

Fortunately, the docs continue:

If negative, there is no limit to the number of fields returned, and trailing null fields are not suppressed.

In other words, we can pass a limit of -1, and get all the matches, even the trailing empty ones:

"[]a[][]".split('[]', -1)
#=> ["", "a", "", ""]

This even works when all the matches are empty:

"[][][]".split('[]', -1)
#=> ["", "", "", ""]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.