14

There are some discussions here, and utility functions, for splitting strings, but I need an ad-hoc one-liner for a very simple task.

I have the following string:

local s = "one;two;;four"

And I want to split it on ";". I want, eventually, go get { "one", "two", "", "four" } in return.

So I tried to do:

local s = "one;two;;four"

local words = {}
for w in s:gmatch("([^;]*)") do table.insert(words, w) end

But the result (the words table) is { "one", "", "two", "", "", "four", "" }. That's certainly not what I want.

Now, as I remarked, there are some discussions here on splitting strings, but they have "lengthy" functions in them and I need something succinct. I need this code for a program where I show the merit of Lua, and if I add a lengthy function to do something so trivial it would go against me.

1
  • [^;]* is perfectly happy matching zero semicolons. So lua matches zero semicolons each time it gets to a delimiter. You can use "[^;]+" instead for a slightly better result but there are reasons the lua-users.org/wiki/SplitJoin page of the lua-users wiki runs as long as it does when talking about splitting strings. Commented Nov 11, 2013 at 14:18

3 Answers 3

24
local s = "one;two;;four"
local words = {}
for w in (s .. ";"):gmatch("([^;]*);") do 
    table.insert(words, w) 
end

By adding one extra ; at the end of the string, the string now becomes "one;two;;four;", everything you want to capture can use the pattern "([^;]*);" to match: anything not ; followed by a ;(greedy).

Test:

for n, w in ipairs(words) do
    print(n .. ": " .. w)
end

Output:

1: one
2: two
3:
4: four
Sign up to request clarification or add additional context in comments.

6 Comments

Wow, thanks. Your solution works perfectly! (I won't close this question yet: if somebody could explain to me why my original code returns spurious empty strings I'd be grateful.)
@NiccoloM. Remember that * matches zero or more, the empty string where I marked $ in the string one$;two$;$;four$ is also a match.
But what about one$;$two$;$;$fo$ur$? Why is the zero match only before ; ? Why isn't it also after the ;, and between every two letters?
@NiccoloM. Because * is greedy, it will try to match as long as possible, the non-greedy version to match zero or more is -.
It's worth noting that LUA Patterns are not actually Regular Expressions, you will notice many differences between 'standard' regexp implementations and how LUA patterns operate.
|
0

Just changing * to + works.

local s = "one;two;;four"
local words = {}
for w in s:gmatch("([^;]+)") do 
    table.insert(words, w) 
    print(w)
end

The magic character * represents 0 or more occurrene, so when it meet ',', lua regarded it as a empty string that [^;] does not exist.

Sorry for my carelessness, the words[3] should be a empty string, but when I run the original code in lua5.4 interpreter, everything works.

code here

running result here (I have to put links because of lack of reputation)

2 Comments

This does not give the OP's desired output, they want an empty string at index 3. { "one", "two", "", "four" }
@Nifim sry,I dont read the question carefully.But when I use lua5.4 interpreter, the original code suddenly works!?
-2
function split(str,sep)
    local array = {}
    local reg = string.format("([^%s]+)",sep)
    for mem in string.gmatch(str,reg) do
        table.insert(array, mem)
    end
    return array
end
local s = "one;two;;four"
local array = split(s,";")

for n, w in ipairs(array) do
    print(n .. ": " .. w)
end

result:

1:one

2:two

3:four

3 Comments

Your answer should contain an explanation of your code and a description how it solves the problem.
This doesn't work as OP expected, because the empty string between ;; isn't captured.
I see. Now I understand what you mean.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.