0

There's a file I'm trying to use for a list of strings that has the following rules:

  1. Cannot begin or end with an unescaped comma.
  2. A comma is escaped by a preceding comma.
  3. Strings are separated by unescaped commas.
  4. Everything else is absolutely face-value.

I've been fiddling around with some VB.NET code to parse a file like this and split it up into either a String() or a List(Of String), but it's gotten to be a little annoying. It's not that I can't figure this out; it's that I don't want to write crap code. If it's unnecessarily confusing, unecessarily slow, or anything else like that, it's not good enough.

Now, I know this almost starts to sound a little like a Code Review question, but I'm really starting to think that maybe a good regex would work better than trying to do this programmatically. Unfortunately regexes are not easy to work with, and while using one to tell it to escape on a comma may be a trivial matter, getting it to also ignore double commas and such is a bit more of an issue, at least for somebody who's not used to regexes.

How do you do this (properly) in VB.NET? In particular, I'm having a little bit of trouble putting together a wild card that'll match anything at all but a comma. It's also taking me a little bit to find out whether #1 has to be verified programmatically, or whether it can be done in the regex itself at the same time as the split operation.

EDIT

I just "woke up" and realized that this syntax is ambiguous, since in an odd-numbered series of three or more commas, you don't know what's escaped and what isn't. I'm just going to accept the current answer and move on.

1
  • @AvinashRaj I've got a regex reference in front of me; I'm still looking through it and trying to put something together to try. In particular, It's taking me a little while to find something that says "any character at all but this particular one". Commented Oct 13, 2014 at 19:14

1 Answer 1

2

Haven't used VB.net in a long time ... but I would't got the RegEx way.

What about splitting the string by "," ...

Dim parts As String() = s.Split(New Char() {","c})

You will get a list of items, now you only need to take care of the empty items (escaped commas) and join them with the correct preceding item.

PS: not sure if split gives you empty items in case of ",,"

Sign up to request clarification or add additional context in comments.

2 Comments

Is using this and the concatenation a lot more efficient than regex?
The code will be simpler to understand than a RegEx, not even sure if this can be done via RegEx. And probably faster, but without testing hard to tell.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.