-1

I am using the answer of joker83 in this question: Regular expression for parsing CSV in PHP but I find it can't parse csv string whose field value containing comma correctly. Is it possible to refine this regexp to solve this problem?

Explanation of the pattern from joker83: /,(?=(?:[^\"])*(?![^\"]))/.
1. ,(?=x) means a comma that follows a pattern x.
2. [^\"] means some character other than double quote.
3.(?:[^\"]) means match the parenthesis-ed subpattern but don't capture it into matched resulting array.
4. * means 0 or more of the specified pattern.
5.(x)* means 0 or more of the pattern x.
6. y?![^\"] means a y that NOT follows some character not double quote(i.e. matching y that follows a dobule quote)
7. The whole meaning is matching a comma that follows a double quote (where * means zero ) or matching a comma that follows 1 or more of characters other than double quote and these characters follows a double quote.

As you can see, if the csv string is 120,"I love ""Lexi Belle"", ""Proxy Paige""","good stuff", then when apply this regexp in preg_split, we will get 4 fields (i.e. 120 """I Love Lexi Bell"" ""Proxy Piage""" **"good stuff"**)rather than the correct 3 fields.

Note: I'm using PHP5.2.6 (can't upgrade to new version since I spent a lot of time to install a oci8 that can read Oracle 8i on Windows. I can't install them correctly again in new version of PHP).
Note: I can't use fgetcsv() either since the input csv file contains LF code in csv string and fgetcsv() will split the newline in the middle of that field.

2
  • "I am using the answer of joker83 in this question", you should not. Commented Oct 13, 2015 at 14:31
  • As my notes show, his answer is best suitable one for my case then. Commented Oct 13, 2015 at 14:44

2 Answers 2

0

You can use this regex:

/,(?=([^\"]*\"[^\"]*\")*[^\"]*$)/

Which is found from this stackoverflow entry Java: splitting a comma-separated string but ignoring commas in quotes (but for java).

On your string it gives:

array(3) {
  [0]=>
  string(3) "120"
  [1]=>
  string(31) ""I love Lexi Bell, Proxy Paige""
  [2]=>
  string(12) ""good stuff""
}

Note that you still have the '"' on them.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks! This is what I need. I accept this as the best answer.
0

Why don't you use str_getcsv?

$string = '120,"I love Lexi Bell, Proxy Paige","good stuff"';
$parsedCsv = str_getcsv($string);
print_r($parsedCsv);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.