Edit: previous versions clobbered the newline that would, I assume, be in the file. Fixed now.
This is probably too much on the "if in doubt, use brute force" side, but it works:
regex = r"""(?<=["'])[^\S\n]+(?=["'])|(?<=["'])[^\S\n]+(?=\d)|(?<=\d)[^\S\n]+(?=\d|\.\d)|(?<=(?<=\w|\d)\d)[^\S\n]+(?=["'])|(?<=["'\d])[^\S\n]*,[^\S\n]*"""
It leaves commas inside strings, and handles numbers with a leading dot.
To get the output you want:
re.sub(regex, ",", original_string)
For a rough idea of performance [1], on an Ivy Bridge Celeron
import timeit
s = """\
import re
s = \"\"\"1,' unchanged 1' " unchanged 2 " 2.009,-2e15 35 " fad!" ' dfgsdfg ' , 'asdfasdf' " fasf , , asfa" "2 fs", .085 .835\"\"\"
rgex = re.compile(r\"\"\"(?<=["'])\s+(?=["'])|(?<=["'])\s+(?=\d)|(?<=\d)\s+(?=\d|\.\d)|(?<=(?<=\w|\d)\d)\s+(?=["'])|(?<=["'\d])\s*,\s*\"\"\")
re.sub(rgex, ",", s)
"""
print("1k iterations: ", timeit.timeit(stmt=s, number=1000))
print("10k iterations: ", timeit.timeit(stmt=s, number=10000))
print("100k iterations: ", timeit.timeit(stmt=s, number=100000))
print("200k iterations: ", timeit.timeit(stmt=s, number=200000))
print("300k iterations: ", timeit.timeit(stmt=s, number=300000))
gives:
1k iterations: 0.0494868220000626
10k iterations: 0.4617418729999372
100k iterations: 4.604098313999884
200k iterations: 9.197777003000056
300k iterations: 13.79744054799994.
Interestingly, with the regex module, which is supposed to be more performant (as far as I understood), and which is supposed to replace the standard library re some time in the future, it's roughly two times slower.
[1]: It's not a realistic test as it just iterates on the string over and over, but I was in a hurry. Later tried a little better, with a string consisting of 200.000 and 300.000 lines (of the same string) and it came out roughly the same. ~8 seconds for 200.000 and ~12 seconds for 300.000.