I have a few unstructured data like this
test1 21;
test2 22;
test3 [ 23 ];
and I want to remove the unnecessary whitespace and convert it into the list of two-item per row and the expected output should look like this
['test1', '21']
['test2', '22']
['test3', ['23']]
Now, I am using this regex sub method to remove the unnecessary whitespace
re.sub(r"\s+", " ", z.rstrip('\n').lstrip(' ').rstrip(';')).split(' ')
Now, the problem is that it is able to replace the unnecessary whitespace into single whitespace, which is fine. But the problem I am facing in the third example, where after and before the open and close bracket respectively, it has whitespace and that I what to remove. But using the above regex I am not able to.
This is the output currently I am getting
['test1', '21']
['test2', '22']
['test3', '[', '23', ']']
You may check the example here on pythontutor.
['23']is a result of making an array. Not really a regex strong point.