Hi I have a "sudo" csv file that looks something like this:
id, Wave ID, Time Stamp, Number of Samples, Sample Data Array
123, 317, 1567191561.8044672, 128, 79, 17, 162, 165, 66, 3, 40, 191, 68, 56, 59, 142, 143, 7, 150, 14, 120, 172, 76, 167, 55, 27, 198, 115, 50, 87, 38, 185, 199, 74, 43, 4, 133, 114, 89, 10, 136, 46, 85, 187, 182, 170, 149, 9, 25, 128, 39, 175, 102, 45, 33, 35, 129, 156, 20, 118, 108, 72, 111, 99, 122, 140, 93, 155, 54, 63, 189, 173, 171, 134, 163, 159, 91, 193, 64, 8, 97, 34, 80, 11, 121, 145, 190, 135, 144, 31, 29, 179, 125, 116, 196, 67, 152, 112, 148, 103, 132, 106, 78, 75, 28, 174, 119, 98, 110, 86, 123, 141, 84, 83, 178, 12, 169, 113, 48, 131, 52, 180, 100, 117, 6, 77, 69, 146, 18, 157, 127, 164
123, 20, 1567191562.0020044, 16, 779, 788, 801, 817, 835, 855, 875, 895, 916, 933, 946, 956, 963, 965, 962, 952
123, 20, 1567191561.8064446, 0,
123, 317, 1567191561.8044672, 100, 132, 48, 195, 78, 190, 124, 38, 99, 87, 1, 66, 6, 106, 18, 180, 197, 59, 148, 41, 128, 125, 194, 175, 81, 21, 115, 184, 30, 71, 77, 166, 3, 107, 114, 52, 55, 186, 5, 103, 145, 19, 8, 69, 64, 122, 90, 129, 83, 165, 79, 178, 2, 14, 74, 25, 133, 147, 158, 75, 146, 20, 140, 101, 97, 10, 143, 88, 50, 168, 112, 118, 9, 137, 155, 24, 89, 144, 16, 13, 156, 196, 113, 183, 34, 120, 142, 130, 49, 86, 46, 138, 191, 192, 189, 70, 123, 159, 108, 7, 95
So the first 4 columns are a normal csv then whatever remains is a list of some length. The "Number of Samples" column denotes how long the list is and each line is ended with a new line character.
The Final dataframe would look something like:
id, Wave ID, Time Stamp, Sample Data Array
123, 317, 1567191561.8044672, [1,2,3,4,5,...]
123, 317, 1567191561.8044672, [1,2,3,4,5,...]
123, 20, 1567191561.8044672, []
123, 317, 1567191563.8044672, [1,2,3,4]
Is there some way to import this using read_csv in pandas or something else? I wrote a simple parser that reads the file line by line but its pretty slow. Would prefer to have a pandas dataframe at the end so I can do some group by/sorting on the columns.
Thanks