I have a long list of data, that meaningful data being sandwiched between 0 values, here is how it looks like
0
0
1
0
0
2
3
1
0
0
0
0
1
0
The length of 0 and meaningful value sequence is variable. I want to extract the meaningful sequence, each of them into a row in a dataframe. For example, the above data can be extracted to this:
1
2 3 1
1
I used this code to 'slice' the meaningful data:
import pandas as pd
import numpy as np
raw = pd.read_csv('data.csv')
df = pd.DataFrame(index=np.arange(0, 10000),columns = ['DT01', 'DT02', 'DT03', 'DT04', 'DT05', 'DT06', 'DT07', 'DT08', 'DT02', 'DT09', 'DT10', 'DT11', 'DT12', 'DT13', 'DT14', 'DT15', 'DT16', 'DT17', 'DT18', 'DT19', 'DT20',])
a = 0
b = 0
n=0
for n in range(0,999999):
if raw.iloc[n].values > 0:
df.iloc[a,b] = raw.iloc[n].values
a=a+1
if raw [n+1] == 0:
b=b+1
a=0
but I keep getting KeyError: n, while n is the row after the first row has a value different than 0.
Where is the problem with me code? And is there any way to improve it, in term of speed and memory cost? Thank you very much