1

I am trying to decrease the processing time of the std function below. Is there a module I could import that could decrease the processing time for this function? It calculates the standard deviation each of the the iterating 10000 values one by one. Although the std function is fast I am looking to perhaps decrease the processing time by half. The function turns the calculations to numpy arrays at the end.

Variables:

FILE = 'input.csv'
data =pd.read_csv(FILE, low_memory=False)
#reverses all the table data values
data1 = data.iloc[::-1].reset_index(drop=True)
#Columns
PC_list= np.array(data1['Close'])
number = 10000

Function

std= pd.Series(PC_list).rolling(number).std().dropna().to_numpy()

Performance:

enter image description here

Sample of the PC_list data frame:

[386.63 386.63 386.63 386.63 386.63 386.63 386.63 386.63 386.63 386.63
 386.63 386.63 386.63 386.63 386.63 386.63 386.63 386.63 386.63 386.63
 378.03 378.03 378.03 378.03 378.03 378.03 378.03 378.03 373.   370.69
 370.13 370.13 369.73 369.73 375.41 375.41 375.41 375.   375.   375.
 376.95 376.95 376.95 376.95 376.95 376.95 376.95 376.95 376.95 376.95
 376.95 376.95 376.95 376.95 376.95 376.95 376.95 376.95 376.95 376.95
 376.95 376.95 376.95 376.95 376.95 376.95 376.95 376.95 376.95 376.94
 376.52 376.52 376.52 376.52 376.52 376.52 376.52 376.52 376.52 376.52
 376.52 376.52 371.32 371.32 371.32 371.32 371.32 371.32 371.32 371.32
 370.96 370.96 370.96 377.09 377.09 377.09 377.09 378.97 374.39 374.4 ]

1 Answer 1

2

Numba

Use a single pass algorithm

from numba import njit

@njit
def std(a, k):
    n = len(a)
    m = n - k + 1
    k_ = k
    mu  = np.zeros(m, np.float64)
    var = np.zeros(m, np.float64)
    mu[0]  = a[:k].sum() / k
    var[0] = ((a[:k] - mu[0]) ** 2).sum() / k_

    for i in range(1, m):
        old = a[i-1]
        new = a[i+k-1]
        d = (new - old)
        mu[i] = mu[i-1] + d / k
        old_ = mu[i-1]
        new_ = mu[i]
        var[i] = var[i-1] + d * (new + old + new_ + old_) / k_
    return mu, var ** 0.5

Prime the compilation of the function

std(np.arange(100), 10);

Create test data

np.random.seed([3, 14])
s = pd.Series(np.random.randn(1_000_000))

Use function

mu, sig = std(s.to_numpy(), 1000)

These results are close to s.rolling(1000).std().dropna() but their are numerical differences.

s.rolling(1000).std().dropna().sub(sig).plot()

enter image description here

The timing results

%timeit s.rolling(1000).std().dropna()
%timeit pd.Series(std(s.to_numpy(), 1000)[1])

28 ms ± 189 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
10.3 ms ± 42.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

You can see what Pandas is doing deep inside the bowels... here

It isn't dissimilar but does a few more overhead checks that cost some time. Ultimately, it is likely safer to just use what Pandas has. But if you need something that is a tad quicker, well there you go.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks do you think it would be possible to use gpu acceleration with this function

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.