RollingKurt
Description
The RollingKurt class computes the excess kurtosis of a data sequence within a specified moving window. This rolling calculation provides a measure of the “tailedness” of the data distribution over the window, with a correction applied for small sample sizes. The computed value represents excess kurtosis, meaning it is adjusted to measure how the distribution deviates from a normal distribution (where excess kurtosis is zero). Additionally, a bias correction (or sample correction) is included, making this estimate more accurate when sample sizes are small.
Parameters:
window_size: Specifies the size of the rolling window.start_policy: Defines how the function handles the initial phase when fewer thanwindow_sizedata points are available. This parameter accepts one of the following three values:"strict": ReturnsNaNfor all calculations untilwindow_sizeelements have been processed."expanding": Adapts the computation by dynamically reducing the window size to include all available data, starting from a single point and growing untilwindow_sizeis reached."zero": Simulates a full initial window of zeros, effectively pre-filling the data stream withwindow_sizezeros before processing the actual input.
Usage Example and Plot
Below is an example of using RollingKurt to calculate the rolling median for a random dataset, along with a plot illustrating its output.
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from screamer import RollingKurt
# Generate example data
data = np.cumsum(np.random.normal(size=300))
# Create subplots with specified row heights and shared x-axis
fig = make_subplots(
rows=2, cols=1,
shared_xaxes=True,
row_heights=[2/3, 1/3],
vertical_spacing=0.1
)
# Add traces for each subplot
fig.add_trace(go.Scatter(y=data, mode='lines', name='Input Data'), row=1, col=1)
fig.add_trace(go.Scatter(y=RollingKurt(30)(data), mode='lines', name='Rolling Kurtosis', line=dict(color='red')), row=2, col=1)
# Update layout with titles and axis labels
fig.update_layout(
title=f"Rolling Kurtosis with Window Size 30",
xaxis_title="Index",
yaxis=dict(title="Input Data"),
yaxis2=dict(title="Rolling Kurtosis", range=[-2, 4]),
margin=dict(l=20, r=20, t=80, b=20),
legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1)
)
fig.show()
Implementation Details
Algorithm
RollingKurt implements cyclic buffers to accumulate windowed statistics.
Complexity
Time Complexity:
O(log(1))per new element due to the insertion and deletion operations in the heaps.Space Complexity:
O(window_size), as only elements within the current window are stored.
Performance
Short streams (n=1.000): 120% faster than
Pandas Rolling kurtLonger streams (n=1.000.000): 400% faster than
Pandas Rolling kurt