Clip
Description
The Clip class restricts each value in a data sequence to fall within a specified range. Values outside the defined lower and upper bounds are clipped to the nearest boundary, ensuring that the processed data stays within the desired range. This function is useful in scenarios where extreme values or outliers need to be managed or excluded from further analysis.
Parameters:
lower(optional): The minimum allowable value. If a data point is below this threshold, it will be set tolower. If unspecified, there is no lower bound.upper(optional): The maximum allowable value. If a data point exceeds this threshold, it will be set toupper. If unspecified, there is no upper bound.
NaN handling: NaN values are not modified by this function and remain as NaN if present in the input data.
Usage Example and Plot
Below is an example of using Clip to constrain data to fall between -1 and 1, along with a plot illustrating the effect.
import numpy as np
import plotly.graph_objects as go
from screamer import Clip
# Generate example data
data = np.random.normal(size=50)
# Clip data to fall between -1 and 1
clipped_data = Clip(lower=-1, upper=1)(data)
# Create plot with input data and clipped data
fig = go.Figure()
fig.add_trace(go.Scatter(y=data, mode='lines+markers', name='Input Data'))
fig.add_trace(go.Scatter(y=clipped_data, mode='lines+markers', name='Clipped Data', line=dict(color='red')))
fig.update_layout(
title="Data Clipping with Bounds (-1, 1)",
xaxis_title="Index",
yaxis_title="Value",
yaxis=dict(range=[-2, 2]), # Limit y-axis to show the clipping effect clearly
margin=dict(l=20, r=20, t=80, b=20),
legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1)
)
fig.show()
Implementation Details
Algorithm
The Clip function performs an element-wise check on each value in the input data:
If a lower bound (
lower) is specified, any value below this threshold is replaced withlower.If an upper bound (
upper) is specified, any value exceeding this threshold is replaced withupper.
This element-wise operation ensures that each data point falls within the specified range, with minimal computational overhead.
Complexity
Time Complexity:
O(1).Space Complexity:
O(1).
Performance
The Clip has comparable speed to numpy’s clip, and is approximately 10x faster than Pandas clip..