aerosandbox.tools.statistics.time_series_uncertainty_quantification
#
Module Contents#
Functions#
|
Estimates the standard deviation of the random noise in a time-series dataset. |
|
Bootstraps a time-series dataset and fits splines to each bootstrap resample. |
Attributes#
- aerosandbox.tools.statistics.time_series_uncertainty_quantification.estimate_noise_standard_deviation(data, estimator_order=None)[source]#
Estimates the standard deviation of the random noise in a time-series dataset.
Relies on several assumptions:
The noise is normally-distributed and independent between samples (i.e. white noise).
The noise is stationary and homoscedastic (i.e., the noise standard deviation is constant).
The noise is uncorrelated with the signal.
- The sample rate of the data is significantly higher than the highest-frequency component of the signal. (In
practice, this ratio need not be more than ~5:1, if higher-order estimators are used. At a minimum, however, this ratio must be greater than 2:1, corresponding to the Nyquist frequency.)
The algorithm used in this function is a highly-optimized version of the math described in this repository, part of an upcoming paper: https://github.com/peterdsharpe/aircraft-polar-reconstruction-from-flight-test
- Parameters:
data (aerosandbox.numpy.ndarray) – A 1D NumPy array of time-series data.
estimator_order (int) – The order of the estimator to use. Higher orders are generally more accurate, up to the point where sample error starts to dominate. If None, a reasonable estimator order will be chosen automatically.
- Return type:
float
Returns: An estimate of the standard deviation of the data’s noise component.
- aerosandbox.tools.statistics.time_series_uncertainty_quantification.bootstrap_fits(x, y, x_noise_stdev=0.0, y_noise_stdev=None, n_bootstraps=2000, fit_points=300, spline_degree=3, normalize=None)[source]#
Bootstraps a time-series dataset and fits splines to each bootstrap resample.
- Parameters:
x (aerosandbox.numpy.ndarray) – The independent variable (e.g., time) of the dataset. A 1D NumPy array.
y (aerosandbox.numpy.ndarray) – The dependent variable (e.g., altitude) of the dataset. A 1D NumPy array.
n_bootstraps (int) – The number of bootstrap resamples to create.
fit_points (Union[int, Iterable[float], None]) –
An optional variable that determines what to do with the splines after they are fit:
- If an integer, the splines will be evaluated at a linearly-spaced vector of points between the minimum
and maximum x-values of the dataset, with the number of points equal to fit_points. This is the default.
If an iterable of floats (e.g. a 1D NumPy array), the splines will be evaluated at those points.
If None, the splines won’t be evaluated, and instead the splines are returned directly.
spline_degree (int) – The degree of the splines to fit.
normalize (bool) –
Whether or not to normalize the data before fitting. If True, the data will be normalized to the range [0, 1] before fitting, and the splines will be un-normalized before being returned. If False, the data will not be normalized before fitting.
If None (the default), the data will be normalized if and only if fit_points is not None.
x_noise_stdev (Union[None, float]) –
y_noise_stdev (Union[None, float]) –
- Return type:
Union[Tuple[aerosandbox.numpy.ndarray, aerosandbox.numpy.ndarray], List[aerosandbox.tools.pretty_plots.utilities.natural_univariate_spline.NaturalUnivariateSpline]]
Returns: One of the following, depending on the value of fit_points:
If fit_points is an integer or array, then this function returns a tuple of NumPy arrays:
x_fit: A 1D NumPy array with the x-values at which the splines were evaluated.
- y_bootstrap_fits: A 2D NumPy array of shape (n_bootstraps, len(x_fit)) with the y-values of the
splines evaluated at each bootstrap resample and at each x-value.
- If fit_points is None, then this function returns a list of n_bootstraps splines, each of which is a
NaturalUnivariateSpline, which is a subclass of scipy.interpolate.UnivariateSpline with more sensible extrapolation.