📒
Machine & Deep Learning Compendium
  • The Machine & Deep Learning Compendium
    • Thanks Page
  • The Ops Compendium
  • Types Of Machine Learning
    • Overview
    • Model Families
    • Weakly Supervised
    • Semi Supervised
    • Active Learning
    • Online Learning
    • N-Shot Learning
    • Unlearning
  • Foundation Knowledge
    • Data Science
    • Data Science Tools
    • Management
    • Project & Program Management
    • Data Science Management
    • Calculus
    • Probability & Statistics
    • Probability
    • Hypothesis Testing
    • Feature Types
    • Multi Label Classification
    • Distribution
    • Distribution Transformation
    • Normalization & Scaling
    • Regularization
    • Information Theory
    • Game Theory
    • Multi CPU Processing
    • Benchmarking
  • Validation & Evaluation
    • Features
    • Evaluation Metrics
    • Datasets
    • Dataset Confidence
    • Hyper Parameter Optimization
    • Training Strategies
    • Calibration
    • Datasets Reliability & Correctness
    • Data & Model Tests
    • Fairness, Accountability, and Transparency
    • Interpretable & Explainable AI (XAI)
    • Federated Learning
  • Machine Learning
    • Algorithms 101
    • Meta Learning (AutoML)
    • Probabilistic, Regression
    • Data Mining
    • Process Mining
    • Label Algorithms
    • Clustering Algorithms
    • Anomaly Detection
    • Decision Trees
    • Active Learning Algorithms
    • Linear Separator Algorithms
    • Regression
    • Ensembles
    • Reinforcement Learning
    • Incremental Learning
    • Dimensionality Reduction Methods
    • Genetic Algorithms & Genetic Programming
    • Learning Classifier Systems
    • Recommender Systems
    • Timeseries
    • Fourier Transform
    • Digital Signal Processing (DSP)
    • Propensity Score Matching
    • Diffusion models
  • Classical Graph Models
    • Graph Theory
    • Social Network Analysis
  • Deep Learning
    • Deep Neural Nets Basics
    • Deep Neural Frameworks
    • Embedding
    • Deep Learning Models
    • Deep Network Optimization
    • Attention
    • Deep Neural Machine Vision
    • Deep Neural Tabular
    • Deep Neural Time Series
  • Audio
    • Basics
    • Terminology
    • Feature Engineering
    • Deep Neural Audio
    • Algorithms
  • Natural Language Processing
    • A Reality Check
    • NLP Tools
    • Foundation NLP
    • Name Matching
    • String Matching
    • TF-IDF
    • Language Detection Identification Generation (NLD, NLI, NLG)
    • Topics Modeling
    • Named Entity Recognition (NER)
    • SEARCH
    • Neural NLP
    • Tokenization
    • Decoding Algorithms For NLP
    • Multi Language
    • Augmentation
    • Knowledge Graphs
    • Annotation & Disagreement
    • Sentiment Analysis
    • Question Answering
    • Summarization
    • Chat Bots
    • Conversation
  • Generative AI
    • Methods
    • Gen AI Industry
    • Speech
    • Prompt
    • Fairness, Accountability, and Transparency In Prompts
    • Large Language Models (LLMs)
    • Vision
    • GPT
    • Mix N Match
    • Diffusion Models
    • GenAI Applications
    • Agents
    • RAG
    • Chat UI/UX
  • Experimental Design
    • Design Of Experiments
    • DOE Tools
    • A/B Testing
    • Multi Armed Bandits
    • Contextual Bandits
    • Factorial Design
  • Business Domains
    • Follow the regularized leader
    • Growth
    • Root Cause Effects (RCE/RCA)
    • Log Parsing / Templatization
    • Fraud Detection
    • Life Time Value (LTV)
    • Survival Analysis
    • Propaganda Detection
    • NYC TAXI
    • Drug Discovery
    • Intent Recognition
    • Churn Prediction
    • Electronic Network Frequency Analysis
    • Marketing
  • Product Management
    • Expanding Your Data Science Skills
    • Product Vision & Strategy
    • Product / Program Managers
    • Product Management Resources
    • Product Tools
    • User Experience Design (UX)
    • Business
    • Marketing
    • Ideation
  • MLOps (www.OpsCompendium.com)
  • DataOps (www.OpsCompendium.com)
  • Humor
Powered by GitBook
On this page
  • TOOLS
  • Forecasting methods
  • Data Transformations
  • SPLITTING TIME SERIES DATA
  • Evaluate forecast accuracy
  • Rolling window analysis
  • Moving average window
  • Decomposition
  • Weighted “window”
  • Time Series Components
  • STATIONARY TIME SERIES
  • SHORT TIME SERIES
  • Kalman filters in matlab
  • LTSM for time series
  • CLASSIFICATION
  • CLUSTERING TS
  • ANOMALY DETECTION TS
  • Dynamic Time Warping (DTW)

Was this helpful?

  1. Machine Learning

Timeseries

PreviousRecommender SystemsNextFourier Transform

Last updated 1 year ago

Was this helpful?

  1. - what is?

  1. - stl x11 seats

TOOLS

  1. Semi supervised with DTAIDistance - Active semi-supervised clustering

  • A trend (a,b,c) exists when there is a long-term increase or decrease in the data.

  • A seasonal (a - big waves) pattern occurs when a time series is affected by seasonal factors such as the time of the year or the day of the week. The monthly sales induced by the change in cost at the end of the calendar year.

  • A cycle (a) occurs when the data exhibit rises and falls that are not of a fixed period - sometimes years.

But correlation can LIE, the following has 0.8 correlation for all of the graphs:

Autocorrelation measures the linear relationship between lagged values of a time series.

L8 is correlated, and has a high measure of 0.83

  • Average: Forecasts of all future values are equal to the mean of the historical data.

  • Naive: Forecasts are simply set to be the value of the last observation.

  • Seasonal Naive: forecast to be equal to the last observed value from the same season of the year

  • Drift: A variation on the naïve method is to allow the forecasts to increase or decrease over time, the drift is set to be the average change seen in the historical data.

  • Log

  • Box cox

  • Back transform

  • Calendrical adjustments

  • Inflation adjustment

SPLITTING TIME SERIES DATA

  • Dummy variables: sunday, monday, tues,wed,thurs, friday. NO SATURDAY!

  • notice that only six dummy variables are needed to code seven categories. That is because the seventh category (in this case Sunday) is specified when the dummy variables are all set to zero. Many beginners will try to add a seventh dummy variable for the seventh category. This is known as the "dummy variable trap" because it will cause the regression to fail.

  • Outliers: If there is an outlier in the data, rather than omit it, you can use a dummy variable to remove its effect. In this case, the dummy variable takes value one for that observation and zero everywhere else.

  • Public holidays: For daily data, the effect of public holidays can be accounted for by including a dummy variable predictor taking value one on public holidays and zero elsewhere.

  • Easter: is different from most holidays because it is not held on the same date each year and the effect can last for several days. In this case, a dummy variable can be used with value one where any part of the holiday falls in the particular time period and zero otherwise.

  • Trading days: The number of trading days in a month can vary considerably and can have a substantial effect on sales data. To allow for this, the number of trading days in each month can be included as a predictor. An alternative that allows for the effects of different days of the week has the following predictors. # Mondays in month;# Tuesdays in month;# Sundays in month.

  • Advertising: $advertising for previous month;$advertising for two months previously

“compute parameter estimates over a rolling window of a fixed size through the sample. If the parameters are truly constant over the entire sample, then the estimates over the rolling windows should not be too different. If the parameters change at some point during the sample, then the rolling estimates should capture this instability”

estimate the trend cycle

  • 3-5-7-9? If its too large its going to flatten the curve, too low its going to be similar to the actual curve.

  • two tier moving average, first 4 then 2 on the resulted moving average.

Decomposition

Weighted “window”

  1. Level. The baseline value for the series if it were a straight line.

  2. Trend. The optional and often linear increasing or decreasing behavior of the series over time.

  3. Seasonality. The optional repeating patterns or cycles of behavior over time.

  4. Noise. The optional variability in the observations that cannot be explained by the model.

All time series have a level, most have noise, and the trend and seasonality are optional.

One step forecast using a window of “1” and a typical sample “time, measure1, measure2”:

  • linear/nonlinear classifiers: predict a single output value - using the t-1 previous line, i.e., “measure1 t, measure 2 t, measure 1 t+1, measure 2 t+1 (as the class)”

  • Neural networks: predict multiple output values, i.e., “measure1 t, measure 2 t, measure 1 t+1(class1), measure 2 t+1(class2)”

One-Step Forecast: This is where the next time step (t+1) is predicted.

Multi-Step Forecast: This is where two or more future time steps are to be predicted.

Multi-step forecast using a window of “1” and a typical sample “time, measure1”, i.e., using the current value input we label it as the two future input labels:

  • “measure1 t, measure1 t+1(class) , measure1 t+2(class1)”

  1. sliding-window methods - converts a sequential supervised problem into a classical supervised problem

  2. recurrent sliding windows

  3. hidden Markov models

  4. maximum entropy Markov models

  5. input-output Markov models

  6. conditional random fields

  7. graph transformer networks

STATIONARY TIME SERIES

  1. T+1 - T

  2. Bigger lag to support seasonal changes

  3. pandas.diff()

  4. Plot a histogram, plot a log(X) as well.

  5. Test for the unit root null hypothesis - i.e., use the Augmented dickey fuller test to determine if two samples originate in a stationary or a non-stationary (seasonal/trend) time series

SHORT TIME SERIES

    1. Autoregression (AR)

    2. Moving Average (MA)

    3. Autoregressive Moving Average (ARMA)

    4. Autoregressive Integrated Moving Average (ARIMA)

    5. Seasonal Autoregressive Integrated Moving-Average (SARIMA)

    6. Seasonal Autoregressive Integrated Moving-Average with Exogenous Regressors (SARIMAX)

    7. Vector Autoregression (VAR)

    8. Vector Autoregression Moving-Average (VARMA)

    9. Vector Autoregression Moving-Average with Exogenous Regressors (VARMAX)

    10. Simple Exponential Smoothing (SES)

    11. Holt Winter’s Exponential Smoothing (HWES)

Predicting actual Values of time series using observations

There are three types of gates within a unit:

  • Forget Gate: conditionally decides what information to throw away from the block.

  • Input Gate: conditionally decides which values from the input to update the memory state.

  • Output Gate: conditionally decides what to output based on input and the memory of the block.

Using lstm to predict sun spots, has some autocorrelation usage

CLASSIFICATION

CLUSTERING TS

ANOMALY DETECTION TS

    1. You can feed ransac with tsfresh/tslearn features.

  1. STL:

  2. Sliding windows

Dynamic Time Warping (DTW)

DTW, ie., how to compute a better distance for two time series.

Myth 1: The ability of DTW to handle sequences of different lengths is a great advantage, and therefore the simple lower bound that requires different-length sequences to be reinterpolated to equal length is of limited utility [10][19][21]. In fact, as we will show, there is no evidence in the literature to suggest this, and extensive empirical evidence presented here suggests that comparing sequences of different lengths and reinterpolating them to equal length produce no statistically significant difference in accuracy or precision/recall. Myth 2: Constraining the warping paths is a necessary evil that we inherited from the speech processing community to make DTW tractable, and that we should find ways to speed up DTW with no (or larger) constraints[19]. In fact, the opposite is true. As we will show, the 10% constraint on warping inherited blindly from the speech processing community is actually too large for real world data mining. Myth 3: There is a need (and room) for improvements in the speed of DTW for data mining applications. In fact, as we will show here, if we use a simple lower bounding technique, DTW is essentially O(n) for data mining applications. At least for CPU time, we are almost certainly at the asymptotic limit for speeding up DTW.

  1. Another function for dtw distance in python

SKtime - is a sk-based api, , integrates algos from tsfresh and tslearn

(really good) , explains about the basics in time series prediction, splitting, next step, delayed step, multi step, deseason.

- extracts 1200 features, filters them using FDR for time series classification etc

- DTW, shapes, shapelets (keras layer), time series kmeans/clustering/svm/svr/KNN/bary centers/PAA/SAX

- Library for time series distances (e.g. Dynamic Time Warping) used in the . The library offers a pure Python implementation and a faster implementation in C. The C implementation has only Cython as a dependency. It is compatible with Numpy and Pandas and implemented to avoid unnecessary data copy operations

is a Python library for user-friendly forecasting and anomaly detection on time series. &

* Identify anomalies, outliers or abnormal behaviour (see for example the ).

The recommended method for perform active semi-supervised clustering using DTAIDistance is to use the COBRAS for time series clustering: . COBRAS is a library for semi-supervised time series clustering using pairwise constraints, which natively supports both dtaidistance.dtw and kshape.

, a neural net with time warping - as part of the following manuscript, which focuses on analysis of large-scale neural recordings (though this code can be also be applied to many other data types)

- : Time-Series Similarity with Warping Networks

- “The approach is to come up with a list of features that captures the temporal aspects so that the auto correlation information is not lost.” basically tells us to take sequence features and create (auto)-correlated new variables using a time window, i.e., “Time series forecasts as regression that factor in autocorrelation as well.”. we can transform raw features into other type of features that explain the relationship in time between features. we measure success using loss functions, MAE RMSE MAPE RMSEP AC-ERROR-RATE

on how to define ‘time series’ dummy variables that utilize beginning\end of certain holiday events, including important information on what NOT to filter even if it seems insignificant, such as zero sales that may indicate some relationship to many sales the following day.

(mean, median, percentiles, iqr, std dev, bivariate statistics - correlation between variables)

Bivariate Formula: this correlation measures the extent of a linear relationship between two variables. high number = high correlation between two variable. The value of r always lies between -1 and 1 with negative values indicating a negative relationship and positive values indicating a positive relationship. Negative = decreasing, positive = increasing.

White-noise has autocorrelation of 0.

SK-lego -

of ARIMA algorithm - captures the time series trend or forecast.

curves to explain a complex seasonal fit.

about ML Methods for Sequential Supervised Learning - Six methods that have been applied to solve sequential supervised learning problems:

A time series without a trend or seasonality, in other words non-stationary has a trend or seasonality

There are ways to , i.e., take the difference between time points.

(amazing) and more.

Pmdarima‘s auto_arima function is extremely useful when building an ARIMA model as it helps us identify the most optimal p,d,q parameters and return a fitted ARIMA model.

- explains the concept etc, 1 out of 55 videos.

- Yes, you can use DTW approach for classification and clustering of time series. I've compiled the following resources, which are focused on this very topic (I've recently answered a similar question, but not on this site, so I'm copying the contents here for everybody's convenience):

UCR Time Series Classification/Clustering: , and

Time Series Classification and Clustering with Python:

Capital Bikeshare: Time Series Clustering:

Time Series Classification and Clustering:

Dynamic Time Warping using rpy and Python:

Mining Time-series with Trillions of Points: Dynamic Time Warping at Scale:

Time Series Analysis and Mining in R (to add R to the mix):

And, finally, two tools implementing/supporting DTW, to top it off: and

,

), (shay palachi),

, part , part

a sklearn-like toolkit with an amazing intro, various algorithms for non seasonal and seasonal, transformers, ensembles.

on github

,

- random sample consensus for outlier detection

, , , , 5, 6

,

AD for TS, recommended by DTAIDistance,

using stl decomposition

Forecasting using Arima ,

Auto arima , ,

for outliers, using z-score and t test

Another esd test inside

,

, (for deep learning, as a layer)

with a

, mentions prunedDTW, sparseDTW and fastDTW

git

(duplicate above in classification) - Yes, you can use DTW approach for classification and clustering of time series. I've compiled the following resources, which are focused on this very topic (I've recently answered a similar question, but not on this site, so I'm copying the contents here for everybody's convenience):

UCR Time Series Classification/Clustering: , and

Time Series Classification and Clustering with Python:

Capital Bikeshare: Time Series Clustering:

Time Series Classification and Clustering:

Dynamic Time Warping using rpy and Python:

Mining Time-series with Trillions of Points: Dynamic Time Warping at Scale:

Time Series Analysis and Mining in R (to add R to the mix):

And, finally, two tools implementing/supporting DTW, to top it off: and

-

(nice)

medium
A LightGBM Autoregressor — Using Sktime
SKtime-DL - using keras and DL
TSFresh
DTAIDistance
DTAI Research Group
dtaidistance.clustering.hierarchical
Darts
Forecasting models
Examples
Ddtaidistance.clustering.kmeans
Dtaidistance.clustering.medoids
anomatools package
https://github.com/ML-KULeuven/cobras
Affine warp
Neural warp
NeuralWarp
A great introduction into time series
Interesting idea
Time series patterns:
Some statistical measures
Forecasting methods
Data Transformations
Transforming time series data to tabular (in order to use tabular based approach)
With a gap
now with even timeseries split by group
Evaluate forecast accuracy
Rolling window analysis
Moving average window
Visual example
Creating
1, scikit-lego with a decay estimator
Time Series Components
This article explains
What is?
remove the trend and seasonality
Shay on stationary time series, AR, ARMA
STL
Short time series
PDarima -
Min sample size for short seasonal time series
More mastery on short time series.
Using kalman filters
Kalman filters in matlab
LTSM for time series
Part 1
Part 2
Stackexchange
main page
software page
corresponding paper
a blog post
another blog post
ipython notebook
another blog post
another blog post
yet another blog post
R package
Python module
Clustering time series, subsequences with a rolling window, the pitfall.
Clustering using tslearn
Kmeans for variable length
notebook
What is stationary (process
stationary time series analysis
mastery on arimas
TS anomaly algos (stl, trees, arima)
AD techniques
2
3
Z-score, modified z-score and iqr an intro why z-score is not robust
Adtk
Awesome TS anomaly detection
Transfer learning toolkit
paper and benchmarks
Ransac is a good baseline
Ransac
2
3
4
Anomaly detection for time series
anomatools
AD where anomalies coincide with seasonal peaks!!
AD challenges, stationary, seasonality, trend
Rt anomaly detection for time series pinterest
AD
Solving sliding window problems
Rolling window regression
1
2
1
2
3
Twitters ESD test
here
Minimal sample size for seasonal forecasting
Golden signals
youtube
Graph-based Anomaly Detection and Description: A Survey
Time2vec
paper
The three myths of using DTW
Youtube - explains everything
Python code
good tutorial.
Medium
DTW in TSLEARN
DynamicTimeWarping
Stackexchange
main page
software page
corresponding paper
a blog post
another blog post
ipython notebook
another blog post
another blog post
yet another blog post
R package
Python module
Time Series Hierarchical Clustering using Dynamic Time Warping in Python
notebook
K-Means with DTW, probably fixed length vectors, using tslearn
With time series
Random walk
Time series decomposition book
Mastery on ts decomposition
TSlearn