The Digamma.ai Time Series Analysis Framework
The traditional development process of time series analysis is complex and time consuming. At Digamma.ai, we made this development process better, faster and more efficient and developed our own proprietary time series analysis framework.
What is Time Series Analysis?
A time series is a series of data points indexed in time order. The daily values of the Dow Jones Industrial Average and a car's engine temperature measurements taken from a sensor are both examples of a time series. Time series analysis includes methods for analyzing these series of data to extract meaningful statistics, predict future values, or detect anomalies. Stock price prediction, equipment malfunction detection and cyber attack monitoring are a few examples of how time series analysis can be applied to real-life scenarios.

California Monthly Home Sales Volume

Seattle Daily CO2 data - previous 3 days

Why Use Digamma.ai’s Time Series Analysis Framework?
Conducting time series analysis is often a laborious undertaking. In practice, it’s not a matter of simply developing an algorithm and letting a computer do the work. First, it is necessary to prepare and clean the data. Following data processing, repetitive, time-consuming and hands-on trial and error experiments must be conducted manually. However, our proprietary and scalable framework changes all of this by ‘gluing’ various time series methods together, automating them and allowing you to use them iteratively to refine your model. When applied, our framework allows you to bootstrap your big data, quickly assess it and get to market faster than your competitors.
Our Framework
Our framework consists of four parts including data preprocessing, exploratory analysis, anomaly detection and forecasting. Each of these four methods includes a variety of models and algorithms each customizable with options and parameters. A typical workflow usually requires multiple iterations, trying different models and algorithms, tweaking their parameters and applying them in many different combinations. As a manual process, it is time-consuming. However, our framework automates the entire process. Our approach allows you to use a variety of effective methods quickly, iteratively and easily to refine your model instead of having to rely on onerous trial and error experiments. Our framework is scalable to very large datasets and can be deployed to the cloud, including Amazon AWS and other providers.

Data Preprocessing

  • Parsing many common formats (i.e. CSV, JSON and SQL databases).
  • Missing values imputation techniques specially designed for time dependent data.
  • Resampling techniques (i.e. fixing and changing samples’ frequency); support for non-uniformly sampled data and changes in sampling rates.
  • Quantization methods developed specifically for time series.
  • Handling of gaps in measurements (i.e. long sequences of absent values).
  • Normalization.
  • Applying a variety of filters (e.g. lo-pass, LOWESS).

Exploratory Analysis

  • Spectral analysis (i.e. fourier, wavelet analysis, linear and non-linear filtering).
  • Feature extraction (i.e. scalar and sequential features).
  • Dimensionality reduction methods for multivariate time series.
  • Clustering (i.e. individual points and whole sequences).
  • Time series decomposition (i.e. seasonal-trend decomposition-STL, robust PCA methods, singular spectrum decomposition-SSA).
  • Sequential pattern mining-WINEPI algorithm.
  • Descriptive statistics.
  • Visualization data via variety of plots and charts helping user to assess it quickly.

Anomaly Detection

  • Supports single-variate and multivariate time series.
  • Detection of single isolated anomaly points (e.g. outliers).
  • Detection of changes in trends and dynamics.
  • Detection of anomalous subsequences.

Forecasting

  • One-step dynamic forecasts.
  • Trend and seasonality handling.
  • Linear models (i.e. whole family of SARIMA models).
  • Variance modelling (GARCH models family).
  • Kalman filtering.
  • Non-linear autoregressive models.
  • Neural networks.
  • Fuzzy logic systems.

How Our Methodology Works

First, we obtain the data from a client in whatever format they provide. We use automatic routines to transform this raw data into a standardized format to make processing easier. Then, we start a typical data science processing flow, starting from data cleaning (e.g. removing invalid values from a time series), fixing non-uniformly sampled parts (e.g. data is often collected in regular time intervals but can sometimes be interrupted due to errors and maintenance) and normalizing value ranges. There are often series with huge differences in scale, some varying between 0-1.0 and some between 0 and 1,000,000,000.

Next, we begin the exploratory process where we analyze the data, draw charts, histograms, extract task related features and calculate various statistics. Then, we proceed to model selection and tuning. We have models for prediction, filtering and anomaly/change detection. Tuning is completely automatic and our experts perform the model selection. After that, we evaluate results according to various metrics. Here, the first iteration ends. From this point, we analyze results, and, if they are not adequate, tweak the previous stages to improve overall performance — this is the next iteration.

This process is generally linear, as one stage comes after the other. However, in practice, each stage performs its task with multiple sets of parameters and the result of every run is passed to the next stage. In this way, the process is more like a graph or a tree—not a sequence—and is somewhat similar to neural nets insofar as they relate to how computing nodes are organized. Also, this method stores all intermediate results. So, if only the second stage from the end of the pipeline is changed, then only the last two stages will be re-computed.

Reports

It is important to present your team’s data scientists with a comprehensive and actionable set of results from a time series analysis. All reports in our framework are delivered as PDF documents and include a variety of information in the form of plots, charts, and tables. Our framework is interactive and can used by data scientists as a tool to perform a variety of tasks with regards to time series analysis.

Our report can help your data scientist to assess whether your process in a time series analysis project is successful. If not, the report can guide your team in making adjustments where necessary to optimize the process and produce more accurate results.

In addition to a final report, an exploratory report is performed at the early stages of the process to provide a quick glimpse into your dataset. It additionally allows a quick assessment of data quality, quantities, and core characteristics such as range and variance.

Finally, our framework allows your data scientist to receive iterative reports during the process of building a data analysis and modeling framework for existing pieces of data. The iterative reports provides the results of the algorithm's application and the metrics to show how your data scientists’ models perform.

EXPLORATORY REPORT: TABLE OF CONTENTS

Filtering

  1. Butterworth
  2. Total variation denoising
  3. Anomaly Detection
  4. Basic info
  5. Preprocessing
  6. Imputation
  7. Ordering
  8. Resampling
  9. Prediction
  10. Neo-fuzzyneuron.
  11. ARIMA
  12. SARIMAX
  13. Prediction-comparison
  14. Multivariate Anomaly Detection

To download a sample Exploratory Report generated by our Time Series Analysis Framework, please provide your contact details below. Your contact information will not be shared and will be used only to send you back the report.