Project Description

Many methods for time series clustering have been proposed in the literature. However, most of them struggle computing a large number of clusters or are computationally expensive. In addition, they do not consider latent information. The objective of this research is to develop a time series clustering method extending the idea of co-clustering Laclau et al. (2017) for time series using Optimal Transport, Villani (2008); Santambrogio (2015), while critically analyzing the state of the art in the theory and the industry.

Following tasks should be carried out in a period of 2 to 4 months

  • To propose a method capable of clustering Time Series data using OT and causal modeling while making use of side information such as categories, timestamps, customer location. This method has to provide at least similar results to Dynamic Time Warping (DTW);
  • To review the existing theory on time series clustering, co-clustering and causality;
  • To identify the unique characteristics of the cloud computing industry for time series analysis;
  • To compare theoretically and practically the developed method against Dynamic Time Warping, K-Shape, Soft DTW and other state of the art methods.

The object of research is uncertainty reduction in demand forecasting through clustering for time series in a large data context within the industry. The subject of the research is time series clustering using OT and causal modeling techniques to improve forecasts while helping understand any possible causal relations.

References

  1. Bachoc, F., Suvorikova, A., Loubes, J.-M., and Spokoiny, V. (2018). Gaussian Process Forecast with multi- dimensional distributional entries. ArXiv e-prints.
  2. Gretton, A., Borgwardt, K. M., Rasch, M. J., Schölkopf, B., and Smola, A. (2012). A kernel two-sample test. J. Mach. Learn. Res., 13:723–773.
  3. Huang, B., Zhang, K., and Schölkopf, B. (2015). Identification of time-dependent causal model: A gaussian process treatment. In Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI’15, pages 3561–3568. AAAI Press.
  4. Pearl, J. (2009). Causality: Models, Reasoning and Inference. Cambridge University Press, New York, NY, USA, 2nd edition.
  5. Rasmussen, C. E. and Williams, C. K. I. (2005). Gaussian processes for machine learning. MIT Press.
  6. Santambrogio, F. (2015). Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling. Progress in Nonlinear Differential Equations and Their Applications. Springer International Publishing.
  7. Villani, C. (2008). Optimal Transport: Old and New. Grundlehren der mathematischen Wissenschaften. Springer Berlin Heidelberg.