Ask HN: What is the SOTA in short-term forecasting?

1 point

2 years ago

I've been working by my own on some personal projects on value prediction of multivariate mixed-frequency data, that is, thousands of values recorded about power generation and many exogenous variables for high frequencies used to get an estimation of a set of target variables some hours in advance. And so I've stumbled upon the world of time series forecasting with deep learning [1].

First of all: do enterprises (if they actually use deep learning, because it kind of works in my experience) retrain the neural network at each step with the new row of data? And, if so, do they take all data as the training set, and validate nothing?

Second: do they use some kernel for the loss in order to penalize errors of the model in more recent data? That would seem extremely logical to me.

Third: which arquitecture is it often used? I've been using LSTM but in [1] the SOTA arquitectures are way more advanced.

Four: as we are dealing with paths with a strong stochastic component, does it make more sense to pursue a binary output with a cross-entropy loss instead of the value of the target? As in: I will make this decision if the value exceeds 'x', let me train the nn so that it outputs the probability of exceeding 'x', or, maybe, outputing the p-confidence intervals, for example, instead of trying to naively guess the value.

[1] https://github.com/thuml/Time-Series-Library