pyfreak182
pyfreak182 t1_j8vpx4e wrote
Reply to [Discussion] Time Series methods comparisons: XGBoost, MLForecast, Prophet, ARIMAX? by RAFisherman
In case you are not familiar, there is also Time2Vec embeddings for Transformers. It would be interesting to see how that architecture compares as well.
pyfreak182 t1_j9slq8a wrote
Reply to [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
It helps that the math behind back propagation (i.e. matrix multiplications) is easily parallelizable. The computations in the forward pass are independent of each other, and can be computed in parallel for different training examples. The same is true for the backward pass, which involves computing the gradients for each training batch independently.
And we have hardware accelerators like GPUs that are designed to perform large amounts of parallel computations efficiently.
The success of deep learning is just as much about implementation as it is theory.