pyfreak182 t1_j9slq8a wrote on February 24, 2023 at 6:50 AM

Reply to [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

It helps that the math behind back propagation (i.e. matrix multiplications) is easily parallelizable. The computations in the forward pass are independent of each other, and can be computed in parallel for different training examples. The same is true for the backward pass, which involves computing the gradients for each training batch independently.

And we have hardware accelerators like GPUs that are designed to perform large amounts of parallel computations efficiently.

The success of deep learning is just as much about implementation as it is theory.

pyfreak182 t1_j8vpx4e wrote on February 17, 2023 at 7:50 AM

Reply to [Discussion] Time Series methods comparisons: XGBoost, MLForecast, Prophet, ARIMAX? by RAFisherman

In case you are not familiar, there is also Time2Vec embeddings for Transformers. It would be interesting to see how that architecture compares as well.