weeeeeewoooooo t1_jabotnq wrote on February 28, 2023 at 7:16 AM

Reply to comment by Etterererererer in [P] [R] Neural Network in Fortran! by Etterererererer

There is a lot of outdated code in physics and chemistry. It is in Fortran because no one wants to rewrite it.

It is actually more difficult to parallelize fortran code on supercomputers compared to modern distributed computing libraries like HPX, which is a C++ library. Fortran will also perform much worse as you won't beat the DAG schedulers in HPX.

Fortran is also slower, because you can't write generic high performance code. This is also true of C. You need something akin to templating which allows you to optimize expressions at compile-time like you can do in C++. You need generic code for all but the most trivial operations, else it is very difficult and time consuming to build more complex operations like those that build into neural networks.

Additionally, CUDA is native C++ (and again can be optimized generically which you can't do once you have C or Fortran APIs), so if you want to seriously take advantage of vectorization, you should look at using GPUs. It is likely that if a place has a supercomputer, they probably also have a GPU cluster.

All these reasons are why C++ is the king of performance programming.

weeeeeewoooooo t1_j8xoy8u wrote on February 17, 2023 at 6:15 PM

Reply to comment by BenXavier in [Discussion] Time Series methods comparisons: XGBoost, MLForecast, Prophet, ARIMAX? by RAFisherman

This is a great question. Steve Brunton has some great videos about dynamical systems and their properties that are very accessible. This one I think does a good job showing the behavioral relationship between the eigenvalues and the underlying system: https://youtu.be/XXjoh8L1HkE

Recursive application of a system (model) over a "long" period of time gets rid of transients, so the system will fall onto the governing attractors of the system, which are generally dictated by the eigenvalues of the system. The recursive application also helps isolate the system so you are observing the model autonomously, rather than being driven by external inputs. This helps you tease out how expressive your model actually is versus how dependent it is on you feeding it from the target system's observations, which helps reduce over fitting and reduces bias.

weeeeeewoooooo t1_j8wyqaj wrote on February 17, 2023 at 3:28 PM

Reply to [Discussion] Time Series methods comparisons: XGBoost, MLForecast, Prophet, ARIMAX? by RAFisherman

You should probably try all four. There are some simple ways for you to do comparisons yourself. You can easily compare time-series models and the robustness of their training by using them to recursively predict the future by feeding their outputs back into themselves (regardless if they were trained in that fashion).

This will expose the properties of the eigenvalues of the model itself. Failure of a time-series model to match the larger eigenvalues of a system means it is failing the fundamentals and not able to capture the most basic global properties of the system you are trying to fit.

You don't necessarily have to do any fancy calculations. If the models fail to maintain the same qualitative patterns apparent in the original data over long time periods of self-input, then that means they are failing to capture the underlying dynamics. Many models eventually explode or decay to some fixed point (like a cycle or fixed value). This is a red flag that either the model is inadequate or training has failed you.

A simple dummy test for this would be training on something like a spin glass or Lorenz attractor, any kind of chaotic system really. Or just look along any interesting dimension of the data that you are using. A good model when recursively applied to itself will look very similar to the original signal in how it behaves regardless of phase.