
AltruisticNight8314 t1_j1ohh7u wrote

What hardware would be required to i) train or ii) fine-tune weights (i.e. run a few epochs on my own data) for medium-sized transformers (500M-15B parameters)?

I do research on proteomics and I have a very specific problem where perhaps even fine-tuning the weights of a trained transformer (such as ESM-2) might be great.

Of course, there's always the poor man's alternative of building a supervised model on the embeddings returned by the encoder.