arsenyinfo t1_j5m33oc wrote on January 23, 2023 at 11:37 PM

As a practitioner, I am surprised to see no chapter on finetuning

SimonJDPrince OP t1_j5o0orv wrote on January 24, 2023 at 10:43 AM

Can you give me an example of a review article or chapter in another book that covers roughly what you expect to see?

NoRexTreX t1_j5o9hd8 wrote on January 24, 2023 at 12:29 PM

I can't give an example material outside of just huggingface documentation but it's the big thing right now to leverage pre trained models so if your book doesn't mention it then it's missing the hipest thing. And also adapterhub.

arsenyinfo t1_j5r4b5l wrote on January 24, 2023 at 11:47 PM

Random ideas from the top of my head:

intro why transfer learning works;
old but good https://cs231n.github.io/transfer-learning/;
a concept of catastrophic forgetting;
some intuition on answering empirical questions like what layers should be frozen, how to adapt LR etc.

SimonJDPrince OP t1_j5taba8 wrote on January 25, 2023 at 12:28 PM

Thanks. This is useful.

Apprehensive-Grade81 t1_j5oijh0 wrote on January 24, 2023 at 1:50 PM

Definitely would like something like this. Maybe SOTA benchmarks as well.

new_name_who_dis_ t1_j5oix1c wrote on January 24, 2023 at 1:53 PM

Fine tuning isn’t any different than just training…? You just don’t start with random network, but fine tuning doesn’t really have anything different besides that and the size of the dataset

SimonJDPrince OP t1_j5olux3 wrote on January 24, 2023 at 2:15 PM

That was kind of my impression. And I do discuss this in the chapters on transformers and regularization. Was wondering if there is more to it.