derpderp3200 OP t1_j2avw24 wrote
Reply to comment by realjunkman in [D] Has any research been done to counteract the fact that each training datapoint "pulls the model in a different direction", partly undoing learning until shared features emerge? by derpderp3200
Interesting! I thought about something similar, a "no parameter is left unused" during training, but using unused regions for fine-tuning sounds like a much more clever application of the principle.
realjunkman t1_j2cc4nm wrote
It was a presentation I saw at EMNLP this past year. I’ll try and look for it, but if I don’t report back… it was a presentation during day 3!
Viewing a single comment thread. View all comments