[D] Has any research been done to counteract the fact that each training datapoint "pulls the model in a different direction", partly undoing learning until shared features emerge? Submitted by derpderp3200 t3_zwd49c on December 27, 2022 at 11:01 AM in MachineLearning 20 comments 4
velcher t1_j217txd wrote on December 28, 2022 at 10:37 PM https://arxiv.org/abs/2001.06782 Gradient Surgery for Multi-Task Learning Some related work in Multi-task RL. But I remember my impression of it was that it only moderately helps Multi-task RL. Permalink 3
Viewing a single comment thread. View all comments