magpiesonskates

magpiesonskates t1_j1u1oag wrote on December 27, 2022 at 11:38 AM

Reply to [D] Has any research been done to counteract the fact that each training datapoint "pulls the model in a different direction", partly undoing learning until shared features emerge? by derpderp3200

This is only true if you use batch size of 1. Randomly sampled batches should average out the effect you speak of