TheLionKing2020
TheLionKing2020 t1_iu1bcw8 wrote
Reply to [D] [R] Large-scale clustering by jesusfbes
Well, you don't need to train on all of these data
First take samples of 10k, 50k and 100k and see if you have different results. Do you get different number of clusters?
TheLionKing2020 t1_iu3g14t wrote
Reply to comment by jesusfbes in [D] [R] Large-scale clustering by jesusfbes
Also before going to make tests over 100k of samples check if you can lower the dimensions: feature selection, low variance, PCA, etc.