sapnupuasop
sapnupuasop t1_iu0pqof wrote
Reply to [D] [R] Large-scale clustering by jesusfbes
Does clustering on 100s of features make sense? Maybe spark could solve your problems but you must look if there are spark implementations of other algorithms
sapnupuasop t1_iu0zjy5 wrote
Reply to comment by jesusfbes in [D] [R] Large-scale clustering by jesusfbes
Yeah was thinking of curse of dimensionality, with standard Euclidean distance for example, distances in high dimensions lose their meaning, but there are surely other distance which could function there. Btw I have used sklearn to cluster on couple millions of rows with sklearn successfully