home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.ai      Awaiting the gospel from Sarah Connor      1,954 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 1,135 of 1,954   
   Ted Dunning to mathlover   
   Re: Possible to Find the Clusters One by   
   23 Jul 06 09:29:25   
   
   From: ted.dunning@gmail.com   
      
   mathlover wrote:   
   > ... in the problem I am working on simple k-means clustering attains   
   satisfying quality.   
   >   
   > However, because of the very large size of the problem it takes a lot   
   > of time to find all the clusters (I mean using k-means).   
      
   Actually, it sounds like you just need a really fast version of   
   k-means.   
      
   That is much more easily come by than what you are asking for.  The   
   problem is that the positioning of the unwanted clusters helps define   
   the desired clusters.   
      
   Fast k-means algorithms avoid making multiple passes through all of   
   your data.  Instead, they make multiple passes through a subset of the   
   data (a randomized subset, of course) until the cluster centroids are   
   fairly well defined and then they simply classify the remaining data   
   points by making a single pass through them.   
      
   Also, if you really have a large data set then you probably don't have   
   to cluster all of your data to find the clusters of interest.  A subset   
   should do as well.   
      
   [ comp.ai is moderated ... your article may take a while to appear. ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca