... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"

comp.ai

Awaiting the gospel from Sarah Connor

1,954 messages

[ << oldest | < older | list | newer > | newest >> ]

Message 1,545 of 1,954

Ted Dunning to All

Re: clustering criterion

17 Oct 07 23:49:38

   From: ted.dunning@gmail.com   
      
   With web sessions, the best proximity data you can get (in my opinion)   
   is derived by building user or session models and then seeing how well   
   the model predicts other sessions.  Depending on the cardinality of   
   your event set, you may need to use a latent variable model to deal   
   with sparsity.  If you have only a few (hundreds, say) pages that all   
   get pretty good traffic then you may be able to model their visits   
   explicitly.  There are many forms of latent models possible, but the   
   latent Dirichlet work on hidden Markov models for text clustering   
   might be of particular interest to you.   
      
   > thanks. actually my dataset is web sessions. i computed a similarity   
   > matrix between sessions. since i don't know the label of sessions   
   > priorly, i'm kind of confused about the criterion. by the way, does   
   > anyone know good algorithm for this kind of matrix clustering?   
   > thanks in advance.   
      
   [ comp.ai is moderated ... your article may take a while to appear. ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)

[ << oldest | < older | list | newer > | newest >> ]