Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.ai    |    Awaiting the gospel from Sarah Connor    |    1,954 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 1,545 of 1,954    |
|    Ted Dunning to All    |
|    Re: clustering criterion    |
|    17 Oct 07 23:49:38    |
      From: ted.dunning@gmail.com              With web sessions, the best proximity data you can get (in my opinion)       is derived by building user or session models and then seeing how well       the model predicts other sessions. Depending on the cardinality of       your event set, you may need to use a latent variable model to deal       with sparsity. If you have only a few (hundreds, say) pages that all       get pretty good traffic then you may be able to model their visits       explicitly. There are many forms of latent models possible, but the       latent Dirichlet work on hidden Markov models for text clustering       might be of particular interest to you.              > thanks. actually my dataset is web sessions. i computed a similarity       > matrix between sessions. since i don't know the label of sessions       > priorly, i'm kind of confused about the criterion. by the way, does       > anyone know good algorithm for this kind of matrix clustering?       > thanks in advance.              [ comp.ai is moderated ... your article may take a while to appear. ]              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca