From: ted.dunning@gmail.com   
      
   On Aug 16, 8:16 pm, "adshaikh.hip...@googlemail.com"   
    wrote:   
   >   
   > ... I have an interesting problem here.   
      
   Indeed. Interesting and very well stated.   
      
   > 1) ... million events N...usually inter arrival times of packets   
      
   OK.   
      
   And presumably we can say that there is some hidden state of the   
   packet source that is determining the distribution of the inter-   
   arrival time as well as some other observables such as source IP,   
   protocol or packet length. These other observables can be considered   
   inputs or outputs as your problem needs determine. I will assume   
   there that you have disentangled the problem so that you are looking   
   at the inter-arrival times of a particular kind of packet from a   
   particular source. Your actual problem will probably be somewhat more   
   complex.   
      
   > ... I have a Markov model with states which are related   
   > to the number of bins I shrink down to. But doing this will give me   
   > only a very basic fit of the model.   
      
   But this basic idea can be very nicely extended to give you a much   
   better fit.   
      
   Instead of binning the inter-arrival times, suppose that you describe   
   the distribution with some plausible distribution. And distribution   
   that is limited to non-negative values that has sufficient flexibility   
   is probably pretty much OK. Thus, you might use gamma, or log-normal   
   or whatever.   
      
   Now, what I would do here is assume that there are some reasonably   
   small (but undetermined) number of discrete internal states and that   
   each of these internal states has an associated distribution for inter-   
   arrival time. The internal states should also have transition   
   probabilities. This basically just is a reframing of a hidden Markov   
   model so that it has continuous outputs rather than discrete outputs.   
      
   The goal here is to learn the parameters of the distribution for each   
   state, the transition probabilities and then to have a viable   
   estimation process for determining the internal state. As inputs to   
   the process, you need some sort of weakly informative prior for the   
   inter-arrival distributions and for the transition matrix. The inter-   
   arrival prior is easy enough since you know the range of values that   
   you expect to see. The transition matrix is also pretty simple since   
   you can adapt the recent work on Bayesian HMM's pretty directly.   
      
   >From this, it is pretty straightforward to do a Gibb's sampler.   
   Sample distributions and transition matrix from the priors, then   
   estimate hidden states, estimate parameters based on hidden states,   
   lather, rinse, repeat. This will probably be a bit slow, but should   
   converge pretty well for small numbers of internal states. The   
   posterior marginal likelihood for different numbers of states can be   
   used to select how many states you think your data will justify.   
      
   The next level of wonderment could be had by using a Dirichlet process   
   to describe the number of internal states. This moves your effort to   
   a non-parametric level and would allow it to learn some pretty   
   elaborate structures.   
      
   Take, for example, a case where you have a source that generates   
   events at one of two rates. Moreover, it tends to spend a fair bit of   
   time in the faster generating state once it gets into it (i.e. spews   
   packets at high rate for a while) but then spends a long time   
   quiescent (i.e. goes quiet). If you use a Poisson distribution, then   
   both states will have pretty variable packet rates, but you could use   
   the Gamma distribution so that the quiescent state could be highly   
   variable (i.e. Poisson) while the fast rate could be very nearly   
   periodic.   
      
   On the initial sampling with two states, one state will have a higher   
   rate distribution than the other and therefore most of the fast events   
   will be assigned to it. From this, you will get pretty good samples   
   for the model parameters which will give you good assignments and so   
   on. The transition from fast to slow would be very unlikely (to allow   
   long bursts) but transition from slow to fast would essentially be   
   certain.   
      
   Models with one state would not model the data well and thus would   
   have low marginal probability while models with three or more states   
   would model the data well, but would have large amounts of co-   
   dependent parameters leading to low marginal posterior probability.   
   The posterior on number of states would have a sharp peak at two   
   states and the posteriors on the inter-arrival models would also have   
   sharp peaks if the rates are highly distinct. The sharpness of these   
   peaks would give you valuable information about how well your overall   
   model fits the data.   
      
   a   
      
   [ comp.ai is moderated ... your article may take a while to appear. ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|