From: ben.usenet@bsb.me.uk   
      
   Tim Rentsch writes:   
      
   > ram@zedat.fu-berlin.de (Stefan Ram) writes:   
   >   
   >> ram@zedat.fu-berlin.de (Stefan Ram) writes:   
   >>   
   >>> Given n times of the 24-hour day, print their average.   
   >>> For example, the average of "eight o'clock" and   
   >>> "ten o'clock" (n=2) would be "nine o'clock".   
   >>> (You can choose any representation, for example "HH:MM"   
   >>> or "seconds since midnight".)   
   >>   
   >> Thanks for all replies!   
   >>   
   >> I waited a few days before answering to allow   
   >> sufficient time to think about the problem.   
   >>   
   >> There were not enough tests written and run. As a result,   
   >> the puzzle has not yet been solved (unless I have overlooked   
   >> a contribution or misworded expectations).   
   >>   
   >> So, here are two possible test cases.   
   >>   
   >> average( 23.5, 1.5 )== 0.5   
   >> average( 11.5, 13.5 )== 12.5   
   >>   
   >> (I use hours as units, so "0.5" means, "half past midnight".)   
   >>   
   >> I hope that these test cases encode sensible expectations   
   >> for an average of two times on a 24-hour clock in the spirit   
   >> of the example given in the OP, which was, "the average of   
   >> eight o'clock and ten o'clock would be nine o'clock", since   
   >> these test cases just have rotated that example by 3.5 and   
   >> 15.5 hours.   
   >>   
   >> I believe that I have not seen an algorithm so far in this   
   >> thread that would pass these tests.   
   >   
   > As before, the problem is underspecified.   
      
   Some remarks not specifically in reply to you, Tim...   
      
   The input is a collection, t(n), of n > 1 numbers in [0, 24). The   
   average should be a number, A, in [0, 24) that minimises   
      
    Sum_{i=1,n} distance(A, t(i))   
      
   (or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms of   
   variance). So far, this is just what an average is. The key point is   
   what is the distance (or difference) whose sum (or sum of squares) we   
   want to minimise? For times, I would say it is the length of the   
   shorter arc round an imaginary 24-hour clock face.   
      
   The problem has a natural interpretation in terms of angles. Whatever   
   the circular quantity is, convert the values to unit vectors round a   
   circle. For times of day, just scale [0, 24) to [0, 2*pi). The average   
   is then just the direction of the average vector, converted back to a   
   time of day.   
      
   Sometimes that vector has zero length, and the average is undefined, but   
   otherwise the length of the vector gives an indication of the   
   variability of the data.   
      
   Why do I consider this a reasonable interpretation of the problem?   
   Well, given a list of times of day when a train is observed to pass some   
   station, the circular 24-hour-time average should be our best estimate   
   of the scheduled time.   
      
   Obviously there are other possible readings of the problem, but I was   
   not able to justify any of them as useful for any real-world   
   applications. This is a case where I hope I am wrong and there /are/   
   other circular averages with practical interpretations.   
      
   --   
   Ben.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|