From: news.dead.person.stones@darjeeling.plus.com   
      
   On 24/12/2022 14:21, Tim Rentsch wrote:   
   > Ben Bacarisse writes:   
   >   
   >> Tim Rentsch writes:   
   >>   
   >>> ram@zedat.fu-berlin.de (Stefan Ram) writes:   
   >>>   
   >>>> ram@zedat.fu-berlin.de (Stefan Ram) writes:   
   >>>>   
   >>>>> Given n times of the 24-hour day, print their average.   
   >>>>> For example, the average of "eight o'clock" and   
   >>>>> "ten o'clock" (n=2) would be "nine o'clock".   
   >>>>> (You can choose any representation, for example "HH:MM"   
   >>>>> or "seconds since midnight".)   
   >>>>   
   >>>> Thanks for all replies!   
   >>>>   
   >>>> I waited a few days before answering to allow   
   >>>> sufficient time to think about the problem.   
   >>>>   
   >>>> There were not enough tests written and run. As a result,   
   >>>> the puzzle has not yet been solved (unless I have overlooked   
   >>>> a contribution or misworded expectations).   
   >>>>   
   >>>> So, here are two possible test cases.   
   >>>>   
   >>>> average( 23.5, 1.5 )== 0.5   
   >>>> average( 11.5, 13.5 )== 12.5   
   >>>>   
   >>>> (I use hours as units, so "0.5" means, "half past midnight".)   
   >>>>   
   >>>> I hope that these test cases encode sensible expectations   
   >>>> for an average of two times on a 24-hour clock in the spirit   
   >>>> of the example given in the OP, which was, "the average of   
   >>>> eight o'clock and ten o'clock would be nine o'clock", since   
   >>>> these test cases just have rotated that example by 3.5 and   
   >>>> 15.5 hours.   
   >>>>   
   >>>> I believe that I have not seen an algorithm so far in this   
   >>>> thread that would pass these tests.   
   >>>   
   >>> As before, the problem is underspecified.   
   >>   
   >> Some remarks not specifically in reply to you, Tim...   
   >   
   > I didn't really say very much. Your comments are a lot more   
   > interesting.   
   >   
   >> The input is a collection, t(n), of n > 1 numbers in [0, 24). The   
   >> average should be a number, A, in [0, 24) that minimises   
   >>   
   >> Sum_{i=1,n} distance(A, t(i))   
   >>   
   >> (or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms   
   >> of variance). So far, this is just what an average is. The key point   
   >> is what is the distance (or difference) whose sum (or sum of squares)   
   >> we want to minimise? For times, I would say it is the length of the   
   >> shorter arc round an imaginary 24-hour clock face.   
   >   
   > Minimizing the square of the distance (aka length of the shorter arc)   
   > is one metric, and I think a reasonable one. (Minimizing the absolute   
   > value of the distance gives a result more like the median than the   
   > mean, IIANM.)   
      
   Yes - perhaps Ben was thinking of making   
      
    Sum_{i=1,n} signed_distance(A, t(i)) = 0   
      
   rather than minimising the (absolute) distance. That would match up with   
   minimising the sum of   
   squares of the distance (mean). Or he just meant the median as you say, which   
   is different from the   
   mean. Both have plausible "average" properties, and meet my "minimal"   
   criteria for what I think   
   anyone would require of an "average" applying to a modular arithmetic:   
      
   - if all x_i are within 5 minutes of each other [ALLOWING FOR WRAPPING   
    AROUND THE END OF THE MODULO LOOP] then the "average" will be within   
    those same 5 minutes. NOT e.g. 12 hours or 4 hours or whatever away.   
      
   - if all x_i are translated by a fixed offset like 1 hour, the average   
    will translate by the same offset, i.e. it maintains the same "modulo"   
    relation with the translated x_i.   
      
   hmm, thinking now I'll add the similar:   
      
   - if all x_i are "reflected" about a chosen time, then the new average   
    will be the reflected time of the original average.   
      
   (those aren't intended to be a definition of a "modular average", just to   
   exclude suggestions that   
   blatently lack the right symmetry for such an average.)   
      
   >   
   >> The problem has a natural interpretation in terms of angles. Whatever   
   >> the circular quantity is, convert the values to unit vectors round a   
   >> circle. For times of day, just scale [0, 24) to [0, 2*pi). The   
   >> average is then just the direction of the average vector, converted   
   >> back to a time of day.   
   >   
   > Averaging the "time unit vectors" is another plausible metric.   
      
   Yes I think this one would work best in most situations, given that the OP   
   didn't have a precise   
   mathematical requirement for his solution - probably he wants something that   
   just works reasonably   
   as expected. (It's my favourite answer.)   
      
   Another way of thinking of this approach is "balancing" the 24-hour modular   
   circle, where we've put   
   unit weights at each of the times x_i. E.g. we look for a balancing line   
   through the centre of the   
   circle. [Note times on the circle go from 1-24, not 1-12 like a normal   
   clock.] Also, thinking like   
   this, it's equivalent to (conventional) averaging of the sin of the angular   
   differences from the   
   average (balancing) point, since the sin will give the perpendicular distance   
   of the weight from the   
   balancing line. So it still fits with the general scheme of minimising   
   distance(x_i, v)^2, but with   
   a slightly different metric (distance function) in use.   
      
   As sin(x) is close to x when x is small, the vector average solution would   
   match closely with the   
   mean (by arc length) solution when the times are close together, but diverge   
   more noticeably when   
   the times are spread out...   
      
      
   Mike.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|