home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.programming      Programming issues that transcend langua      57,431 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 56,824 of 57,431   
   Ben Bacarisse to Mike Terry   
   Re: Another little puzzle (1/2)   
   24 Dec 22 22:30:55   
   
   From: ben.usenet@bsb.me.uk   
      
   Mike Terry  writes:   
      
   > On 24/12/2022 14:21, Tim Rentsch wrote:   
   >> Ben Bacarisse  writes:   
   >>   
   >>> Tim Rentsch  writes:   
   >>>   
   >>>> ram@zedat.fu-berlin.de (Stefan Ram) writes:   
   >>>>   
   >>>>> ram@zedat.fu-berlin.de (Stefan Ram) writes:   
   >>>>>   
   >>>>>> Given n times of the 24-hour day, print their average.   
   >>>>>> For example, the average of "eight o'clock" and   
   >>>>>> "ten o'clock" (n=2) would be "nine o'clock".   
   >>>>>> (You can choose any representation, for example "HH:MM"   
   >>>>>> or "seconds since midnight".)   
   >>>>>   
   >>>>>    Thanks for all replies!   
   >>>>>   
   >>>>>    I waited a few days before answering to allow   
   >>>>>    sufficient time to think about the problem.   
   >>>>>   
   >>>>>    There were not enough tests written and run.  As a result,   
   >>>>>    the puzzle has not yet been solved (unless I have overlooked   
   >>>>>    a contribution or misworded expectations).   
   >>>>>   
   >>>>>    So, here are two possible test cases.   
   >>>>>   
   >>>>> average( 23.5,  1.5 )==  0.5   
   >>>>> average( 11.5, 13.5 )== 12.5   
   >>>>>   
   >>>>>    (I use hours as units, so "0.5" means, "half past midnight".)   
   >>>>>   
   >>>>>    I hope that these test cases encode sensible expectations   
   >>>>>    for an average of two times on a 24-hour clock in the spirit   
   >>>>>    of the example given in the OP, which was, "the average of   
   >>>>>    eight o'clock and ten o'clock would be nine o'clock", since   
   >>>>>    these test cases just have rotated that example by 3.5 and   
   >>>>>    15.5 hours.   
   >>>>>   
   >>>>>    I believe that I have not seen an algorithm so far in this   
   >>>>>    thread that would pass these tests.   
   >>>>   
   >>>> As before, the problem is underspecified.   
   >>>   
   >>> Some remarks not specifically in reply to you, Tim...   
   >> I didn't really say very much.  Your comments are a lot more   
   >> interesting.   
   >>   
   >>> The input is a collection, t(n), of n > 1 numbers in [0, 24).  The   
   >>> average should be a number, A, in [0, 24) that minimises   
   >>>   
   >>>    Sum_{i=1,n} distance(A, t(i))   
   >>>   
   >>> (or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms   
   >>> of variance).  So far, this is just what an average is.  The key point   
   >>> is what is the distance (or difference) whose sum (or sum of squares)   
   >>> we want to minimise?  For times, I would say it is the length of the   
   >>> shorter arc round an imaginary 24-hour clock face.   
   >> Minimizing the square of the distance (aka length of the shorter arc)   
   >> is one metric, and I think a reasonable one.  (Minimizing the absolute   
   >> value of the distance gives a result more like the median than the   
   >> mean, IIANM.)   
   >   
   > Yes - perhaps Ben was thinking of aking   
   >   
   >    Sum_{i=1,n} signed_distance(A, t(i))  =  0   
   >   
   > rather than minimising the (absolute) distance.  That would match up   
   > with minimising the sum of squares of the distance (mean).  Or he just   
   > meant the median as you say, which is different from the mean.  Both   
   > have plausible "average" properties,   
      
   Well, I was trying to be deliberately vague (hence my talking about "an   
   average" rather than mean, median or whatever) but I was no vague enough   
   with the formula.   
      
   I wasn't thinking of a zero net distance because some metrics (though   
   probably not applicable in this situation) don't always allow for a zero   
   sum.  I should have referred to minimising some kind distance sum, be it   
   the absolute value of a signed sum or, as in the solution I was   
   proposing, the sum of directional differences (i.e. full vectors).   
      
   > and meet my "minimal" criteria   
   > for what I think anyone would require of an "average" applying to a   
   > modular arithmetic:   
   >   
   > -   if all x_i are within 5 minutes of each other [ALLOWING FOR WRAPPING   
   >     AROUND THE END OF THE MODULO LOOP] then the "average" will be within   
   >     those same 5 minutes.   NOT e.g. 12 hours or 4 hours or whatever away.   
      
   And when the x_i are widely spaced the average might be considered   
   nearly meaningless.  If you have {0, 8, 16} hours then what is the   
   "average"?  Are there cases where every set of circular values has a   
   meaningful "average"?  If not, you really need more than a single number   
   answer.   
      
   (These are not questions to you, specifically, I'm interested to know of   
   interpretations that go beyond the way I see things.)   
      
   > -   if all x_i are translated by a fixed offset like 1 hour, the average   
   >     will translate by the same offset, i.e. it maintains the same "modulo"   
   >     relation with the translated x_i.   
   >   
   > hmm, thinking now I'll add the similar:   
   >   
   > -   if all x_i are "reflected" about a chosen time, then the new average   
   >     will be the reflected time of the original average.   
   >   
   > (those aren't intended to be a definition of a "modular average", just   
   > to exclude suggestions that blatently lack the right symmetry for such   
   > an average.)   
   >   
   >>> The problem has a natural interpretation in terms of angles.  Whatever   
   >>> the circular quantity is, convert the values to unit vectors round a   
   >>> circle.  For times of day, just scale [0, 24) to [0, 2*pi).  The   
   >>> average is then just the direction of the average vector, converted   
   >>> back to a time of day.   
   >> Averaging the "time unit vectors" is another plausible metric.   
   >   
   > Yes I think this one would work best in most situations, given that   
   > the OP didn't have a precise mathematical requirement for his solution   
   > - probably he wants something that just works reasonably as expected.   
   > (It's my favourite answer.)   
   >   
   > Another way of thinking of this approach is "balancing" the 24-hour   
   > modular circle, where we've put unit weights at each of the times x_i.   
   > E.g. we look for a balancing line through the centre of the circle.   
      
   Is that the same?  For {0, 8, 16} any of the three lines through 0, 8 or   
   16 balances, but the vector sum interpretation is undefined.  And for   
   {0, 6, 12, 18} there are two balance lines, neither of which seems to be   
   a natural answer for the average.   
      
   (Also, every balance line gives two answers, so how do you pick?)   
      
   > [Note times on the circle go from 1-24, not 1-12 like a normal clock.]   
   > Also, thinking like this, it's equivalent to (conventional) averaging   
   > of the sin of the angular differences from the average (balancing)   
   > point, since the sin will give the perpendicular distance of the   
   > weight from the balancing line.  So it still fits with the general   
   > scheme of minimising distance(x_i, v)^2, but with a slightly different   
   > metric (distance function) in use.   
   >   
   > As sin(x) is close to x when x is small, the vector average solution   
   > would match closely with the mean (by arc length) solution when the   
   > times are close together, but diverge more noticeably when the times   
   > are spread out...   
      
   --   
   Ben.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca