From: news.dead.person.stones@darjeeling.plus.com   
      
   On 24/12/2022 22:30, Ben Bacarisse wrote:   
   > Mike Terry writes:   
   >   
   >> On 24/12/2022 14:21, Tim Rentsch wrote:   
   >>> Ben Bacarisse writes:   
   >>>   
   >>>> Tim Rentsch writes:   
   >>>>   
   >>>>> ram@zedat.fu-berlin.de (Stefan Ram) writes:   
   >>>>>   
   >>>>>> ram@zedat.fu-berlin.de (Stefan Ram) writes:   
   >>>>>>   
   >>>>>>> Given n times of the 24-hour day, print their average.   
   >>>>>>> For example, the average of "eight o'clock" and   
   >>>>>>> "ten o'clock" (n=2) would be "nine o'clock".   
   >>>>>>> (You can choose any representation, for example "HH:MM"   
   >>>>>>> or "seconds since midnight".)   
   >>>>>>   
   >>>>>> Thanks for all replies!   
   >>>>>>   
   >>>>>> I waited a few days before answering to allow   
   >>>>>> sufficient time to think about the problem.   
   >>>>>>   
   >>>>>> There were not enough tests written and run. As a result,   
   >>>>>> the puzzle has not yet been solved (unless I have overlooked   
   >>>>>> a contribution or misworded expectations).   
   >>>>>>   
   >>>>>> So, here are two possible test cases.   
   >>>>>>   
   >>>>>> average( 23.5, 1.5 )== 0.5   
   >>>>>> average( 11.5, 13.5 )== 12.5   
   >>>>>>   
   >>>>>> (I use hours as units, so "0.5" means, "half past midnight".)   
   >>>>>>   
   >>>>>> I hope that these test cases encode sensible expectations   
   >>>>>> for an average of two times on a 24-hour clock in the spirit   
   >>>>>> of the example given in the OP, which was, "the average of   
   >>>>>> eight o'clock and ten o'clock would be nine o'clock", since   
   >>>>>> these test cases just have rotated that example by 3.5 and   
   >>>>>> 15.5 hours.   
   >>>>>>   
   >>>>>> I believe that I have not seen an algorithm so far in this   
   >>>>>> thread that would pass these tests.   
   >>>>>   
   >>>>> As before, the problem is underspecified.   
   >>>>   
   >>>> Some remarks not specifically in reply to you, Tim...   
   >>> I didn't really say very much. Your comments are a lot more   
   >>> interesting.   
   >>>   
   >>>> The input is a collection, t(n), of n > 1 numbers in [0, 24). The   
   >>>> average should be a number, A, in [0, 24) that minimises   
   >>>>   
   >>>> Sum_{i=1,n} distance(A, t(i))   
   >>>>   
   >>>> (or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms   
   >>>> of variance). So far, this is just what an average is. The key point   
   >>>> is what is the distance (or difference) whose sum (or sum of squares)   
   >>>> we want to minimise? For times, I would say it is the length of the   
   >>>> shorter arc round an imaginary 24-hour clock face.   
   >>> Minimizing the square of the distance (aka length of the shorter arc)   
   >>> is one metric, and I think a reasonable one. (Minimizing the absolute   
   >>> value of the distance gives a result more like the median than the   
   >>> mean, IIANM.)   
   >>   
   >> Yes - perhaps Ben was thinking of aking   
   >>   
   >> Sum_{i=1,n} signed_distance(A, t(i)) = 0   
   >>   
   >> rather than minimising the (absolute) distance. That would match up   
   >> with minimising the sum of squares of the distance (mean). Or he just   
   >> meant the median as you say, which is different from the mean. Both   
   >> have plausible "average" properties,   
   >   
   > Well, I was trying to be deliberately vague (hence my talking about "an   
   > average" rather than mean, median or whatever) but I was no vague enough   
   > with the formula.   
   >   
   > I wasn't thinking of a zero net distance because some metrics (though   
   > probably not applicable in this situation) don't always allow for a zero   
   > sum. I should have referred to minimising some kind distance sum, be it   
   > the absolute value of a signed sum or, as in the solution I was   
   > proposing, the sum of directional differences (i.e. full vectors).   
   >   
   >> and meet my "minimal" criteria   
   >> for what I think anyone would require of an "average" applying to a   
   >> modular arithmetic:   
   >>   
   >> - if all x_i are within 5 minutes of each other [ALLOWING FOR WRAPPING   
   >> AROUND THE END OF THE MODULO LOOP] then the "average" will be within   
   >> those same 5 minutes. NOT e.g. 12 hours or 4 hours or whatever away.   
   >   
   > And when the x_i are widely spaced the average might be considered   
   > nearly meaningless. If you have {0, 8, 16} hours then what is the   
   > "average"? Are there cases where every set of circular values has a   
   > meaningful "average"? If not, you really need more than a single number   
   > answer.   
   >   
   > (These are not questions to you, specifically, I'm interested to know of   
   > interpretations that go beyond the way I see things.)   
   >   
   >> - if all x_i are translated by a fixed offset like 1 hour, the average   
   >> will translate by the same offset, i.e. it maintains the same "modulo"   
   >> relation with the translated x_i.   
   >>   
   >> hmm, thinking now I'll add the similar:   
   >>   
   >> - if all x_i are "reflected" about a chosen time, then the new average   
   >> will be the reflected time of the original average.   
   >>   
   >> (those aren't intended to be a definition of a "modular average", just   
   >> to exclude suggestions that blatently lack the right symmetry for such   
   >> an average.)   
   >>   
   >>>> The problem has a natural interpretation in terms of angles. Whatever   
   >>>> the circular quantity is, convert the values to unit vectors round a   
   >>>> circle. For times of day, just scale [0, 24) to [0, 2*pi). The   
   >>>> average is then just the direction of the average vector, converted   
   >>>> back to a time of day.   
   >>> Averaging the "time unit vectors" is another plausible metric.   
   >>   
   >> Yes I think this one would work best in most situations, given that   
   >> the OP didn't have a precise mathematical requirement for his solution   
   >> - probably he wants something that just works reasonably as expected.   
   >> (It's my favourite answer.)   
   >>   
   >> Another way of thinking of this approach is "balancing" the 24-hour   
   >> modular circle, where we've put unit weights at each of the times x_i.   
   >> E.g. we look for a balancing line through the centre of the circle.   
   >   
   > Is that the same? For {0, 8, 16} any of the three lines through 0, 8 or   
   > 16 balances, but the vector sum interpretation is undefined. And for   
   > {0, 6, 12, 18} there are two balance lines, neither of which seems to be   
   > a natural answer for the average.   
      
   Yes, it's logically the same, in so far as WHEN the x_i aren't already   
   balanced [aka they /have/ a   
   meaningful average] we get the same average both ways. (Adding vectors is   
   effectively a way of   
   calculating moments of the weights in the balancing view of things.)   
      
   With modular arithmetic there is a scenario where everything is already   
   balanced, and so it isn't   
   clear what an average would represent, if anything. Both {0, 8, 16} and {0,   
   6, 12, 18} are already   
   perfectly balanced - ANY line through the origin would balance for either   
   example. That is just a   
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|