home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.ai      Awaiting the gospel from Sarah Connor      1,954 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 907 of 1,954   
   Marina Sapir to jbromer@isp.com   
   Re: Algorithm/Theory help: Patterns, com   
   06 Feb 06 00:39:27   
   
   From: marina@sapir.us   
      
   jbromer@isp.com wrote:   
   > Would you mind providing the particulars of the example you mentioned   
   > in your message.  I am having trouble understanding your paper, and I   
   > am not able to understand your example either.   
   >   
   > For example, you might list the three patterns you mentioned and then   
   > show us how you would, "code each string by a new string of length n,   
   > where  on i-th position you put the number of i-th patterns of change   
   > is in this string."   
   >   
   > Finally could you give us an example of what Wittkowski calls mU-stat.   
   >   
   > Jim Bromer   
   >   
      
   Yes.   
      
   Suppose, you have   
   - an original important string A and   
   - the set X of many other strings, which you want to compare whith the   
   string A and order by their distance from A.   
      
   Next, suppose, you have three patterns, which characterize distinctions   
   between strings:   
   - insertion,   
   - deletion,   
   - transposition.   
      
   Further assume you have an algorithm, which takes two strings and tells   
   you, how many insertions, deletions, transpositions one needs to do, to   
   transform the second string into the first one.   
      
   Let b, c are two of the strings from the set X. The string  b needs 3   
   insertions, 2 deletions and 0 transpositions to be transformed into   
   string A. The string c need 0 insertions, 0 deletions and 2   
   transpositions. You can not tell, which of these strings is further   
   from A, because you do not know what is more important: transpositions   
   or deletions or insertions, and you do not want to assign arbitrary   
   weights to these patterns.   
      
   So, what you can do, you code each string in X by the numbers you get   
   from your algorithm for this string. For example, the string b will be   
   coded by the code-string   
   <3, 2, 0>,   
   (3 insertions, 2 deletions and 0 transpositions) the string c will be   
   coded by the string <0, 0, 2>.   
      
   We say the string x is "further" than the string y, if the string x   
   requires at least as many of each transformations as the string y, and   
   the strings are not equal.   
   For example, we agree that if one string requires 2 insertions and 2   
   deletions, and another string requires 2 insertions and 3 deletions,   
   the second string is further from A.   
      
   For the code-strings it means that the string x is "larger" than the   
   code-string y, if, in each position, the number in the code-string x is   
   not smaller than the number in the code-string y; and the code-strings   
   are not equal.   
      
   The strinngs b, c in our example above are not comparable: one requires   
   more transpositions, another requires more insertions. But if you have   
   large and diverse set of strings X, there will be comparable strings.   
   Now, for each code-string s from X, calculate, how many code-strings   
   are "larger" and how many code-strings are "smaller" in X. The   
   difference between these two numbers is what is called mU-stat for the   
   string s.   
      
   Now, each string is characterized by its own number, mUstat fro this   
   string. The larger is the number, the closer is the string to A,   
   comparing with other strings. At this point, every two strings in X   
   became comparable by their statistics (mU-stat). To do this comparison,   
   you did not have to make any arbitrary assumption about importance of   
   each pattern.   
      
   In my paper, I was talking about binary patterns of two classes. I used   
   this mU-stat for classification.   
      
   It works!   
      
      
   [ comp.ai is moderated.  To submit, just post and be patient, or if ]   
   [ that fails mail your article to , and ]   
   [ ask your news administrator to fix the problems with your system. ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca