home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.editors      What? Edlin ain't good enough for you?      123,932 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 123,312 of 123,932   
   Janis Papanagnou to Eli the Bearded   
   Re: [vim] Jumping from current Unicode s   
   28 Dec 23 04:40:44   
   
   From: janis_papanagnou+ng@hotmail.com   
      
   On 28.12.2023 03:36, Eli the Bearded wrote:   
   > In comp.editors, Janis Papanagnou   wrote:   
   >> In Vim I frequently jump from string to the next equal string using the   
   >> commands '*' (forward search'n'jump) and '#' (backward search'n'jump).   
   >>   
   >> With Unicode characters that doesn't seem to always work (at least not   
   >> per default).   
   >>   
   >> In the following (UTF-8 encoded) test sample there is one subset of   
   >> Omega words where * and # works correctly and one where it doesn't   
   >> (starting with the cursor on the first letter of any word)   
   >>   
   >>     Ωmega Ωmega Ωmega Ωmega Ωmega Ωmega Ωmega Ωmega   
   >   
   > This is like complaining that a search for "MISS" does not also match   
   > "МІЅЅ". They are completely different strings that just happen to look   
   > alike with certain font choices.   
      
   No, unfortunately you seem to have MISSed the point. It's not about   
   same looking but different strings. It's about different behavior of   
   the same Vim operations (* and #) on _two types_ of words.   
      
   Try to copy/paste the line into a Vim session, then move the cursor   
   onto the first character of the first word, then type * repeatedly.   
   Then do the same starting with the first character of the third word,   
   and observe the difference! - Tell me what you think about that.   
      
   (You can adjust the test-case to use these two letters in different   
   contexts, or work on single characters.)   
      
   Janis   
      
   > Some of those are "ohm sign", "Latin   
   > small letter m", "Latin small letter e", "Latin small letter g", "Latin   
   > small letter a" and the others are "Greek capital letter omega",   
   > "Latin small letter m", "Latin small letter e", "Latin small letter g",   
   > "Latin small letter a".   
   >   
   > Your "difference is only the encoding" fails to grasp that Unicode is   
   > semiotics aware, even if users might not be.   
   >   
   > Elijah   
   > ------   
   > https://www.unicode.org/reports/tr36/#visual_spoofing   
   >   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca