home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   alt.comp.os.windows-11      Steaming pile of horseshit Windows 11      4,852 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 4,706 of 4,852   
   Maria Sophia to Maria Sophia   
   Re: PSA: HTML fragment mode interaction    
   15 Feb 26 21:27:35   
   
   XPost: alt.comp.os.windows-10, alt.comp.microsoft.windows   
   From: mariasophia@comprehension.com   
      
   Maria Sophia wrote:   
   > Without everyone's help, particularly that of Lawrence, Paul & Carlos, I   
   > never would have gotten this far in testing and explaining how it works.   
      
   This reminds me, in a small way, of how I felt when reading Einstein's 1916   
   book (later revised in the early 1920's, which lost copyright 100 years   
   later) in that every revelation reveals a new mystery to resolve next.   
      
   Keeping in mind the whole thing started when I pasted Chromium text into   
   Notepad++ which caused Control+A to die, this is a short explanation.   
      
   Windows assigns every clipboard format a numeric ID, e.g., 1, 7, 13, 16.   
    CF_TEXT is the old ANSI text format.   
    CF_OEMTEXT is the old OEM codepage text format.   
    CF_UNICODETEXT is the modern Unicode text format.   
    CF_LOCALE tells Windows what language or locale the text came from   
    etc.   
      
   While NirSoft InsideClipboard shows those IDs, it turns out that Chromium   
   registers its own formats by name, so Windows assigns those names whatever   
   available (usually large) numbers it has, such as 49426 or 49683.   
      
   Notepad++ does not use most of those CF clipboard formats directly.   
   Notepad++ almost always just asks Windows only for CF_UNICODETEXT.   
      
   The important detail is that Windows uses the HTML Format entry to generate   
   the plain text that Notepad++ receives. That conversion step is where the   
   invisible CTRL+A land mine comes from.   
      
   That means the presence of HTML Format changes how the plain   
   text is produced, even though Notepad++ never reads the HTML itself.   
      
   When the control+B shortcuts.xml macro rewrites the clipboard, it removes   
   HTML Format and all Chromium internal formats.   
      
   With only plain text formats left, Windows no longer has to convert from   
   HTML, so the plain text becomes clean and Ctrl+A works again.   
      
   But what exactly is causing Control+A to stop working in Notepad++?   
   The reason Ctrl+A dies is not that HTML is pasted into the file.   
      
   The problem actually happens earlier, inside Windows, when Windows converts   
   the HTML Format entry into CF_UNICODETEXT for Notepad++.   
      
   When Chromium puts HTML Format on the clipboard, Windows must run its   
   HTML-to-text converter. That converter uses the StartHTML, EndHTML,   
   StartFragment, and EndFragment offsets inside the HTML Fragment block.   
      
   If those offsets are wrong, or if the HTML fragment is malformed, the   
   converter can produce a CF_UNICODETEXT stream with hidden control   
   characters, mismatched boundaries, or an unexpected buffer length.   
      
   Notepad++ receives that CF_UNICODETEXT stream and loads it into its   
   internal Scintilla buffer. If the buffer contains an unexpected control   
   sequence or a broken length field, Scintilla can fail to compute the   
   full document range.   
      
   Bingo!   
      
   When that happens, Ctrl+A does not select the whole buffer because   
   Scintilla thinks the document ends earlier than it actually does.   
      
   The Control+B macro fixes the issue because it wipes the clipboard and   
   replaces it with plain text only (among other things that it does).   
      
   With no HTML Format present, Windows does not run the HTML-to-text   
   converter again, so the CF_UNICODETEXT stream is finkally clean   
   and Scintilla can compute the correct document length.   
      
   Once the buffer is clean, Ctrl+A works again.   
   Whew!   
      
   Given this took me hours to debug & resolve, the whole point of this PSA is   
   to help the next person not have to do all the work that I just had to do!   
   --   
   "Everything should be made as simple as possible, but not simpler."   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca