Forums before death by AOL, social media and spammers... "We can't have nice things"
|    alt.comp.os.windows-10    |    Steaming pile of horseshit Windows 10    |    197,590 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 197,374 of 197,590    |
|    Maria Sophia to Maria Sophia    |
|    Re: PSA: HTML fragment mode interaction     |
|    15 Feb 26 21:57:21    |
      XPost: alt.comp.os.windows-11, alt.comp.microsoft.windows       From: mariasophia@comprehension.com              Maria Sophia wrote:       > Since I covered the topic at the level of a published paper, there's really       > not much more left to discuss, as far as I can tell. It's all fixed now.              All that's left after fixing the issue was understanding what actually       went wrong in the first place (which killed the control+A in Notepad++).              It turns out that Windows does not convert the HTML Format entry into text       until an application explicitly asks for a text format.              So the corruption happens at the moment Notepad++ requests CF_UNICODETEXT.              The sequence (as far as I can re-construct it) is...              1. With Ctrl+C, Chromium places several formats on the clipboard,        including HTML Format, CF_UNICODETEXT, and its internal metadata.              2. With Ctrl+V, Notepad++ asks Windows:        "Give me CF_UNICODETEXT."              3. Windows sees that HTML Format is available and may choose to generate        the CF_UNICODETEXT stream by converting the HTML fragment.               Kaboom!              4. That conversion step can produce a corrupted CF_UNICODETEXT stream.        The corruption is not visible text. Which is why I couldn't "see" it.        It is a bad length field or a hidden control character (apparently).               Is that a bug?        I don't know.              5. Scintilla loads that corrupted stream into its internal buffer.        But the buffer boundaries are now wrong, so Ctrl+A fails because        Scintilla thinks the document ends earlier than it actually does.              So why didn't I see it in the Notepad++ hex editor?              The HTML is never pasted into the file, so it can't be seen.       But it affects the text Windows hands to Notepad++ at paste time.              Well then, why does adding and deleting a character fix it?              Because the corruption lives only in Scintilla's internal buffer       structures, not in the visible text. When the macro inserts a space,       Scintilla is forced to rebuild its entire buffer. That rebuild wipes out       the corrupted boundary. Removing the space forces a second rebuild,       which simply restores the original content. The second rebuild is not       needed for the fix; it is only needed to undo the temporary change.              After that, the macro selects all and cuts the text. Cutting forces       Windows to create a brand new clipboard entry. This new clipboard entry       contains only plain text formats, because Scintilla does not generate       HTML Format or any Chromium internal formats.              I don't know if this is a bug or not, as all I know, in the end, is...       1. The corrupted CF_UNICODETEXT stream from the original paste is gone.       2. The clipboard now contains only clean plain text.       3. Scintilla now has a clean buffer with correct boundaries.       4. Ctrl+A works again.              Woo hoo!              So the fix works because Windows created the problem when converting the       HTML fragment into text, which corrupted Scintilla's internal buffer.       Adding and removing a character forces Scintilla to rebuild its buffer,       and cutting the text forces Windows to rebuild the clipboard without       HTML Format. The corruption cannot survive those two rebuilds.              I think we explained it as simply as we could, but not simpler.       --        How wonderful that we have met with a paradox.        Now we have some hope of making progress.              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca