Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.databases.oracle    |    Overblown overpriced overengineered SHIT    |    2,288 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 2,102 of 2,288    |
|    Frank van Bortel to Server Applications    |
|    Re: Oracle Text: Indexing UTF8 or UTF16    |
|    19 May 05 10:00:46    |
      From: frank.van.bortel@gmail.com              Server Applications wrote:       > Hello       >       > I am trying to build a system where I can full-text index documents with       > UTF8 or UTF16 data using Oracle Text. I am doing the filtering in a       > third-party component outside the database, so the I dont need filtering in       > Oracle, but only indexing.       > If I put file references to the filtered files in the database and index       > these (using FILE_DATASTORE), everything works fine. But I rather put the       > filtered data in the database, and index it from here (using the       > PROCECURE_FILTER). But this gives me some problems when the data is actually       > unicode data.       > The interface for the procedure in the PROCEDURE_FILTER does not allow the       > data to be output as NCLOB or NVARCHAR, but only CLOB or VARCHAR. Indexing       > the data directly in the table (using eg. an NULL_FILTER or CHARSET_FILTER)       > have the same impact. If I try to index a column of the type NCLOB or       > NVARCHAR, the index-creation gives me an error telling me that it is an       > invalid column-type.       >       > I have tried to create a database with the UTF8 character set, expecting       > that the CLOB column type then could contain the UTF8 data, and that the       > indexing then would recognize the unicode characters in the data. This does       > not give any errors, but none of the unicode string in the data are       > contained in the index, only the strings in english (or ascii, strings with       > characters all within 1 byte) are contained in the index afterwards.       >       > Is is not possible to index data directly in a column (using either       > CHARSET_FILTER, NULL_FILTER or PROCEDURE_FILTER) that is in UTF8 or UTF16       > format?       >       >       > Thanks in advance for any comments.       >       > /David       >       >       This ng is dead - repost in cdo.server              --       Regards,       Frank van Bortel              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca