home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.ai      Awaiting the gospel from Sarah Connor      1,954 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 1,011 of 1,954   
   Ted Dunning to All   
   Re: A Newbie Question - Matching Two Tex   
   20 Apr 06 02:03:18   
   
   From: ted.dunning@gmail.com   
      
   There are a number of toolkits available from research groups.  Try   
   "named entity recognition" as a search on google.   
      
   The methods used are varied.  Having a large dictionary of alternative   
   names is a good start.  Defining patterns of usage that surround the   
   kinds of entities that you are looking for will help find entities not   
   in the dictionary (both for extending the dictionary after human review   
   and for actually finding entities).  For example, in  your field, you   
   might have lots of statements of the sort "Excavations were conducted   
   at ...".  The text after this will tend to be the name of an excavation   
   site or at least a reference to a site that was mentioned shortly   
   before.   
      
   I am not an expert on these systems, however, so you probably should be   
   looking at the literature.  The research group at Sheffield (where I   
   got my degree, btw) has done some excellent work on building NLE   
   recognizers that do not use large entity dictionaries, but the best   
   systems use pattens plus a large dictionary.   
      
   [ comp.ai is moderated.  To submit, just post and be patient, or if ]   
   [ that fails mail your article to , and ]   
   [ ask your news administrator to fix the problems with your system. ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca