From: pornchannel@yahoo.com   
      
   "Gilad Novik" wrote in message news:...   
   > Hi,   
   >   
   > I want to build a tool for data mining from an html page. I want the user to   
   > select an element from a web page, and train my application to recognize it   
   > in its later updates. For example, suppose the user wants to extract some   
   > data from a financial. He want to extract his total balance, plus the table   
   > of the last transactions. What he should do is to highlight the elements   
   > inside the html page. After doing that, the application should analyze the   
   > html element structure, and learns how to find it in similar pages (even   
   > when they are not identical). What I really need is an algorithm to   
   > "understand" a single element (by it's structure, position in page or any   
   > other methods), and then I want to look in a new page, and choose the most   
   > similar element (which should probably be the right one).   
   >   
   > Does anyone has an idea for it?   
   >   
   > Regards,   
   > Gilad Novik   
   >   
      
   try:   
      
   Wrapper Induction for Information Extraction   
   Nicholas Kushmerick   
      
   http://citeseer.nj.nec.com/kushmerick97wrapper.html   
      
   [ comp.ai is moderated. To submit, just post and be patient, or if ]   
   [ that fails mail your article to , and ]   
   [ ask your news administrator to fix the problems with your system. ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|