Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.ai    |    Awaiting the gospel from Sarah Connor    |    1,954 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 170 of 1,954    |
|    Markus to All    |
|    Re: Finding an HTML element    |
|    30 Nov 03 00:29:10    |
      From: markus-1977@gmx.net              > I want to build a tool for data mining from an html page. I want the user to       > select an element from a web page, and train my application to recognize it       > in its later updates. For example, suppose the user wants to extract some       > data from a financial. He want to extract his total balance, plus the table       > of the last transactions. What he should do is to highlight the elements       > inside the html page. After doing that, the application should analyze the       > html element structure, and learns how to find it in similar pages (even       > when they are not identical). What I really need is an algorithm to       > "understand" a single element (by it's structure, position in page or any       > other methods), and then I want to look in a new page, and choose the most       > similar element (which should probably be the right one).              Seems you are trying to "learn" a structure, for example a grammar for       a pattern language. There are a bunch of algorithms out there that can       learn text patterns nicely.              I've seen something like what you described before, I think it was       with the Lexikon Project at DFKI (www.dfki.de). I don't know of any       publications out of the top of my head, though.              Markus              [ comp.ai is moderated. To submit, just post and be patient, or if ]       [ that fails mail your article to |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca