... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.misc
General topics about computers not cover
21,759 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 21,230 of 21,759
Ben Collver to All
Search 101: How to Find the Lost Web (2/
08 Jul 25 14:11:46
   [continued from previous message]   
      
   will overwhelm you with spam about the covid vax. But if you search   
   on Twitter, limiting the date range to summer 2019, you will get   
   precisely the consensus that existed at the time, and nothing else.   
   There's no contamination, because no one can fake a summer 2019   
   Tweet. If the Tweet is dated June, July or August 2019, then that's   
   when it was published, and its content is what it contained back   
   then. This is very different from the output of web search engines,   
   where random third parties control the information sources, and you   
   have all manner of people manipulating both post content and post   
   dates in a bid to win traffic.   
      
   BUSTIN' MYTHS   
   =============   
      
   The value of being able to filter spam chronologically is immense,   
   and it can completely demolish virtually any myth built by the   
   information machine. In the 2010s I got curious as to the origins of   
   the Twitter hashtag. I wanted to know who invented the idea.   
   Wikipedia and a clutch of other sites assured me it was Chris   
   Messina. How predictable, I thought. Twitter hashtag invented by a   
   high-profile, privileged dude with connections galore. But my life   
   experience told me that *well-connected dudes with high public   
   profiles are much better associated with taking credit for inventions   
   than actually inventing them*. So I decided to check out the story on   
   Twitter itself.   
      
   Because of Twitter's chronological integrity and the fact that I   
   could restrict the period of investigation to a time before Messina's   
   claim, I was able to establish that Messina did not in fact invent   
   the Twitter hashtag. I wrote a post documenting the truth back in   
   2016. Sadly, it's been one of the least visited posts I've ever   
   written. The search engines are quite happy with wall to wall   
   regurgitations of the Wikipedia line. But the post does demonstrate   
   how much more accurate Twitter can be as an information source than a   
   typical web search engine. And whilst single Tweets are limited (by   
   character-count) in their ability to elaborate on a story,   
   collectively they can prove extremely thorough in the picture they   
   provide.   
      
      
      
   > These obscure search engines are incredibly refreshing to use,   
   > because they deliberately punish the exact, cash-crazed ideology   
   > that Google goes out of its way to reward.   
      
   Twitter also affords us a directional filter on information. By   
   default, we only really see what influential voices are saying. But   
   we can filter a Twitter Advanced Search to show only the replies TO   
   those influential voices. That directional filter can serve as an   
   ideological filter and quickly take us to the opposing views which a   
   web search engine can easily hide.   
      
   This works brilliantly where marketing or propaganda is strong. For   
   example, a brand is only ever going to tell you what it gets right.   
   Never what it gets wrong. The brand will typically use SEO strategies   
   with web search engines, to ensure that its official messaging   
   occupies the whole front page, and that the more negative feedback is   
   buried under a continuous spew of marketing. But using Twitter Search   
   we can completely filter out the brand's own messaging and search   
   only the replies to it. This gives a much truer picture of the   
   brand's performance, and we additionally get to see whether the brand   
   addresses issues raised by members of the public, or simply ignores   
   them.   
      
      
      
      
      
   > It's no longer about the consumer. It's 50% an elitist closed shop   
   > in which Amazon, eBay, YouTube and Co. win by default, and 50% a   
   > "which established e-corp can bribe the most PR7s and pump the most   
   > elaborate data graph into Silicon Valley?" contest.   
      
   CUSTOMISED SEARCH   
   =================   
      
   Instances of the decentralised search engine Searx (listed here--page   
   requires JavaScript) are often recommended as an alternative to   
   bigger web search engines. But it's rarely explained how the search   
   capabilities offered by Searx can be rigorously customised to focus   
   on the best sources of information for a given subject.   
      
      
      
   Searx is all about metasearch. That is, compiling results from a   
   variety of different search indexes. But with Searx, you can choose   
   which indexes you want to query. If you've explored and tested   
   various instances of Searx, you've probably noticed that the search   
   results can be vastly different from one instance to the next. That's   
   because each one is set up by its administrator to query a different   
   selection of indexes. But the range of sources a Searx instance   
   queries is also open to user-customisation. By going into the   
   *Preferences*, you can define exactly whose results you want, and   
   whose you don't.   
      
   I'll use Searx Belgium as an example, because I've found it to be   
   reliable. There are tabs along the top of the results page that   
   denote categories of search. Once you've entered a search term and   
   have a results list on screen, you'll see that the results list is   
   headed with horizontal selection options such as General, Images,   
   Videos, News, etc. Unlike with Google, you can simultaneously choose   
   as many or as few of these search categories as you like. Just select   
   the tab or tabs you want and then re-click the Start Search button.   
      
      
      
   > The Searx Preferences page illustrates just how many different   
   > search resources there are, and names them so we can investigate   
   > them in their own right.   
      
   Let's say you de-selected the *General* tab--which is selected by   
   default--and instead selected the *Social Media* tab. You'll see a   
   dramatic change in the results. Rather than being sourced from   
   Google, Wikipedia, etc (which are Searx Belgium's default sources for   
   General search), the results are now solely coming from Reddit (which   
   is Searx Belgium's default source for Social Media).   
      
   I really like having the option to get a selection of results solely   
   from Reddit, because community Q&A discussion is broadly a lot more   
   genuine than the output of some listicle merchant whose real goal is   
   not to help you solve a problem, but to pocket some commission from   
   Amazon. Even if the contributors on Reddit are not experts (and   
   sometimes they are), collectively they're likely to get you closer to   
   a real solution than an expert blogger who isn't even trying to help.   
      
   True, we could confine Google or DuckDuckGo search results to Reddit   
   by prefixing our search term with *site:reddit.com*--and this is one   
   of the only really reliable techniques left of filtering out the   
   annoying spam on major web search engines. But we've come to expect   
   greater convenience than having to type a website domain into a   
   search box, and that's what the tab system on Searx gives us.   
      
   Out of the box, the Searx instance in our example already offers some   
   easy ways to customise the search results for specific needs. But by   
   pitching into the *Preferences*, we can further tailor the sources   
   for each of those category tabs. For example, we could restrict the   
   image search sources solely to Unsplash, or Flickr. Then we filter   
   out all of the news site spam and very predominantly find photography   
   enthusiasts instead.   
      
   > Independence from major web search is something we can, and should,   
   > try to build progressively.   
      
   Incidentally, if you do make any changes in Searx *Preferences*,   
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]