home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.c      Meh, in C you gotta define EVERYTHING      243,242 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 242,378 of 243,242   
   Keith Thompson to All   
   Re: is_binary_file()   
   08 Dec 25 14:43:58   
   
   From: Keith.S.Thompson+u@gmail.com   
      
   Michael Sanders  writes:   
   [...]   
      
   For yet another set of unreliable hueristics for guessing whether a file   
   is text or binary, you can take a look at Perl's built-in "-T" and "-B"   
   operators.   
      
           The "-T" and "-B" tests work as follows. The first block   
           or so of the file is examined to see if it is valid   
           UTF-8 that includes non-ASCII characters. If so, it's a   
           "-T" file. Otherwise, that same portion of the file is   
           examined for odd characters such as strange control codes   
           or characters with the high bit set. If more than a third   
           of the characters are strange, it's a "-B" file; otherwise   
           it's a "-T" file. Also, any file containing a zero byte   
           in the examined portion is considered a binary file. (If   
           executed within the scope of a use locale which includes   
           "LC_CTYPE", odd characters are anything that isn't a   
           printable nor space in the current locale.) If "-T" or   
           "-B" is used on a filehandle, the current IO buffer is   
           examined rather than the first block. Both "-T" and "-B"   
           return true on an empty file, or a file at EOF when testing   
           a filehandle. Because you have to read a file to do the "-T"   
           test, on most occasions you want to use a "-f" against the   
           file first, as in "next unless -f $file && -T $file".   
      
   It's not clear how big a "block" is.  For an empty file, both -T   
   and -B are true.  I don't know whether there are other cases where   
   both are true, or where both are false.   
      
   --   
   Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com   
   void Void(void) { Void(); } /* The recursive call of the void */   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca