home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.c      Meh, in C you gotta define EVERYTHING      243,242 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 242,355 of 243,242   
   James Kuyper to Michael Sanders   
   Re: is_binary_file()   
   06 Dec 25 20:37:22   
   
   From: jameskuyper@alumni.caltech.edu   
      
   On 2025-12-05 20:05, Michael Sanders wrote:   
   > Am I close? Missing anything you'd consider to be (or not) needed?   
   >   
   >    
   >   
   > /*   
   >  * Checks if a file is likely a binary by examining its content   
   >  * for NULL bytes (0x00) or unusual control characters.   
      
   NULL is a macro that expands to a null pointer constant. I think you   
   mean "null character". This isn't just nit-picking -C is a   
   case-sensitive language, so it's essential to pay attention to case.   
      
   >  * Returns 0 if text, 1 if binary or file open failure.   
   >  */   
      
   You should return a distinct value for file open failure - a file that   
   cannot be opened cannot be determined to be either a text or a binary file.   
      
   You really cannot distinguish with certainty whether a file is a text   
   file or a binary file based solely upon the contents. A file whose   
   format is an array of two-byte 2's complement little-endian integers   
   would normally be considered binary, yet it might happen to contain   
   integers whose bytes all happen to be printable characters.   
      
   The standard does not define what a "binary file" is. However, it does   
   provide a promise that applies only to streams in text mode, which   
   depends upon what was written to that file:   
      
   "Data read in from a text stream will necessarily compare equal to the   
   data that were earlier written out to that stream only if: the data   
   consist only of printing characters and the control characters   
   horizontal tab and new-line; no new-line character is immediately   
   preceded by space characters; and the last character is a new-line   
   character." (7.23.2p2).   
      
   I believe it therefore makes sense to consider something to be a text   
   file if it meets those requirements, and otherwise is a binary file.   
   Note that the last requirement implies that an empty file cannot qualify   
   as text - at a minimum, it must contain a new-line character.   
      
   This implies the use of the isprint() function; the only other   
   characters you need to handle specifically are '\t', '\n', and ' '.   
   Since the result returned by isprint() is locale-dependent, the program   
   should, at least optionally, use setlocale().   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca