home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.c      Meh, in C you gotta define EVERYTHING      243,242 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 242,392 of 243,242   
   Keith Thompson to Michael Sanders   
   Re: is_binary_file()   
   09 Dec 25 15:42:59   
   
   From: Keith.S.Thompson+u@gmail.com   
      
   Michael Sanders  writes:   
   > On Mon, 8 Dec 2025 18:44:33 +0000, bart wrote:   
   >> It's not clear what the actual problem is. What is the use-case   
   >> for a  function that tells you whether any file /might/ be a   
   >> text-file based on  speculative analysis of its contents?   Is   
   >> the result /meant/ to be fuzzy?   
   >   
   > Hey bart.   
   >   
   > What I mean is that since I have not yet defined a canonical   
   > standard for my program, the goal here (to determine if my code   
   > can parse the file) is unclear.   
   >   
   > It means I need to plan much more *before* I write more code, no   
   > mean feat when one is excited & ready to jump in =)   
      
   You say you want to parse the file.  That implies that you expect   
   the file to have a certain format/syntax, and for parsing to fail   
   on a file that doesn't satisfy the syntax.   In that case, I   
   speculate that determining whether the file is text or binary is   
   not useful.  The way to determine whether you can parse it is   
   simply to try to parse it, and see whether that succeeds or fails.   
   For example, if I want to parse a file containing a C translation   
   unit, I can feed it to a C compiler (or just a parser if I have   
   one).  If the file contains non-text bytes, that's just a special   
   case of a syntactically incorrect input, and the parser will   
   detect it.  It should work similarly for whatever format you're   
   trying to parse.  I doubt that you need to distinguish between   
   incorrect input that's pure text and incorrect input that's   
   "binary".   If I'm right about this (which is by no means   
   certain), you could have saved a lot of time by telling us up   
   front *why* you want to distinguish between "text" and "binary"   
   files.   On the other hand, I've seized on the word "parse", and I   
   may be reading too much into it.   
      
   --   
   Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com   
   void Void(void) { Void(); } /* The recursive call of the void */   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca