home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.c      Meh, in C you gotta define EVERYTHING      243,242 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 242,407 of 243,242   
   Michael Sanders to Michael Sanders   
   Re: is_binary_file()   
   10 Dec 25 18:41:22   
   
   From: porkchop@invalid.foo   
      
   On Wed, 10 Dec 2025 11:35:48 -0000 (UTC), Michael Sanders wrote:   
      
   > Yes. Here's my 2nd attempt...   
   >   
   > [...]   
      
   Last version for me (I have to pivot to other things).   
      
   Main change is a look up table, ought to provide   
   optional future extensibility...   
      
   Earnest thanks to each & all =)   
      
   #include    // FILE, fopen, fread, fclose   
   #include   // size_t   
      
   // is_text_file()   
   // Returns:   
   //   -1 : could not open file   
   //    0 : is NOT a text file (binary indicators found)   
   //    1 : is PROBABLY a text file (no strong binary signatures)   
      
   int is_text_file(const char *path) {   
       FILE *f = fopen(path, "rb");   
       if (!f) return -1;   
      
       unsigned char chunk[4096]; // 4KB   
       size_t n, i;   
      
       // Look Up Table: 1 = allowed in text, 0 = binary indicator   
       // Allows TAB(0x09), LF(0x0A), CR(0x0D), printable ASCII (0x20–0x7E)   
       static const unsigned char LUT[128] = {   
           0,0,0,0,0,0,0,0,0,1,1,0,0,1,0,0, // 0x00–0x0F   
           0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, // 0x10–0x1F   
           1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, // 0x20–0x2F   
           1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, // 0x30–0x3F   
           1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, // 0x40–0x4F   
           1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, // 0x50–0x5F   
           1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, // 0x60–0x6F   
           1,1,1,1,1,1,1,1,1,1,1,0          // 0x70–0x7F, last 0 = DEL   
       };   
      
       while ((n = fread(chunk, 1, sizeof(chunk), f)) > 0) {   
           for (i = 0; i < n; i++) {   
               if (chunk[i] < 128 && !LUT[chunk[i]]) {   
                   fclose(f);   
                   return 0; // binary indicator found   
               }   
               // bytes >= 128 are accepted as probably text   
           }   
       }   
      
       fclose(f);   
       return 1; // probably text   
   }   
      
   --   
   :wq   
   Mike Sanders   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca