... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"

comp.lang.c

Meh, in C you gotta define EVERYTHING

243,242 messages

[ << oldest | < older | list | newer > | newest >> ]

Message 242,413 of 243,242

bart to Richard Heathfield

Re: is_binary_file()

10 Dec 25 22:37:48

   From: bc@freeuk.com   
      
   On 10/12/2025 19:42, Richard Heathfield wrote:   
   > On 10/12/2025 17:18, Scott Lurndal wrote:   
   >> Michael S  writes:   
   >>> On Wed, 10 Dec 2025 15:07:30 GMT   
   >>> scott@slp53.sl.home (Scott Lurndal) wrote:   
   >>>   
   >>>> Michael Sanders  writes:   
   >>>>> On Sat, 6 Dec 2025 02:00:22 -0000 (UTC), Lew Pitcher wrote:   
   >>>>>> I should have added that I feel that you probably haven't really   
   >>>>>> defined /what/ "text file" means, and that has interfered with   
   >>>>>> the development of this function. As Keith pointed out, the task   
   >>>>>> of distinguishing between a "text" file and a "binary" file is not   
   >>>>>> easy. I'll add that a lot of the difficulty stems from the fact   
   >>>>>> that there are many definitions (some conflicting) of what a "text"   
   >>>>>> file actually contains.   
   >>>>>   
   >>>>> Yes. Here's my 2nd attempt following the template (of thinking)   
   >>>>> you've suggested...   
   >>>>   
   >>>> The problem with all of your attempts is the performance   
   >>>> issue.  Success requires reading every single byte of the   
   >>>> file, one byte at a time.   The word 'slow' is not sufficient   
   >>>> to describe how bad the performance will be for a very large   
   >>>> file.   
   >>>>   
   >>>> At a minimum, dump the stdio double-buffered byte-by-byte   
   >>>> algorithm and use mmap().   
   >>>>   
   >>>   
   >>> I suggest to do actual speed measurements before making bold   
   >>> claims like above. Don't trust your intuition!   
   >>   
   >> I have, more than once, done such measurements after mmap()   
   >> was introduced in SVR4 circa 1989 (ported from SunOS).   
   >>   
   >> On a single-user system, running a single job, the difference   
   >> for smaller files is in the noise.   For larger files, or when   
   >> the system is heavily loaded or multiuser, it can be significant.   
   >   
   > 1989 is 36 years ago. Technology has moved on. If reading your file is   
   > too slow to read, get yourself a real computer.   
   >   
   > On my very ordinary desktop machine, I just freq'd[1] a 7,032,963,565-   
   > byte file in 12.256 seconds. That's 573,838,410 bytes per second. It's a   
   > damn sight faster than I could do by hand.   
   >   
   > How, exactly, are you using `slow'?   
   >   
      
   A getc loop took 4.3 seconds to read a 192MB file from SSD, on my   
   Windows PC.   
      
   Under WSL it took 8.4 seconds (8.4/0.5 real/user).   
      
   However reading it all in one go took 0.14 seconds.   
      
   I guess not all 'getc' implementations are the same.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)

[ << oldest | < older | list | newer > | newest >> ]