... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.arch
Apparently more than just beeps & boops
131,241 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 129,645 of 131,241
BGB to All
Re: Random/OT: Low sample rate audio wei
11 Sep 25 02:05:59
   From: cr88192@gmail.com   
      
   On 9/10/2025 8:33 PM, Lawrence D’Oliveiro wrote:   
   > On Sat, 6 Sep 2025 14:19:40 -0500, BGB wrote:   
   >   
   >> But, there is some "weird hacks" that can be done in audio processing   
   >> when downsampling that seems to notably increase intelligibility at an   
   >> 8kHz sample rate ...   
   >   
   > There are digital encoding formats used with mobile phones that are   
   > optimized for speech. Ever heard a call where the other end sounded every   
   > now and then like they were underwater? That’s the kind of compression   
   > artifact you get.   
      
   Looking some at it, apparently a lot of the current modern phone class   
   audio codecs are based on trying to run a model of the human vocal tract   
   and then adding white noise to make it sound more natural (with some   
   apparently partly based on vocoder technology).   
      
   But, in my case, I don't really hear speech effectively over phones, I   
   mostly hear a lot of warbling that I am left trying to decipher over all   
   the hiss.   
      
      
   As noted, the filtering hack mostly kept to normal PCM handling, but I   
   soon realized can't work as a general solution to "stuff sounding bad"   
   at an 8kHz sample rate.   
      
      
      
   When I was looking into it, 4-channel sinewave synthesis is possible, but:   
   Quality is still poor;   
   At a 125Hz update frequency, at 16 bits per sinewave, still takes around   
   8kbps.   
      
   Needs 16 bits roughly to encode both the frequency and amplitude of each   
   sinewave to an acceptable degree.   
      
   When fiddling with it, I ended up finding an OK strategy of:   
   Sample for 12 signwaves, dividing the 2-8 kHz range into roughly 1/6   
   octave chunks (picking the loudest wave within each chunk);   
   Pick the top 4 loudest waves from the 12 sampled.   
      
      
   I was experimenting with pushing the scheme I mentioned else-thread to   
   around 6 kbps, which (last I messed with it) still generates some truly   
   awful audio quality.   
      
   Posted an example to my twitter feed:   
   https://x.com/cr88192/status/1965694742186049683   
      
   It does sound a fair bit better with a 16kHz sampling rate (12kbs), but   
   is still notably inferior to 8kHz 2-bit ADPCM (16kbps).   
      
      
   The 6kbps case is interesting as it gets a 2-minute song into around   
   96K, which is kinda pushing into MIDI territory. But, MIDI would have   
   sounded better (though, no real obvious way to auto-convert PCM audio   
   into MIDI commands).   
      
   Well, unless maybe doing something like sinewave synthesis but then   
   trying to convert the sine waves into Note On/Off commands. Though,   
   naively mapping sinewave synthesis to MIDI commands would likely add a   
   fair bit of bulk and overhead.   
      
      
      
   It is possible that I may need to take a different approach to   
   generating the pattern table.   
      
   Initial approach:   
      Fill it with sine-waves;   
        Didn't work very well.   
      Current strategy:   
        Start with a table of 16-bit patterns (curated manually);   
        Map each to samples, 0=full negative, 1=full positive;   
        Run N passes of averaging;   
        Generate a pattern table with 4-bits per pattern sample.   
      
   Possible pattern-table generation strategy (not yet tried):   
   Use the sign of each sample relative to the base curve to generate a   
   16-bit key;   
   average the relative values for each key, keeping track of relative   
   usage frequency;   
   Pick the top-N else merge similar patterns until one has fewer than 256   
   or so.   
      
   Note that any sounds much over ~ 250Hz at an 8000 sample rate are being   
   generated from the pattern table.   
      
      
   But, it is possible this approach may be a lost cause (could not be made   
   to give anywhere acceptable quality at these bitrates).   
      
      
   Note that I don't want something significantly more complicated or   
   expensive than ADPCM (so, ideally no entropy coding or fancy transforms   
   on the decoder side...).   
      
   To be useful, would need to either:   
      Do better than ADPCM at a similar bitrate;   
      Achieve bitrates lower than what is possible with ADPCM.   
      
   Was partly looking at the latter, but to be useful it needs to have some   
   level of "passable" quality, which I have yet to achieve at this target   
   (eg, particularly at 6 kbps).   
      
   ...   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]