Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.arch    |    Apparently more than just beeps & boops    |    131,241 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 129,645 of 131,241    |
|    BGB to All    |
|    Re: Random/OT: Low sample rate audio wei    |
|    11 Sep 25 02:05:59    |
      From: cr88192@gmail.com              On 9/10/2025 8:33 PM, Lawrence D’Oliveiro wrote:       > On Sat, 6 Sep 2025 14:19:40 -0500, BGB wrote:       >       >> But, there is some "weird hacks" that can be done in audio processing       >> when downsampling that seems to notably increase intelligibility at an       >> 8kHz sample rate ...       >       > There are digital encoding formats used with mobile phones that are       > optimized for speech. Ever heard a call where the other end sounded every       > now and then like they were underwater? That’s the kind of compression       > artifact you get.              Looking some at it, apparently a lot of the current modern phone class       audio codecs are based on trying to run a model of the human vocal tract       and then adding white noise to make it sound more natural (with some       apparently partly based on vocoder technology).              But, in my case, I don't really hear speech effectively over phones, I       mostly hear a lot of warbling that I am left trying to decipher over all       the hiss.                     As noted, the filtering hack mostly kept to normal PCM handling, but I       soon realized can't work as a general solution to "stuff sounding bad"       at an 8kHz sample rate.                            When I was looking into it, 4-channel sinewave synthesis is possible, but:       Quality is still poor;       At a 125Hz update frequency, at 16 bits per sinewave, still takes around       8kbps.              Needs 16 bits roughly to encode both the frequency and amplitude of each       sinewave to an acceptable degree.              When fiddling with it, I ended up finding an OK strategy of:       Sample for 12 signwaves, dividing the 2-8 kHz range into roughly 1/6       octave chunks (picking the loudest wave within each chunk);       Pick the top 4 loudest waves from the 12 sampled.                     I was experimenting with pushing the scheme I mentioned else-thread to       around 6 kbps, which (last I messed with it) still generates some truly       awful audio quality.              Posted an example to my twitter feed:       https://x.com/cr88192/status/1965694742186049683              It does sound a fair bit better with a 16kHz sampling rate (12kbs), but       is still notably inferior to 8kHz 2-bit ADPCM (16kbps).                     The 6kbps case is interesting as it gets a 2-minute song into around       96K, which is kinda pushing into MIDI territory. But, MIDI would have       sounded better (though, no real obvious way to auto-convert PCM audio       into MIDI commands).              Well, unless maybe doing something like sinewave synthesis but then       trying to convert the sine waves into Note On/Off commands. Though,       naively mapping sinewave synthesis to MIDI commands would likely add a       fair bit of bulk and overhead.                            It is possible that I may need to take a different approach to       generating the pattern table.              Initial approach:        Fill it with sine-waves;        Didn't work very well.        Current strategy:        Start with a table of 16-bit patterns (curated manually);        Map each to samples, 0=full negative, 1=full positive;        Run N passes of averaging;        Generate a pattern table with 4-bits per pattern sample.              Possible pattern-table generation strategy (not yet tried):       Use the sign of each sample relative to the base curve to generate a       16-bit key;       average the relative values for each key, keeping track of relative       usage frequency;       Pick the top-N else merge similar patterns until one has fewer than 256       or so.              Note that any sounds much over ~ 250Hz at an 8000 sample rate are being       generated from the pattern table.                     But, it is possible this approach may be a lost cause (could not be made       to give anywhere acceptable quality at these bitrates).                     Note that I don't want something significantly more complicated or       expensive than ADPCM (so, ideally no entropy coding or fancy transforms       on the decoder side...).              To be useful, would need to either:        Do better than ADPCM at a similar bitrate;        Achieve bitrates lower than what is possible with ADPCM.              Was partly looking at the latter, but to be useful it needs to have some       level of "passable" quality, which I have yet to achieve at this target       (eg, particularly at 6 kbps).              ...              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca