... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
linux.debian.kernel
Debian kernel discussions
2,884 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 1,010 of 2,884
james young to pronoiac@gmail.com
Bug#1116067: linux-image-6.1.0-32-amd64:
15 Oct 25 04:10:01
   XPost: linux.debian.bugs.dist   
   From: pronoiac@gmail.com   
      
   Whoops, I hadn't intended to top-post... I'll do it correctly this time.   
      
   On Thu, Sep 25, 2025 at 12:57 PM james young  wrote:   
   >   
   > I'm not sure what timeline to expect for a response.   
   > Would a tarball of the outer image preserve everything needed for   
   diagnostics?   
   >   
   > -James   
   >   
   > On Tue, Sep 23, 2025 at 2:10 PM james young  wrote:   
   > >   
   > > I hit an issue with btrfs compression; I reported it to Debian, which   
   > > I was using, and they suggested that I take it upstream.   
   > >   
   > > Thanks, Salvatore. My apologies to everyone if I misunderstood.   
   > >   
   > > -James   
   > >   
   > > On Tue, Sep 23, 2025 at 1:50 PM Salvatore Bonaccorso    
   wrote:   
   > > >   
   > > > Control: tags -1 + moreinfo   
   > > >   
   > > > Hi James,   
   > > >   
   > > > On Tue, Sep 23, 2025 at 08:04:25PM +0200, James Young wrote:   
   > > > > Package: src:linux   
   > > > > Version: 6.1.129-1   
   > > > > Severity: normal   
   > > > > X-Debbugs-Cc: pronoiac@gmail.com   
   > > > >   
   > > > > Dear Maintainer,   
   > > > >   
   > > > >   
   > > > > * What led up to the situation?   
   > > > > We made empty files in a loop, in parallel, under CPU and I/O load.   
   > > > > We had an outer Btrfs image file with compression, which contained a   
   Btrfs image file, which contained billions of empty files.   
   > > > > We wrote around 100TB to the inner image file.   
   > > > > Around 60TB in, compression quietly shut off.   
   > > > > We ran out of space; both mounts presented i/o errors.   
   > > > >   
   > > > > * What exactly did you do (or not do) that was effective (or   
   ineffective)?   
   > > > >   * I unmounted the inner and outer images.   
   > > > >   I didn't take note of memory usage before this point.   
   > > > >   * dump debug info for the outer image - `btrfs inspect-internal   
   dump-tree --dfs ...`   
   > > > >   * We started a btrfsck. (twice, actually; breadth-first hit memory   
   limits, I think)   
   > > > > After that, I learned about `btrfs check`, but didn't interrupt the   
   btrfsck, due to Sunk Cost Fallacy.   
   > > > > The btrfsck is still running. It's of extremely dubious value now.   
   > > > > * check the kernel logs   
   > > > >   * I grepped for btrfs, the mount points, compress, and zstd. I   
   didn’t find a smoking gun in the right timeframe.   
   > > > >   
   > > > > not done yet:   
   > > > > * mount the outer image   
   > > > > * rebooted   
   > > > > * tried a newer kernel. we're currently on kernel 6.1.129; we could go   
   to newer 6.1 or 6.12 kernels   
   > > > > * redo live file system compression, with e.g. `btrfs filesystem   
   defrag -czstd`   
   > > > > * fstrim the outer image   
   > > > >   
   > > > > goals:   
   > > > > * work out what happened.   
   > > > > How can we help?   
   > > > > * help avoid it happening again, to others   
   > > > > * salvage what we can   
   > > > >   
   > > > > I've run `bugreport` as a non-privileged user. Let me know if root   
   access would give a fuller picture.   
   > > >   
   > > > I believe the best thing you could do here is to contact actually   
   > > > upstream people directly. get_maintainers and the MAINTAINERS file   
   > > > has:   
   > > >   
   > > > BTRFS FILE SYSTEM   
   > > > M:      Chris Mason    
   > > > M:      Josef Bacik    
   > > > M:      David Sterba    
   > > > L:      linux-btrfs@vger.kernel.org   
   > > > S:      Maintained   
   > > > W:      https://btrfs.readthedocs.io   
   > > > Q:      https://patchwork.kernel.org/project/linux-btrfs/list/   
   > > > C:      irc://irc.libera.chat/btrfs   
   > > > T:      git git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git   
   > > > F:      Documentation/filesystems/btrfs.rst   
   > > > F:      fs/btrfs/   
   > > > F:      include/linux/btrfs*   
   > > > F:      include/trace/events/btrfs.h   
   > > > F:      include/uapi/linux/btrfs*   
   > > >   
   > > > So I would suggest you to contact above maintainers including the   
   > > > list.   
   > > >   
   > > > Please keep this downstream bugreport as well in the recipients list.   
   > > >   
   > > > Regards,   
   > > > Salvatore   
      
   I made a tarball of the file system, then mounted and looked at the   
   file systems.   
   I attempted to recompress (with btrfs defrag) and fstrim, with little   
   success in freeing up space.   
      
   I started btrfs check with the progress option; within two hours, it   
   had gotten to “[2/7] checking extents, 82 items checked”.   
   I confused the extents with the compressed chunk length - 128KiB - so   
   that seemed woefully low on progress.   
   Over a week later, it’s still "82 items checked".   
   It’s still taking CPU (3% right now) and gigs of memory; it’s doing   
   something, though slowly.   
      
   So, a question:   
   * is this business as usual for a btrfs check?   
   * is this a clue about what happened?   
   * is this a symptom?   
      
   If this is a useful metric for file system robustness, is this   
   something I could / should experiment with to shorten?   
   * run `sync`   
   * periodically pause writes, to let the buffers empty   
      
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]