home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   alt.os.development      Operating system development chatter      4,255 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 4,157 of 4,255   
   BGB to John Ames   
   Re: z/PDOS-generic   
   22 Jul 24 14:16:26   
   
   From: cr88192@gmail.com   
      
   On 7/22/2024 9:51 AM, John Ames wrote:   
   > On Fri, 19 Jul 2024 23:21:22 GMT   
   > scott@slp53.sl.home (Scott Lurndal) wrote:   
   >   
   >>    Poor performance, silly filename length limitations.   
   >   
   > I dunno, 8.3 is downright spacious compared to a number of actual   
   > mainframe operating systems...   
   >   
      
   Looking some, it seems:   
      MS-DOS: 8.3   
      Commodore: 15.0   
      Apple ProDOS: 16.0   
      Apple Macintosh: 31.0 (HFS)   
      Early Unix: 14 (~ N.M where N+M+1 <= 14)   
      
   Whereas TENEX and some others were 6 character.   
      OS4000: 8 character   
      VAX/VMS (and others): 6.3   
        It seems 6.3 was fairly common on DEC OS's.   
      
   Others:   
      ISO 9660  30 (variable format, similar to Unix)   
      UDF: 255   
      FAT32 and NTFS: 256 (UTF-16)   
      EXT2/3/4: 256 (UTF-8)   
      
   For most uses, a 32 character limit would probably be fine.   
      
      
   In many Apple systems, file type and similar was given in a hidden   
   "resource fork" rather than encoded in the filename via a file extension   
   or similar. This seems to be a bit of weirdness fairly specific to Apple   
   systems.   
      
      
   For an experimental filesystem design of mine (not used much as of yet),   
   I had used 48-character base names (sufficient "most of the time"), with   
   an optional encoding for longer names.   
      
   Basically using free-form names following Unix-like conventions, albeit   
   with semi-mandatory file extensions more like in Windows land (binaries   
   typically use '.exe' and '.dll' extensions; however, unlike Unix style   
   shells, the file extension is not usually given when invoking a command;   
   and the extension will be inferred when loading the program).   
      
      
   However, it allows longer names using a scheme similar to FAT32 LFN's,   
   just with names encoded as UTF-8. Otherwise, the design was similar to   
   an intermediate between EXT2 and NTFS; though trying to avoid the sorts   
   of needless complexity seen in NTFS. The LFN's could be omitted, in   
   which case the name limit would be 48 bytes as UTF-8.   
      
      
   For directories, I went with organizing directory entries in an AVL tree:   
   Typical directories are not big enough to justify the relative   
   complexity of a B-Tree (unless aggregating the entire directory tree   
   structure into a shared B-Tree).   
   I had gone the route of using disk blocks to encode directories.   
   Many directories are still big enough that linear search is undesirable.   
      
   Hashed directory lookup seems to be popular, but I went with AVL here   
   (but, with balancing requirements relaxed to depth +/- 3 rather than +/-   
   1, to reduce the number of rotations needed).   
      
      
   For directory lookups, generally the tree is walked using a specialized   
   version of "strncmp()" over the 48 character base-name. Names are   
   encoded as UTF-8, and the "strncmp()" variant is designed to assume that   
   'char' is unsigned (the standard version could give different results   
   based on the signedness of 'char' or other factors).   
      
   Though, "memcmp()" could probably be used and would give the same   
   results here (with names NUL padded to 48 bytes as-needed).   
      
      
   As I saw it, fully variable length directory entries (like seen in EXT2)   
   are also undesirable.   
   So, in this case, directory entries are 64 bytes, with 48 bytes for the   
   name, and the rest for tree management data and holding inode index.   
      
   Another major structure is the inode table, which:   
      Is semi-recursive, the inode table itself has an inode,   
        is allocated much like a file.   
      Inodes are built from a tagged structure.   
        Partially inspired by NTFS.   
      Currently uses a block-allocation scheme similar to EXT2.   
        Small table of block indices:   
          Index 0..15: Points directly at target block;   
          Index 16..23: One level of indirection.   
          Index 24..27: Two levels of indirection.   
          Index 28/29: Three levels of indirection.   
          Index 30: Four levels of indirection.   
          Index 31: Five levels of indirection.   
        Span-based allocation was a close second place.   
          The tagged inode structure could also allow for span-based files.   
          But, I went with an EXT2 like scheme for now.   
            Span based allocation would have been more complicated.   
      
   The current implementation mostly assumes 512 byte inodes, but   
   technically it is variable.   
      
   In the block indirection tables, unlike EXT2, the lower-levels of   
   indirection have "shadowed" spaces in the higher levels of indirection.   
   This was mostly for sake of simplicity (it seemed simpler to just waste   
   some of the table entries than to go the EXT2 route). Theoretically, the   
   deeper tables could mirror the shallower tables, but this wasn't done in   
   the current implementation (easier to not bother).   
      
   Similar to filesystems like EXT2 and similar, the first 16 inodes are   
   currently special/reserved, and used mostly to encode filesystem   
   metadata (inode table, inode bitmap, root directory, block bitmap, ...).   
   However, one minor difference being that block numbering is relative to   
   the start of the partition (so, for example, block 0 in this case is a   
   NULL block, but technically the superblock exists at this location).   
   Higher numbered inodes would be used for files and similar.   
      
   For now, the special inodes are identified by magic index, unlike the   
   NTFS MFT which encodes a name for these special entries (maybe later   
   could add a "magic ID" tag or similar).   
      
   TODO might be to consider file compression. No immediate plans for   
   journaling support.   
      
      
      
   While a case could have been made for "just use EXT2 or similar", my   
   main development system is Windows, so pretty much any choice (other   
   than FAT32 or NTFS or similar) is a similar level of hassle.   
      
   So:   
      FAT32, mostly what I had ended up using thus far.   
        But, with some hacks to support things like symlinks and similar.   
      NTFS, possible, but significant needless complexity.   
        Main issue is that it has too much needless complexity.   
      EXT2, mostly more sane than NTFS, but still some questionable choices.   
      ExFAT, doesn't address the issues in my case.   
        Basically FAT but with redesigned directories   
        Still patent encumbered.   
        (For FAT32 and the core of NTFS, patents have expired).   
      
      
   Thus far, had been using FAT32, but using cruft to try to add things   
   like symlinks and similar on top of FAT32 is ugly.   
      
   ...   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca