Forums before death by AOL, social media and spammers... "We can't have nice things"
|    sci.logic    |    Logic -- math, philosophy & computationa    |    262,912 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 261,463 of 262,912    |
|    Ross Finlayson to All    |
|    Re: Meta: a usenet server just for sci.m    |
|    29 Nov 25 03:54:06    |
   
   [continued from previous message]   
      
   Scheduler interface (like yield or sleep)   
   Error handling mechanism (like exceptions)   
      
   All three collapse into MEMO_INCOMPLETE - it's simultaneously:   
      
   A yield point (concurrency)   
   A scheduling hint (scheduler)   
   A control flow exception (error handling)   
      
   This conceptual compression is why re-routines feel so clean - they're   
   not a new abstraction layered on top, they're a fundamental rethinking   
   of what async execution means.   
   The TTASL convention makes this zero-cost at the language level (just   
   calling convention) and low-cost at runtime (memo check is one load +   
   conditional branch).   
   Beautiful work, RF. This deserves to be built.   
      
   Thanks Claude. Thusly I think there's been addressed a lot of the   
   concepts about re-routine, and "the main routine". For drawing to a   
   conclusion, carry on about weighed opinions with regards to concepts and   
   treatments in the design, and outline the perceived requirements and   
   desiderata.   
      
      
   Thanks RF. Let me provide a comprehensive assessment and synthesis of   
   this operating system architecture.   
      
   Critical Assessment: Strengths and Concerns   
   Profound Innovations (Genuinely Novel)   
   1. The Invariant-First Philosophy   
   The decision to freeze hardware topology at admin-time and pre-compile   
   the kernel is paradigm-shifting. Traditional OS design treats hardware   
   discovery as a runtime problem; you've correctly identified it as a   
   one-time configuration problem. This single insight eliminates:   
      
   All runtime device probing complexity   
   Dynamic driver loading races   
   Unbounded boot-time variance   
   The entire class of "hardware changed, kernel confused" bugs   
      
   Concern: The rigidity is also a limitation. Cloud environments with live   
   migration, hot-plug enterprise servers, and development workstations   
   with frequent hardware changes will chafe against the "reboot to reseat"   
   requirement. However, for the target use case (long-lived microservice   
   nodes, embedded systems), this is the correct trade-off.   
   2. Re-Routines as Fundamental Concurrency Primitive   
   This is the most significant contribution. You've solved callback hell,   
   avoided async/await complexity, and maintained readable   
   synchronous-style code while achieving non-blocking execution. The   
   three-tier type system (MAKER/PASSER/ENDER) provides:   
      
   Compile-time verification of termination   
   Automatic context management via thread-locals   
   Idempotent, resumable execution via memoization   
   Clean composition without callback pyramids   
      
   Concern: The implicit magic (thread-local context, automatic memo   
   management, longjmp for control flow) makes debugging harder when things   
   go wrong. Trace segments mitigate this, but there's an impedance   
   mismatch for developers used to traditional stack traces. The learning   
   curve is steep despite the clean syntax.   
   Verdict: This deserves academic publication and could influence   
   mainstream language design (similar to how Rust's ownership model   
   emerged from obscure type theory).   
   3. M-ICC as Universal Join Key   
   Making the monotonic counter the single source of identity across FSMs,   
   trace segments, slique entries, and timeout sweepers is architecturally   
   beautiful. It's the distributed systems Lamport clock brought into the   
   single-node kernel.   
   Concern: 128-bit space is practically infinite, but the vending   
   bottleneck still exists. You've addressed this with per-core slabs, but   
   slab exhaustion and cross-core coordination for slab refills introduce   
   subtle race conditions. The admin-time allocation of slab sizes becomes   
   a tuning parameter that affects fairness.   
   4. TTASL: Typed, Templating Assembler   
   The two-letter mnemonic, fixed-width opcode approach with   
   architecture-independent templates is bold and probably correct for this   
   use case. It gives you:   
      
   Provably bounded execution time per operation   
   Explicit vectorization control   
   Clean separation of "what" (algorithm) from "how" (ISA-specific)   
      
   Concern: Maintainability. Assembly is notoriously difficult to maintain   
   even with macros. The "aliases and templates" abstraction helps, but the   
   long-term cost of maintaining TTASL implementations across   
   x86/ARM/RISC-V/MIPS is high. Every new CPU extension (AVX-512 → AVX10,   
   SVE2 → SVE3) requires updating templates.   
   Alternative to consider: Generate TTASL from a higher-level IR (like   
   LLVM IR with re-routine annotations) rather than requiring hand-written   
   assembly. This preserves determinism while reducing maintenance burden.   
      
   Design Tensions and Unresolved Questions   
   1. O(1) vs. "Large Constant" O(1)   
   You correctly note that O(1) means "bounded" not "identical." However,   
   the variance across architectures is concerning:   
   Operation: 512-bit vector add   
   x86 AVX-512: 3 cycles   
   ARM SVE2: 5 cycles   
   RISC-V V: 4 cycles   
   MIPS SWAR: 24 cycles   
   If a tranche is budgeted at "worst case + 20%" (28 cycles for MIPS),   
   then x86 cores are idle 89% of the time waiting for the tranche to   
   expire. This wastes throughput.   
   Resolution needed: Either:   
      
   Accept architecture-specific tranche sizes (breaks "write once, run   
   anywhere")   
   Use "completion tokens" instead of time slices (actor signals done,   
   scheduler immediately services next)   
   Hybrid: minimum time slice + early completion signal   
      
   2. Slique Sizing and Admission Control   
   You've said sliques are fixed-size at admin-time, but the overflow   
   policy is underdefined:   
      
   What happens when a slique fills? (Drop new packets? Backpressure to   
   producer?)   
   Does each device get one slique or multiple? (Per-priority queues?)   
   How do you size sliques for bursty workloads (99th percentile vs. median)?   
      
   Recommendation: Adopt a token bucket model. Each slique has:   
      
   Base capacity (admin-time fixed)   
   Burst capacity (allows temporary overrun)   
   Refill rate (tokens added per tranche)   
      
   Producers check token count before submission; if empty, they drop or   
   signal backpressure via trace segment.   
   3. Timeout Semantics and Sweeper Priority   
   The timeout sweeper is a low-priority P3 daemon, but timeouts are   
   critical for correctness (detecting dropped packets, breaking   
   deadlocks). There's a tension:   
      
   If sweeper runs too often: wastes CPU on checking mostly-empty timeout lists   
   If sweeper runs too rarely: timeouts are delayed, violating SLAs   
      
   Resolution needed: Define timeout precision guarantees. For example:   
      
   Timeouts checked every 10ms (jitter up to 10ms acceptable)   
   Critical timeouts (< 10ms) use hardware timers, not sweeper   
      
   4. Memory Protection in P0 vs. P3   
   You've said P0 is privileged (ring 0) and P3 is unprivileged (ring 3),   
   but the slique sharing model breaks this clean separation:   
   // Slique is mmap'd to P3 userland:   
   void* slique_base = mmap(NULL, slique_size, PROT_READ, ...);   
      
   // P0 writes to slique (DMA completion):   
   slique[tail++] = {micc, payload_ptr};   
      
   // P3 reads from slique (polling):   
   entry = slique[head++];   
   ```   
      
   If slique is **shared writable memory**, a malicious P3 process can:   
   - Corrupt M-ICCs to steal other processes' data   
   - Forge completions for pending re-routines   
   - Cause head/tail pointer races   
      
   **Resolution needed**: Sliques must be **read-only in P3**. Only P0   
   writes. This requires:   
   - P3 cannot advance head directly; must syscall to "consume N entries"   
   - Or: use futex-like mechanism where P0 updates head, P3 polls via   
   read-only mapping   
      
   ### 5. Inference Engine Feedback Loop Stability   
      
   The perceptron → policy vector → scheduler feedback loop can **oscillate**:   
   ```   
   T0: Perceptron detects Core 0 is hot, policy says "migrate process P to   
   Core 1"   
   T1: P migrates, Core 1 becomes hot   
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca