muta...@gmail.com wrote:   
   > On Friday, July 23, 2021 at 5:32:07 AM UTC+10, anti...@math.uni.wroc.pl   
   wrote:   
   >   
   > > That have have at least 3 different solutions: relocation (base)   
   > > register, paging and position independent code. Note that   
   >   
   > Thanks for explaining the underlying theory to me!!!   
   >   
   > > relocation register is set up by operating system, so you can   
   > > enlarge it without change to application (of course operating   
   > > system must know (possibly compute) size of relocation register).   
   > > For many years paging is preferred method to run multiple   
   > > applications.   
   >   
   > Ok. But that wasn't available on the 8086. Maybe there was   
   > some way "fake paging" could have been created. Regardless,   
   > we ended up with the "relocation (base) register" option, which   
   > is presumably what a segment register is considered to be.   
      
   There is important difference: segment registers on 8086   
   can be treated as "just another register", in particular   
   program may use them to store arbitrary 16-bit numbers   
   (and some did so). Relocation register is handled by   
   OS. Since relocation register is managed by OS, OS   
   can reliably more programs, swap them and load in   
   different place, etc. This is not possible on 8086.   
      
   > > > As opposed to what alternative for large memory model   
   > > > 8086 programs? There's some advantage to restricting   
   > > > them to 1 MiB instead of letting them fly on an 80386?   
   >   
   > > Simple fact is that program which has more than 1M of code   
   > > will not run on 1M machine.   
   >   
   > This is true, but I would have stopped right here. There is   
   > nothing special about 1 M. The problem could be restated   
   > as 2 M instead. The proper thing to do is abstract the   
   > situation right here.   
      
   The abstract result is: to address M locations in memory   
   you need at least n bits, when n is such that 2 to power   
   n is bigger or equal to M. There is no way to avoid   
   this limitaition. What you can do is to hide extra bits   
   in some place. Polish K-202 micro used 16 "normal"   
   address bits, but had extra register providing 8 upper   
   bits. The advantage was that creator of K-202 could   
   boast about addressing 16M of memory, otherwise this   
   was basically usesless (base configuratin had 4k memory).   
      
   PC-s for many years used 16-bit DMA chips. That is, DMA   
   chip managed 16 low address lines. Upper bits where   
   provided by separate register (4-bit for 8086, 8-bit   
   for 286, possibly bigger for later). This was major   
   pain in OS, as OS had to allocate buffer avoiding   
   crossing 64k boundaries. For example, to have   
   decent floppy support Linux at startup allocated   
   buffer for single track (later it would be tricky   
   to find suitable piece of memory). Those problem   
   were resolved by better DMA in PCI era...   
      
   > > More generally, if program   
   > > really needs more than 1M of data, it will not run on 1M   
   > > machine. In such case, why to bother with segments?   
   >   
   > Because the Norks may produce an 8086+ with 5-bit   
   > segment shifts giving you 2 M, tomorrow. For no change   
   > whatsoever to the application program.   
   >   
   > > What remain are programs that are small enough to fit   
   > > in 1M and do some useful work and which can also do   
   > > useful work on larger datasets. IME significant percentage   
   > > of such programs involved largish array, they would fail   
   > > with large data in large model.   
   >   
   > There is a more significant percentage that WORK with   
   > the large memory model, which is why they WORK, even   
   > with 1 M.   
   >   
   > > At best you could   
   > > use huge model, at cost of significant slowdown.   
   >   
   > Huge is very rare. Turbo C++, a very popular compiler,   
   > doesn't even generate suitable code.   
   >   
   > > So you deal with small class of programs. And apparently   
   > > you did not realize that if you _have to_ deal with   
   > > segments (say for compatibility with 8086), then what   
   > > 286 and 386 did is much better.   
   >   
   > I don't know what you are talking about. The usage of   
   > segments that I described seems to be the most   
   > appropriate solution.   
   >   
   > Are you talking about effectively having two executables   
   > combined into one? I don't consider that to be superior   
   > to the design I outlined.   
      
   No. I mean that 286 gets rid of concept of "segment shift"   
   and allows to place segment origin at arbitrary place.   
      
   Concernig "two executables combined into one", actual   
   application code was one. But clearly, some low-level   
   issues depended on processor. In ideal world such   
   dependencies would be handled by OS. But DOS did not   
   handle them. So programs bundled extention providing   
   need OS support. Or if you prefer you can say that   
   program bundled its own OS. Note that "bundling"   
   was purely for convenience of users and distributors:   
   instead of loading separate "DOS extender" users   
   dealt with single program.   
      
   And if you go beyond DOS, early Windows contained   
   appropriate support so the same applicatin could   
   run in real mode or protected 16-bit mode.   
      
   > > [S/360]   
   > > In instruction set, IBM missed PC relative instructions   
   >   
   > Is that really a problem? I don't see anything particularly   
   > wrong with the assembler generated by GCC.   
      
   If you ignore practical issues like compiler complexity   
   and program size, then you do not need PC relative   
   addressing. Igonring such issue may get you TSS-360.   
   Lynn Wheeler (who frequently posts in alt.folclore.computers)   
   wrote about benchmark comparing TSS-360 and CP-67 on   
   the same machine: IIRC TSS-360 could handle 4 simulated   
   users, CP-67 could handle 40. Of course, both are   
   complex programs and there may be many reasons for   
   difference. But one thing stand out: CP-67 dependend   
   on paging. TSS-360 used position-independent code.   
   This is similar to what Linux ELF is doing now. In   
   64-bit Linux position-independent code leads to few percent   
   slowdown (in 32-bit mode it is worse, but should not   
   exceed 20%). But because of lack of PC-relative   
   and IBM program conventions TSS code was reported   
   as much slower than regular 360 code.   
      
   And considering compilers, gcc has not so trivial   
   part handling reloading of base registers on 370.   
   I discovered a bug in this code: when procedure   
   was big enough this code got confused and base   
   registers were wrong. I also provided a fix, but   
   IBM folks fixed it in their own way and IBM   
   version is included in distributed gcc.   
   Anyway, for different architecture compiler   
   would be smaller and easier to write.   
      
   > > and usefulness of larger immediates.   
   >   
   > Ditto. Yes, you can pile on loads of features, but are they   
   > really necessary? To achieve what purpose? You think   
   > S/360 applications would be 10% faster if IBM had thought   
   > of this? 50% faster?   
      
   It is hard to quantify separate featuers. In case of   
   TSS-360 all togheter was 10-times. One contributing factor   
   that TSS code was bigger, later reports claim that   
   TSS gained a lot of speed by going to bigger memory.   
   More memory meant less paging, but smaller code   
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|