Doubt it, since the modern x86 is actually a RISC processor that runs microcode that emulates the old CISC hardware you're talking about.
This is kinda correct, but unfortunately, has nothing to do with plethora of memory addressing modes provided by x86 arch. In x86 memory, memory is managed and accessed by MMU, and not by instruction pipeline.
Before, it was really hard to understand for more general public, but nowadays with proliferation of smartphone and pi-style SOCs (Ssytem On Chips) it should not be alien for you to think about CPUs, especially x86 one, as of SOCs, which they, indeed are.
Case in point: For example PIC and APIC (x86s programmable interrupt controllers) used to be separate chips on a mobo (up to 486/Pentium realm, if I remember correctly, APIC came with Pentium II and multi-processors (usually 2 then), still a separate part on mobo, though). These days APICs are core part of CPU, and per core(!) (pun intended), ie if you have 8-core die, it has 8 APICs embedded - just think about it. APIC does not care about CPU addressing mode, it works same in any case in cpu, or on mobo, only now it is part of each core.
Same way as APIC, x86 MMU does not care about other parts of the processor. In fact modern x86 MMU is a core, that has it's own ALU (arithmetic/logic number cruching unit), at least since x486 (maybe x386 even, not sure when LEA was added).
Naturally it's "ALU powers" are a bit weird, because given "ALU" is intended for computing memory addresses (shifts, offsets, array striding and array addressing) and not general math. Even then, there were dudes who came up with a way to use this internal MMU's ALU for "fucking numeric computations" by the means of abusing aforementioned LEA instruction (Load Effective Address) ie. they instructed CPU's MMU to compute "memory address" but they treated the input and output of address as purely numerical data. Because MMU ALU was much simpler, meaner and leaner, this allowed wizards to do certain(!) computations on it faster than on that one, actual, main integer ALU, right on die (and that would be per core, these days). People are weird beasts, search about it, its really interesting story of insanity. I bet my 2 cents those LEA tricks work the same even these days, same as the day when they were first discovered.
I don't remember it being used for anything real, besides curiosity, but maybe (actually 100% probably) such code is used in somewhere in the world for some insane purpose right now.
Finally x86s became RISCy CISCes around time of Pentium II/III, and Pentium 4. We, who lived back then, we all know what was utter POS of failure of RISC wanking that Pentium 4 CPU was. Anyway, all the three booted in realmode and ran realmode DOS just fine. So I hope you see your RISCy-CISCy argument as (correctly) invalid. It is completely orthogonal to question of x86 addressing modes. Also this RISCy-CISCy thing is not implemented only in microcode, there are other physical dependencies - your view is very neat and too simplistic, this is dark and murky reality we are talking about. But again x86 addressing has nothing to do with RISC/CISC matter at hand.
Simply such is a nature of CPU business. Once you make something you can never go back, because you never know, who started using it. It's same with syscalls in OS kernels. Those too are the words you can never take back. So once something works, it must work the same way ad infinitum.
Now, for x86 it's bitness: yes 16 (real), 32 (prot), 64 (long) is an input parameter to instruction decoder, and it's "lifted" from "segment" (useless concept for many now), but above realmode, segments really are just selectors into segment descriptors table. Actual segment definition in a memory table is much much longer than any segment register on x86, yet as it is stated in documentation, it is loaded into internal segment register/cache (and MMU and respective CPU units) and then is cached in respective components. This comes from Intel documentation. That means instruction decoder knows that it is decoding 16, 32, 64-code respectively, independently of other CPU SOC "components", through caching that value inside itself when it is re-configured. Depending on that value all that RISCy-CISCy of yours business properly self-adjusts itself. Again there is no mutual exclusion between RISCy-CISCy and addressing mode.
Exactly this design is what allowed for so called UNREAL mode: you jumped into PMODE and set segment registers limits (best to 4GB ie 32-bit max) and fell back into 16-bit mode. Because segment registers in RMODE are not real segment selectors, just 16 bit offset shifters for 16 offset registers, after return from PMODE, internal segments machinery including MMU remained set with PMODE (limits) and worked as such. No pixie dust either. This worked 100% on cyrix, amd, via and intel, so it's fundamental property of a "real" x86 architecture.
There was something insanely cool and magical seeing most* of the DOS working normally (mostly IO, that was what you cared about most - at least that worked), yet to able to reach anywhere between 0 and 4GB with value in EAX anytime you wanted. It made some of us, teenagers at that time, extremely "horny". I remember doing that shit with tutorials from internet ("delay-up"
onto floppy in library) with Turbo Pascal and QBASIC/QuickBasic. Aah happy times.
*- on that note yes, all the software that relied on automatic segment wraparound imploded on said wraparound spot (which stopped happening because it was set to 4G instead of 64K) under UNREAL, but in practice, given DOS strictly single-user/single-program nature, it turned out to be smaller problem that people thought it would be.
Unfortunately with 386
Virtual 8086 mode was introduced (intended especially for Windows 9x and similar "OS" software). This is first attempt at "virtualization" in x86 space, and very limited at that. In Intel tradition quite incomplete and fucked up too, like first PMODE introduction in 286. It properly emulated only bare 8086 software (so no PMODE/UNREAL), which broke plenty of advanced DOS applications.
There were protocols to run PMODE (ie Doom/Duke/Blood) alongside Windows 9x lifted from DOS (through DPMI), but it did not emulate UNREAL mode, causing extinction of whole realm of cool UNREAL based software, often specialty software like: 3D Studio (before MAX), some Ultima games, and some other games.
For example DOSBox avoids all these shenanigans by emulating whole 486/Pentium
in software(!) including - our beloved UNREAL!!!.
By now, you can guess I was very anti-Windows back then: we only had one family computer, and as hardcore DOS-UNREALIST/PMODIST I was democratically overturned by my mother and younger brother into shitty Windows 95
. By force.
Now, why I am talking about all this in such details - as you see I am bit passionate about these things, so from time to time I check the developments, and as far as I know UNREAL still works on latest Intels (not sure AMDs but 99% probably too).
Last time I checked, few people tried "long unreal", ie 64-bit UNREAL but this time AMD did really good job (with 64bit x86), so hardware barrier between RMODE, PMODE, UNREAL and LONG is, unfortunately, airtight. There is simply no way to manipulate CPUs internal chips and have them remain set differently through the barrier jump - jumping from other modes into LONG is complete, and it completely resets all the affected CPU sub-components when jumping through the hyper-jump - to "correct" values.
This is really sad development in my eyes: it ultimately took out all the joy from x86 platform. This act also "closes" the era of great joyful hobbyist exploration and evolution, of great inventors of software like PKZIP, and initial cambrian explosion of ISVs and era of "inventive PC software" of the 90s. You see in DOS, you really are the ultimate master of the machine. It's hidden from plebs, by the command interpreter, but it's true. You are completely free, as in real American eagle style freedom. You are the GOD. It was blind underthought experiment on IBM's part, which allowed this freedom to happen by chance, and it gave rise to software "inventors" and DOS - which instead gave rise to the corruption of Microsoft corporation. And Microsoft is smart, they knew that users with real freedom could easily displace them, so they did everything in their might to take over the freedom given, and make themselves indispensable in the process. Freedom was closed. Freedom eventually won, overwhelmingly but in completely different way. Since introduction 64-bit barrier old-shcool DOS style freedom died, and new 64-bit freedom Linux arrived, but this is different kind of freedom.
This LONG mode barrier is also what killed 64-bit DOS by the way, and officially made classic DOS a really dead OS. This is why you'll never see 64-bit DOS ever, be it FreeDOS or something else on "normal" (intel, amd, via) x86 platform.
To locate the relevant timemachine spacetime in CPU timemachine warps table, this is post-athlon epoch, circa 2003 and thus it happened almost 20 years ago. I guess many people reading this board were not even born back then. That is the spot where "DOS war" was lost.
After LONG barrier introduction, USER in not the GOD anymore, OS is a new GOD. At least not in a simple way (you can still raw boot into 64 bit by hand - but you'll end up without OS services - we are so addicted to them now, it's a very though proposition).
And then there's all the other shit they added on top, such as the negative CPU rings and management engine hardware.
You are not entirely wrong, but you are fundamentally mistaken. But I bet from previous paragraphs you are starting to get hang of CPU business, and especially specialty CPU business, like the shitty x86 platform is.
While Microsoft, of Wintel, coalition is slowly killing off DOS, and 32-bit windows - they are very slow at it, and it seems they don't have as much power over it as they used to have. World is a strange place (no Playstation
pun intended).
Analyze this:
Ie NTVDM which was V8086 component in 32-bit windows kernels, that provided shitty virtual DOS machines we talked about above, was removed when? In Vista or Windows 7? Sorry I don't use Windows since last and best Windows ever - XP - so I have no clue about Windows minutiae drama, and thus when exactly it happended, but I remember reading articles that removal happened and overall sentiment was kinda sour. V8086 does not work directly in LONG mode. You have to step-down to PMODE and run V8086 from there. Too much of a hack on 64-bit oses in Microsoft eyes.
Even though microsoft removed the access to the feature and DOSBox naturally took over, CPU cores still express this hardware circuitry in PMODE even in 2023. So although Microsoft killed it, it's not like CPU manufacturers will now remove it. Remember, from CPU manufaturer's POV it's part of the platform forever - even when vestigial.
DOS style booting was not supposed to work since 64-bit native UEFI introduction, thus on anything post 2003 (let's give UEFI developers chance to catch up: so anything post 2008) - yet as video proves, UEFI and mobo manufacturers still bundle respective modules into their mobo's UEFI installs (you probably won't find INT 13h emulation in 100 EUR toilet paper worth Intel notebooks and tablets, though) in 2023 as well.
What younger people don't understand about CPUs is how the actual CPU production processes work: there is nothing in 486 that is somehow "obsolete" today. 486 is perfectly viable processor, even these days.
You can express Pentium from 1993 today the same.
That is actually what
Intel Quark was. Quark was literally midtime-Pentium from 1995, expressed in 32 nanometers and 400Mhz clock. Think about it, these ran naturally at around 100-133 when released and had packaging with 4-5cm side on I don't know what micrometers. Now it fits into sub 5-militer package on the photo on wikipedia. Imagine how that very same CPU would fit into modern 5nm CPU process, it would be probably fit 1x1mm package. And that is whole platform you need to run Duke or Doom.
For example, Unix was written on PDP-11, which took a whole basement room. PDP-11 expressed with modern 5nm process (albeit it would be much more complex task than expressing already existing Pentium due to PDP11s nature - it's a transistor computer without "chips" - uff cringe writing it) would be probably as big as pointy tip of a needle - ie whole room would shrinks to microscopic point of needle spike end.
Imagine what kind of hypercore machine could be produced that way: you could have 100000 x 100000 node cluster of Unix V computers fit 1x1 cm die. It's insane. Modern GPU pixel shader core is more complex that this PDP-11 computer. There are thousands of them in GPUs.
Now about the other parts of modern x86 CPUs: almost all for plebs intended modern Intels sport a SOC-style on CPU embedded graphics cards ... with OpenGL and shader cores mentioned above. Think about it.
Do you seriously think, that somebody in their right mind at Intel tinkers with the realmode cricuitry from '89 in x86 line? Or in AMD for that matter? Of course not! This part was last time refactored and RISCy-CISCyzed as you said, in 2003, and since then it is being expressed as is. It get expressed from 20 year old verilog or hdl code, or what they use, which is part of modern codebase (of course), and nobody touches it, unless there is really fucking ultra big reason to touch it.
They know it's good and that it works fine and it is battle tested. They have batteries of tests they test it with and that's how they know, that it's okay. Nobody just goes there to obscure directory in CPUs source code and changes something because they don't like it.
There is substantial value of trillions of dollars of worth attached to that part so unles you have really fucking big reason and giant rocketeer balls of steel, you just don't fuck with it.
Hardware world is a still world where things have to be done old 70-80-90s ways, like for real.
Because once production line starts, you are spewing millions of chips a day and you cannot easily fix them. These errors when they happen they are etched billion times into the nanomatter of the silicon, what are you going to do, once line runs? Take microscope and nano-resolder them with your hands ? In million sized batches? No, you just throw them out, they are fucked anyway, and you prepare new batch and hope you fixed the problem and never fuck around with what works, ever again. That's how it's really done.
That is why x86 CPUs still boot in realmode.
Also you are misunderstanding ME.
Negative rings were always a thing, discovered by tinkerers and coreboot writers and they are pretty well understood (given their nature). This stuff is used by firmware thus BIOS back then and UEFI these days.
That is not how intel ME "protection" works.
If you notice, as good boys from libreboot discovered, intel "protection" activates only after some minutes of CPU runtime. This is done, to be supposedly able to fix firmware by sysadmin in time allotted, but this also done that way, beacuse it is least invasive to the rest of complex machine like modern CPU is. Nobody will mess with ancient CPU IC parts because of some shitty corporate "protection".
Given all said above, it is highly probably a highly specialized circuitry within CPU, probably expressed within it's own "core" on die, "disconnected" from everything else, and it activates only at certain time point and checks whether encrypted "kingdom keys" have been somehow supplied into the CPU (probably stored in some cache or hidden register). If not, it will indiscriminately shut down power to the whole CPU assembly package, killing all cores and everything living inside with that. It certainly does not mess with MMU, ALU, FPU or anything else, parts which live their own lives, multiplied by numbers of cores present.
Intel ME is OS, and firmware, running on separate "smartphone" level SOC on the motherboard of machine, powered by "parasitic power".
What confuses beginners, is that UEFI and ME share same "disk" - ROM chip from which they load themselves, is single ROM chip for the whole machine. But this is only a cost saving measure.
Think of it like two distributed and unrelated computers, sharing same clustered and networked storage disk, formatted with two partitions, one for ME and other one for UEFI. Both are micro-OS installations in their own right. But physically, they are two separate systems connected to the same ROM: one is ME (with it's own CPU and MEM on SOC) and other is CPU+MEM of the host machine.
So it's SOC-to-SOC-style inter/computer communication. ME does not run on CPU+MEM, it runs on it's own hardware,
alonside(!) the host CPU and MEM. Thus it's not RING -something (-1 - -2). It is simply something completely different and orthogonal - it is really a different machine in your machine. ME, however besides having it's own memory, is also connected to host memory - and that is what makes ME such fucking piece of shit and pain in the ass to deal with.
CPU+MEM host cannot "look" into ME's memory, because ME memory is on ME's own SOC chip direct line and is "unwired" to host CPU. There is really no physical path there, thus no way for host CPU to inspect ME. However ME can, through it's own dedicated ME to host memory lines, scan host's memory space easily. That's the whole sneakiness of it.
Younger/modern people don't realize how parallel all this circuitry is. ME can be talking to memory reading byte by byte behind CPU's back, and CPU woudln't notice shit. That is the ME problem.
So ME is blackbox, whose contents nobody knows, that can access everything within the machine: MEM, GPU, NET, HDD, USB - you name it, and nobody notices - because these components are not designed like to notify CPU "while dealing with you, I am also doing completely different business with somebody else (ME) behind your back, honey".
In a way it's pure genius. You can have "shadow" USB device that will talk only directly with ME and nothing in the system will see it. From the outside it might look like dead thumbstick, that does nothing, not even blinks led, but in fact, it might be exchanging data with ME, hidden. That is what makes ME so scary. What have you plugged into your PC in last 24 hours? Did it work? And even if it did, or didn't, how do you know, it didn't also do something else entirely? We are fucked.
But this, again, has nothing to do with CPU in question. CPU side of "protection" mechanism works differently. It "works" by simple denial of function. "Unprotected" machine simply turns itself off.
All that is needed, for machine to remain running, is for ME SOC core to upload unlocking "kingdom keys" into the CPU - within the first 30 minutes of runtime. That is all. This system approach is safest and least invasive of all the solutions possible.
If you could somehow upload the kingdom code yourself directly with assembly code, the result would be the same. Unfortunately, the protocol and everything else in this exchange is secret, very closely guarded by Intel.
There won't be a leak, because Intel is not stupid and because it makes both components: CPUs and ME SOCs.
So they can have crypto parts of both components intimately matching and expressed directly in hardware of both and that makes them "undumpable" by firmware and software dumping means.
This proved to be working: as of yet nobody has cracked the mechanism. Unless somebody either leaks both parts of hardware desing (CPUs and MEs) and how it's done from inside Intel corp or somehow hacks the mechanism's crypto exchange due to some crypto error/bug, this solution is un-fixable by design.
This is also probably the most guarded secret in Intel, as many premium features ride on this shit working, like SGX and encrypted VM enclaves, and other coporate and military shit.
It's stuff at the level of spies, national armies and three letter agencies (mostly USA). People dabbing in this shit have unexpected heart attacks, or suddenly drive off the road. Getting it out would mean dealing with that kind of crap. That's simply "too crazy" bar to deal with.
It's really sad what PCs have become. From tools of freedom to tools of oppression (real one).
In the meantime, we see that FPGAs are more affordable each day. I saw jcore people in videos say, that set of jcore ASIC style masks can be made - if you know how to walk the industry and who to talk to - under 20K dollars.
This was pre rus-ua war. For few dozens some more, you can have TSMC, or anybody in the space really, make you a run of 100K units. This is tarting to be close to hobbyist reality and we are seriously entering most interesting era in hardware ever.
I wish there would more young people interested in bringing some modernized-pentium hybrid into retro-gaming.
MISTer already has working 486 core which is dearest to my heart - I bet someone experienced and talented could make it run 400 Mhz in ASIC.
If we got some capable youngsters make some serious Pentium/Quark equivalent, we could fuck Intel and AMD goodbye, at least in area of retro-gaming. These things, made by "obsolete" 32nm processes would still be tiny and would have minuscule power draw in range of few dozens watts at max.
What I would propose, would be to add some bastardized support for UNREAL LONG, where you could address 4G+ memory and moderns sticks and such a box could be seriously viable platform for retro gaming and crazy DIY computer projects.
Not sure how that would pan out in other areas but, at least it would make old DOS fresh again.
If this dream CPU had software controllable frequency multipliers, you could step up and step down execution speed from 30Mhz to 400Mhz and back for most games and I think it would be crazy effective at doing hardware "dosbox".
Albeit there are people messing with this, the fact that progress is so slow and we don't see something like PC equivalent of AMIGA Vampire (or other AMIGA "accelerators"), proves that PC niche is probably missing crucial people to do this, or there is not much interest in going this path (or PC "master race" complacency?), which is sad.
For example, we still are missing fully compatible "dumb" but open VGA card. There were some attempts in the past, but failed. It seems people started working on it too soon.
In the end, original 8086 IBM PC was haphazard hodge-podge of parts, smashed together in just a few months, so there should be enough talent in this modern world, to make this "dream" platform again.
Maybe Pentium-class CPU is still too high of a target, and PC community has too few people messing with this. Or maybe even modern FPGAs are too small to express Pentium class chip yet? Not my area of research.
But given there is pretty big retro-computing and vapor-anything craze around youtube these days, I hope somebody would pick up a torch one day and make it happen eventually: Free-again x86-like platform with 64-bit memory interface for us poor plebs.
It would certainly be godsend to many - and such "ultimate doom/quake machine" would certainly be ultimate retro-gaming piece of kit.