Comments

You must log in or register to comment.

anlumo t1_ix30oys wrote

This reads like a journalist with no tech knowledge heard a drunk engineer ramble about the M1 emulation extensions over a beer and then wrote down what he could remember.

89

gurenkagurenda t1_ix34gl0 wrote

Yeah, the original post is a great read, but this digest is so confused. What’s actually going on is that there’s a rarely used behavior in x86 dating back from the 8080. Rosetta 2 needs to support it for the rare cases, and it does so by having undocumented hardware emulation for the feature in the M1.

The clever bit is that the extension that does that emulation is undocumented, so once Rosetta 2 is deprecated, Apple Silicon can finally cast off this decades old backwards compatibility cruft.

Edit: to clarify, it’s actually two features, but the situation with each is virtually identical; some stuff you don’t care about gets calculated as a side effect of every add/subtract/compare.

51

irkli t1_ix5pl2n wrote

Ahhh ... A dim memory fog rises... Yeah as an assembly programmer (from minicomputer days through x286) there was an issue with the flags, x86 to x80, I can recall and won't look up.

PUSH PSW

Pushed the accumulator (A, 8 bits) and program status word (flags like carry, etc) at once and there were side effects too boring to recall. Ugh glad the good old days are gone who needs that shit!

5

FictitiousThreat t1_ix2yjjl wrote

It’s like finding out your new Mercedes will run on whale oil.

35

Cultural-Interview77 t1_ix38gaf wrote

Not really running on whale oil, but more like an additional x86 instructions set like found on early 90's 615 PowerPC prototypes from IBM, that would have allowed both x86 and PPC instructions to run simultaneously.. if Microsoft wouldn't have said "fuck off" to IBM of course.

4

mtaw t1_ix5hsib wrote

Long explanation, since this article sucks:

#BCD In the olden days of computing (1960s and 1970s) it was common to represent numbers as "Binary-Coded Decimal". Meaning that, rather than store numbers as a purely binary number, they stored the number as decimal digits with each digit stored as a 4-bit binary number.

E.g. decimal 10 in binary is 1010, but in BCD it's 0001 0000 - the first 4 bits for the digit in the 10s place and the second four bits for the number in the 1s place. Clearly this takes up a lot more space in memory, but it did make conversion of numbers from decimal format to/from binary a lot faster and simpler. So some early machines worked with BCD natively.

The 8080 (and other early microprocessors) had instructions to do BCD arithmetic, in addition to ordinary binary. Probably to make it easier to port code from older systems, as most code written for those machines never used it.

#Flags It's not the instructions themselves that are the problem here though, but a couple of processor flags. These are bits which are updated after certain instructions, indicating whether something-or-another happened during the operation. The exact flags differ between processors just as the instruction set does, but a couple of typical/common ones are the overflow flag, which gets set to 1 if the result of an arithmetic operation overflowed, i.e. was too big to fit in the register where it was stored, and the zero flag which gets set to 1 if the result of an operation was zero.

The state of the flags are used to control conditional flow of a program. More concretely, on the x86 the instructions JO (Jump if Overflow) and JZ (Jump if Zero) will jump to a different place in the program code if the overflow and zero flags are set, respectively.

Now to tie this all together: The 8080 has a flag named the AF (Adjust Flag). It is set after doing an arithmetic operation, signifying that a carrying or borrowing from or to the lower 4 bits of a byte happened. This is intended for BCD arithmetic - so you know whether the lower digit of the number has changed or not. Which might require you to update a display or some such. (yes kids, computers were so slow back then you wouldn't want to update even single digit on a display if you didn't have to)

A second flag that comes in here is PF (Parity Flag), which is set to 1 or 0 depending by counting the number of binary 1s in the result of an operation and setting it depending on whether that count is odd or even. This was needed in the "olden days" for things like serial communication, where data integrity was checked by checking the parity, which was always supposed to be one or the other.

This, too, became mostly obsolete pretty quickly, as most microcomputers would end up having dedicated chips to do serial communications (UARTs) so the processor could continue to run the program and not have to deal with mundane tasks like parity-checking.

#Where Apple comes into it

Newer processors (designed in the last 35 years or so) don't have BCD instructions or these flags, since their purpose is obsolete. But even if they're not used much on x86, they're still there, and need to be updated after each operation regardless, in case the program makes use of them.

So given that they're trying to translate x86 code directly into ARM code here, the fact that ARM does not have those flags is a big issue, since if you need to add even a single extra instruction or two for each arithmetic instruction in order to update the flags, that means you're slowing the code down by 2-3x times. (and that's why emulation is typically slow, even if you have a far more powerful processor than the one you're trying to emulate, the lack of a direct functional equivalence from one instruction to another often means you still have to use many instructions to emulate a single on on the original hardware)

But if you're designing your own chip, it is relatively easy to add support for those extra flags in hardware. (it doesn't slow it down significantly, just means a few hundred more transistors) So they decided to make it a lot easier and above all faster to translate stuff from x86 to ARM by making their own ARM chips secretly have these flags. Flags are stored in a "status register", named NZCV on ARM because it has the Negative, Zero, Carry, oVerflow flags. The register has 64 bits, as it's a 64-bit processor, with those stored in the upper 4 bits of the lower half (so bits 31, 30, 29, 28), so most of it is unused. What Apple did, was to secretly add added the PF and AF flags as bits 27 and 26 of the register, and used them so they could translate x86 code into ARM - using those flags when necessary, and still have it run quickly, much more quickly than if it had to calculate those flags in software after every arithmetic operation.

28

mroboto2016 t1_ix30nvp wrote

When I was working on my AS Degree in IT, I was looking at windows files. This was a few years back, but they were (maybe still?) running print spooler code from the first versions of windows.

I guess if it ain't broke, don't fix it.

18

MadMadBunny t1_ix475w9 wrote

Wait—that was it? The intro was the whole article?

2