> The CPU still fetches the target into the instruction cache before the protection kicks in.
> In Phantom, ordinary instructions, including a no-op, can be misinterpreted by the CPU as branches, triggering speculative behavior the program never asked for.
Is the idea you combine these two to execute a BTB style attack? Is there a world in which speculative cache fetching is still fine if it’s non exploitable or is it always a risk and the performance cost of fixing the hardware negligible?
> The Fractal team showed that the conditional branch predictor has no privilege isolation at all
This one seems more serious. Now that it’s confirmed, does it provide a map for how to exploit it in a real system or is this non-exploitable in practice because of OS design choices around migration?
To give an analogy, it almost feels like removing the protection circuitry from a Li-Ion battery and then testing if it can catch fire, and observing that it does. Should it really worry users?
I would say it's like calling the battery a fire hazard if the vents don't work, but actually that's not analogous because the necessity for vents doesn't merely arise from the need to protect against bad design of the protection circuitry. They're needed for safety even if your circuitry design is flawless. So the analogy is actually kind of poor in that regard.
An obvious example is web browsers, where a vulnerability can easily be uninteresting because it lives in a sandboxed process… until you find a sandbox escape, then it is critical.
As long as you suspect there may be other vulnerabilities in the other layers, it is worthwhile investigating and fixing them, because defence in depth only works until someone manages to put together a full chain.
A mistake is a mistake, whether you have a way to reproduce it right now or not. It's pretty much a given that whatever means you have right now to reproduce the problem will evolve and broaden the scope. Also, if you haven't found a way to reproduce the problem, it doesn't mean it doesn't exist: it takes a lot more effort to prove that it's impossible to reproduce than to simply not being able to reproduce the problem.
It's research, which often involves a ton of work for zero pay off. It's usually thankless and unrewarding, on the off chance that there is some exploit to be had.
it's hard for me to justify the tremendous effort of implementing the OS from scratch, instead of adding the functionality that you need to for example linux or xv6.
> (it) exposes primitives that let a single experiment switch privilege levels at runtime while executing the same instructions in the same address space.
i think that it can be achieved by following linux modifications:
- make all executable pages executable both in user and kernel mode
- define a new syscall number, let's call it 'fractal'
- upon 'svc' trap (syscall), if it's a fractal syscall, just branch to instruction after the 'svc' (still in kernel mode! no 'eret', as opposed to no-fractal syscalls)
and.. that's it?
> [...] they usually run their experiments on top of an operating system that was never built for the job. They open up macOS or Linux, patch the kernel by hand, and hope the modifications hold. The approach is unstable, hard to reproduce, and on Apple’s platforms, slated for deprecation.
I'd also like to hear more about why that's a problem, not because I disagree, but because I don't know jack about this and it's fascinating. However, I could imagine at least a couple of advantages to this approach.
* It's not a general purpose OS. It doesn't have to support 10,000,000 different accessories, just enough to get the kernel booted so researchers can interact with the hardware.
* You don't have to deal with general purpose constraints here. Who needs something like a fair scheduler when the goal is to give researchers direct access to the hardware for minutes at a time?
* If broad hardware support and universal use case support aren't goals, you can write something vastly simpler that basically loads a program and turns it loose on the underlying bare metal. I imagine that'd make repeatability vastly easier, with no "oops, an Ethernet packet came in so I need to service that mid-test" interrupt{,ion}s.
Those would seem like good reasons to make a minimal kernel that doesn't get between the researchers and their work.