upvote
> However, I would like to point out that Apple isn't totally wrong here because the accessibility API unfortunately is way too broadly scoped, and because of that you literally get access to everything on the computer like you you can screenshot listen and and move the cursor... This is completely ridiculous and the proper engineering solution would actually be to phase out the accessibility API and replace it with something that is narrowly scoped so you can grant specific permissions individually

If you don't have use of your hands you want that. The whole point of accessibility APIs is allowing arbitrary control of your computer via novel means. One of the big selling points of Dragon Natually Speaking is the ability to tell your computer to do things based on descriptions without a mouse. "open outlook", "click compose", "select subject", "type foo", etc. Unfortunately modern software breaks this a lot. Chrome and anything electron based don't provide any accessibility information to the OS. The interior of the window excluding the tab bar is a void. Yes chrome has an inbuilt screen reader as do a number of electron apps. But if you aren't blind and want to use something like Dragon it doesn't work. Canvas based apps are often the same.

And no the solution here is not computer vision with an LLM. Text and buttons rendered on my computer exist in memory somewhere as text and buttons. We should not need to convert them to pixels and back lossily to recover text and buttons. We should just expose things to the accessibility API and not guess.

reply
> Chrome and anything electron based don't provide any accessibility information to the OS

Are we sure about this? At least on windows, NVDA works fine with chrome and any electron apps.

reply
> And no the solution here is not computer vision with an LLM.

Also, even if you hypothetically wanted to use computer vision with an LLM… what API is that LLM going to use to take screenshots and click on stuff?

reply
> However, I would like to point out that Apple isn't totally wrong here because the accessibility API unfortunately is way too broadly scoped, and because of that you literally get access to everything on the computer like you you can screenshot listen and and move the cursor...

I want apps to be able to do that!

reply
Yes but miffing to open Privacy & Security & see dozens of apps pretending to need “accessibility” features. Apple has a dozen+ categories there but many poweruser apps I want specifically need accessibility.

Is there an opinionated reason not to break out capabilities?

reply
> Is there an opinionated reason not to break out capabilities?

If you have a disability and need tools to use your computer the last thing you want to do is have those things not only off by default but complicated and involved to turn on.

reply
Is there a reason a capability has to be covered by only a single permission? Why not have one accessibility permission that covers all that and then a bunch of individual permissions for non-accessibility apps?
reply
Apple doesn’t provide another API for this, so apps have to use the one that’s available.
reply
Then they should use an appropriately scoped API, as OP suggested.
reply
Controlling my computer is appropriate scope for an accessibility tool
reply
deleted
reply
Isn’t that just deliberate on their part? As in, they genuinely don’t want developers to use these APIs and just allow them for accessibility use cases.
reply
deleted
reply
deleted
reply
Gradually improve? How many more decades is reasonable to wait? They are what they are and hoping for change makes no sense to me.
reply
Thanks for sharing this. The "phase out the broadly-scoped Accessibility API and replace with narrower permissions" point is exactly the right structural fix. Right now developers have to declare a permission far broader than they actually need, and from the outside the criteria for what counts as legitimate use isn't clearly defined. Interesting that your iOS app got through but macOS didn't. WhisperPad is Mac-only and I haven't gone through the iOS path, so your experience there is useful data. The "demonstrable accessibility" criterion seems to be where everything bottlenecks.
reply