Azure Doc Intelligence charges $1.50 for 1000 pages. Was that an annual/recurring license?
Would you mind sharing your OCR model? I'm using Azure for now, as I want to focus on building the functionality first, but would later opt for a local model.
Try just GLM-OCR if you want to get started quickly. It has good layout quality, good text recognition quality, and they actually tested setup on Apple Silicon laptops. It works easily out-of-the-box without the long yak shaving setups I encountered with some other models I tried. Chandra is even more accurate on text but its layout bounding boxes are worse and it runs very slowly unless you can set up batched inference with vLLM on CUDA. (I tried to get batching to run with vllm-mlx so it could work entirely on macOS, but a day spent shaving the yak with Claude Opus's help went nowhere.)
If you just want to transcribe documents, you can also try end-to-end models like olmOCR 2. I need pipeline models that expose inner details of document layout because I need to segment and restructure page contents for further processing. The end-to-end models just "magically" turn page scans into complete Markdown or HTML documents, which is more convenient for some uses but not mine.