Doesn't look very expensive to me. An LLM capable of this level of summarization can run in ~12GB of GPU-connected RAM, and only needs that while it's running a prompt.
The cheapest small LLMs (GPT-4.1 Nano, Google Gemini 1.5 Flash 8B) cost less than 1/100th of a cent per prompt because they are cheap to run.
Yes! And also, Apple loves selling expensive hardware and has zero shyness asking people to pay a few thousand bucks to buy into part of their ecosystem.
They could easily offer an on-prem family 'AI' product that you plop in your house and plug into your router, and does all AI processing for the whole family, and uses a secure VPN to connect to any of your devices outside the LAN.
If such a product delivered JUST what this guy's cool hack provides, and made Siri not a stupid piece of sh*t for my family, I'd buy it for $1999 even if I knew it cost Apple $700 to make.