Why I Chose SQLite Over Core Data for Orchard
In the previous post, I wrote about the challenge of monitoring Mac health from inside the App Store sandbox — how Orchard collects real signals (memory pressure, disk space, running apps, launch and quit events) using only APIs Apple permits in a fully sandboxed environment. Once I’d established what data Orchard could collect, the next question was obvious: where does it all go?
Every non-trivial Apple app eventually confronts that question, and for most developers, Core Data is the default answer. It’s deeply integrated into Xcode, it has decades of community knowledge behind it, and Apple continues to invest in it. So why did I walk away from it for Orchard?
The short answer: Core Data was designed for object graphs. Orchard needs a time-series log.
The Shape of Orchard’s Data
To understand the choice, you need to understand what Orchard is actually doing underneath. Every 15 to 30 minutes, the app captures a snapshot of your Mac: memory pressure state, free disk space, which apps are running, overall system state. It also logs discrete events — every app launch and quit — in real time via NSWorkspace notifications.
Over a week of normal use, that’s thousands of rows across a handful of tables. None of it is a complex object graph. It’s append-only event data, with occasional reads to compute aggregates. That’s exactly what SQLite was built for.
The schema reflects this directly. Most of it is a handful of flat, append-only tables: an app launch/quit event log (mac_app_events), periodic system snapshots (mac_usage_snapshots), the installed-app inventory (apps), and the recommendation engine’s outputs (compatibility_checks). The high-volume, privacy-sensitive tables — the event log and the snapshots — stay entirely local; a separate, smaller set describes the devices you own and the synthesised summary that syncs. No relationships to navigate, no managed object contexts to reason about.
Why Core Data Wasn’t the Right Fit
Core Data is excellent at what it does — managing the lifecycle of persistent objects, handling relationships, powering list UIs with fetched results controllers. But Orchard’s access patterns are almost entirely analytical. I’m not loading objects to display or edit. I’m asking questions like: What was the average number of concurrent apps running over the last 30 days? How frequently did memory pressure hit critical in the past two weeks?
Writing those queries in Core Data means fighting the abstraction. You end up in NSPredicate territory, wrestling with fetch request configuration, and often pulling more data into memory than you intended. SQLite with direct queries is more natural for this workload — and significantly easier to reason about when something goes wrong.
Schema migrations were also a concern. Core Data migrations can be fragile, particularly as a schema evolves across multiple app versions in the wild. SQLite migrations, handled explicitly, are predictable. I know exactly what’s happening.
The Point-Free SQLite Library
Rather than using raw SQLite calls, I’m building on the Point-Free SQLiteData library, which includes a SyncEngine for iCloud. This is what makes the CloudKit integration tractable without writing a custom sync layer from scratch — and without touching CKRecord, CKShare, or CKOperation directly.
The library lets me define my schema with the @Table macro in Swift and write type-safe queries via StructuredQueries. All CloudKit record mapping, conflict resolution, and share management is handled by the SyncEngine internally.
The privacy architecture falls out of this naturally. Raw event data — every app launch, every memory pressure reading — stays in local SQLite on your Mac and never leaves it. Only synthesised summary data crosses to iCloud — chiefly a device_summary row carrying aggregated health metrics and the current upgrade verdict, alongside basic device-inventory records. The iOS companion app (planned for January 2027) will read from this summary. Your granular event log is never uploaded.
Honesty as an Architectural Constraint
One design decision that shaped the schema more than I expected: the compatibility engine’s three-bucket model.
Every installed app is classified as confirmed compatible (✓), confirmed incompatible (⚠), or unverified (?). The verdict is conservative — only confirmed signals drive Safe, Caution, or Wait — but unknowns are never hidden. They’re surfaced honestly, ranked by how frequently you actually use each app. If you launch Final Cut Pro every day and Orchard can’t confirm it’s compatible with macOS 27, the verdict reasoning says so explicitly: “You use Final Cut Pro daily, but Orchard can’t confirm it’s compatible — check with the developer before upgrading.”
This matters for the schema because compatibility_checks needs to store not just the verdict, but the reasoning — including the count of unverified apps and the specific heavy-use unknowns that were flagged. Honest output requires honest data.
The Practical Outcome
The analysis engine reads from mac_usage_snapshots and mac_app_events to compute the aggregate metrics that feed the recommendation engine. The recommendation engine produces its Safe / Caution / Wait verdict and writes it to compatibility_checks. Everything flows in one direction, the tables have clear owners, and the queries are straightforward SQL.
None of that requires an object graph. It requires good data design and the right tool for the job.
If you’re building something similar — monitoring tools, health apps, anything with an event log at its core — I’d encourage you to start with the same question I did: is this an object graph, or is it a series of facts over time? The answer should drive the choice.
Next up: the Orchard Report — turning all this local data into a single-page PDF you can hand to an Apple Store employee when you’re choosing your next Mac.
It launches this fall alongside macOS 27. Follow along at theorchard.app, where you can “Follow the Build” and sign up to be notified about early access this Summer!
Thanks for reading.