Idea to App Store in 72 Hours: Building Needlebook
App development in the age of AI
9 min read
Earlier today I launched needlebook, an app for needlepoint creators to track their projects and connect with others. The complete development process, from the initial idea to iOS App Store approval, took just under three days. In this post I'll share how I did it!
The Idea
My girlfriend is an avid needlepointer, and in my unlimited freetime during my garden leave I wanted to build something she could use for her hobby. She walked me through her process -- picking threads, tracking projects, finding new canvases -- and I figured there was a niche here for an app.
The initial requirements were straightforward:
- Project tracking, including photos and threads used
- A way to keep track of her thread stash, and which ones are low or missing for her projects
- A social feed to share projects and find inspiration
With this in mind I got to work!
The Tech
I consider myself an old-school developer, and it's personally taken me quite a while to warm up to the ideas of no-code and low-code development.
Subjecting me to some ridicule from my friends, I built the Market Hours engine from scratch in C++ just for the thrill. Don't @ me.
For needlebook, though, I decided to take the plunge and see how far I could get with agentic, low-code tools. On the AI front, I settled on a stack of:
- Claude Opus 4.6 for initial prototyping in Claude Chat, and miscellaneous development tasks using Claude Code.
- GPT-5.4 Codex for code generation and debugging.
- Gemini Pro 3.1 for asset generation, including the app icon and splash screen.
Day One
I worked with Claude to quickly outline a spec for the app based on my initial feature set. The iteration was rapid and felt conversational -- it essentially involved filling in gaps in a markdown document until I had something that felt like it could build an MVP. All in all, I was ready to jump to development within an hour.
Initial Development
I picked the tech stack up front:
- React Native for the app buildout
- Expo to speed up development and avoid the hassle of native builds while prototyping
- Self-hosted Supabase for the backend, including authentication and database
- Jenkins for CI/CD, also self-hosted on my own hardware
- DigitalOcean for hosting the backend (I've been a user for over a decade and love dumping my projects onto their droplets)
From there, I switched to Claude Code to start building the app. My first prompt was essentially just "build the app in the spec". Within an hour I had a working prototype I could run on a simulator. This is where I've always seen the power of AI code generation -- no one enjoys writing boilerplate code, and being able to generate a working scaffold of the app in minutes was fantastic. Alongside the front-end code, it also generated a suite of migrations to set up the database schema.
I wanted to take it a level further, though. In the world of agentic programming, I figured I would let Claude attempt to set up the back-end integration itself too. I provisioned a sandboxed Linux environment on my DigitalOcean droplet and gave Claude full access (bypass those permissions, baby!). I prompted it to set up Supabase on my server, run the migrations, and connect it to the app. It was honestly a little scary watching it rapid-fire commands on my server, but this task took it just a few minutes to complete, and I had a working back-end hooked up to the app by the end of it. Insane!
While the prompts were running, I spent time doing the annoying peripheral work of setting up the app in App Store Connect, setting up SSO with Google and Apple, and configuring the Jenkins pipeline. I actually gave Claude Cowork's computer use feature a shot at setting up Jenkins for me, but unfortunately the tech isn't quite there yet -- I got impatient after watching it fail to scroll down in my browser for 20 minutes and just set it up myself.
Asset Generation
I wanted a clean look for the app, and I'm as far from a designer as one can be. I initially worked with Claude and ChatGPT for icon generation, but the results were poor and iterating felt like pulling teeth. Maybe I should've predicted this due to the success of their recent models, but Google Gemini Pro 3.1 crushed it first try. Iterating on the style and making minor adjustments was a breeze. I had my app icon ready to go in half an hour.
Day Two
With the basic app built out, I dove into fixing bugs and refining the user experience. The code generated by Claude was solid, but certainly not perfect. I found myself in a rhythm of collecting 3-4 issues, giving them in bullets to Claude Code, and letting it generate fixes for them while I scouted for more.
Wrangling Multiple Agents
As the app shaped into something more usable, I also began tracking features I wanted to add. When I knew the fixes Claude was working on in a session were in a different domain, I'd spin up a new one for the new feature. At my peak I probably had 5-6 different sessions concurrently working on different features and bug fixes. It was a little overwhelming at times, but the workflow felt natural after a while.
I also quickly realized that Claude and GPT have different strengths.
Claude is fantastic at UI/UX. The interfaces it created all felt very friendly and 'human' with no better way to explain it. It's slow, however, and was prone to making mistakes. I had to hand-hold it through a lot of the development.
GPT, on the other hand, was a monster when it came to complex logic and database queries. It would absolutely blaze through any task I gave it, and the code was usually solid enough that I could just skim it and trust it to work. I really appreciated its frequent use of web searches to verify documentation and find solutions to errors it encountered. My main gripe with GPT is that the UI-focused tasks I gave it resulted in very 'robotic' interfaces.
CI/CD Pipeline
With the app in a pretty good place, I turned my attention to CI/CD. I already had the skeleton of a pipeline created, and I was hoping I could wire in automated builds and TestFlight deployment relatively quickly (as I'd previously done with Market Hours).
I use my headless Mac Mini for building and deploying iOS apps, and configured it as the agent for the Jenkins pipeline. I set up a GitHub webhook to trigger the pipeline on pushes to the main branch, and let it run. With Expo as the backbone of the app, I figured the build process would be dead simple after checking out the code:
eas build --platform ios --profile production --local --output ./build.ipa --non-interactiveeas submit --platform ios --build ./build.ipa --non-interactive
Of course, it didn't prove to be that easy. I had to wrangle some of eas's dependencies on the Mac Mini, and it took quite a while to actually get a clean build.
Once I got the build working, I entered Apple credentials hell. I was told that Fastlane was supposed to make all this stress go away but it ended up being just as bad.
An extremely frustrating issue I continuously run into is Apple's codesigning tools not working in a headless environment with local trust stores. I ended up frankensteining a solution I had used for prior projects where I manually import Apple's root CAs and create a temporary keychain for the build process. It really shouldn't be this hard.
Debugging my Kernel?
When I finally resolved the codesigning issues, I immediately ran into issues using eas submit to upload the build to App Store Connect. The upload was continuously stalling at 0% with no error messages, despite the same operation working fine from my MacBook Pro. I attempted switching to a manual Apple-esque approach of using xcrun altool to upload the build, but after getting cryptic errors (Trace/BPT Trap: 5 anyone?) I decided to let Codex have a run at debugging.
After a prolonged round of tests, experiments and analysis, Codex discovered that the issue was isolated to eas's use of node-fetch for the upload process, which uses HTTP/1.1. Somehow, large uploads (>6MB) were stalling indefinitely on PUT requests (but not POST!) on the Mac Mini, and not on my MacBook Pro. An even deeper dive (I literally had Codex build a testing environment to drill into this) revealed the issue was a strange interaction with the RACK TCP loss detection algorithm in the latest versions of macOS. The solution was manually setting net.inet.tcp.rack=0 to disable RACK, which resolved the issue.
For anyone curious, this was happening on an M4 Mac Mini running on macOS 26.2.
Day Three
With the TestFlight upload working I gave my girlfriend an early build to test out. She was thrilled to see her ideas come to life, but she found no less than 50 bugs and missing features in the app within the first hour of testing. Seriously, I just counted them.
Crushing Bugs
I locked in for the next few hours and followed the multiple-agent workflow I described earlier to grind through the issues. Claude was perfect for the minor UI/UX tweaks, but I ended up relying on Codex for the more expansive features and complex bugs. With my CI/CD pipeline working, it was easy for me to push fixes and get her new builds to mess with.
Launch!
I'd be lying if I said I'm not still getting a slow trickle of bugs and feature requests from my girlfriend right now, but once I was at a point where I felt the app was stable enough for a wider audience, I submitted it for App Store review. To my surprise, it entered the review process almost immediately and was approved an hour later. You can check it out here!
Final Thoughts
Building needlebook was a really fun way to learn the ins and outs of agentic programming. I was especially impressed by the agents' ability to SSH into my server and set up the back-end with minimal guidance, which is a use case I hadn't seen before. The CI/CD process was definitely the most painful part of the build, but it was a great learning experience, and ideally we see a patch in the Darwin TCP stack ASAP!