Vibe Coding without the Vibes - AI Driven Development
Recently a colleague recommended Anthropic’s Claude Code to me. I’ve been using AI for a while now; mostly to accelerate smaller tasks such as cronjobs, python scripts, and so on. Essentially, it had replaced Stack Overflow for those small tasks that I’d normally have to do a quick Google refresher on. What I hadn’t figured out was how to apply AI to larger, more structured projects; the kind that have become popular under the banner of “Vibe-Driven Development” on LinkedIn. When I heard about Claude, I figured I’d give it a shot at a task that I knew how to do - what best practices are, what to watch out for, and had a few moving parts, but was not a huge challenge.
Want to just see the final product?
The Task
I need a cheap, easy, and user-friendly way to send and receive files to clients and other small businesses who don’t already have such a solution. It should look professional, and be secure enough I’m not worried about breaking an NDA, but doesn’t need TS/SCI levels of security. It also needs to be cheap - ideally, sitting in the AWS free tier, since it’ll only get used a few times per month. Now there’s a ton of great resources online on how to do this; in my head I had settled on: Static website in an S3 bucket, API Gateway with a Cognito authorizer for user management, and lambdas to generate pre-signed S3 URLs. You can google this; I think about half the internet has done it before. However, I couldn’t find anyone who actually open-sourced the full E2E solution in an easy package.
The AI
As I mentioned; I wanted to put Claude to the task, but I needed to sign up. I first tried with my company’s email, using my company’s phone for the process. Well, my company’s number was rejected; I’m guessing because it’s a VoIP line, not a standard cellphone. Oh well, I next tried my personal number.
And was instantly banned, before I could even log in.
I’m really not sure what happened. Maybe they didn’t like my email - claude@mschultz.consulting - or maybe the fact I tried my VoIP line first. Either way, that was a dead end. I called up my wife and asked her to sign up with her GMail and cell number while she was waiting for her drink at Starbucks.
And she was instantly banned too.
Anthropic doesn’t seem to like shared IPs, as far as I could tell from other’s experiences online. At this point, I wasn’t sure how I could even use it. Then I checked, Claude Sonnet 4.5 had just gone live on Bedrock, and Claude Code supports AWS Bedrock. Excellent, I went ahead and created a Bedrock API key per the docs and was off to the races. Kind of. It turns out, unless you have a pretty high spend with AWS, they have very low default limits in terms of requests per minute - 2. And it’s not adjustable via service quotas. They say you can request a rate limit via a support ticket; I’m still waiting to hear back well over a week later.
So knowing that this was going to take a while, I went ahead and got started.
The Setup
I really wanted to see what Agentic AI could do if given free reign. However, if it went off the rails, I didn’t want an unexpected AWS bill in the mail either, so I started with some guardrails:
A new VM was spun up entirely for this purpose
I created a new GitLab project for it to use, and a project access token with ‘Owner’ rights/API scope rather than a personal token
Installed ntfy.sh to get async updates on my main laptop
I also configured the GitLab MCP connection - later, this was removed because the auth kept needing to be refreshed and it was a hassle.
Turns out, Claude was surprisingly good at interacting with the GitLab API directly.
The Vibes
I figured, let’s treat this thing like an intern or junior developer. I have a lot of experience leading inexperienced devs, and wanted to see how it performed in this scenario. Spoiler alert; actually turned out quite well. I explained my use case in detail, and what I thought I wanted, and asked for it’s thoughts and opinions. It came up with some minor suggestions, such as a mono-repo vs a split N-Tier repo, which I accepted. Then, I asked it to generate a list of tasks it thought needed to be done in order to get the task fully complete. During this process, I went back and forth with it on acceptable limitations, such as no admin users, backend access via S3 API directly, etc - and what would constitute “Done”.
I then told it to generate all of the GitLab issues necessary. It did so quite well, each ticket was well defined and it was broken up into reasonable pieces, if not exactly how I would’ve done it. But I wanted to see what it came up with, not just a really long way around using the GitLab API.
Once the task was broken down into it’s component parts, I had it go through each ticket individually and perform the tasks needed. For each ticket, it was instructed to, via CLAUDE.md:
Check out main
Pull down latest
Cut a new branch
Do work, including testing
Push to origin
Submit a MR
Notify and wait for review
Now, if this seems like I’m having an AI doing standard SDLC, you’re right. One thing I realized is that by going about this process, it gives you a ton of benefits. One, you can pause it at any time to course correct. I often gave it slight updates due to small missed requirements or misunderstandings made sure that I kept it on the rails. This also meant that I was able to fully review the code before it was deployed, ensuring any major security vulnerabilities were fixed. Also I realized that the AI worked incredibly well with this flow, almost like it was trained on it.
Funny how that works.
Regardless, aside from minor corrections and deviations, it honestly performed the task admirably. Once the app was deployed and… just worked as expected, I was floored. Usually I’d have to go in and manually correct a bunch of just blatantly wrong things; in this case, the only things that were wrong were missed as original requirements. Scope creep hits the best of us, I guess. That being said, it wasn’t entirely smooth sailing.
The Struggles
I want to be clear; a lot of how this went was influenced by the fact I have literal decades of experience as a software engineer, cloud engineer, and systems architect. I’m not sure what would happen if you just put “Create me a file transfer app” into Claude, and probably won’t spend the money to find out. However, I was able to treat it like a fellow SWE, and it worked out well, but there were several pretty painful points:
When writing the GitLab CI; I often had to step in and paste necessary info. It really didn’t know how to use the GitLab managed Tofu state; and after pointing it to the container repository and documentation it figured it out. But that took quite a lot of time and tokens compared to pasting from Stack Overflow.
Continually forgetting where it is in the filesystem. This was actually a much bigger deal in retrospect than I realized; about 1/4th of all filesystem requests failed and had to be re-done. Considering the API rate limits, it really took a toll on both how long it took and how expensive this was.
It hallucinated a few times; the one that comes to mind was when running a security audit, it kept spitting out path traversal attacks. However, as the backend was S3 through the Python3 boto API, these weren’t really exploitable. It never could exploit the vulnerability, but kept insisting it was there, and I kept in the checks just in case I missed something.
The API rate limits are rough. Really rough. If you want to use Claude, do your best to go through Anthropic directly.
The Result
Overall, I’m pretty happy with this experience. It’s really opened my eyes to how powerful Agentic AI can be in this space; especially when working on a task with a known problem space. It was interesting working on a client’s request while having this run on my personal laptop in the background, it did greatly accelerate my productivity. I took about a week to do the full task, working on it in the background, around 12-13 hours total. That’s honestly what it would’ve taken me to do this task myself without any help, but in the meantime, I was helping clients, researching, or reading a book - it probably only took 2-3 hours of actual effort. I keep repeating myself, but those rate limits really slowed this down.
For anyone curious about the cost, it came out to $17.67 for my AWS Bedrock bill. Not terrible, but also much higher than I expected. It’s about double what was reported via /cost in Claude Code; and I have no explanation - Bedrock’s cost is roughly the same as Anthropics per mtok. Still, essentially 18$ for the experience and having a file portal so I don’t need to encrypt then email sensitive documents is absolutely worth it.
This experiment reinforced that AI agents can really change the flow of how a developer or team might interact — fast, flexible, but still needing guidance. In a way, that’s the essence of what I’d call Vibe-Driven Development: trusting the AI to take initiative, while keeping a steady human hand on the wheel. Assuming you have the right people at the helm tools like this will continue to be transformative and also incredibly helpful.