NetWatchly - From Vibe Coding to Structured AI Development

NetWatchly is a simple website monitoring service. Nothing fancy. But the real story is not the product itself - it is how it was built.

How AI Got Me

I was at QCon London 2026 and most of the talks were great. But I got sick of all that AI hype. AI buzzwords were everywhere, even where they didn't make any sense. For example, I was at a talk about eBPF, which is a really cool technology. The presentation was great, but the only thing related to AI was in the demo. They demonstrated how eBPF can be used for pre and post processing of commands like making an HTTP request. And instead of processing simple HTTP call made with curl, they processed the call AI agent makes to its API. "Enough" to justify the AI buzzword in the description.

But after a while, I could not stand aside and watch the world change. I had to admit - it is a revolution. So I decided to give AI-driven development a try.

I have spent more than twenty years building software in Java. This time around, I wanted to try something different - to see if I could build a real product with AI doing all the implementation work, at least from a coding perspective. Also, I wanted to explore the magic world of agents, skills, and tools. Previously, I had used AI coding agents for simple tasks, but never for an entire project, and I didn't know how to organize and structure the process. So I decided to experiment and see how it would go.

From the start, I set one rule: I would not manually write application code. Instead, I was going to play the role of a product manager, maybe a team lead - writing specifications and overseeing the implementation process. I might manually test from time to time, but I would not write code myself.

I chose technologies I barely knew: Go for the backend and React/TypeScript for the frontend. There were two reasons for that. First, I wanted to see how far I could get without even looking at the code. I knew that if I chose Java, I would be tempted to jump in and change the code myself. Second, almost equally important, I had to optimize resource usage since I decided to host on Railway on a minimal budget. It seems that the happy days of developers not caring much about resource utilization and memory footprint are coming to an end. GPUs and memory became surprisingly expensive once humanity discovered it could use AI to generate infinite amounts of dancing chicken videos. So I had to choose technologies that are not too resource hungry. Go and React/TypeScript (served as static assets) seemed like great choices for that.

The Honeymoon Phase: Vibe Coding

The first phase was surprisingly chaotic and extremely productive. Using GitHub Copilot and mostly Claude Sonnet models, I could describe features in a few sentences and receive working implementations within minutes. Monitoring logic, dashboards, authentication, APIs - features appeared faster than I could properly validate them. I was not even instructing the AI to write tests, I didn't have any code checks, no reviews, nothing - just me confirming the code works. And those were the happy days, when I was just adding features and not worrying about quality, security, or maintainability. But it was exciting to see how quickly I could build something with the help of AI.

The Reality Check: Why Vibe Coding Doesn't Scale

At a certain point, I realized I needed a far more structured approach. The project was growing, the codebase was getting bigger and more complex, and inconsistent implementations started appearing. Some features were implemented with tests, some without. Some had good error handling, some didn't. Some decisions were genuinely wrong. To give one example, one page already loaded 24 hours of monitoring data by default, and the UI only needed additional buttons for zooming into the last 1, 6, or 12 hours. Instead of filtering the already loaded data on the frontend, the implementation introduced separate API endpoints and additional database queries for every time range. The feature worked correctly, but it clearly showed that technically working solutions are not always the right architectural solutions. Over time it became obvious that the AI agent needed much more than simple prompts in order to deliver reliable code consistently. A larger and more complex codebase required careful planning, implementation, reviews, testing, and architectural oversight. It also required the AI agent to have deep knowledge of the project itself. In other words, the structured process became a must. Vibe coding was not working well anymore.

Incidentally, around the same time, token usage for the Sonnet model in GitHub Copilot skyrocketed. I managed to burn through my entire monthly budget in several days without realizing it. So I decided to move from GitHub Copilot to Claude Code.

Enforcing Order: My SDD Workflow

I started by documenting the current state of the project: architecture, functional and non-functional requirements, basic coding guidelines, etc. Then I invested quite some time in writing instructions, skills, and agents, etc. I referenced the documentation I created so it became part of what the AI agent is aware of at all times. This effort required a lot of back and forth until I got it right - quite exhausting, but worth it. Finally, I introduced a structured implementation workflow based on Specification-Driven Development (SDD).

Instead of starting with prompts, development started with specifications: feature requirements, acceptance criteria, test plans, and so on. I usually invest way more time in writing specifications than it takes to implement the feature. But it is a good investment, especially on large projects - like writing good documentation, it takes time, but it pays off in the long run.

Because the entire codebase was AI-generated, I introduced strict CI-like static code validation: automated tests, coverage checks, linters, vulnerability scanning, CodeQL analysis, Trivy, and multiple review phases. The more autonomy the AI had, the more important it became to have good gatekeepers.

My implementation workflow now includes careful planning, implementation, code reviews, UI testing, pull request creation, static code validation, review feedback, fixes, and merge approval. As part of the workflow, GitHub Copilot reviews the PR. And that is one of the parts I like the most - watching different AI agents collaborate through pull requests. One model generates changes, another reviews them and points out problems. Then the coding agent analyzes the feedback and makes necessary adjustments. The entire workflow often looks like a human collaboration, and I would just supervise the process and jump in only when something looked suspicious.

This phase was quite challenging. I had to understand how Claude Code works, how to write good instructions, and what agents, skills, rules, and memory files are. I had to experiment a lot, try different approaches, and fail many times until I got it right. It was not as exciting as vibe coding, but I learned a lot. Looking back, this was where the real work happened.

TL;DR How Did I Improve the Process?

I improved my process in several ways:

Documented the project: Provided permanent project context for the AI.
Enforced code quality: Added instructions for coding, reviewing, testing, and security.
Added static validation: Integrated linters, CodeQL, and vulnerability scanners.
Defined an SDD workflow: Structured the pipeline into phases like planning, coding, UI testing, and PR workflows.
Automated code reviews: Used GitHub Copilot to review Claude Code's pull requests.

For more technical details about the process, check out my Tech Story.

What's Next: The New (and Boring) Reality of Software Engineering

After I set up the process, the excitement faded a bit. I got the sense that the magic was gone, replaced by boring reality - looking at the console, overseeing the process, pressing buttons here and there. And I'm afraid that might be the future of software development: writing specifications, overseeing the process, reviewing the code. But I'm bad at predicting the future, so chances are high that I'm wrong. Nobody knows what will happen a few years from now, not to mention a decade.

If you are curious about AI-assisted development, explore NetWatchly, send feedback, or suggest ideas. I am still experimenting and improving both the product and development workflows.

Support this AI Experiment

NetWatchly is free and will stay free while the project evolves.

If the tool is useful to you, you can support further development with a donation.

Buy me a beer →