Surviving the Rewrite - Managing Risk and AI Memory Loss in Large-Scale Development

Surviving the Rewrite - Managing Risk and AI Memory Loss in Large-Scale Development

TL;DR: I recently undertook a project that terrifies most engineers: rewriting a massive, critical infrastructure automation tool from scratch. I moved from legacy Bash to Python without writing a single line of manual code - relying entirely on AI agents. Here is how I managed the risk, the architecture, and the “memory loss” of LLMs to build a production-grade tool. The Stakes This wasn’t a simple CRUD app. This tool manages infrastructure for multiple teams. A logic error here doesn’t just throw a stack trace; it could wipe an entire environment or cause immediate customer impact. ...

December 18, 2025 · 5 min · Vignesh Ragupathy
Building AI for Observability with AWS Bedrock

Building AI for Observability with AWS Bedrock

Building AI for Observability with AWS Bedrock In my previous post, I wrote about closing the last mile of observability with AI . The core idea was simple: we already have plenty of metrics, logs, and traces, but the real challenge is turning them into insights and answers that engineers can act on. In that post, I highlighted two main gaps: Connector layer – bridging multiple observability tools like Prometheus, Thanos, Elastic, etc. Insight layer – going beyond raw queries to provide real context and recommendations. Now, I’ve been experimenting with AWS Bedrock , and it feels like a natural way to solve both layers. ...

September 4, 2025 · 2 min · Vignesh Ragupathy

Closing the Last Mile of Observability with AI

Over the years, observability has grown in ways I couldn’t have imagined when I first started working in this space. Thanks to OpenTelemetry, we now have a standard way to collect traces, metrics, and logs. Tools like Grafana, Prometheus, Jaeger and Elasticsearch make it easy to store and visualize that data. But here’s the truth I keep coming back to: Even with all the dashboards and alerts, something is still missing. ...

September 1, 2025 · 4 min · Vignesh Ragupathy