I've been given vague tasks before, but “here’s a demo app, good luck” was a new one.
The assignment was simple on the surface: migrate data out of a legacy application. The catch was that there was almost no documentation, no architectural overview, and no explanation of how the data model mapped to real world actions in the application. All I had was access to the database schema and a working demo of the software.
At first glance, it seemed like the obvious place to start was the schema itself. Open the tables, inspect the columns, follow the foreign keys, and piece together how the system worked. That approach can get you part of the way there. But the problem with guessing from a schema is that tables rarely tell you why the data exists. They tell you how it is stored, not what behavior created it.
In simple systems that gap is manageable. In legacy systems it becomes a serious problem.
As applications grow over time, the likelihood that data ends up exactly where you would expect it becomes very low. Fields get repurposed. Tables accumulate edge cases. Business logic gets scattered across application code, stored procedures, and background jobs. What looks straightforward in a schema can actually represent a surprisingly complicated set of behaviors.
Just staring at tables and columns can get you moving, but once you reach the deeper parts of a legacy application, progress slows down quickly. I realized pretty early that guessing my way through the schema was going to be slow and error prone.
So I built some tooling to help.
I tend to learn by touching things, breaking them, and watching what happens. Some people are very good at logically stepping through schemas and relationships until the system clicks. My brain works differently. I need to see the full picture. When I click a button in an application and can immediately trace that action to database behavior, the system suddenly makes sense.
That realization led to a simple insight: behavior leaves a trail.
Every action in an application eventually turns into database queries. Creating a record, submitting a form, updating a field, all of it ends up as a series of reads and writes. If you can capture those queries as they happen, you can start mapping user intent to the underlying data model.
In other words, instead of asking what do these tables mean, you ask what queries fire when I do this action.
At the end of the day, most applications are variations of CRUD systems. The user interface may be complex, but the real behavior eventually shows up at the database layer. That is where the truth lives.
With that in mind, I built a small tool to watch the database while I interacted with the application.
Speed is always the goal in a data migration. The faster you can understand how the system works, the faster you can map the data confidently. The tool was meant to cut down the time spent guessing.
The concept was simple. The program continuously polls the MariaDB query log and displays queries in real time through a small Tkinter window. My assumption was that queries firing within roughly a second of each other likely belong to the same action, so they get grouped automatically.
For anything that falls outside that window but still seems related, I can manually select queries, group them together, give the group a name, and add notes. Screenshots can be attached if necessary. Everything gets stored locally in a JSON file.
The real value is the workflow it creates.
I perform an action in the application and watch the queries appear in real time. Then I save that group of queries as a named reference.
Create a patient. Save those queries.
Submit a form. Save those queries.
Update a record. Save those queries.
Over time, you build a catalog of how the application behaves under the hood. Instead of guessing which tables are involved in an operation, you can see exactly what the system touches. That makes data mapping far more confident and dramatically reduces the amount of guesswork involved in a migration.
It is also worth mentioning that the tool itself only took a couple hours to build. Claude handled most of the scaffolding.
Before modern AI tooling, something like this would probably have been a weekend project at minimum, and honestly I may not have built it at all for a single migration. The cost to benefit ratio would have been questionable.
That calculation has changed.
Being able to spin up small, purpose built tools quickly is now a real advantage. Instead of forcing yourself to work around limitations, you can build something tailored to the exact problem in front of you. In this case, that small investment of time made the migration process significantly faster and more reliable.
More importantly, it created a repeatable approach.
Whenever you encounter a black box system with little documentation, you can map behavior instead of guessing structure. Watch what the system does. Capture the queries. Build your understanding from real interactions instead of assumptions.
AI assisted development makes this kind of experimentation easier than it has ever been. We can quickly build tools that help us move closer to the goal and spend our time on the truly difficult parts of the problem, rather than writing boilerplate code that a machine can generate in seconds.
Comments (0)
Leave a Comment
No comments yet. Be the first to share your thoughts!