Data migration is one of the highest risk activities in software development. You are taking data that a business depends on, transforming it, and moving it to a new system, often while the old system is still running. Get it right and nobody notices. Get it wrong and you lose records, break workflows, violate compliance requirements, and potentially shut down operations.
We have migrated databases with tens of millions of rows, moved data between incompatible systems, merged data from multiple sources into unified schemas, and done all of it without losing a single record. The process is not glamorous, but it is critical, and the difference between success and disaster comes down to planning.
Why Migrations Fail
Before we cover how to do it right, let us look at why migrations go wrong. The failure modes are predictable.
Inadequate data profiling. Teams start migrating without fully understanding the source data. They discover halfway through that 15% of records have null values in fields that the target system requires. Or that dates are stored in 4 different formats across the dataset. Or that "state" sometimes contains full state names and sometimes contains abbreviations. Every data quality issue you discover during migration instead of before migration adds time and risk.
No validation framework. If you cannot prove that the migrated data matches the source data, you do not know if the migration succeeded. "It looks right" is not validation. You need automated, row level verification that accounts for every record and every field transformation.
Missing rollback plan. If the migration fails at step 7 of 10, what happens? Can you reverse it? How long does the rollback take? Is the source system still intact? Teams that do not plan for failure end up in a state where both the old and new systems contain partial data and neither is reliable.
Underestimating transformation complexity. Moving data from one system to a structurally identical system is straightforward. That almost never happens. Real migrations involve schema changes, data type conversions, business rule changes, record merging, deduplication, and mapping between different data models. Each transformation is a potential point of failure.
Not accounting for data created during migration. If your migration takes 6 hours and the source system is still accepting new data during that window, what happens to the records created after your migration snapshot? This is the "delta problem," and ignoring it means data loss.
Phase 1: Discovery and Profiling
Every successful migration starts with understanding what you are actually moving. This phase typically takes 1 to 2 weeks for a medium complexity migration, and it is time well spent.
Profile the source data comprehensively. For every table and every field, document: the data type, the range of values, the percentage of null or empty values, the distribution of values, and any patterns or anomalies. Run queries like `SELECT COUNT(*), COUNT(DISTINCT field), MIN(field), MAX(field), COUNT(*) FILTER (WHERE field IS NULL)` across every column. This gives you a statistical fingerprint of your data.
Document relationships and dependencies. Foreign keys are the obvious ones, but many systems have implicit relationships, records linked by matching email addresses or business identifiers rather than formal foreign key constraints. Map every relationship, explicit and implicit. If you migrate table A without its related records in table B, you have broken referential integrity.
Identify data quality issues early. Duplicates, orphaned records, inconsistent formats, values that violate business rules in the new system, all of these need to be cataloged and addressed before migration. Do you clean the data in the source system, during migration, or in the target system? Each approach has trade offs, and the decision needs to be deliberate.
Establish record counts as your baseline. Before migration, you must know exactly how many records exist in every table. After migration, you compare. If the counts do not match and you cannot account for every difference, the migration is not complete. This is your most basic validation checkpoint, and we have seen teams skip it.
Phase 2: Design the Migration Strategy
With a clear picture of the source data, you can design the migration approach.
Choose your migration pattern. The three primary patterns are:
1. Big bang migration: Everything moves at once during a scheduled downtime window. Simpler to execute but requires downtime and leaves no room for error during the cutover.
2. Trickle migration (phased): Data moves in batches over days or weeks while both systems run. More complex but reduces risk and minimizes downtime.
3. Parallel run: Both systems operate simultaneously with data synchronized between them for a validation period before cutting over to the new system. Most expensive but lowest risk for mission critical data.
For most applications we build, we use a hybrid approach: migrate historical data in batches beforehand, then run a final delta migration during a short maintenance window. This minimizes downtime while keeping the cutover simple.
Design the transformation layer. Every field mapping between source and target needs to be documented and coded. Create explicit transformation functions for each mapping: date format conversions, enum value changes, name parsing, address normalization, currency conversions. Each function should be unit tested independently. A solid database schema design in the target system makes these transformations much cleaner.
Plan for the delta. If you cannot freeze the source system during migration, you need a strategy for capturing changes. Common approaches include: change data capture (CDC) using database triggers or log tailing, timestamp based incremental sync (migrate records modified after the initial snapshot), or a short freeze window where you block writes, run the delta migration, and then cut over.
Phase 3: Build and Test
This is where the migration pipeline gets built, tested, and validated repeatedly before it touches production data.
Build the migration pipeline as code. Your migration should be a repeatable, automated process, not a series of manual SQL scripts run by hand. Use a scripting language (Python, Node.js, or even well structured SQL scripts) that can be run, rerun, and audited. Version control everything.
Create a validation suite. At minimum, validate: total record counts per table, checksums or hash values for critical fields, referential integrity between related tables, business rule compliance in the target system (no required fields left null, no values outside expected ranges), and specific spot checks on known complex records.
Run the full migration on a staging copy at least 3 times. The first run reveals mapping errors and data quality issues you missed. The second run validates your fixes. The third run confirms timing and gives you a realistic execution time estimate. Three runs is the minimum. Five is better for high risk migrations.
Document the execution runbook. Step by step instructions for the production migration, including: pre migration checks, migration execution commands, validation queries, rollback procedures, and sign off criteria. The runbook should be detailed enough that someone who did not build the pipeline can execute it. This aligns with the kind of operational discipline we bring to ongoing system management.
Phase 4: Execute and Validate
Production migration day follows the runbook precisely.
Take a full backup of the target system before starting. This is your rollback safety net. Verify the backup is restorable. A backup you have not tested is not a backup, it is a hope. We covered backup strategies in detail in our database backup strategy guide.
Run pre migration validation. Confirm source record counts, verify the target system is clean, confirm the backup completed, and verify that all stakeholders are aware of the migration window.
Execute the migration and monitor actively. Watch for errors, track progress against expected timing, and have the rollback procedure ready. If something unexpected happens in the first 10% of the migration, stop and evaluate rather than pushing through.
Run post migration validation immediately. Record counts match? Checksums match? Referential integrity intact? Business rules satisfied? Critical records spot checked? Only after validation passes should you declare the migration complete.
Monitor the target system for 48 to 72 hours after cutover. Some issues only surface when real users interact with the migrated data. Edge cases in the data that your validation did not cover, performance issues from different data distributions, or application bugs triggered by data format differences.
Common Pitfalls to Avoid
Do not migrate directly in production without staging rehearsals. This is the most dangerous shortcut and we have seen it go wrong catastrophically.
Do not rely on the application layer for data integrity. Validate at the database level. Applications have bugs. Database constraints do not lie.
Do not ignore character encoding. UTF 8, Latin 1, and Windows 1252 all look similar until they do not. Special characters, accented names, and emoji will break if encoding is not handled explicitly.
Do not forget about generated IDs. If the source system uses auto increment IDs and the target system does too, you will have ID conflicts. Either map old IDs to new IDs with a lookup table or use a separate identifier namespace.
Do not skip the rollback test. Practice the rollback at least once on staging. Knowing you can undo the migration gives you the confidence to proceed and the safety net if you need it.
Data migration is methodical, detail oriented work. It is not exciting, but it is the kind of engineering discipline that separates professional development teams from freelancers who wing it. If you are planning a migration and want a team that treats your data with the seriousness it deserves, reach out to us. We do not lose records.