At Loadsmart, Elasticsearch is a mission-critical resource that powers search functionality across multiple teams and applications. From helping shippers find available loads to enabling internal teams to query operational data efficiently, it plays a central role in delivering fast, reliable access to the data our business depends on.
This post walks you through our migration from AWS Elasticsearch to Elastic Cloud, the “why”, the challenges, the wins, and the lessons we wish we’d known earlier.
Back in 2021, Elastic shifted its software offerings from Apache 2.0 to dual-licensed under Server Side Public License (SSPL) and the Elastic License. With this move, a significant software suite, including Elasticsearch and Kibana, was changed to a more restrictive license model.
AWS responded by forking the codebase and launching OpenSearch.
That put Loadsmart and many other companies at a crossroads:
We chose the third option, and here’s why:
After extensive research and careful review, Elastic Cloud checked the right boxes:
✅ Maintain compatibility with our existing Python clients and APIs.
✅ Access to newer versions offering security patches, bug fixes, and performance improvements.
✅ A fully managed service maintained by the core developers of Elasticsearch Supported by public benchmarks, there’s a clear and significant performance gap between Elasticsearch and OpenSearch, particularly at scale.
✅ OpenSearch only seemed cheaper because we ran it on a legacy Elasticsearch engine; moving to the official fork would roughly double resource needs, an 8 GB Elastic Cloud node would need about 16 GB on OpenSearch, erasing any savings, as Elastic Cloud’s pricing calculator confirms.
This cross-team effort between the Platform team and Product Engineering teams was centered on migrating production and staging Elasticsearch clusters without disrupting users. We deliberately excluded major refactoring, OpenSearch support, or early adoption of Elasticsearch 8, choosing instead to focus on a safe, incremental shift.
The Platform team engaged with the Product Engineering teams, set shared expectations, and kept everyone informed via Slack updates and weekly check-ins. Clear success metrics helped us track progress across environments.
The migration strategy emphasized automation and rollback capabilities. We used Elastic Curator to take hourly snapshots and built robust Python workflows for restoration, validation, and data consistency checks.
Instead of manual checks, we built automated validations to compare document counts, field mappings, and shard layouts across clusters. Although Elasticsearch snapshots already ensure consistency, we added these extra controls to catch any writes that might slip in during the cutover. If anything was off, our pre-migration snapshots were ready for a rollback–a safety we didn’t end up needing, but one that gave us confidence throughout the process.
We strictly followed a playbook:
The queue grew briefly as the system continued reading from the legacy cluster. Once the secret was updated to point to the new cluster, writes seamlessly shifted over. No downtime, no issues. This smooth transition was possible because the application uses an asynchronous routine to write to the cluster, allowing it to buffer and retry without blocking the main flow.
After migrating the search service to the new Elastic Cloud cluster, we saw immediate and measurable improvements. Most notably, the p50 latency (median response time) for one of our highest-traffic endpoints dropped dramatically.
As shown in the chart below, before the migration on March 20, the p50 latency typically hovered around 90ms. After the migration, and especially following the resolution of an audit log issue in early April, latency dropped significantly to around 40ms, with reduced variability and improved stability.
This performance gain is a direct result of the improved infrastructure and optimized query execution on the new cluster. The migration not only ensured continuity with no downtime but also delivered a tangible improvement in request performance.
But the benefits go beyond just speed:
Migrating to Elastic Cloud was more than a version upgrade–it was a strategic move forward in stability, performance, and operational simplicity. By moving off AWS Elasticsearch and onto Elastic’s managed service, we positioned ourselves to scale smarter, innovate faster, and operate more securely.
Throughout the migration, we focused on automation, cross-team coordination, and risk mitigation. The results? Zero downtime, significantly improved latency, and a future-proofed foundation for more advanced use cases like vector search and real-time observability.
We are excited to take advantage of the newest features. Just as important, we’re carrying forward the lessons from this migration, which will shape how we approach future projects, system upgrades, and architectural evolution.
The migration highlighted several best practices and lessons that proved critical to its success:
If you faced similar or different challenges when choosing Elasticsearch options, we'd love to hear from you and share knowledge.
Curator and index lifecycle management | Elastic docs