Like many fast-growing engineering organizations, our microservices architecture evolved organically over the years. What started as a deliberate move away from a monolith to enable team autonomy and faster deployments had grown into a sprawling ecosystem of services.
Several factors prompted us to take action:
The issue wasn't about having too many services, but rather which ones we could safely consolidate or eliminate.
Rather than relying on intuition or anecdotal evidence, we developed a data-driven scoring system to evaluate each service objectively. Our primary goal was to establish an initial filter using a "decommissioning probability score" to help us determine which services to address first.
We collected three categories of metrics for each service over the last year (2024):
Usage metrics
Cost Metrics
Maintenance Metrics
There are several other metrics that could be used, like # of deployments, # of incidents, and the percentage of out-of-date dependencies, among others; however, we decided to adhere to the original list as it is more suitable for our context.
Before applying our scoring formula, we normalized all raw metric values to a 0-1 interval to ensure fair comparison across vastly different scales.
We used min-max normalization across our entire service portfolio: normalized_value = (value - min_value) / (max_value - min_value).
However, these metrics had opposite relationships to decommissioning probability. For Total Cost, higher values directly indicated candidates
for removal - expensive services with low returns were prime targets. For the Usage and Maintenance metrics, the logic was inverted:
higher values indicated a healthy, actively-used service that should not be decommissioned. Therefore, we applied 1 - normalized_value
to these three metrics, ensuring that low activity translated to high decommissioning scores.
This inversion was critical - a service with minimal traffic and few code changes should score high for removal, while a high-traffic, actively
maintained service should score low.
We then applied the following score for each metric:
We combined all costs into a single metric because our main goal is service usage rather than cost reduction.
Finally, we applied the following decommissioning score formula for each service:
Decommissioning Score = (0.30 × Total Cost) + (0.20 × # PRs merged) + (0.30 × # of web requests received) + (0.20 * # of messages processed).We defined a score greater than 80 as indicating a high likelihood of decommissioning the service. A score greater than 50 suggests that further investigation is warranted, while scores below that threshold are not considered significant.
The scoring system identified 8% of candidate services as highly likely, with 44% warranting further investigation.
Even after applying the initial score as a filter, a critical analysis was still lacking: product features in those services. Is the feature that the service is supposed to deliver still in use? Is it still relevant for our customers? Do we have any plans to leverage it in the future?
We engaged in various research activities to collect insights from Product Managers and Stakeholders. Additionally, a thorough technical assessment of the service was conducted and properly documented. This process eliminated some more services, resulting in 16 out of 45 services identified for decommissioning.
We implemented the following strategy to decommission the remaining services:
For deprecated services:
We have decommissioned 12 out of 44 services, with 4 remaining to be decommissioned later. This results in a 29% reduction in services for one team and a 37% reduction for another.
In terms of savings, we estimated the following costs:
The biggest takeaway: architecture reviews should be a regular, scheduled practice - not something we do when complexity becomes painful.
It's tempting to look back and label the creation of these services as "over-engineering." That would be incorrect and unfair to the engineers who made those decisions.
When these services were created, they addressed real problems:
The lesson: Good architectural decisions can become wrong architectural decisions as context changes. This isn't failure — it's evolution.
Software architecture isn't "done". It requires ongoing attention and optimization, just like code refactoring. Without this project, our complexity would have continued growing linearly while our ability to manage it grew sub-linearly — a recipe for future technical debt and reduced competitiveness.
We learned that:
This project was just the first step. We plan to decommission the remaining four services, evolve this work, and make it a regular part of our engineering culture.
Reducing our microservices complexity was more than a cost-saving exercise — it was a strategic investment in our engineering organization's future effectiveness. By approaching the problem systematically with data-driven scoring, careful validation, and phased execution, we reduced complexity while maintaining system reliability.
The most important lesson? Architecture, like code, requires continuous refactoring. The services we decommissioned weren't mistakes — they were correct decisions that had outlived their usefulness. Recognizing when to evolve or eliminate architectural patterns is just as important as knowing when to introduce them.
Have you gone through a similar architecture consolidation project? What metrics did you find most valuable? I'd love to hear about your experiences in the comments.
Like to solve challenges like this one? We have many open positions at the moment. Check out our engineering culture and the careers page.