Breaking Vendor Lock-In in a Live Production System

A real-world migration off a black-box mobility platform.

Context

I joined an early-stage, VC-backed tech mobility company as its first technical leader.

The company operated daily transportation services using a third-party Transportation Management System (TMS). This system was white-labeled at the interface level only.

The provider:

did not provide access to source code,
did not provide access to databases,
did not provide infrastructure control,
did not provide technical documentation,
did not allow customization beyond standard client configuration.

The company interacted with the platform only through the same interfaces available to any external client.

Internally, the company managed routes, drivers, customer support, and operations, but did not own or control the software systems enabling those activities.

The vendor system was therefore:

operationally usable,
technically opaque,
and structurally rigid with respect to the company’s specific use cases.

At the time I joined, the organization had no internal technical DNA. The focus was entirely operational and business-driven, with no existing engineering culture, internal systems ownership, or long-term technical roadmap.

The company was positioning itself as a growth-oriented technology platform, with the objective of owning its technology as a differentiating asset over time.

Scale at Entry

At the beginning of the transition:

~10 bus lines
~100 daily riders

By the end of the migration:

100+ bus lines
~1,000 daily riders

The system scaled during the in-housing effort.

Constraints

Live production system with daily bookings
No acceptable downtime
No acceptable data loss
No access to vendor source code, databases, or infrastructure
No vendor-side customization or documentation
No internal backend or infrastructure to extend
Engineering team hired from scratch:
- 5 engineers total
- new country, new hiring market
- budget constrained to one senior engineer initially
No dedicated infrastructure or operations engineering roles

The system could not pause. The organization could not absorb shock.

Objective

Progressively internalize the core technical stack while maintaining uninterrupted operations.

The effort was treated as sequenced in-housing, not a rewrite.

Execution Overview

The transition unfolded over ~18 months through incremental steps, each designed to reduce dependency without increasing operational risk.

Phase 1 — System, Business, and Landscape Understanding

The first phase focused on building a complete understanding of the environment before making structural changes.

This included:

analyzing how the third-party TMS was used in practice,
identifying rigidity points caused by the lack of customization,
understanding the business model and go-to-market strategy,
reviewing fundraising materials and positioning,
studying the competitive landscape to understand:
what comparable players owned internally,
what they outsourced,
and how platform ownership evolved as they scaled.

Because no code, data, or documentation was accessible from the provider, this analysis was necessarily external and behavioral, based on:

system outputs,
operational workflows,
and observed constraints.

Based on this analysis, the decision to progressively in-house the platform was made and sequenced.

No development occurred during this phase.

Phase 2 — Owning the Rider Application

The first system to be rebuilt internally was the rider mobile application.

Although the backend logic and data remained external at this stage, this was the first part of the stack to become fully owned by the company.

This served two purposes:

establishing internal ownership over a production-critical surface,
creating early internal momentum by demonstrating that ownership was possible without disrupting operations.

The application remained compatible with the external provider interfaces (API endpoints).

Phase 3 — Internal Authority Over Identity, Users, and Access

Internal backend services were introduced for:

authentication,
authorization,
user management,
booking creation.

From this point:

Users were created internally first, then propagated to the external TMS through its API.

Authentication and authorization decisions were made internally first, then validated again on the provider side to satisfy legacy execution constraints.

Bookings followed the same pattern:

created internally,
then created on the provider system for execution.

Internal systems replicated:

the data layer,
and the business logic inferred from observed behavior of the external platform.

The provider system continued to execute operational flows, but internal systems became authoritative for identity, user state, and access control.

Payments, wallets and promotion logic were subsequently internalized.

This resulted in:

internal authority over identity, users, access control, and pricing,
mirrored state and logic across internal and external systems,
progressive reduction of dependency without disrupting operations.

Phase 4 — Progressive Internal Adoption

Internal capabilities were exposed to teams incrementally using lightweight internal tooling.

Adoption followed a staged sequence:

customer support,
operations,
broader internal teams.

This minimized friction and avoided forcing workflow changes prematurely.

Custom back-office interfaces were introduced only after internal usage patterns stabilized.

Phase 5 — Parallel Operations for Core Operational Modules

Operational modules were migrated using parallel execution.

This included:

stops,
line and trip management,
supply and supplier management,
driver-related operational logic.

Two systems ran simultaneously:

the internal system,
the external TMS.

Routing rules determined which system handled a given request based on context and operation type.

Modules were migrated only once they could be unplugged without fallback.

Phase 6 — Driver Application In-Housing

The driver application was rebuilt after identity, authorization, booking, payment, and operational logic were already internalized.

This removed the final externally dependent surface of the platform.

Result

After ~18 months:

full ownership of rider and driver applications,
full ownership of identity, authorization, booking, payment, and operational logic,
dependency on the external TMS removed progressively,
zero downtime throughout the transition,
zero data loss,
operations scaled from ~10 to 100+ bus lines and from ~100 to ~1,000 daily riders without increasing operational cost.

The system evolved from reliance on an opaque third-party platform to an internally owned, interoperable ecosystem while remaining continuously operational.

Closing Note

This case illustrates how technical sovereignty can be established incrementally under live operational constraints even when no vendor code, data, or documentation is accessible, through careful sequencing, parallel systems, and controlled exposure.