How Uber handles millions of ride/food requests efficiently part 1

Meg's tech corner
The Startup
Published in
6 min readOct 25, 2021

--

A detailed introduction to Uber’s fulfillment architecture

Photo by Priscilla Du Preez on Unsplash

As described in [1], fulfillment service is to “capture a consumer’s intent and fulfill it by matching it with the right set of providers”. For example, one possible customers’ intent is to travel from one location to another location and providers are the available drivers close to the customer. The ultimate goal of the fulfillment service is to efficiently find available nearby drivers for users.

In this series of two blogs, we will look deep into how Uber architectures its fulfillment service and scales it as number of users grow. In this blog, we will look at the data modeling and architecture of the previous generation of the fulfillment service whereas in the next blog we will talk about why and how Uber migrated the fulfillment service to Google’s cloud Spanner as its users grow.

Second part of the blog can be found here: https://medium.com/nerd-for-tech/how-uber-handles-millions-of-ride-food-requests-efficiently-part-2-270f84d2c3c0

High level overview

High level architecture, source: [1]

The figure above shows the high level design of the previous generation of Uber’s fulfillment service. At its core are two data models: supply entity and trip entity. Trip entity captures a unit of work, such as traveling from one place to another place while the supply entity describes a person who can fulfill the work.

Trip entity

Trip entity is modeled as a list of waypoints where a waypoint consists of a location and a set of tasks we can perform at the location. Below is an example definition of the Trip Entity.

Disclaimer: the code in this blog is how I would implement Uber’s fulfillment services based on information in [1] and [2]. I don’t work for Uber and I don’t know how it is actually implemented.

enum Task {
PICK_UP
DROP_OFF
}
class Location {
long longitude;
long latitude;
}
class WayPoint {
Location location;
Task task;
}
class Trip {
List<WayPoint> points;
}
// A simple trip can contain two WayPoints, one with PICK_UP task
// and one with DROP_OFF task.

Supply entity

Supply entity represents a list of tasks the driver needs to fulfill. For example, if a driver is driving a customer to their destination and has accepted a request to pick up a new customer, the supply entity for the driver will have two WayPoints, one for DROP_OFF the current customer and one for PICK_UP the new customer.

Trip and supply services

Trip and supply entity are managed by trip service and supply service correspondingly. Two services are share-nothing micro-services and synchronization of data happens at the storage level. Check the infrastructure section below for more information.

Geospatial index

Geospatial index stores the location information of the drivers and customers. It is used to match a customer with the drivers, that is, to find nearby available drivers for users. Geospatial index is at the core of the matching process and therefore its performance is critical.

For Uber, a geolocation encoding called H3 [2] is used. As shown in the figure below, the whole world is divided into a list of hexagons, called cells. Each hexagon cell is assigned a unique identifier, in the form of a string.

H3 divides the map into hexagon cells, source: [2]

H3 library provides functions to efficiently convert a location (longitude and latitude) to H3 cell identifier and convert the H3 cell identifier back to the location. [3]. Below is an example invocation of the H3 library.

function example() {
const lat = 37.7955;
const lng = -122.3937;
const res = 10;
return h3.geoToH3(lat, lng, res);
}
// output: "8a283082a677fff"

With geospatial index, we don’t need any special database support to store or retrieve data. Therefore any database can satisfy the functional requirement, (but not necessarily the performance!). If we use a relational database, we could define the following schema to store the driver information. location field stores the h3 cell identifier.

create table driver (
driver_id INT NOT NULL,
given_name VARCHAR(100),
surname VARCHAR(100),
location VARCHAR(30),
available BOOLEAN,
...
);
create index driver_by_location_and_availability on table driver (location, available);

When the location of the driver is updated, the supply entity service implementation can be as simple as a database update statement to update the location.

Matching

Matching is the process to find available drivers for a customer. With geospatial encoding, the matching processes only requires two steps.

Step 1: find cells of interest

Figure on the right shows the user and the region of interest

Suppose the user is on the Market street and we want to search all the available drivers in the region enclosed by the red circle. What we will do is first call the h3.geoToH3() function to get the cell identifier of the user’s location. Then we call h3.kRing() to find the identifiers of the 6 hexagons that are adjacent to the cell containing the user. (The definition of the kRing() function is shown below.) In total we will have 7 strings which covers the area inside the red circle, and we will store it in an array called cells_of_interest.

Definition of kRing function:

List<String> kRing(String origin, int k);

Step 2: search the database

We could simply use a database query to find the matching drivers

SELECT driver_id
FROM driver USING INDEX (driver_by_location_and_availability)
WHERE available AND location IN cells_of_interest;

Infrastructure

Uber infrastructure, source: [1]

The figure above shows the infrastructure of the fulfillment service. It has three major components, services to run application logic, database to store data and transaction manager to ensure consistency.

Uber uses Pod [6] to run its share-nothing micro-services. Consistent hashing is used to shard the work across different service instances. Consistent hashing also improves the hit ratio of the in-memory cache. Besides in-memory cache, Redis provides another layer of caching.

The data is stored in the Cassandra database [7], a NoSQL database. Given the volume of Uber’s data and update rate, NoSQL would better meet its performance requirements. Meanwhile, the service also writes replay log in Kafka. Replay log records what changes are made to the database. For example, one record could note that the location of the driver changes from “axxx” to “ayyy”. If the writes to the database fails, we can simply rerun the commands stored in Kafka to bring the database to a consistent state.

Two mechanisms work together to implement transactions on top of the NoSQL database. A serial queue on each service instance is used to order the incoming requests and Saga [5] is used to implement distributed transaction. Distributed transaction is needed when we need to update multiple entries atomically. For example, when a driver accepts an order, we need to update the driver’s supply entity and user’s trip entity inside a single transaction. Otherwise the database can be left in an inconsistent state, on the user’s side, the request is accepted by the driver but the on driver’s side, the request is not accepted.

Continue reading

Second part of the blog, which covers how Uber scales the fulfillment service with Google’s cloud Spanner database, can be found here: https://medium.com/nerd-for-tech/how-uber-handles-millions-of-ride-food-requests-efficiently-part-2-270f84d2c3c0

References

[1] https://eng.uber.com/fulfillment-platform-rearchitecture/

[2] https://eng.uber.com/h3/

[3] https://h3geo.org/docs/

[4] https://h3geo.org/docs/api/traversal

[5] https://camel.apache.org/components/latest/eips/saga-eip.html

[6] https://kubernetes.io/docs/concepts/workloads/pods/

[7] https://cassandra.apache.org/_/index.html

--

--