Maptype — fast doc-value lookups for map data in Elasticsearch
Guenther Starnberger, Tech Lead
- Jul 30, 2019
The ability to quickly serve search results is essential for Yelp. Ranking performance has a significant impact on the time it takes to process a search request, and it’s crucial for fast ranking that we can quickly look up the data that’s fed into our machine learning models. Per-document map-like data is especially challenging in this regard, as traditional approaches often read and decode the full map only to look up a single value within that map later on. MapType is a custom Elasticsearch datatype that provides an optimized look-up approach for such data.
Let’s start by discussing a concrete use case using map-like lookups: For each stored business that offers delivery, we want to store a mapping from the geobox that the business delivers to, to the average time it takes to complete a delivery to that geobox. This allows us to provide the average delivery time for a particular address as input to our ranking model. Figure 1 shows how that data is stored in the index.
One main concern regarding map storage is how to find an efficient storage format that can quickly search data for a particular map key. In this blog post, we’ll start with a short introduction into how Elasticsearch internally stores data, and then further discuss how maptype enables low-latency lookups for this particular type of use case.
While a regular inverted index maps from a term to a document id (encoded in the postings list), doc-values map from document id to doc-value of that document. Internally, doc-values are stored in a column-store on a per-segment basis. As an example, let’s see how the doc value for the rating of a business is represented:
Due to the columnar property of doc-values, the actual physical layout looks like [5, 3.5, 4.2] — with each doc-value for a particular field sorted by document id and placed next to each other.
Internally, Lucene uses a set of compression techniques to reduce the amount of storage required for doc-values. For example, in the case of numeric doc-values, Lucene tries to find a common denominator by which it can divide all of the values, and only encodes the offset from one number to the next, further reducing the amount of bits required to store that number.
Serializing maps for efficient lookups
Our first high-level goal is to find an approach that allows us to serialize the map data-structure into a field that we can store as doc-value. One important characteristic of our use case is that we usually only look up one key per deserialized map. Because of this, using a traditional encoding format such as JSON, Protocol Buffers, or Avro is not a good choice, as they would deserialize the whole map, only to have to look up single elements later on.
Instead, we would use a format similar to FlatBuffers or Cap’n Proto, as they support “zero-copy” deserialization. Instead of first decoding the serialized data into memory, they directly execute operations on the serialized data. For example, in the case of a map lookup, only the accessed element would have to be serialized, thereby considerably improving performance in cases where only a single element is required.
Our first prototype was implemented based on Cap’n Proto, but two issues convinced us that a custom implementation would be a better choice. First, we would need to pre-compile our schema. This works well if we know beforehand which types we want to support, but doesn’t work so well if we want to allow users to choose the types during index creation. Second, we have a very specific use case for which we can optimize our format, compared to general encoding formats that need to support a much wider range of use cases.
As a consequence, we implemented our own serialization format from scratch, but reused some of Elasticsearch’s functionality to, e.g., efficiently encode a number in VLong or ZLong format. The next sections describe our format and implementation.
The maptype format is based on multiple layers:
The bottom layer provides an efficient way to store variable length byte-arrays in a list and to enable random-access lookups of individual list items.
The middle layer uses two lists to implement a map data-structure from a variable length byte-array to another variable length byte-array: The first list stores the keys of the map in sorted order and the second stores the values of the map in the order that corresponds to the sorted key.
The top layer allows us to serialize and deserialize custom data-types (such as integer values and geohashes) from and to the underlying byte-arrays that are used as keys and values in the lower layers.
Encoding a list of variable-length byte arrays
There are two cases for encoding a list of potentially variable-length elements. In the first case, all variable elements are the same size, so we have a header section (as depicted in Figure 2) that stores (i) the total number of elements and (ii) the number of bits required by each element. This information allows us to calculate the offset value when we want to address an element with a given index.
The second case, as shown in Figure 3, is more complex: If elements have variable lengths, we cannot calculate the offset position based on the index and size of each element. Instead, we must use an array of fixed size pointers that point to the last byte of each element. While the pointers are a fixed size, we use the minimum number of bits required to represent the last pointer. For example, if the last pointer points to byte 700, we would use 10 bits per pointer (which allows us to represent numbers up to 1024). We point to the last byte of each element instead of the first since it’s implicitly known that the first byte of the first element starts at the position right after the pointers.
To simplify the handling of this logic in our code implementation subclasses, we
AbstractList<byte>, which allows us to use Java Collections
functionalities such as Collections.binarySearch.
Encoding a map of variable-length byte arrays keys to variable-length byte array values
The next layer uses two instances of our
to provide a map interface. The basic idea is that we can store sorted keys in
one list and values in the other in an order that corresponds to the order of
If we want to look up an element, we first search for the key via a binary search in the key list. If we find the key, we look up the value with the same index as the key and return it.
Mapping byte arrays to custom data types
The top layer allows us to map between types such as strings, geoboxes, and byte arrays used as the key and value type in the underlying map. This can best be explained with a concrete example:
Let’s assume the key consists of two elements: A geohash and an integer number representing the time of day (as hour from 0-23). The value also consists of two elements: The time a delivery takes to a geohash at that particular time, as well as the average rating (from 1-5) for that delivery. Using JSON during indexing and query-time, here’s how an instance of this map could look:
We immediately notice two things:
First, although the key of the map contains two different elements, it is represented as a single string. This is required, as we need to be able to encode the map as JSON during index and query time. However, on an internal level, the two elements are split and encoded as their respective types.
Second, we assign a label to the elements in the value, representing the value as something similar to a struct or namedtuple. This simplifies the usage of the data-type, as fields are available under a descriptive label instead of a random offset. Internally, the offset added by the labels is negligible: The values are still stored concatenated to each other in a byte array, but the order of the values is given by the sorted labels.
Putting it all together — ES maptype plugin
The implementation of the ES maptype plugin takes all of the encoding logic described in the previous section and wraps it inside an Elasticsearch plugin that stores the encoded byte array as a doc value.
Here’s an example schema for a maptype encoded field:
To index values to that field, we can directly post the JSON-encoded version as shown in the previous section.
If we want to access the average delivery time for geobox
9qdex at 10 am in
a painless script, we can access that value via the following statement:
As our underlying code only implements a Map interface without decoding all of
the values into a, e.g., HashMap, this statement only needs to decode the values
for the given key: First
9qdex|10 is split into the geohash
the number 10, then both values are serialized into bytes with their respective
encoders. Afterwards, we look up those serialized bytes via a binary search and
— if they exist — find the bytes for the corresponding key and deserialize the
key with the respective decoder for each type.
MapType is currently in production use at Yelp and an open-source release of the plug-in is in the works. If you want to learn more about MapType or the Yelp ranking platform in general, we’ll be hosting a meetup on August 1, 2019 at Yelp HQ in San Francisco: https://www.meetup.com/Elasticsearch-San-Francisco/events/263053170/
Join us to build the Ranking Platform at Yelp
We are building a reliable and standardized platform for search and ranking applications at Yelp. If you are curious to learn more, Apply here!View Job