Queries are very slow in local vespa

Background Vespa has the index and the attribute concept for indexing exemplified in the schema below

schema doc {
  document doc {
   field license type string {
      indexing: summary | attribute
   }
   field title type string {
      indexing: summary | index
   }
   }
   rank-profile my-ranking {
     first-phase { expression: nativeRank(title) }
   }
} 

The title field will be tokenized and stemmed because of the default match setting, the type of processing you expect for text matching. By default index fields are matched using text match (which is per unit/token/atom). Index fields can be searched in potentially sub-linear time due to inverted index structures.

The license field is defined without index, but with attribute. An attribute field will be in memory at all times. Default matching mode is different than for index and is geared toward exact matching, no tokenization, and no stemming. Also by default, there are no inverted index-like structures for attribute fields. One can add it by adding attribute:fast-search, at the cost of higher memory usage. This and slower overall indexing is the primary reason why attribute fields in Vespa per default do not use fast-search.

Now, on what this means for search performance. Let us assume we have 10M documents stored in a Vespa cluster using the simplified schema:

/search/?query=license:unk&yql=select id,license from doc where userQuery();&ranking=my-ranking

The above request searches the license attribute field, but since it has no inverted-like index structures, the search is linear with 10M documents. In other words, the search process will traverse all 10M documents, read the value of the license field from memory and compare it with the input query term, if the term matches the value, it is exposed to the ranking profile for ranking. This is not particularly efficient. If we change the attribute definition to also include fast-search, then Vespa will build inverted-index like data structures:

 field license type string {
      indexing: summary | attribute
      attribute:fast-search
 }

If we deploy this and follow the changes which require restart, our query will use the inverted-index structure. The search process then changes from just traversing linearly to looking up the term in the dictionary and traversing the posting list. If the term ‘unk’ occurs in less than 10M documents, the search becomes sub-linear and fewer than 10M documents are exposed to ranking.

The above was a simplified example, what if we have a more complex query searching multiple fields?

/search/?query=license:unk title:foo&yql=select id,license from doc where userQuery();&ranking=my-ranking

In the above example, we search using AND between license:unk and title need to contain the term foo (tokenized text match). In this case, (and others), the search process builds a query execution plan to how to efficiently match the query against the indexes. In the case where the license field does not have fast-search the query plan will assume that the term occurs in all documents (worst case). However, since we include a term that has an index defined, it can know the upper bound of the number of hits, and the overall search becomes sub-linear. However, have we used OR to combine license:unk with title:foo, the search would become linear as we are asking for either to have license:unk OR the token foo occurs in the title (logical disjunction).

How to debug matching and ranking performance of single queries?

Run a representative query with the intended ranking profile and look at the totalCount versus the number of documents searched. This information is provided in the result template (See coverage for number of documents search). The overall cost of a query is highly dependent on totalCount, higher totalCount means higher number of documents exposed to first phase ranking. If a query is slow but retrieves relatively few documents, it’s a clear indication that matching has touched a linear scan path. However, ranking complicates this, as the complexity of the first phase ranking profile also impacts performance. Vespa has a built-in ‘unranked’ ranking profile, which can be used to quantify matching versus ranking performance. If using unranked rank-profile one can find the lower bound of search performance, without any first phase ranking, meaning we can debug the matching performance alone.

Query tracing:

The content nodes query plan of a query can be inspected by adding &tracelevel=6&trace.timestamps=1 to the search request. The query blueprint from every content node involved in the query is then included with the traced response. In the blueprint query tree trace there will be a docid _limit which is the number of documents indexed (counting from 1), so for our case with 10M documents, it would be 10M +1. If the estHits on the top root of the query tree is equal to the docid_limit then the overall complexity is linear.

Query blueprint example query tracing example linear

In this example, the top root of the query tree estimates that the total number of hits will be equal to the docid_limit (which is the number of documents indexed). This indicates linear matching.

query tracing example sub-linear

In this example, the top root of the query tree estimates a much lower number of hits (due to the presence of a must-have term in the query tree which has either index or attribute:fast-search. This then restricts the query to match fewer documents than indexed and matching and ranking becomes sub-linear.

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top