Context: Feast has some really useful documentation around how point-in-time joins work.

For model training, you want to get all the offline features and real-time features that are used in production.

WyvernAPI provides the get_historical_features function to retrieve a number of features that correspond to a specific entity (or set of entities) at the time a user request happened. It not only covers the batch features that correspond to the entity (or entity set), which is what feast’s get_historical_features does, but also covers the realt time features that are logged by Wyvern.

For example, let’s say we had a user request like this:

{
    "request_id": "example_request_id".
    "api_source": "/api/v1/product-search-ranking"
    "candidates": [{"product_id": "p_1"}, {"product_id": "p_2"}],
		"user": {"user_id": "u_1"},
		"query": {"query": "chocolate"}
}

The request data in a notebook may look like this, and all of this information would be supplied to get_historical_features:

Input Dataframe (entities):

timestamprequestproductbranduserquerywas_clickedwas_ordered
2023-07-07T22:01:00example_request_idp_1b_1u_1chocolate00
2023-07-07T22:01:00example_request_idp_2b_1u_1chocolate10

The goal of the get_historical_features call is to retrieve all of the features that were available to the machine learning model at the time of the above request. Specifically for this example input, it retrieves all requested product, user, query, and combination features, as they were at July 7th, 2023, 10:01pm

Besides this input dataframe, the list of features has to be passed to the get_historical_features call as well.