Streaming Catalog Overview

Last updated
Save as PDF

Streaming Catalog

The Streaming Catalog allows customers to send new, updated catalog items (product, category, content, and so on) in real-time to Personalization Cloud. The streaming catalog determines what has changed and only those changes are sent out for updates. This means pricing and other time-sensitive information can be updated in seconds. The first consumer of the Streaming Catalog is FIND. All information will now be indexed in near real-time.

Screen Shot streaming arch 2019-09-06 at 4.15.39 PM.png

The diagram below provides an overview and highlights of the key services of Streaming Catalog. The catalog items from the Streaming-ingest service are consumed by the streaming engine, which validates and transforms data as required. The streaming engine sends out only the deltas (changes) to the “out”, which is then replicated to the front-end data centers for Find to index.

Screen Shot streaming catalog 2019-09-06 at 4.23.11 PM.png

The key services of Streaming Catalog include:

Streaming-property service
- Allows to define custom attributes with any supported data type
- Allows to set search attributes
Streaming-engine
- Validates and transforms items
- Only sends “deltas” (changes) to the streaming.engine.out
Find consumes the items that are ready on the “out” topic in real time on the front-end data centers.
Legacy Catalog Adaptor consumes items that are ready on the "out" topic.

Using Streaming Catalog API

Summary of steps on how to use the new streaming catalog.

1. Create Property Definition Collections

The first step is to create property definition collections (streaming-property service) for each item type (product, category, and region). A property definition is the schema for an attribute and must be defined before creating a snapshot. Each item type should have its own property definition collection.

When creating a property definition collection, canonical and derived properties will automatically be created in the new property definition collection. Custom properties can be added to the property definition collection. Any supported data type (that is, integer, string, and so on) can be assigned to a custom property.

When a property definition collection is first created it is in the "creating" state. In this state, users are allowed to modify a property definition. But, once a definition is "published", users cannot modify a property definition. Property definitions can be added when in the "published" or "creating" states.

The "product" property definition collection includes Find search attribute settings. These attributes must be set by using the streaming-property API and not by using the portal.
To create a snapshot, property definition collections must be in the "published" state.

2. Create Snapshots

Next step is to create a product snapshot, which is an instance of the catalog. If regions are required, a place snapshot is created. It is recommended to create a place snapshot with the default "region" property definition collection. The product snapshot is referenced when sending product or category information through the streaming catalog. A place snapshot is used for managing region metadata such as currency or location. Region items need to be in the place snapshot if region over-rides are to be used in the product catalog.

A product snapshot must be associated to "published" product and category definition collections. A place snapshot needs to be associated to a region property definition collection.

There are four states for snapshots: Creating, Complete, Active, and Archive.

The "creating" state is for testing and making modifications as required. Items that are ingested into a "creating" snapshot only go to the view store and do not get ingested into Find or the Legacy Catalog (Recommend, Discover).

The "Complete" state requires "published" property definition collections. Ingested items will go to the Legacy Catalog (Postgres) and are indexed in real-time in the non-production Find index. It is recommended that when data is ready to be picked up by the Legacy Catalog Adaptor (Recommend), a new snapshot should be created and its state should be changed to "Complete".

The "Active" state is considered the production version of the catalog. For Find, all information will be indexed in the production Find indexes. If there is an existing "Active" snapshot, it will be archived when a new snapshot is activated.

3. Add Catalog Items

The third step is to add catalog items (that is, product, categories) using the streaming-ingest service. The snapshot ID is referenced, and the state of the snapshot determines if an item goes to the Legacy Catalog (Postgres database - Recommend) and/or where it is indexed by Find.

Please refer to the documentation for further details.

4. Streaming Status Service

The streaming-status service allows users to see the status of a transaction. A status is provided for each service in the streaming catalog. For example, by making a call to the streaming status service using the tracking ID, a status will be provided for each service that was involved in processing the item. The status for each service includes status on how many items were processed and the time it took in milliseconds.

5. Streaming View Service

The streaming-view service will display the ingested items that have been successfully processed into the catalog.

Detailed information on streaming view service can be found in the documentation.