Skip to main content
RichRelevance

Product Catalog Feed

Overview

The catalog feed provides the Personalization Cloud with all product and category information needed to generate and display personalized recommendations. The feed is typically uploaded once per day to the Algonomy FTP server.

The catalog feed contains multiple files designed to make it easy to generate from your catalog database via Postgres commands, MySQL, or by standard export functionality provided by your ecommerce platform.

Uploading the Feed

Before scheduling the feed for regular uploading, you should first generate a sample feed. This should be provided to your Algonomy integration consultant for validation. The feed will be evaluated for syntactical correctness, but we are not able to validate for business correctness. Therefore it is important that you ensure that all URLs and other data are accurate.

Note: The files need to be saved in UTF-8 encoding without a BOM (byte order mark).

After your team has validated the feed content and structure, send the sample to the Algonomy integration team for processing. Once the test file has successfully been processed, the production file should be scheduled for upload on the agreed schedule, typically once per day in the early morning (12 am to 4 am).

The single compressed file containing all relevant exported files should be sent to the Algonomy SFTP server (your integration team will provide FTP credentials).

Note: The Catalog Feed should not contain any personally identifiable information and as such should not pose a security risk of any kind. 

Catalog Enrichment

As part of the October 2023 release, the platform supports the ability to enrich your catalog using Algonomy's proprietary natural language processing to create descriptive data about products in the form of product attributes and their values. You can find more information about the proprietary catalog enrichment feature here

Note: The synchronization of the catalog enrichment attributes with the Product catalog requires additional activation. Please contact the Algonomy support or your Algonomy consultant for more details.

OR

You can import a file (separate from the compressed Catalog Feed File) that has additional information about the products in the form of product attributes and values. The catalog enrichment file should be compressed (supports zip format) and sent to the Algonomy SFTP server.  Please reach out to your Algonomy representative or raise a support ticket to activate this feature for your site.

File Format Example

The compressed file should follow the pattern enrichment_full_sitename_YYYY_MM_DD.*.zip. You can download a zipped sample file here

This is an example of passing single-value attributes in the row-based file format:

product_id|attr_name|attr_value
100|size|s
100|color|red

This is an example of passing multi-valued attributes in individual rows (the list value delimiter is a period by default, but it can be customized. It cannot be the same as the column delimiter):

product_id|attr_name|attr_value
100|size|xl.l.m.s.xs
100|color|green.blue.red

For externally imported files, you should use the same 'delimiter' which is configured for regular Catalog feeds. By default,  '|'  separator is used.
Currently, the platform supports only row-based file format for enrichment attributes.

Note: Once the file is imported, the enrichment properties will be visible in the Accepted tab on the catalog enrichment page in the portal. By default, the platform rolls up attributes as 'Accepted' attributes. You can review the attributes, decline or move the accepted attributes to new ones. Any manual changes to the attributes will be reflected in the Product catalog when the next product feed file is processed by the platform.

Sample Catalog Feed Files

You can download a zipped sample file here. It contains all the required files plus the often-used attribute feed file and may be used by your IT or integration team as a starting point.

Feed Files

Feed File Details Filename pattern
Catalog Feed (compressed)  This is the compressed file that consists of all files listed below. This should be uploaded at least once a day, either as a .zip or .gz archive. required catalog_full_sitename_YYYY_MM_DD.zip
or
catalog_full_sitename_YYYY_MM_DD.gz
Product Feed File List of all products sold on your site that can be recommended and can have recommendations. required product_full_sitename_YYYY_MM_DD.txt
Category Feed File A hierarchical description of your catalog's category structure. required category_full_sitename_YYYY_MM_DD.txt
Product-Category Mapping Feed File A mapping of products to relevant categories. One product can map to more than one category. required product_in_category_sitename_YYYY_MM_DD.txt
Product Attribute Feed File Product attributes. product_attribute_sitename_YYYY_MM_DD.txt

Note: The Catalog Feed can be compressed as either a .zip or .gz file.

File Specification

See below for the specifications for each file.

The delimiter in the file must be a pipe (|) unless another delimiter is required. If, for some reason, you are not able to use a pipe as the delimiter, alert your Algonomy representative who will work with you to accommodate your specific needs.

Please take care to ensure that delimiters cannot occur within column values. If this happens, the row will appear to have one too many columns to the interpreter, an error will occur, and the feed will fail. 

Important: If the field is optional, leave it blank/null if you want to retain the default value listed in the table below. Only populate the field if you want to change the default value.

Note: The column headers are case-sensitive.

Note: The column headers need to be in the order as shown in the documentation.

Product Feed File

Filename: product_full_sitename_YYYY_MM_DD.txt

Name Type Required Definition
product_id ASCII Yes

Identifier for the product to be used for onsite JavaScript integration as well as in the RichRelevance Dashboard. Maximum length: 100 characters.

NOTE: If a product exists in a product_full file on day one as recommendable, but is omitted on day 2, the {rr} processor will automatically update this product's recommendability to FALSE.  Additionally, it will erase all of the attributes that may have been added in the first feed.

name Text Yes The name of the product as it will be displayed in recommendations on site and in the Personalization Cloud Dashboard. Max length: 255 characters. If missing, name is not updated if the product is already in the system.
product_parent_id ASCII No

Note: A Product Parent ID should only be provided if ensemble functionality is used. This is not a necessary feature for all merchants. For more information, you can read an explanation of ensemble functionality here.

Defines the parent of the child product. Multiple parents can be given by using the standard value separator. Default is ‘.’ For products that are not a part of an ensemble, the parent ID will be empty.

price Number Yes

The price for a product that is used for generating recommendations, price filters, and displaying in onsite recommendations. The price of the product as a real number.  Example: 8.99

recommendable Boolean YES

Is a product recommendable? This is often used to mark products that are out of stock as not recommendable. If the product is recommendable, use one of the following values (values are case-insensitive):

1, true, on, Y

If the product is not recommendable, use one of the following values (values are case insensitive):

0, false, off, N

Note: If your site uses region files, the product will only be recommended if recommendable is set to true in the product full feed file and in_stock is set to true in the product region feed file. If recommendable is set to false, that product will not be recommended no matter what in_stock is set to.

image_url Text Yes

The site URL used to display the product image in recommendations. Maximum length: 255 characters. We recommend submitting only the portion of the URL that constitutes the path from the domain name, as the domain can be inserted separately and this cuts down on filesize. If you must include the domain, we do recommend omitting the protocol to avoid security warnings on mixed http/https pages.

link_url Text Yes

The site URL used to link the product recommendation to the Item Page. Maximum length: 255 characters. We recommend submitting only the portion of the URL that constitutes the path from the domain name, as the domain can be inserted separately and this cuts down on filesize. If you must include the domain, we do recommend omitting the protocol to avoid security warnings on mixed http/https pages.

rating Number No

The overall rating for a product. Rating is a decimal value. If a previous value exists and a rating is not provided in the current feed, this value is not updated. Defaults to -1.0 (no rating).

num_reviews Integer No

The number of reviews available for a product. If not provided, the value is not updated. This must be an integer. Defaults to 0.

brand Text No

The product brand. Maximum length: 255 characters.

description Text No

To be used only for Find.

sale_price Number No

This is the sale price of the product. Only use this if a product is “on sale” (different than regular pricing). Merchandising rules will use this price if it is set in the feed.

  • If the value of sale_price is a valid value ("0" or "12.35" etc.), then the product is converted into cents and then stored in the catalog as that value. The previous value gets overwritten.
  • If the sale price isn’t set, merchandising rules use the list price instead.
  • If some products have sale prices, and others don’t, leave this field blank for products that don’t have a sale price.
  • To remove or reset a sale price, remove any value from this field. This will store the product sale_price as NULL value.
start_date Date No

Specifies the date that the product became available on the merchant’s site/catalog for purchase

  • Used by New Arrivals strategies, which recommend products that have been made available for purchase in the last few days (see Recommendation Strategies for more detail)
  • Format: YYYY-MM-DD
sale_price_min Number No

Minimum sale price. Example: 8.99

sale_price_max Number No

Maximum sale price. Example: 8.99.

list_price_min Number No

Minimum list price. Example: 8.99.

list_price_max Number No

Maximum list price. Example: 8.99.

Note: Price should be defined for Sale Price to be used. Price min/max is a method for handling SKU-level pricing. To use one or both of the ranges, the four min/max fields must be added to the feed in the correct order, even if only one of the ranges is in use.

Example

product_id|name|price|recommendable|image_url|link_url
0001|Brown Pea Coat|299.99|true|assets/images/brown-pea-coat-27765.jpg|products/brown-pea-coat-27765.html
0002|Blue Fedora|49.99|true|assets/images/blue-fedora-28873.jpg|products/blue-fedora-28873.html

Category Feed File

Filename: category_full_sitename_YYYY_MM_DD.txt required

Name Type Required? Definition
category_id ASCII Yes

Identifier for the category to be used for onsite JavaScript integration and in the RichRelevance Dashboard. Max length: 400 characters.

Note: Category ID cannot have more than one Parent ID.

parent_id ASCII Yes

Category ID of the parent category. This is used to build the category hierarchy which can be used to construct merchandising controls for groups of categories. If no parent, then leave the field empty. Max length: 400 characters.

name Text Yes

Name of the category to be displayed in recommendations and in the RichRelevance dashboard. This name should be shopper-friendly-some strategies display category names on your site. HTML Entity encode single and double quotes.

category_link_url Text No

The destination a user is sent to when clicking on a recommendation of a category.

category_image_url Text No

The category image URL.

Example

category_id|parent_id|name
1005||Men's
1006||Women's
2005|1005|Jeans
2006|1006|Jeans

Product-Category Mapping Feed File

Filename: product_in_category_sitename_YYYY_MM_DD.txt

Name Type Required? Definition
category_id ASCII Yes

Identifier for the category to be used for onsite JavaScript integration and in the RichRelevance Dashboard. Max length: 400 characters.

Note: Category ID cannot have more than one Parent ID.

product_id ASCII Yes

Identifier for the product to be used for onsite JavaScript integration as well as in the RichRelevance Dashboard. Maximum length: 100 characters.

Example

category_id|product_id
3005|0002
3007|0001

Product Attribute Feed File

Filename: product_attribute_sitename_YYYY_MM_DD.txt

Note: For sites with minimal attribute requirements, this file may be omitted.

The Product Attribute File allows users to specify values for attributes that can then be used to inform the recommendations made by Personalization Cloud. Attributes like region, language, and color can all be specified in this file.  

Users can pass custom compatibility lists as an attribute in the product attributes feed file. Product compatibility mapping (MCMP) allows users to define compatibility through attribute feed using MCMP.NAME. Since compatibility is a special attribute, it should be prefixed by "MCMP.", for example, MCMP.accessories. Here, accessories will be the name of the compatibility attribute.

The attribute values (and multi-valued attributes) can be listed in either a column-based format or a row-based format. The system automatically detects the format based on the header. It should be noted, however, that the column-based format does not support localization (the use of region/language).

Column-based format

Name Type Required? Definition
product_id ASCII Yes

Identifier for the product to be used for onsite JavaScript integration as well as in the RichRelevance Dashboard. Maximum length: 100 characters.

attribute.[attribute key1] Text Yes First unique attribute key
attribute.[attribute key 2] Text Yes Second distinct attribute key
     
attribute.[attribute key N] Text Yes Nth distinct attribute (N=number of distinct attribute keys)
Example
product_id|attribute.color|attribute.size|attribute.fabric
100|blue|medium|cotton
101|red|large|wool

Row-based format

Name Type Required? Definition
product_id ASCII Yes Identifier for the product as defined in the products file
attr_name text Yes Unique attribute key. HTML-entity encode single and double quotes.
attr_value text Yes Name of the attribute. HTML-entity encode single and double quotes.

Examples

This is an example of passing single-value attributes in the row-based file format:

product_id|attr_name|attr_value
100|size|s
100|color|red

This is an example of passing multi-valued attributes in individual rows (the list value delimiter is a period by default, but may be customized when creating the feed profile. It cannot be the same as the column delimiter):

product_id|attr_name|attr_value
100|size|xl.l.m.s.xs
100|color|green.blue.red

Multiple Attribute Values

A single attribute may take more than a single item as a value. Sometimes items may have more than one possible size, or color, for example. In cases like this, you can specify a list of values for the attribute.

The list must use a delimiter that does not exist in attribute values and should not be a list delimiter used elsewhere in the file. Currently, the default delimiter for lists is the period.  However, the list value delimiter may be changed to something else when creating the feed profile. Alert your Algonomy integration consultant if you believe this will be necessary.

Note: Because the default delimiter is a period, values sent as decimal numbers (prices, for example) will be misinterpreted as two integers instead of a single float if the delimiter is left at default. Be sure to discuss with your Algonomy integration consultant if you will be passing decimal numbers in your attribute feed.

Examples
Row-based:
product_id|attr_name|attr_value
100|color|green.blue.red
100|size|xl.l.m.s.xs
100|fabric|cotton
Column-based:
product_id|attribute.color|attribute.size|attribute.fabric
100|green.blue.red|xl.l.m.s.xs|cotton

Localized Attributes

You can signify that certain attributes are enabled only in the context of certain regions or certain languages. This is useful when, for example, you have regulatory attributes that differ across countries, or sales that run across different dates in different physical stores. By default, when there is a conflict between a region-dependent attribute and a language-dependent attribute, the region-dependent attribute takes precedence. If you have a specific need that requires language to take precedence, speak with your Algonomy representative about changing this setting. 

Localized attributes persist from feed upload to feed upload. If you want to erase one of these attributes, or remove its influence and allow the non-localized version of the attribute to take precedence in all regions, you can do so by including the attribute one more time with a localization_type of "-1"

Note: To leverage localized attributes in Find, the attribute first needs to be added without localization before it can be localized. In other words, it should have a value under the site default language or region, then you can add regionalized attributes. Referencing the second example below, this can be achieved by having a record such as:  100|on_sale|true|| prior to adding 100|on_sale|true|region|us.

Example
Row-Based Example with Localization 
product_id|attr_name|attr_value|localization_type|localization_value
100|on_sale|true|region|us
100|on_sale|false|region|de
100|on_sale|true|region|fr
200|has_documentation|true|language|zh-CN
200|has_documentation|false|language|en-US
Row-Based Example with Localization leveraged for Find
product_id|attr_name|attr_value|localization_type|localization_value
100|on_sale|true||
100|on_sale|true|region|us
100|on_sale|false|region|de
100|on_sale|true|region|fr
200|has_documentation|true||
200|has_documentation|true|language|zh-CN
200|has_documentation|false|language|en-US

Common Attributes

Margins

Margin-based strategies use the MARGIN attribute to determine the order of the products recommended.

A margin value is a percentage, represented in decimals between 0 and 1 that represents the margin (profit) the item generates for the retailer. 

The attribute name is case sensitive, so using a column-based format, the column name would be attribute.MARGIN. An example of this would be as follows:

product_id|attribute.MARGIN
100|.75

That same example, in a row-based format:

product_id|attr_name|attr_value
100|MARGIN|.75

Short Names

The attribute name “Product Short Name” has a special function. Products with a matching “Product Short Name” value will never be recommended together. This is most commonly used to prevent two similar products from being recommended side by side.

This attribute name is case-sensitive and requires spaces between the words. There are two valid names: "Product Short Name," and "PRODUCT SHORT NAME."

In a column-based feed, the column name would be "attribute.Product Short Name" or "attribute.PRODUCT SHORT NAME."

An example using the column-based format would be:

product_id|attribute.PRODUCT SHORT NAME
100|7-inch Tablet

And using the row-based format:

product_id|attr_name|attr_value
100|PRODUCT SHORT NAME|7-inch Tablet

Reserved Words

These names are already used by the Algonomy system and cannot be used as attribute names:

  • product_ancestors_id
  • product_brand
  • product_canonical_id
  • product_categories
  • product_end_date
  • product_external_id
  • product_feed_date
  • product_genre
  • product_genre_id
  • product_image_id
  • product_link_id
  • product_name
  • product_num_reviews
  • product_pricecents
  • product_price_range_max_cents
  • product_price_range_min_cents
  • product_rating
  • product_recommendable
  • product_release_date
  • product_saleprice_cents
  • product_sale_price_range_max_cents
  • product_sale_price_range_min_cents
  • product_substitutes
  • product_type

Best Practice for IDs

When deciding which values to send to Algonomy as IDs for products, categories, etc. we suggest you stick to alphanumeric ASCII strings, with a maximum length of 100 characters. Spaces and most other special characters are allowed, but to avoid confusion, it's a best not include the characters commonly used by our feed ingestion system as delimiters in IDs (| ; . ^ , ~ `§ \ and /).

  • Was this article helpful?