Product Catalog Feed
Overview
The catalog feed provides the Personalization Cloud with all product and category information needed to generate and display personalized recommendations. The feed is typically uploaded once per day to the Algonomy FTP server.
The catalog feed contains multiple files designed to make it easy to generate from your catalog database via Postgres commands, MySQL, or by standard export functionality provided by your ecommerce platform.
Uploading the Feed
Before scheduling the feed for regular uploading, you should first generate a sample feed. This should be provided to your Algonomy integration consultant for validation. The feed will be evaluated for syntactical correctness, but we are not able to validate for business correctness. Therefore it is important that you ensure that all URLs and other data are accurate.
Note: The files need to be saved in UTF-8 encoding without a BOM (byte order mark).
After your team has validated the feed content and structure, send the sample to the Algonomy integration team for processing. Once the test file has successfully processed, the production file should be scheduled for upload on the agreed schedule, typically once per day in the early morning (12am to 4am).
The single compressed file containing all relevant exported files should be sent to the Algonomy FTP server (your integration team will provide FTP credentials).
Note: We do not support SFTP uploads at this time. The Catalog Feed should not contain any personally identifiable information and as such should not pose a security risk of any kind. However, if you require a higher level of security when transferring files, you may choose to upload the feed files over FTPS.
Sample Files
You can download a zipped sample file here. It contains all the required files plus the often-used attribute feed file and may be used by your IT or integration team as a starting point.
Feed Files
Feed File | Details | Filename pattern |
---|---|---|
Catalog Feed (compressed) | This is the compressed file that consists of all files listed below. This should be uploaded at least once a day, either as a .zip or .gz archive. required | catalog_full_sitename_YYYY_MM_DD.zip or catalog_full_sitename_YYYY_MM_DD.gz |
Product Feed File | List of all products sold on your site that can be recommended and can have recommendations. required | product_full_sitename_YYYY_MM_DD.txt |
Category Feed File | A hierarchical description of your catalog's category structure. required | category_full_sitename_YYYY_MM_DD.txt |
Product-Category Mapping Feed File | A mapping of products to relevant categories. One product can map to more than one category. required | product_in_category_sitename_YYYY_MM_DD.txt |
Product Attribute Feed File | Product attributes. | product_attribute_sitename_YYYY_MM_DD.txt |
Note: The Catalog Feed can be compressed as either a .zip or .gz file.
File Specification
See below for the specifications for each file.
The delimiter in the file must be a pipe (|) unless another delimiter is required. If, for some reason, you are not able to use a pipe as delimiter, alert your Algonomy representative who will work with you to accommodate your specific needs.
Please take care to ensure that delimiters cannot occur within column values. If this happens, the row will appear to have one too many columns to the interpreter, an error will occur, and the feed will fail.
Important: If the field is optional, leave it blank/null if you want to retain the default value listed in the table below. Only populate the field if you want to change the default value.
Note: The column headers are case-sensitive.
Note: The column headers need to be in the order as shown in the documentation.
Product Feed File
Filename: product_full_sitename_YYYY_MM_DD.txt
Name | Type | Required | Definition |
---|---|---|---|
product_id | ASCII | Yes |
NOTE: If a product exists in a product_full file on day one as recommendable, but is omitted on day 2, the {rr} processor will automatically update this product's recommendability to FALSE. Additionally, it will erase all of the attributes that may have been added in the first feed. |
name | Text | Yes | The name of the product as it will be displayed in recommendations on site and in the Personalization Cloud Dashboard. Max length: 255 characters. If missing, name is not updated if the product is already in the system. |
product_parent_id | ASCII | No | |
price | Number | Yes | |
recommendable | Boolean | YES | |
image_url | Text | Yes | |
link_url | Text | Yes | |
rating | Number | No | |
num_reviews | Integer | No | |
brand | Text | No | |
description | Text | No | |
sale_price | Number | No | |
start_date | Date | No | |
sale_price_min | Number | No | |
sale_price_max | Number | No | |
list_price_min | Number | No | |
list_price_max | Number | No |
Note: Price should be defined for Sale Price to be used. Price min/max is a method for handling SKU-level pricing. To use one or both of the ranges, the four min/max fields must be added to the feed in the correct order, even if only one of the ranges are in use.
Example
product_id|name|price|recommendable|image_url|link_url 0001|Brown Pea Coat|299.99|true|assets/images/brown-pea-coat-27765.jpg|products/brown-pea-coat-27765.html 0002|Blue Fedora|49.99|true|assets/images/blue-fedora-28873.jpg|products/blue-fedora-28873.html
Category Feed File
Filename: category_full_sitename_YYYY_MM_DD.txt required
Name | Type | Required? | Definition |
---|---|---|---|
category_id | ASCII | Yes | |
parent_id | ASCII | Yes | |
name | Text | Yes | |
category_link_url | Text | No | |
category_image_url | Text | No |
Example
category_id|parent_id|name 1005||Men's 1006||Women's 2005|1005|Jeans 2006|1006|Jeans
Product-Category Mapping Feed File
Filename: product_in_category_sitename_YYYY_MM_DD.txt
Name | Type | Required? | Definition |
---|---|---|---|
category_id | ASCII | Yes | |
product_id | ASCII | Yes |
Example
category_id|product_id 3005|0002 3007|0001
Product Attribute Feed File
Filename: product_attribute_sitename_YYYY_MM_DD.txt
Note: For sites with minimal attribute requirements, this file may be omitted.
The Product Attribute File allows users to specify values for attributes that can then be used to inform the recommendations made by Personalization Cloud. Attributes like region, language, and color can all be specified in this file.
Users can pass custom compatibility lists as an attribute in the product attributes feed file. Product compatibility mapping (MCMP) allows users to define compatibility through attribute feed using MCMP.NAME. Since compatibility is a special attribute, it should be prefixed by "MCMP.", for example, MCMP.accessories. Here, accessories will be the name of the compatibility attribute.
The attribute values (and multi-valued attributes) can be listed in either a column-based format or a row-based format. The system automatically detects the format based on the header. It should be noted, however, that the column-based format does not support localization (the use of region/language).
Column-based format
Name | Type | Required? | Definition |
---|---|---|---|
product_id | ASCII | Yes | |
attribute.[attribute key1] | Text | Yes | First unique attribute key |
attribute.[attribute key 2] | Text | Yes | Second distinct attribute key |
… | |||
attribute.[attribute key N] | Text | Yes | Nth distinct attribute (N=number of distinct attribute keys) |
Example
product_id|attribute.color|attribute.size|attribute.fabric 100|blue|medium|cotton 101|red|large|wool
Row-based format
Name | Type | Required? | Definition |
---|---|---|---|
product_id | ASCII | Yes | Identifier for the product as defined in the products file |
attr_name | text | Yes | Unique attribute key. HTML-entity encode single and double quotes. |
attr_value | text | Yes | Name of the attribute. HTML-entity encode single and double quotes. |
Examples
This is an example of simple passing of single-value attributes in the row-based file format:
product_id|attr_name|attr_value 100|size|s 100|color|red
This is an example of passing multi-valued attributes in individual rows (the list value delimiter is a period by default but may be customized when creating the feed profile. It cannot be the same as the column delimiter):
product_id|attr_name|attr_value 100|size|xl.l.m.s.xs 100|color|green.blue.red
Multiple Attribute Values
A single attribute may take more than a single item as a value. Sometimes items may have more than one possible size, or color, for example. In cases like this, you can specify a list of values for the attribute.
The list must use a delimiter that does not exist in attribute values and should not be a list delimiter used elsewhere in the file. Currently the default delimiter for lists is the period. However, the list value delimiter may be changed to something else when creating the feed profile. Alert your Algonomy integration consultant if you believe this will be necessary.
Note: Because the default delimiter is a period, values sent as decimal numbers (prices, for example) will be misinterpreted as two integers instead of a single float if the delimiter is left at default. Be sure to discuss with your Algonomy integration consultant if you will be passing decimal numbers in your attribute feed.
Examples
Row-based:
product_id|attr_name|attr_value 100|color|green.blue.red 100|size|xl.l.m.s.xs 100|fabric|cotton
Column-based:
product_id|attribute.color|attribute.size|attribute.fabric 100|green.blue.red|xl.l.m.s.xs|cotton
Localized Attributes
You can signify that certain attributes are enabled only in the context of certain regions or certain languages. This is useful when, for example, you have regulatory attributes that differ across countries, or sales that run across different dates in different physical stores. By default, when there is a conflict between a region-dependent attribute and a language-dependent attribute, the region-dependent attribute takes precedence. If you have a specific need that requires language to take precedence, speak with your Algonomy representative about changing this setting.
Localized attributes persist from feed upload to feed upload. If you want to erase one of these attributes, or remove its influence and allow the non-localized version of the attribute to take precedence in all regions, you can do so by including the attribute one more time with a localization_type of "-1"
Note: To leverage localized attributes in Find, the attribute first needs to be added without localization before it can be localized. In other words, it should have a value under the site default language or region, then you can add regionalized attributes. Referencing the second example below, this can be achieved by having a record such as: 100|on_sale|true|| prior to adding 100|on_sale|true|region|us.
Example
Row-Based Example with Localization
product_id|attr_name|attr_value|localization_type|localization_value 100|on_sale|true|region|us 100|on_sale|false|region|de 100|on_sale|true|region|fr 200|has_documentation|true|language|zh-CN 200|has_documentation|false|language|en-US
Row-Based Example with Localization leveraged for Find
product_id|attr_name|attr_value|localization_type|localization_value 100|on_sale|true|| 100|on_sale|true|region|us 100|on_sale|false|region|de 100|on_sale|true|region|fr 200|has_documentation|true|| 200|has_documentation|true|language|zh-CN 200|has_documentation|false|language|en-US
Common Attributes
Margins
Margin-based strategies use the MARGIN attribute to determine the order of the products recommended.
A margin value is a percentage, represented in decimals between 0 and 1 that represents the margin (profit) the item generates for the retailer.
The attribute name is case sensitive, so using a column-based format, the column name would be attribute.MARGIN. An example of this would be as follows:
product_id|attribute.MARGIN 100|.75
That same example, in a row-based format:
product_id|attr_name|attr_value 100|MARGIN|.75
Short Names
The attribute name “Product Short Name” has a special function. Products with a matching “Product Short Name” value will never be recommended together. This is most commonly used to prevent two similar products from being recommended side by side.
This attribute name is case-sensitive and requires spaces between the words. There are two valid names: "Product Short Name," and "PRODUCT SHORT NAME."
In a column-based feed, the column name would be "attribute.Product Short Name" or "attribute.PRODUCT SHORT NAME."
An example using the column-based format would be:
product_id|attribute.PRODUCT SHORT NAME 100|7-inch Tablet
And using the row-based format:
product_id|attr_name|attr_value 100|PRODUCT SHORT NAME|7-inch Tablet
Reserved Words
These names are already used by the Algonomy system and cannot be used as attribute names:
- product_ancestors_id
- product_brand
- product_canonical_id
- product_categories
- product_end_date
- product_external_id
- product_feed_date
- product_genre
- product_genre_id
- product_image_id
- product_link_id
- product_name
- product_num_reviews
- product_pricecents
- product_price_range_max_cents
- product_price_range_min_cents
- product_rating
- product_recommendable
- product_release_date
- product_saleprice_cents
- product_sale_price_range_max_cents
- product_sale_price_range_min_cents
- product_substitutes
- product_type
Best Practice for IDs
When deciding which values to send to Algonomy as IDs for products, categories, etc. we suggest you stick to alphanumeric ASCII strings, with a maximum length of 100 characters. Spaces and most other special characters are allowed, but to avoid confusion, it's a best not include the characters commonly used by our feed ingestion system as delimiters in IDs (| ; . ^ , ~ `§ \ and /).