Data Requirements
Ready to get started integrating your data into Curate? Check out data typing and schemas below.
Data Types
Section titled “Data Types”Data types classify the format of data within a field in the schema that is shared below. If you’re not familiar with data-typing, here’s a quick crash course!
These types are used to ensure the data intake has a predictable format for the service. This allows us to not only validate data faster, but ensure we eliminate errors BEFORE they impact your experience within the system. For example, an “Integer” represents a whole number vs a “Double” which represents all numbers both whole and fractional. Using 42 as an example, as an Integer will never include a decimal, 42 is a valid value. A Double requires 42 to be written as 42.0.
You may be wondering, why does this matter? Well! Imagine you are expecting to do some basic maths from the file you’re creating, and you’ve got two product with two different revenue values:
| Name | Revenue |
|---|---|
| Product 1 | $36.12 |
| Product 2 | $12.99 |
Simple right? $49.11 Now imagine if Product 2’s value was TWELVE, what happens in software is that if you add a string to an anything you get concatenation, so the result becomes 36.12TWELVE
Another common example where this can impact your data, is if we didn’t enforce a double type and this was read as an integer you now have a problem: do you round Product 2 to 12 or 13? Do you drop the decimal value entirely and leave it as 12? This seems like a small change, but across large data sets this will change the results of KPIs dramatically. Losing $.99 on every transaction on every product over time would mean A LOT of missing revenue.
String
Section titled “String”A string is any data contained within a set of quotation marks (” ”). Strings are often used for words or sentences where the value must be saved exactly as the user inputs it (including punctuation!). Strings can include every other type of data, but they wont be used as that data type (see the example above of concatination) so they are used for organizing data not calculating results.
-
"HIVERYMART"is your typical valid string. -
"3.14","3", and even" !"or"🐝"are valid strings, they just can’t be used for maths. -
HIVERYMARTis NOT a valid string, it’s missing it’s quotation marks.
Integer
Section titled “Integer”An “Integer” is a simple way to represent whole numbers (positive or negative), meaning numbers without any decimal or fractional parts. For example: -2, -1, 0, 1, 2 are all integers whereas 3.14 is not. Integer data types are used when you want to explicitly reject fractional numbers or not be slowed down by calculating them.
-
1or-1are valid integers. -
0is a valid integer, though it represents nothing. -
3.14is NOT a valid integer, it’s a double.
Double
Section titled “Double”A “Double” is a more accurate way to represent numbers whole OR fractional using decimal points. This means all integers are valid doubles, but no doubles are valid integers. It’s called “double” because it can hold numbers with “double the precision” compared to other numerical data types.
-
3.14,-3.14are valid doubles. -
3is a valid double (representing3.0) AND a valid integer -
TWELVEis NOT a valid double, it’s a string
File Schemas
Section titled “File Schemas”Below you’ll find each file explained so you know just what to do when integrating your data!
Products
Section titled “Products”A list of all products along with their corresponding UPCs, regardless of their current assortment status. This list should encompass details such as category, subcategory, brand, sub-brand, manufacturer, pack group, and all relevant product dimensions.
| Field | Description | Example | Default |
|---|---|---|---|
| upc | Integer used as a unique product identifier. UPCs should be aligned in each input file, including check digits. | 036000241457 | N/A - Required Field |
| id | String used as a unique product identifier is used to isolate items from a UPC, If you don’t have a specific usecase for this you may ignore it. | ”123” | UPC Value |
| name | String providing a product name. Must be less than 70 characters | CLIF BLDRS MINT 6PK | N/A - Required Field |
| category | String used to define the product category. Must be less than 70 characters | Energy Bar | Defaults to ‘Unknown’ |
| subcategory | String used to define the product’s sub-category. Must be less than 70 characters | Whey-Based | Defaults to ‘Unknown’ |
| subsubcategory | String used to further define the product’s sub-category attributes. Must be less than 70 characters | Low-Sugar | Defaults to ‘Unknown’ |
| brand | String used to define the product brand. Must be less than 70 characters | CLIF | Defaults to ‘Unknown’ |
| subbrand | String used to used to define the product’s sub-brand. Must be less than 70 characters | CLIF BUILDERS | Defaults to ‘Unknown’ |
| manufacturer | String used to define the manufacturer of the product. Must be less than 70 characters | CLIF BAR & COMPANY | N/A - Required Field |
| packgroup | String used to define the type of packaging and/or count of the item. | 12OZ 6PK BOX | Defaults to ‘Unknown’ |
| size | Double used to specify the item’s weight or volume. | 12 | Defaults to ‘1.0’ |
| uom | String used to define the unit of measurement associated with the size column. | OZ | Defaults to ‘UNK’ |
| case_size | Integer used to define the number of items in a case. | 5 | Defaults to ‘1’ |
| usable_life | Integer used to define the item’s shelf life, in days. | 999 | Defaults to ‘999’ |
| height | Double used to define the height of the product, in inches. | 8.25 | Defaults to ‘1.0111’ |
| width | Double used to define the width of the product, in inches. | 11.32 | Defaults to ‘1.0111’ |
| depth | Double used to define the depth of the product, in inches. | 5.75 | Defaults to ‘1.0111’ |
Stores
Section titled “Stores”A numbered list of stores that includes geographical information such as region, state, distribution center, cluster, and latitude and longitude.
| Name | Description | Example | Default | Constraints |
|---|---|---|---|---|
| store_number | Unique store identifier. | 1 | Cannot be empty. The type of data is an Integer | |
| name | The store’s name. | HIVERYMART HQ | N/A | String |
| region | Any type of region—can be geographic, retailer, socioeconomic, numeric, etc. | RURAL | N/A | String |
| store_type | Any store-level attribution, i.e. store size, demographics, or population density. | Neighborhood | supermarket | String |
| distribution_center | The distribution center (DC) that feeds the store. | DC1 | N/A | String |
| cluster | If using clusters, enter the planogram id or cluster name/id. | HQAREA | 0 | String |
| state | The state the store is in. | NC | N/A | String |
| latitude | Useful for geographical insights. | 35.5403 | 0.0 | Double |
| longitude | Useful for geographical insights. | -79.7480 | 0.0 | Double |
| address | Address of the store. | 100 MAIN ST | N/A | String |
| post_code | Postal or zip code for that store. | 27341 | String |
Movement
Section titled “Movement”Mapping group UPCs to store number, weekly movement data, and pricing information.
| Name | Description | Example | Default | Constraints |
|---|---|---|---|---|
| store_number | A unique store identifier | 1 | Cannot be empty. The type of data is an Integer | |
| product_key | The UPC or ID of the product. Note, the product_key should align with the UPC or ID that was provided in the products-master tab, including check digits. | 036000241457 | Cannot be empty. The type of data is a String | |
| week_start_date | The week start date in year/month/day format. Any day of the week may be used as the week start date, as long as it is consistent. | ”2021-01-31” | Format=“YYYY-MM-DD” Note: Excel will reformat String | |
| units | Number of units sold for the week starting at date. | 21 | 0.0 | Double |
| revenue | Sales revenue for the week starting at date. | 84.55 | 0.0 | Double |
Planograms
Section titled “Planograms”Planogram files for use with space aware projects. These are uploaded as a .zip of all the planograms. Each planogram uses the file format of .psa. Upload these for space aware planogram drawing in the app. Uploading unlocks features like: Days of Supply and Number of Facings.
Planogram-Store Mapping
Section titled “Planogram-Store Mapping”Indicates which Planograms are executed at which stores. Use this in space aware projects. This is required along with the planogram files.
| Name | Description | Example | Default | Constraints |
|---|---|---|---|---|
| planogram_id | Unique planogram identifier. Usually the name of the file. | ”HIVERYMART_HQ_FOODS.psa” | Cannot be empty. This needs a value. Value needs to be a String. | |
| store_number | Unique store identifier. | 1 | 1 | Cannot be empty. The type of data is an Integer |
Product Images
Section titled “Product Images”Images of each product included in this project to display on a planogram as either assorted or unassorted. The product images are uploaded in a zip folder just like planograms. Use these with your space aware project in order to visualize the planograms. Note that these are optional, and any missing images will default to a backup indicator. In alignment with the Product Image Standards from GS1 multiple product images across orientations can be numerically ordered as follows:
- Front 1
- Left 2
- Top 3
- Back 7
- Right 8
- Bottom 9
File Naming Convention: <upc>.<orientation>.<file type>
Example: 123.1.png which corresponds the the product UPC 123 and the front facing orientation.
Notes:
- UPC should match the UPC in the product master.
- Orientation is a numerical indicator, e.g.: 1 -> front, 2 -> side
- File type can be .jpg or .png
Innovation Items
Section titled “Innovation Items”New items that warrant an investigation into their performance. Use this for items that don’t have sales data. This enables us to predict the items performance at each store.
| Name | Description | Example | Default | Constraints |
|---|---|---|---|---|
| product_key | Either the UPC or ID of the innovation item, whichever you have chosen in the products-master file. Note, all innovation items must be included in the products-master table. | ”036000241457” | Cannot be empty. This needs a value. Value needs to be a String | |
| unit_price | The price for one unit of the innovation item. | 5.99 | 7.77 | Double |
| update_type | The type of projection the innovation item will receive. ‘sister’ denotes a pairing with one or more like items, while ‘constant’ denotes a flat unit per store per week projection across all stores. | ”sister” | Cannot be empty. This needs a value of either “sister” or “constant”. Value needs to be a String | |
| values | If update_type is “sisters”, then provide a weight and a sister item in the format {weight}:{sister item upc/ID}, or the format {weight}:{sister item UPC/ID};{weight}:{sister item UPC/ID} if it has multiple sisters. The estimated movement of the innovation item will be the sum of the weighted movement divided by the number of sisters. If update_type is “constant”, then provide the units per store per week projection. | ”0.65:036000241457” | Cannot be empty. This needs a value. Value needs to be a String |
Sister Stores
Section titled “Sister Stores”Use this file for new stores with no data to simulate item performance based on data from similar stores.
| Name | Description | Example | Default | Constraints |
|---|---|---|---|---|
| store_number | Unique store identifier. | 1 | Cannot be empty. The type of data is an Integer | |
| sister_store_number | Unique store identifier. | 1 | Cannot be empty. The type of data is an Integer |
Availability
Section titled “Availability”This file defines the products available by store and is not a requirement if all stores can carry all products. Every product that is available at a particular store must have an entry, otherwise it will not be assorted at that store.
| Name | Description | Example | Default | Constraints |
|---|---|---|---|---|
| store_number | Unique store identifier | 1 | Cannot be empty. The type of data is an Integer | |
| product_key | Either the UPC or ID, whichever you have chosen in the products-master file. | 036000241457 | Cannot be empty. The type of data is an Integer | |
| supplier | Note: this is optional if you want to add a supplier at the store/item level. | ”CLIF BAR & COMPANY“ | ‘WAREHOUSE’ | String |
Consumer Decision Tree
Section titled “Consumer Decision Tree”A graphical depiction illustrating the process and factors a customer considers when deciding to buy a product within a certain category, and informs Hivery’s demand transfer capabilities. Outlines the prioritization of attributes important to the customer, like brand, price, size, flavor, quality, or features.
Upload an image of your CDT and we will translate it and use it for informing our models. This can be in any format, but the more detailed the better. Our system then takes your uploaded Products file and matches each product to the CDT based on those products’ attributes.
Using your CDT we power our Demand Transfer model, which in turn powers features such as: Product Recommendations, Enhanced Assortments, and Assortment Strategy.