Classifications

Classifications provide a way to segment your users into individual segments that are known to have different behaviors. Common segmentation keys are Country (DK/US/IT etc.), type (B2B/B2C etc.), or industry (VVS/Painting/Roofing etc.).

The popularity of products and content entities is completely isolated between segments, which means that for algorithms like PopularProducts, PopularContent, PopularBrand etc. data is strictly sourced from the users' segment (unless otherwise configured via Classification Matchings).

Popularity is also used indirectly in other scenarios, such as in search ranking (when sorting by relevance or popularity) or recommendation fill-strategies.

Classifications may also be used to target specific user groups with merchandising rules.

Setting up Classifications

Classifications are automatically created whenever we detect a new value or value-pair; basically, the classifications provided on the User-object on any tracking request will essentially dictate in which data-silo that piece of tracked behavior gets stored. Once segments are created, Relewise will infer automatically how behavior is different between two segments, based on the behavior tracked for users belonging to each segment.

Segments are identified by a simple string-key, or if necessary, a combination of string-keys.

CRUD Semantics

Classifications cannot be manually created, deleted, or modified.

They are always automatically created upon first use, and if Relewise stops seeing data for a certain segment, that silo of data will just be left to become obsolete over time.

Segment data cannot be manually moved or copied from one silo to another. This means you should try to avoid renaming the keys and values used for classifications, as it means you will likely lose out on behavioral data already collected. For more info on migrating classifications, read the Migrating Classifications segment below.

Essentially, Relewise separates the behavioral data collected for each segment into its own silo. This means that behavioral data assigned to one segment will never affect the behavioral data assigned to another segment. The upside to this is that it allows us to know what products are popular for a particular group of users, compared to another comparable group of users. The downside is that the data is being spread more thinly, meaning that you require comparatively more behavioral data to ensure that each segment is statistically valid.

The classification provided on the User-object on any search or recommendation request will dictate the data-silo from which the data gets read to produce the result.

Number of Segments

The number of segments (data-silos) is determined based on how many different classification values we see for 1 classification key, or if using multiple classification keys, the total sum of unique combinations seen.

Example

Key: Country

Seen values: DK, US and UK.

There will then be 3 segments/silos.

Example

Key: Country

Key: Type

Seen value-pairs: DK/B2C, DK/B2B, US/B2C, US/B2B, UK/B2C, UK/B2B

There will then be 6 different segments/silos. We see the same countries as in example 1, but for each of them there are now what we can consider conceptually 2 sub-segments for each of them, B2C and B2B.

This means that if you operate 10 different countries with the Country key, and at the same time you have an Industry key with 10 different industries (and assuming that every industries are represented in every country), you would have a total of 10 x 10 = 100 segments. Adding a third classification can easily grow the number of segments into the thousands - which makes it almost impossible to maintain statistically valid data for each segment.

As such, it is always best to minimize the number of segments to ensure you have sufficient data to properly populate each of the segments with enough data to be statistically valid.

Migrating Classifications

Classifications "live" on the user object and move with the user. Classifications, thus, are a collection of all the tracked users, both known and unknown, that are registered to that classification segment.

If you have 1 classification key, e.g. Country, and then add a second key, e.g. Industry six months later, and thus start sending 2 classifications instead of 1 for all requests, then all data collected in that 6-month interim will essentially be left obsoleted. This is because a segment is defined by its combination of key-values, not by individual keys and values (unless configured otherwise via Classification Matchings).

However, for known users returning to the site, if they previously had just the one classification key, the previously observed observation for each user will be migrated to the new combined classification segment. For anonymous users, past behavior for obsolete classifications will never be migrated.

Classifications: Best Practices

There is no definitive right or wrong for the number of segments you should use. What determines it depends on the number of visitors and representable behavioral tracking that you have across all segments. It is generally best to keep the number of classification key/value combinations as low as possible, but enough to differentiate users into segments known to have big differences in behavior.

Practically, strive to have 1 or 2 classification keys, with a total number of values/value-pairs of less than 50. Having only 1, 2, 3 or 5 total values/value-pairs is perfectly fine. This will ensure your data doesn’t get spread too thin in most scenarios.

Having too few segments, on the other hand, can also leave value on the table. E.g., if you service ten markets, but do not segment between them, then the popularity of products and/or content will be an average for all countries, rather than specific to the individual country. If you suspect that user behavior of viewing and buying products differs between countries, then the best course of action is to add Country as a classification.

Maintain Consistency

It is important to always use the exact same classification keys and values for the same user regardless of the type of request. This is to ensure that a user contributes to the same pool of data as the user will consume data from.

If classifications are not provided on a User on a request, Relewise will remember what classification was last used by that same user and use that for the request processing. However, best practice dictates to always pass the classification as part of the user object whenever it is known - which should, in most cases, be all the time.

UserUpdate

Generally, classifications should never manually be maintained via UserUpdate requests. In almost all cases they can be managed purely via the regular tracking/searching/recommendation requests, with classifications provided on the user object on the request.

Classification Matchings

For most situations, the default setup of classifications is sufficient to get the desired benefit from its use. For certain advanced edge cases, however, it may be useful to alter the behavior to better tailor the use of classifications to match your requirements. With Classification Matchings, Relewise can handle situations where you have multiple classification keys, but only want to evaluate certain value-pairs, while ignoring others, or scenarios where you want to favor the weight of a certain key in a value-pair.

Keep in mind...

Classification Matchings are an advanced strategy that requires precise fine-tuning to work well. For the vast majority of scenarios, it is not necessary to engage with this sub-feature, as the default classification strategies will be more than enough to succeed.

These are some of the scenarios that could warrant the use of Classification Matchings. If you want to know more, or if you believe this is necessary for your implementation, reach out to us for a sparring session.

You have introduced a new classification key called Industry, but do not want to use it yet for segmenting your data.
You have a classification key called Incentive that defines whether the user is a new user, a high spender, or a frequent shopper. This is not a clear distinguisher of behavior, but you still want to use it a bit to tilt the results towards their respective incentive.
You have a classification called Country but still want to use a bit of the behavior of all countries irrespective of whether they match. Except when the country is Sweden, as you have identified that their behavior is so different from all other countries that they should never use behavior from other countries.

Classifications ​

Setting up Classifications ​

Number of Segments ​

Migrating Classifications ​

Classifications: Best Practices ​

Maintain Consistency ​

UserUpdate ​

Classification Matchings ​