You can use the Recommend node to find new items that a customer might be interested in.
The Recommend node requires a data set containing at least three fields:
- A field that uniquely identifies users
- A field that uniquely identifies products
- A numeric ranking field that represents how a given user has rated a product
With these three fields, the Recommend node can compare the rankings that different users have given to the same products to predict how users might rank products that they have not yet experienced. For example:
User | Product | User's ranking of product |
---|---|---|
User 1 | A | 7 |
B | 6 | |
C | 9 | |
User 2 | A | 6 |
B | 7 | |
C | ? |
In this example, the Recommend node could predict how User 2 might rank Product C. This prediction would be based on how User 1 ranked Product C, given that User 1 and User 2 ranked Products A and B quite similarly. In cases where predicted rankings are high, the new product could then be recommended to the user.
Recommendation training
To perform recommendation training, you will need a labeled data set, that is, one where users have ranked products.
Product recommendation example: Training
This example uses a sample from a product rankings data set. The object of training is to predict how users might rank products that they have not used.
user |
product |
ranking |
---|---|---|
001 |
a |
99 |
001 |
b |
42 |
001 |
c |
73 |
002 |
a |
33 |
002 |
b |
63 |
- For training, specify
ranking
in the Rating Field property. - Specify
user
in the User Field property. - Specify
product
in the Product Field property. - Specify a Prediction Field with an appropriate name, for example
predictedRank
. - Run the analysis.
The Recommend node outputs a data set containing the user
and product
input fields, along with the Prediction Field in place of the ranking
field:
user |
product |
predictedRank |
---|---|---|
001 |
a |
99 |
001 |
b |
35 |
001 |
c |
68 |
002 |
a |
40 |
002 |
b |
51 |
Recommendation evaluation
To evaluate the accuracy of an analytic model, and by extension the accuracy of scoring that is performed using that model, you can use the Recommend node's Evaluate operation.
To evaluate a child training model, you need to use a validation data set as an input to a Recommend node, see Generating training and validation data sets.
Product recommendation example: Evaluating the child training model
To evaluate the child model created during the product ranking data set training:
- Provide a validation data set as input to the Recommend node.
- Select Evaluate in the Operation property.
The evaluation produces an RMSE:
ModelDisplayName |
ChildModelDisplayName |
Rank |
RMSE |
---|---|---|---|
Recommendation Model |
Child Model 1 |
1 |
15.82 |
Recommendation re-training and re-evaluating
After training and evaluating your first child model, you can choose to train another one in order to obtain a better RMSE and more accurate scoring results.
To retrain for recommendation, you will need new data. Each time you re-train using the same analytic model, another child model is produced. Once a new child model is produced, you can then evaluate it using the data store output and child model that was produced by your new training attempt. If a new child model is found to have a lower RMSE, you could then use it for scoring.
Recommendation scoring
Prerequisite: You have selected a child model within your analytic model to use for scoring, see Creating analytic models.
Once you have selected a child model to use for scoring, you can create another analysis that uses a Recommend node to score an unlabeled data set, that is, to predict values for each record. There are three types of scoring with the Recommend node. In the Score Type property you can choose from:
- Ratings - Given a user field and a product field, predict a rating field that represents how that user might rate that product.
- Users - Given a product field, find ratings that were given to the product.
- Products - Given a user field, find products to recommend to the user.
Product recommendation example: Scoring
This example completes the product recommendation data set examples in this topic. You have an unlabeled data set containing a user
field and a product
field. The values in these fields are the same as the set of values used in training:
user |
product |
---|---|
1 |
A |
1 |
B |
1 |
C |
2 |
A |
2 |
B |
2 |
C |
- Select a child model to use for scoring, for example Child Model 1.
- Provide the unlabeled data set as input to a Recommend node.
- Select Score in the Operation property.
- Select Ratings in the Score Type property.
- Specify
user
in the User Field property. - Specify
product
in the Product Field property. - Specify a name for the Prediction Field to hold the rating values, for example
predictedRank
. - Select a data type for the prediction field in the Prediction Field Type property, for example Integer.
The following results are produced:
user |
product |
predictedRank |
---|---|---|
1 |
A |
91 |
1 |
B |
30 |
1 |
C |
76 |
2 |
A |
46 |
2 |
B |
50 |
2 |
C |
60 |
You can then output this data set to a new data store and use it in other data stages, such as a dashboard. Note that in this example, the scoring model was able to generate a prediction about how User 2 would rank product C, based on how User 1 ranked products A, B, and C.
Regression or Recommendation: Root Mean Square Error (RMSE)
The Root Mean Square Error (RMSE) is a measure used to evaluate Regression or Recommendation models. RMSE is the square root of the mean of the square of the summation of all errors between predicted values and labeled values.
In general, the lower the RMSE, the better the performance of a model. What typifies a "low" RMSE depends on the range of values in the model's label field.
If there are large errors between predicted values and labeled values (i.e. a high ), this will magnify the RMSE because this value is squared.
Properties
Display Name
Specify a name for the node.
The default value is Recommend.
Operation
Select an operation type. Choose from:
- Train
- Score
- Evaluate
Analytic Model
Select an analytic model. You can only choose from Recommendation type models.
Rating Field
Select an input field to use for ranking. This must be a numeric field where users have rated items.
User Field
Select an input field that contains the user information.
Product Field
Select an input field that contains the product information.
Prediction Field
Enter a name for a prediction field which will be included in the output of the node.