1 Introduction
-
They fail to capture the personalized preferences of the user in terms of behaviors, especially negative feedback signals. In a real-world e-commerce platform, before a user buys an item (target behavior), they may show multiple auxiliary behaviors, such as clicking, adding to cart and collecting. These auxiliary behaviors have complex relationships with the target behaviors. In addition, the interaction relationships between behaviors are highly customized for different users and reflect the personalized tastes of the user. For example, when shopping on an e-commerce platform, some people may add preferred items to the cart before buying them, while others may buy them directly. We therefore need to carefully model these personalized preferences based on the interaction between behaviors. The negative signals associated with auxiliary behaviors (i.e., the user does not perform the auxiliary behavior) are also useful and have often been ignored in previous methods. An example of personalized behavior preferences and negative feedback is shown in Fig. 1.
-
Explicit interactions between auxiliary and target behaviors are not fully explored. Since different users have various behavior preferences, we need to fully explore the explicit relationships between multiple behaviors. To learn the user’s preferences, the explicit contributions of different auxiliary behaviors to the target behavior should be explored. However, existing methods [14, 18, 21] typically consider user–item bipartite graphs for each behavior separately, an approach that fails to jointly model the explicit contributions of these auxiliary behaviors. In [16] and [15], behavior chains are adopted to explore the dependencies of behaviors, but this approach cannot handle complex sequences of user behavior. Nevertheless, the explicit behavior interactions (i.e., statistical explicit semantic information) directly reflect the probability that a user would show the target behavior after showing the auxiliary behavior. Thus, explicit statistical behavior information is of vital importance for user modeling and has potential research value for multi-behavior RSs.
-
We propose a new model called MB-EBIH for multi-behavior recommendation, which can capture personalized user preferences with both positive and negative feedback from auxiliary behaviors. Moreover, it explicitly models the relations between the auxiliary and target behaviors and learns the explicit interactions between multiple behaviors.
-
To the best of our knowledge, we are the first to model the explicit behavior interactions in multi-behavior RSs and to explore the relationships between different behaviors to perform better personalized user modeling.
-
We conduct comprehensive experiments on four real-world datasets to evaluate the effectiveness of MB-EBIH and the generalization of the explicit behavior interactions. The results show that our model has significantly improved recommendation performance compared to other baseline models, and demonstrate the effectiveness of capturing the explicit behavior interactions in multi-behavior RSs.
2 Related Work
3 Preliminaries
3.1 Problem Formulation
3.2 Task Formulation
4 Method
4.1 Overview
4.2 Shared Embedding Module
4.3 Explicit Behavior Interaction Extraction Module
4.3.1 Heterogeneous Behavior Informative Graph
4.3.2 Details of the Explicit Behavior Interaction Extraction Module
4.3.3 Extracting Explicit Behavior Interaction Information
4.3.4 Inferring Explicit Behavior Interaction
4.4 Explicit Behavior Interaction Fusion Module
4.5 Complexity Analysis
4.5.1 Time Complexity
4.5.2 Space Complexity
4.6 Discussion
5 Experiments
5.1 Experimental Settings
5.1.1 Dataset
-
Beibei: This dataset was collected from Beibei,2 which is the largest e-commerce platform for baby products in China. It contains 21,716 users and 7,977 items with three types of user–item behaviors, including clicking, adding to cart or carting for short and buying.
-
Tmall: This dataset was collected from Tmall,3 one of the largest e-commerce platforms in China. There are 41,738 users and 11,953 items with four types of behavior, including clicking, carting, buying and collecting.
-
IJCAI15: This dataset was released from the IJCAI Contest 2015,4 which is focused on the task of predicting repeat buyers. To ensure that the training data were not too sparse, we filtered out users who bought fewer than 15 times and items that were bought fewer than 20 times. We were left with 55,038 users and 28,728 items, with the same four behaviors as in the Tmall dataset.
-
QK-article [41]: This dataset was collected from Tencent’s news article recommendation platform. Similar to IJCAI15, we filtered out users who bought fewer than 5 times and items that were bought fewer than 5 times. We were left with 40,343 users and 19,218 items with four types of behaviors, including clicking, following, sharing, liking.
Dataset | User# | Item# | Interaction# | Behavior type# |
---|---|---|---|---|
Beibei | 21,716 | 7,977 | \(3.3\times 10^{6}\) | \(\left\{ \textrm{Click,Cart,Buy} \right\}\) |
Tmall | 41,738 | 11,953 | \(2.3\times 10^{6}\) | \(\left\{ \textrm{Click,Cart,Collect,Buy} \right\}\) |
IJCAI15 | 55,038 | 28,728 | \(7.5\times 10^{6}\) | \(\left\{ \textrm{Click,Cart,Collect,Buy} \right\}\) |
QK-article | 40,343 | 19,218 | \(2.4\times 10^{6}\) | \(\left\{ \textrm{Click,Share,Follow,Like}\right\}\) |
-
Recall@K quantifies the proportion of relevant items from a test set that are correctly included in the top-K recommendation list. It measures the ability of the system to recall and capture relevant items from the recommended options. The higher the value of Recall@K, the better the ability of the system in terms of recalling relevant items.
-
NDCG@K evaluates the quality of the ranking of recommended items by assigning higher scores to relevant items that are ranked higher in the top-K list. It emphasizes the importance of both the relevance and the position of each item, with the aim of prioritizing and promoting higher-ranked relevant items in the recommendation list. A higher NDCG@K value indicates a better-ranked list of relevant items.
5.1.2 Baselines
-
MF-BPR [3]: This method has demonstrated strong performance in the top-n recommendation task and is frequently employed as a benchmark for assessing the effectiveness of new models. The BPR approach has been extensively utilized as an optimization strategy and is based on the assumption that positive items should receive higher scores than negative items.
-
NGCF [7]: This is a state-of-the-art graph neural network model that was specially designed to combine graph neural network with an RS.
-
LightGCN [8]:This state-of-the-art GCN-based recommendation model represents a breakthrough in leveraging high-order neighbors within the user–item bipartite graph to deliver accurate recommendations.
-
NMTR [17]: This is a state-of-the-art method that uses multitask learning to update NCF for multi-behavior tasks. For each type of behavior, it constructs a data-dependent interaction function and links the model predictions for each type of behavior in a cascading fashion.
-
MBGCN [14]: This is a state-of-the-art multi-behavior recommendation model based on GCN. It effectively considers the varying contributions of multiple behaviors to the target behavior based on a unified graph. It learns the behavior contributions and leverages an item–item graph to capture the behavior semantics.
-
MATN [18]: This model incorporates attention networks and memory units to distinguish and capture the relationships between users and items.
-
MB-GMN [29]: This model utilizes a graph meta network to capture personalized signals from multiple behaviors and to effectively model the diverse dependencies between them.
-
GNMR [40]: This GNN-based approach explores multi-behavior dependencies through recursive embedding propagation on a unified graph. It employs a relation aggregation network to effectively model the heterogeneity of interactions within the graph.
-
CRGCN [15]: This model utilizes a cascading GCN structure to effectively model multi-behavior data. It employs a residual design to deliver the learned behavioral features from one behavior to the next.
-
MB-CGCN [16]: This is a recently proposed model that adopts cascading CGN blocks to explicitly leverage multiple behaviors for embedding learning. In this model, a LightGCN learns the features of previous behavior and transfers them to the subsequent behavior through a feature transformation operation. The embeddings obtained from all behaviors are then aggregated to create the final prediction.
5.1.3 Hyper-parameter Settings
5.2 Overall Performance
Dataset | Metric | Single-behavior methods | Multi-behavior methods | MB-EBIH | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
MF-BPR | NGCF | LightGCN | NMTR | MBGCN | MATN | MB-GMN | GNMR | CRGCN | MB-CGCN | |||
Beibei | Recall@10 | 0.0868 | 0.0901 | 0.0925 | 0.0937 | 0.1249 | 0.0942 | 0.1040 | 0.1105 | 0.1215 | 0.1539 | 0.1911 |
NDCG@10 | 0.0414 | 0.0521 | 0.0538 | 0.0571 | 0.0745 | 0.0579 | 0.0586 | 0.0635 | 0.0986 | 0.1095 | 0.1235 | |
Recall@20 | 0.1244 | 0.1463 | 0.1517 | 0.1559 | 0.1925 | 0.1604 | 0.1608 | 0.1635 | 0.2173 | 0.2362 | 0.2807 | |
NDCG@20 | 0.0641 | 0.0652 | 0.0698 | 0.0796 | 0.0937 | 0.0790 | 0.0740 | 0.0794 | 0.1330 | 0.1547 | 0.1674 | |
Recall@40 | 0.1965 | 0.2311 | 0.2328 | 0.2644 | 0.2938 | 0.2610 | 0.2863 | 0.2872 | 0.2778 | 0.3388 | 0.3694 | |
NDCG@40 | 0.0817 | 0.0853 | 0.0887 | 0.0995 | 0.1157 | 0.1071 | 0.1078 | 0.1096 | 0.1585 | 0.1910 | 0.1991 | |
Tmall | Recall@10 | 0.0263 | 0.0391 | 0.0417 | 0.0470 | 0.0692 | 0.0562 | 0.0893 | 0.0601 | 0.0901 | 0.0925 | 0.0985 |
NDCG@10 | 0.0151 | 0.0225 | 0.0291 | 0.0293 | 0.0424 | 0.0367 | 0.0462 | 0.0388 | 0.0568 | 0.0594 | 0.0644 | |
Recall@20 | 0.0355 | 0.0491 | 0.0553 | 0.0672 | 0.0947 | 0.0790 | 0.1072 | 0.0838 | 0.1136 | 0.1231 | 0.1280 | |
NDCG@20 | 0.0176 | 0.0356 | 0.0333 | 0.0360 | 0.0493 | 0.0408 | 0.0592 | 0.0452 | 0.0631 | 0.0696 | 0.0732 | |
Recall@40 | 0.0481 | 0.0716 | 0.0719 | 0.0982 | 0.1275 | 0.1042 | 0.1345 | 0.1190 | 0.1543 | 0.1594 | 0.1664 | |
NDCG@40 | 0.0200 | 0.0350 | 0.0361 | 0.0412 | 0.0558 | 0.0517 | 0.0674 | 0.0529 | 0.0749 | 0.0778 | 0.0803 | |
IJCAI15 | Recall@10 | 0.0217 | 0.0281 | 0.0286 | 0.0314 | 0.0468 | 0.0350 | 0.0558 | 0.0417 | 0.0562 | 0.0701 | 0.0718 |
NDCG@10 | 0.0116 | 0.0153 | 0.0180 | 0.0211 | 0.0304 | 0.0217 | 0.0376 | 0.0272 | 0.0351 | 0.0428 | 0.0500 | |
Recall@20 | 0.0311 | 0.0351 | 0.0387 | 0.0461 | 0.0661 | 0.0412 | 0.0750 | 0.0600 | 0.0884 | 0.0891 | 0.0939 | |
NDCG@20 | 0.0191 | 0.0198 | 0.0211 | 0.0263 | 0.0358 | 0.0242 | 0.0432 | 0.0393 | 0.0412 | 0.0504 | 0.0563 | |
Recall@40 | 0.0489 | 0.0501 | 0.0519 | 0.0707 | 0.0906 | 0.0714 | 0.0960 | 0.0779 | 0.0924 | 0.1135 | 0.1221 | |
NDCG@40 | 0.0225 | 0.0231 | 0.0240 | 0.0309 | 0.0415 | 0.0316 | 0.0518 | 0.0327 | 0.0510 | 0.0598 | 0.0629 | |
QK-article | Recall@10 | 0.0615 | 0.0641 | 0.0692 | 0.0704 | 0.0711 | 0.0708 | 0.0946 | 0.0860 | 0.1173 | 0.1359 | 0.1509 |
NDCG@10 | 0.0333 | 0.0349 | 0.0357 | 0.0366 | 0.0384 | 0.0380 | 0.0495 | 0.0488 | 0.0691 | 0.0796 | 0.0892 | |
Recall@20 | 0.1021 | 0.1063 | 0.1064 | 0.1107 | 0.1178 | 0.1122 | 0.1402 | 0.1366 | 0.1767 | 0.2005 | 0.2205 | |
NDCG@20 | 0.0445 | 0.0465 | 0.0493 | 0.0502 | 0.0510 | 0.0506 | 0.0622 | 0.0581 | 0.0748 | 0.0934 | 0.1084 | |
Recall@40 | 0.1673 | 0.1745 | 0.1780 | 0.1821 | 0.1844 | 0.1838 | 0.2202 | 0.2084 | 0.2534 | 0.2846 | 0.3096 | |
NDCG@40 | 0.0589 | 0.0613 | 0.0625 | 0.0650 | 0.0662 | 0.0658 | 0.0834 | 0.0790 | 0.0907 | 0.1121 | 0.1282 |
-
Comparison of model performance. Table 2 shows that our MB-EBIH outperforms all the baseline models in terms of both the Recall@K and NDCG@K metrics (\(K=\left\{ 10, 20, 40 \right\}\)). Compared to the single-behavior model, we introduce multiple behaviors to reflect user preferences more comprehensively. Compared with the NN-based model, we employ GCN to obtain higher-order neighbor information. Different from the structure of GCN-based models, our model explicitly exploited interactions between behaviors through a heterogeneous behavior informative graph. By introducing explicit behavior interaction information, our model can more accurately capture users’ personalized preferences under different behaviors.
-
Importance of GNN in the RS. From an investigation of the performance of the single-behavior models in Table 2, we can see that the two GCN-based models, NGCF and LightGCN, perform better than traditional MF, thus proving that the ability of GCN to explore higher-order neighbor information can enable the model to learn more efficient embeddings of users and items.
-
Importance of multiple behaviors in the RS. From the results in Table 2, we can see that the single-behavior-based models MF-BPR, LightGCN and NGCF gave inferior performance to the multi-behavior-based models, which demonstrates the necessity of considering multi-behavior data in the RS. By exploring the dependencies between multiple behaviors, multi-behavior recommendation models can model user preferences from multiple perspectives
5.3 Modeling User Personalized Behavioral Preferences
5.4 Effect of the Graph Attention Mechanism
Dataset | Beibei | Tmall | IJCAI15 | QK-article | ||||
---|---|---|---|---|---|---|---|---|
Method | Recall | NDCG | Recall | NDCG | Recall | NDCG | Recall | NDCG |
GraphSage | 0.1458 | 0.0867 | 0.0722 | 0.0419 | 0.0619 | 0.0415 | 0.1329 | 0.0765 |
k-GNNs | 0.1838 | 0.1218 | 0.0782 | 0.0469 | 0.0662 | 0.0440 | 0.1292 | 0.0743 |
LEConv | 0.1866 | 0.1221 | 0.0796 | 0.0482 | 0.0643 | 0.0419 | 0.1387 | 0.0805 |
GAT | 0.1911 | 0.1235 | 0.0985 | 0.0644 | 0.0718 | 0.0500 | 0.1509 | 0.0892 |
5.5 Ablation Study
5.5.1 Effects of Negative Feedback Signals
5.5.2 Effects of Different Explicit Behavior Interactions
Dataset | Beibei | Tmall | IJCAI15 | QK-article | ||||
---|---|---|---|---|---|---|---|---|
Variants | Recall | NDCG | Recall | NDCG | Recall | NDCG | Recall | NDCG |
Base model | 0.1911 | 0.1235 | 0.0985 | 0.0644 | 0.0718 | 0.0500 | 0.1509 | 0.0892 |
w/o. click | 0.1899 | 0.1218 | 0.0703 | 0.0432 | 0.0479 | 0.0310 | 0.0757 | 0.0412 |
w/o. cart(follow) | 0.1229 | 0.0706 | 0.0951 | 0.0593 | 0.0706 | 0.0485 | 0.1447 | 0.0843 |
w/o. collect(share) | / | / | 0.0899 | 0.0550 | 0.0712 | 0.0492 | 0.1488 | 0.0862 |
w/o. cart,click | 0.1170 | 0.0672 | 0.0674 | 0.0419 | 0.0503 | 0.0329 | / | |
w/o. cart,collect | / | 0.0880 | 0.0537 | 0.0596 | 0.0386 | |||
w/o. collect,click | 0.0677 | 0.0411 | 0.0467 | 0.0295 | ||||
w/o. follow,click | / | 0.0747 | 0.0406 | |||||
w/o. follow,share | 0.1436 | 0.0826 | ||||||
w/o. share,click | 0.0707 | 0.0378 | ||||||
w/o. collect,click,cart | / | 0.0631 | 0.0396 | 0.0413 | 0.0257 | / | ||
w/o. share,click,follow | / | 0.0699 | 0.0376 |
Dataset | Beibei | QK-article | ||
---|---|---|---|---|
Behavior orders | Count | Percentage (%) | Count | Percentage (%) |
click, cart, buy | 289,080 | 98.71 | / | |
0, 0, buy | 3,411 | 1.16 | ||
click, 0, buy | 369 | 0.13 | ||
click, 0, 0, like | / | 533,667 | 90.65 | |
click, 0, share, like | 46678 | 7.93 | ||
click, follow, 0, like | 6219 | 1.06 | ||
click, follow, share, like | 2117 | 0.36 |
Dataset | Tmall | IJCAI15 | ||
---|---|---|---|---|
Behavior orders | Count | Percentage (%) | Count | Percentage (%) |
click, 0, 0, buy | 211,445 | 73.63 | 481,876 | 73.96 |
0, 0, 0, buy | 46,438 | 16.17 | 102,334 | 15.71 |
click, 0, collect, buy | 24,092 | 8.39 | 55,799 | 8.56 |
0, 0, collect, buy | 4,838 | 1.68 | 10,631 | 1.63 |
click, cart, 0, buy | 291 | 0.101 | 787 | 0.121 |
click, cart, collect, buy | 29 | 0.01 | 75 | 0.012 |
0, cart, 0, buy | 23 | 0.008 | 53 | 0.0081 |
0, cart, collect, buy | 2 | 0.0007 | 6 | 0.0009 |