Skip to main content


The Post-COVID era has seen an increase in the retail sales recorded by e-commerce companies with total sales increasing from 5.6% to about 11%, majorly due to the usage of predictive analysis techniques. Leading retail businesses such as Amazon have employed these techniques to drive increased customer satisfaction and increased sales.

Machine learning algorithms are used to develop product recommendation engines to suggest customers based on their previous behavior, browsing history, and purchase history. In the case of Amazon, this helped increase sales by about 35%. Another use case of predictive analytics in retail business is the suggestion and determination of retail prices for goods and services based on customers’ previous history and activities.

This work introduces and developed a machine learning-based tool which employs customer behavior (predictive analysis) in the e-commerce industry to predict the products that will be attractive to customers (product recommendation), and that can also be used to drive tailored advertisement of products to customers. The developed tool provides a simple to use artificial intelligence service that can be used by e-commerce companies to achieve effective product recommendations.
We demonstrated the effectiveness of the developed solution through our retail shopping solutions. Our work emphasizes the contribution of AI in predictive analytics applications to retail businesses.

We have proposed an algorithm that relies on information such as user-profile details, visited pages, and click data to determine a user’s interest. In addition to this information, we have also used click patterns of users for purchased products to improve the accuracy of prediction.

Literature Review

This section briefly presents the state-of-the-art in the recommendation system paradigm, especially with predictive analysis.
Literature has identified user behavior used in predictive analysis primarily as click information, user rating on purchased items, comments left behind by users, sharing functionalities, cart information, etc. In addition, [1-2] methods have been used extensively in developing product recommendation systems.
Recommendation systems based on predictive analysis have relied on changes in user behavior because some users do not leave explicit information such as ratings of products and comments for products they interacted with. [3] extracted the ratings from the purchase history of a user based on a specific product. The acquired data is related to the products that were purchased several times by a single user. With the user-item matrix collaborative filtering technique, they could produce a system that recommends the items that users were actually interested in.


Data Collection
Data collection is an essential part of our methodology. To find the nearest products to user preferences, we have established data type and information as the primary data collection step. To achieve this, we used a publicly available dataset collected from the internet. This dataset consists of user profile information and click history. It contains 294,864 records for 51,386 unique users, and the main feature is the user click history.

Data mining techniques were applied to the dataset to clean the data and make it useable for the proposed model. The processing includes data cleaning, extraction of related information, and feature discovery. After the process, we deduct variables relevant to developing the model for our use case. These variables include user IP, user click information, access date, access time, access page, product name, product type, and product id. In addition, we have removed all duplicated records in the dataset and deleted all records which do not have a specific purchased product.

Data Visualization
To properly understand the dataset we are working with, we have employed data visualization to analyze online product shopping based on different criteria, from time series analysis, user interest-based shopping analysis, and user access page analysis.


Figure 1: time series analysis - daily online product shopping

Figure 1: time series analysis – daily online product shopping

Figure 2: time series analysis - monthly online product shopping

Figure 2: time series analysis – monthly online product shopping

Figure 3: user interest analysis

Figure 3: user interest analysis

Figure 4: access page analysis

Proposed Model

Our developed AI model employed deep neural networks to implement collaborative filtering-based algorithm to accomplish the defined objectives of carrying out predictive analysis for production recommendation. The proposed method uses a combination of deep neural networks (DNN) -a machine learning architecture- with word2vec mechanism to explore click-based features of the users. Some other alternate methods to DNN as classifier are XGBoost, Random Forest, and Support Vector Machine (SVM).


Figure 5: Proposed architecture

Specifically, we employed the AutoRec architecture [5] which is a denoising autoencoder network with a collaborative filtering model. It can be used to learn lower-dimensional feature representations at the bottleneck layer, or to fill the blanks of the interaction matrix directly in the reconstruction layer. It is designed to accept both an item-based and user-based autoencoder. It takes in the latent vector of each item or user, maps it to the hidden layers and then attempts to reconstruct the item or user in the output layer, thereby predicting the ranking of products recommended to the user.

Figure 6: Autoencoder architecture

The Word2vec technique is a two-staged neural network that is used to generate data in vector space. We have used this technique to generate the nearest neighbor product that is based on user purchasing records and the click sequences.


The results from the experiments are evaluated with several evaluation metrics such as Mean Absolute Error, Mean Square Error, and Root Mean Square Error. The model is also evaluated extensively on real data from a variety of retail industries including goods and services, travel, automotive, and retail.

The output of the model is a ranking of products that are closest to the user purchased item and click information. From our results, we compared the predicted values and the true values to determine the effectiveness of the model. The close relation between the two values shows the effectiveness of the developed model in recommending products from user behavior.
It is important to note that this model is not restricted to only recommending products but can work with any tasks that are based on user behavior.


[1] Zhang, Z.P., Kudo, Y., Murai, T., Ren, Y.G. (2019). Enhancing recommendation accuracy of item-based collaborative filtering via item-variance weighting. Appl. Sci, vol.9, pp. 1928.
[2] Lu, J., Wu., D., Mao, M., Wang, W., Zhang, G. (2015). Recommender system application developments: A survey. Decis. Support Syst., vol. 74, pp. 12-32.
[3] Zheng, L., Lu, C.T., et al. (2019). MARS: Memory attention-aware recommender system. In Proceedings of the 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Washington, DC, USA, pp. 11-20.
[4] Sedan, S., Menon, A.K., Sanner, S., and Xie, L. (2015). Autorec: Autoencoders meet collaborative filtering. In WWW, pp. 111-112.