{"id":178,"date":"2021-02-08T13:36:37","date_gmt":"2021-02-08T13:36:37","guid":{"rendered":"http:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/?p=178"},"modified":"2021-04-30T12:06:30","modified_gmt":"2021-04-30T12:06:30","slug":"contextual-bandit-problem-starting-from-an-example","status":"publish","type":"post","link":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/2021\/02\/08\/contextual-bandit-problem-starting-from-an-example\/","title":{"rendered":"Why does Amazon always guess our preference? – explaining contextual bandit problem without mathematics"},"content":{"rendered":"\n

This blog will give you an idea of the rationale behind the recommendation system. How contextual bandit problem works in such a system? Hope this blog will give you an answer.<\/span><\/p>\n\n\n\n

During last semester, we are given a list of topics to discuss as a team. The fourth topic is bandit problem! <\/p>\n\n\n\n

\n
\n
\"\"<\/figure>\n<\/div>\n\n\n\n
\n

This is the two-arm bandit machine. Each time you have to choose to pull one arm to earn money. How will you do that? Which arm you will choose to pull? Probably try several times, and summarise some experience. Then you may have some rules to guide you to pull the arm.<\/p>\n\n\n\n

This is the bandit problem which is clearly about how to make a good decision. In a two-arm bandit machine, it is to choose to pull which arm to earn more money. When it comes to the recommendation system, it is to choose the good news\/products\/videos to earn a more click-through rate!!!<\/p>\n<\/div>\n<\/div>\n\n\n\n

Amazon’s secret – recommendation system<\/span><\/h1>\n\n\n\n
\n\n\n\n
\n
\n
\"\"<\/figure>\n<\/div>\n\n\n\n
\n

When you open your Amazon, you may notice it automatically recommends products for you. And when you using Tictok, it probability recommends videos that most attracts you. That is a recommendation system.<\/p>\n<\/div>\n<\/div>\n\n\n\n

Judging by Amazon\u2019s success, the recommendation system works. The company reported a 29% sales increase to $12.83 billion during its second fiscal quarter, up from $9.9 billion during the same time last year. A lot of that growth arguably has to do with the way Amazon has integrated recommendations into nearly every part of the purchasing process.<\/p><\/blockquote>\n\n\n\n

Amazon benefits from its recommendation system by recommending personalised products to different customers. You may have noticed that once you open Amazon, it shows the recommendation for you that you are actually interested in. Similarly, you may notice that Yahoo! recommends news you interests in, Tiktok always knows your tastes in videos. Although they may use a different algorithm, such personalized recommendation could be done by contextual bandit algorithms. A good recommendation system will always know you better than yourself !!<\/strong> Now, let’s look at what is contextual bandit problem through an example.<\/p>\n\n\n\n

Looking at the contextual bandit problem through an example<\/span><\/h1>\n\n\n\n
\n\n\n\n
\n\n\n\n

Assuming we have a website called ‘click me’ posting interesting news, and we make a profit from the click-through rate on web advertising. A list of companies asked us to put their advertisements on our website. In order to maximize our profit, we want to personalize these advertisements and attract our customers to click. In other words, we want to show specific advertisements to specific viewers. But how? This is the bandit problem.<\/p>\n\n\n\n

Collecting the contextual information<\/h2>\n\n\n\n

If we want to guess a person’s preference, we firstly want to know more about this person. Similarly, to our company, we want to know more about our viewers, which is called context in bandit problem. These contexts may contain:<\/p>\n\n\n\n