{"id":178,"date":"2021-02-08T13:36:37","date_gmt":"2021-02-08T13:36:37","guid":{"rendered":"http:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/?p=178"},"modified":"2021-04-30T12:06:30","modified_gmt":"2021-04-30T12:06:30","slug":"contextual-bandit-problem-starting-from-an-example","status":"publish","type":"post","link":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/2021\/02\/08\/contextual-bandit-problem-starting-from-an-example\/","title":{"rendered":"Why does Amazon always guess our preference? &#8211; explaining contextual bandit problem without mathematics"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\"><span class=\"has-inline-color has-secondary-color\">This blog will give you an idea of the rationale behind the recommendation system. How contextual bandit problem works in such a system? Hope this blog will give you an answer.<\/span><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">During last semester, we are given a list of topics to discuss as a team. The fourth topic is bandit problem! <\/p>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:33.33%\">\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/1_pcEsW85jbSIzsEONxn1XRQ.jpeg\" alt=\"\" class=\"wp-image-286\" width=\"251\" height=\"291\" srcset=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/1_pcEsW85jbSIzsEONxn1XRQ.jpeg 345w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/04\/1_pcEsW85jbSIzsEONxn1XRQ-259x300.jpeg 259w\" sizes=\"auto, (max-width: 251px) 100vw, 251px\" \/><\/figure>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:66.66%\">\n<p class=\"wp-block-paragraph\">This is the two-arm bandit machine.  Each time you have to choose to pull one arm to earn money. How will you do that? Which arm you will choose to pull? Probably try several times, and summarise some experience. Then you may have some rules to guide you to pull the arm.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This is the bandit problem which is clearly about how to make a good decision. In a two-arm bandit machine, it is to choose to pull which arm to earn more money. When it comes to the recommendation system, it is to choose the good news\/products\/videos to earn a more click-through rate!!!<\/p>\n<\/div>\n<\/div>\n\n\n\n<h1 class=\"wp-block-heading\"><span style=\"color:#00983f\" class=\"has-inline-color\">Amazon&#8217;s secret &#8211; recommendation system<\/span><\/h1>\n\n\n\n<hr class=\"wp-block-separator has-text-color has-background is-style-wide\" style=\"background-color:#249009;color:#249009\" \/>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"350\" src=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/02\/recommendation-1-1024x350.png\" alt=\"\" class=\"wp-image-180\" srcset=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/02\/recommendation-1-1024x350.png 1024w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/02\/recommendation-1-300x103.png 300w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/02\/recommendation-1-768x263.png 768w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/02\/recommendation-1-1536x525.png 1536w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/02\/recommendation-1.png 1558w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<p class=\"wp-block-paragraph\">When you open your Amazon, you may notice it automatically recommends products for you. And when you using Tictok, it probability recommends videos that most attracts you. That is a recommendation system.<\/p>\n<\/div>\n<\/div>\n\n\n\n<blockquote class=\"wp-block-quote is-style-default is-layout-flow wp-block-quote-is-layout-flow\"><p>Judging by Amazon\u2019s success, the recommendation system works. The company reported a 29% sales increase to $12.83 billion during its second fiscal quarter, up from $9.9 billion during the same time last year. A lot of that growth arguably has to do with the way Amazon has integrated recommendations into nearly every part of the purchasing process.<\/p><\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">Amazon benefits from its recommendation system by recommending personalised products to different customers. You may have noticed that once you open Amazon, it shows the recommendation for you that you are actually interested in. Similarly, you may notice that Yahoo! recommends news you interests in, Tiktok always knows your tastes in videos. Although they may use a different algorithm, such personalized recommendation could be done by contextual bandit algorithms.<strong> A good recommendation system will always know you better than yourself !!<\/strong> Now, let&#8217;s look at what is contextual bandit problem through an example.<\/p>\n\n\n\n<h1 class=\"wp-block-heading\"><span style=\"color:#019925\" class=\"has-inline-color\">Looking at the contextual bandit problem through an example<\/span><\/h1>\n\n\n\n<hr class=\"wp-block-separator has-text-color has-background is-style-wide\" style=\"background-color:#249009;color:#249009\" \/>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<p class=\"wp-block-paragraph\">Assuming we have a website called &#8216;click me&#8217; posting interesting news, and we make a profit from the click-through rate on web advertising. A list of companies asked us to put their advertisements on our website. In order to maximize our profit, we want to personalize these advertisements and attract our customers to click. In other words, we want to show specific advertisements to specific viewers. But how? This is the bandit problem.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Collecting the contextual information<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">If we want to guess a person&#8217;s preference, we firstly want to know more about this person. Similarly, to our company, we want to know more about our viewers, which is called context in bandit problem. These contexts may contain:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Personal information: Gender, region, age, etc&#8230;<\/li><li>Recent browsing records and click-through records: Even including how many seconds you spend in viewing one advertisement<\/li><li>The preference of the categories of news: for example, our viewer may like the news of Justin Bieber, or they may focus on sales information.<\/li><li>etc&#8230;<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Trying and learning how to guess<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Okay dokey. Now we have lots of information about our viewers. What&#8217;s the next step? If you want to guess a persons&#8217; favourite movie, you might want to show them some movies and observe their reactions. For example, if we show them &#8216;Titanic&#8217; and they said they really love this movie, they probably like a romantic movie and we will show them more romantic movies to guess. If you show them &#8216;The Lion King&#8217; and they said they do not like this movie, you will not show them more cartoon movies. (Just example, I love The Lion King!!!!) <\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/i.imgflip.com\/15at2x.jpg\" alt=\"Uh... is it Alien? - Imgflip\" width=\"198\" height=\"173\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Similarity, we have a list of advertisement from a list of companies. Which advertisement we choose to show for viewers with certain type?<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"170\" src=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/02\/contextualbanditdiag-1024x170.png\" alt=\"\" class=\"wp-image-181\" srcset=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/02\/contextualbanditdiag-1024x170.png 1024w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/02\/contextualbanditdiag-300x50.png 300w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/02\/contextualbanditdiag-768x127.png 768w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/02\/contextualbanditdiag-1536x255.png 1536w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/02\/contextualbanditdiag-1600x265.png 1600w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-content\/uploads\/sites\/18\/2021\/02\/contextualbanditdiag.png 1762w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Similarly,  each time, our system will give them a type of advertisement (that is choose an action), and watch their reaction. If guess correctly, the machine will gain &#8216;rewards&#8217; (that is you click the products), and such rewards will transfer to experience about this type of viewers. If guess incorrectly, the machine is &#8216;regret&#8217; that do not guess viewers preference and try to guess again and again. After a long time, our machine could guess the preference of viewers correctly! <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For example, for viewers age below 6 years old. When the machine shows the ads about toys, and children&#8217;s clicked that ad. The machine will gain experience that children are more likely to click ads about toys. And next time our machine is more likely to put an advertisement about toys on our website.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">After a while and huge data, this engine has cumulated enough information about viewers preference and has a high probability to guess the preference &#8211; just like the process of learning (learn experience from success and try after failure!)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now our company runs very well and could show certain advertisements to certain viewers! With a high click-through rate, we made lots of profit!!<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pyxis.nymag.com\/v1\/imgs\/8f8\/e12\/51b54d13d65d8ee3773ce32da03e1fa220-dogecoin.rsquare.w1200.jpg\" alt=\"Why Dogecoin Is Forcing People to Take It Seriously\" width=\"242\" height=\"242\" \/><\/figure><\/div>\n\n\n\n<h1 class=\"wp-block-heading\"><span style=\"color:#089b4c\" class=\"has-inline-color\">Extended reading<\/span><\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">This blog is only a general idea of multi-arm bandit problem, see more explanation including Maths please visit:<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-wp-embed is-provider-maddie-smith wp-block-embed-maddie-smith\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"wp-embedded-content\" data-secret=\"zPc0baVG9d\"><a href=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/maddie-smith\/2021\/02\/02\/learn-from-your-mistakes-multi-armed-bandits\/\">Learn From Your Mistakes &#8211; Multi-armed Bandits<\/a><\/blockquote><iframe loading=\"lazy\" class=\"wp-embedded-content\" sandbox=\"allow-scripts\" security=\"restricted\" style=\"position: absolute; clip: rect(1px, 1px, 1px, 1px);\" title=\"&#8220;Learn From Your Mistakes &#8211; Multi-armed Bandits&#8221; &#8212; Maddie Smith\" src=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/maddie-smith\/2021\/02\/02\/learn-from-your-mistakes-multi-armed-bandits\/embed\/#?secret=zPc0baVG9d\" data-secret=\"zPc0baVG9d\" width=\"600\" height=\"338\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\"><\/iframe>\n<\/div><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">See more references on contextual bandits and reinforcement learning in depth please visit:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/towardsdatascience.com\/contextual-bandits-and-reinforcement-learning-6bdfeaece72a\">https:\/\/towardsdatascience.com\/contextual-bandits-and-reinforcement-learning-6bdfeaece72a<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/www.cs.ubc.ca\/labs\/lci\/mlrg\/slides\/2019_summer_5_contextual_bandits.pdf\">https:\/\/www.cs.ubc.ca\/labs\/lci\/mlrg\/slides\/2019_summer_5_contextual_bandits.pdf<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And this video is really good to watch if you want to learn it at the beginning:<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"Contextual Bandit: from Theory to Applications. - Vernade - Workshop 3 - CEB T1 2019\" width=\"688\" height=\"387\" src=\"https:\/\/www.youtube.com\/embed\/Mu8uAVrD08w?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>This blog will give you an idea of the rationale behind the recommendation system. How contextual bandit problem works in&hellip;<\/p>\n","protected":false},"author":25,"featured_media":290,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[4],"tags":[],"class_list":["post-178","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blogs"],"_links":{"self":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/posts\/178","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/users\/25"}],"replies":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/comments?post=178"}],"version-history":[{"count":11,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/posts\/178\/revisions"}],"predecessor-version":[{"id":375,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/posts\/178\/revisions\/375"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/media\/290"}],"wp:attachment":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/media?parent=178"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/categories?post=178"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/wp-json\/wp\/v2\/tags?post=178"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}