{"id":345,"date":"2022-03-21T09:36:00","date_gmt":"2022-03-21T09:36:00","guid":{"rendered":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/?p=345"},"modified":"2022-05-06T14:12:32","modified_gmt":"2022-05-06T14:12:32","slug":"particle-filtering-and-covid-19-part-2-the-bootstrap-filter","status":"publish","type":"post","link":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/2022\/03\/21\/particle-filtering-and-covid-19-part-2-the-bootstrap-filter\/","title":{"rendered":"Particle Filtering and COVID-19 (Part 2 &#8211; The Bootstrap Filter)"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This is the second part of a series on using particle filtering in epidemiology. In <a href=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/2022\/03\/14\/particle-filtering-and-covid-19-part-1-the-filtering-problem\/\">part 1<\/a> we saw an example of how to model observed case numbers as a partially observed random process and formulated the problem of inferring the true case numbers as a filtering problem. Unfortunately, exact inference is rarely possible for realistic models and was very computationally expensive even in our relatively simple example! This post will introduce the bootstrap particle filter, an approximate method that is much more computationally efficient and can be used in cases where it is not possible to compute the true filtering distribution.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Sequential Importance Sampling<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A common way of dealing with intractable integrals or complicated probability distributions is by using Monte Carlo methods \u2013 in our case, we can simulate from the probability distribution <span class=\"wp-katex-eq\" data-display=\"false\">p(x_t \\,|\\, y_{1:t-1})<\/span> for <span class=\"wp-katex-eq\" data-display=\"false\">x_t<\/span> since we know the dynamics of the disease (we can do this sequentially by simulating <span class=\"wp-katex-eq\" data-display=\"false\">x_t<\/span> from time 0 up to <span class=\"wp-katex-eq\" data-display=\"false\">t<\/span> according to the binomial updates we defined earlier). We can also compute <span class=\"wp-katex-eq\" data-display=\"false\"> p(y_t \\,|\\, x_t)<\/span> directly since we know the distribution of the observation process. These are the ingredients required to use the Monte Carlo technique of importance sampling to approximate the filtering distribution \u2013 if we simulate a large number of samples (known as \u201cparticles\u201d) for <span class=\"wp-katex-eq\" data-display=\"false\">x_t<\/span> we can weight them proportionally to <span class=\"wp-katex-eq\" data-display=\"false\"> p(y_t \\,|\\, x_t)<\/span> and estimate properties of the distribution (for example, the mean) by treating our weighted particles as if they were a sample from the filtering distribution.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Unfortunately, it is quite inefficient to use importance sampling out of the box in this setting \u2013 most of our simulations will not be close to what really happened in the epidemic and will therefore have very small weights. We need to find a lot of plausible possibilities for the true sequence of case numbers in order to get a good estimate for the filtering distribution, otherwise we will just end up being very overconfident about the one particle we simulated that is somewhat close to the truth. This problem gets worse at each timestep as the number of possibilities for the path taken by the hidden states increases \u2013 In fact, the number of particles required to produce good estimates increases exponentially in <span class=\"wp-katex-eq\" data-display=\"false\">t<\/span>. In our epidemiology example (see below for a comparison of the estimated filtering distribution with the true distribution) importance sampling did well initially but the shrinking credible intervals after <span class=\"wp-katex-eq\" data-display=\"false\">t = 10<\/span> indicate that the sample weights are becoming increasingly concentrated on just a few particles.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"569\" src=\"http:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/sisvsexact2-1024x569.png\" alt=\"sisvsexact2\" class=\"wp-image-322\" srcset=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/sisvsexact2-1024x569.png 1024w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/sisvsexact2-300x167.png 300w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/sisvsexact2-768x427.png 768w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/sisvsexact2-1536x853.png 1536w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/sisvsexact2-2048x1138.png 2048w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/sisvsexact2-1600x889.png 1600w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">One issue with this approach is that we keep and use all of the particles, no matter how unlikely they are to be close to the truth. For example, in our epidemic we know that the true number of cases must be at least as big as the number of cases we observed, so any simulated particle that does not satisfy this at each time point will have a weight of 0 since it cannot possibly reflect the true number of cases.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">The Bootstrap Filter<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The bootstrap filter is a way of sequentially generating particles that are more concentrated in areas of high density of <span class=\"wp-katex-eq\" data-display=\"false\">p(y_t \\, | \\, x_t)<\/span>. Instead of continuing to propagate all of our particles forward and sequentially updating their weights, the bootstrap filter uses the weights to resample the particles, creating a new generation of particles that can all be given equal weight. This allows particles with negligible weight to be eliminated, while we simulate several descendants of the particles with higher weights. At each timestep, we sample the next generation of particles independently and with replacement, with each particle having a probability proportional to its weight of being selected.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We\u2019ll illustrate this on our running example via an animation \u2013 the gif below shows the propagation of 4 particles, with the true number of cases indicated by the dashed line. The dots representing each current particle are scaled according to their weights, with x\u2019s indicating removed particles. The resulting estimate of the filtering distribution is shown at the end.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/particleprop2.gif\" alt=\"\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Using 100 particles (see the figure below), we obtain a pretty reasonable approximation for the exact filter mean and quantiles, and for over 1000 particles the two are virtually indistinguishable. The only difference is that the bootstrap filter algorithm takes only seconds to run (even with a huge number of particles), while the exact distribution took several hours to compute!<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"569\" src=\"http:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/sirvsexact2-1024x569.png\" alt=\"sirvsexact2\" class=\"wp-image-324\" srcset=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/sirvsexact2-1024x569.png 1024w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/sirvsexact2-300x167.png 300w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/sirvsexact2-768x427.png 768w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/sirvsexact2-1536x853.png 1536w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/sirvsexact2-2048x1138.png 2048w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/sirvsexact2-1600x889.png 1600w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">In addition, the weights within the particle filtering algorithm can be used to estimate the data likelihood. If we do not know the true values of <span class=\"wp-katex-eq\" data-display=\"false\">p<\/span> and <span class=\"wp-katex-eq\" data-display=\"false\">p_{obs}<\/span> this can be used to estimate which values are likely given our data as we can run the particle filter for different possibilities for the parameters. For example, we can roughly locate the maximum likelihood estimators in our example by estimating the data likelihood on a grid of possible parameter values (see below for an approximate contour plot of the likelihood, with the rough location of the maximum indicated).<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"569\" src=\"http:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/contour2-1024x569.png\" alt=\"contour2\" class=\"wp-image-325\" srcset=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/contour2-1024x569.png 1024w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/contour2-300x167.png 300w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/contour2-768x427.png 768w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/contour2-1536x853.png 1536w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/contour2-2048x1138.png 2048w, https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-content\/uploads\/sites\/40\/2022\/03\/contour2-1600x889.png 1600w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Optimising for estimates of <span class=\"wp-katex-eq\" data-display=\"false\">p<\/span> and <span class=\"wp-katex-eq\" data-display=\"false\">p_{obs}<\/span> is easy enough to do in our simple example with only two parameters, though much harder for more complicated models. Also, just estimating the parameters isn\u2019t enough if we also want to perform inference on the hidden states, since simply substituting these into a particle filter wouldn\u2019t reflect the fact that we are uncertain about what the true parameter values are \u2013 our uncertainty about the parameters results in a higher degree of uncertainty about the hidden states. A neat way of performing inference on the parameters and hidden states jointly is by using particle Markov chain Monte Carlo.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h3 class=\"wp-block-heading\">Further Reading<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<ol class=\"wp-block-list\"><li><a href=\"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/ziyang-yang\/2021\/04\/24\/how-could-particle-filter-track-thanos-explaining-particle-filter-without-mathematics\/\">How could particle filter track Thanos? \u2013 explaining particle filter without mathematics<\/a> &#8211; Ziyang Yang<\/li><li><a href=\"https:\/\/www.stats.ox.ac.uk\/~doucet\/doucet_johansen_tutorialPF2011.pdf\">A Tutorial on Particle Filtering and Smoothing: Fifteen years later<\/a> &#8211; Arnaud Doucet and Adam M. Johansen<\/li><\/ol>\n","protected":false},"excerpt":{"rendered":"<p>This is the second part of a series on using particle filtering in epidemiology. This post will introduce the bootstrap particle filter, a computationally efficient method for inferring the true case numbers in the filtering problem.<\/p>\n","protected":false},"author":43,"featured_media":423,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"slim_seo":{"title":"Particle Filtering and COVID-19 (Part 2 - The Bootstrap Filter) - Connie Trojan","description":"This is the second part of a series on using particle filtering in epidemiology. This post will introduce the bootstrap particle filter, a computationally effic"},"footnotes":""},"categories":[1],"tags":[],"class_list":["post-345","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-json\/wp\/v2\/posts\/345","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-json\/wp\/v2\/users\/43"}],"replies":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-json\/wp\/v2\/comments?post=345"}],"version-history":[{"count":10,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-json\/wp\/v2\/posts\/345\/revisions"}],"predecessor-version":[{"id":483,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-json\/wp\/v2\/posts\/345\/revisions\/483"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-json\/wp\/v2\/media\/423"}],"wp:attachment":[{"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-json\/wp\/v2\/media?parent=345"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-json\/wp\/v2\/categories?post=345"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/connie-trojan\/wp-json\/wp\/v2\/tags?post=345"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}