# Bayes theorem, and making probability intuitive

The goal is for you to come away from this

video understanding one of the most important formulas in probability, Bayes’ theorem. This formula is central to scientific discovery,

it’s a core tool in machine learning and AI, and it’s even been used for treasure

hunting, when in the 80’s a small team led by Tommy Thompson used Bayesian search tactics

to help uncover a ship that had sunk a century and half earlier carrying what, in today’s

terms, amounts to $700,000,000 worth of gold. But there’s more than one level of understanding. There’s knowing what each part means, so

you can plug in numbers. There’s understanding why it’s true; later

I’ll show you to a diagram that’s helpful for rediscovering the formula on the fly as

needed. Then there’s being able to recognize when

you need to use it. With the goal of gaining a deeper understanding,

you and I will tackle these in reverse order. So before dissecting the formula, or explaining

the visual that makes it obvious, I’d like to tell you about a man named Steve. Listen

carefully. Steve is very shy and withdrawn, invariably

helpful but with very little interest in people or in the world of reality. A meek and tidy

soul, he has a need for order and structure, and a passion for detail. Which of the following do you find more likely:

“Steve is a librarian”, or “Steve is a farmer”? Some of you may recognize this as an example

from a study conducted by the psychologists Daniel Kahneman and Emos Tversky, whose Nobel-prize-winning

work was popularized in books like “Thinking Fast and Slow”, “The Undoing Project”,

and several others. They researched human judgments, with a frequent focus on when these

judgments irrationally contradict what the laws of probability suggest they should be. The example with Steve, the maybe-librarian-maybe-farmer,

illustrates one specific type of irrationality. Or maybe I should say “alleged” irrationality;

some people debate the conclusion, but more on all that in a moment. According to Kahneman and Tversky, after people

are given this description of Steve as “meek and tidy soul”, most say he is more likely

to be a librarian than a farmer. After all, these traits line up better with the stereotypical

view of a librarian than that of a farmer. And according to Kahneman and Tversky, this

is irrational. The point is not whether people hold correct

or biased views about the personalities of librarians or farmers, it’s that almost

no one thinks to incorporate information about ratio of farmers to librarians into their

judgments. In their paper, Kahneman and Tversky said that in the US that ratio is about 20

to 1. The numbers I can find for today put it much higher than that, but let’s just

run with the 20 to 1 ratio since it’s a bit easier to illustrate, and proves the point

just as well. To be clear, anyone who is asked this question

doesn’t have perfect information on the actual statistics of farmers, librarians,

and their personality traits. But the question is whether people even think to consider this

ratio, enough to make a rough estimate. Rationality is not about knowing facts, it’s about recognizing

which facts are relevant. If you do think to make this estimate, there’s

a simple way to reason about the question – which, spoiler alert, involves all the

essential reasoning behind Bayes’ theorem. You might start by picturing a representative

sample of farmers and librarians, say, 200 farmers and 10 librarians. Then when you hear

the meek and tidy soul description, let’s say your gut instinct is that 40% of librarians

would fit that description and that 10% of farmers would. That would mean that from your

sample, you’d expect that about 4 librarians fit it, and that 20 farmers do. The probability

that a random person who fits this description is a librarian is 4/24, or 16.7%. So even if you think a librarian is 4 times

as likely as a farmer to fit this description, that’s not enough to overcome the fact that

there are way more farmers. The upshot, and this is the key mantra underlying Bayes’

theorem, is that new evidence should not completely determine your beliefs in a vacuum; it should

update prior beliefs. If this line of reasoning makes sense to you,

the way seeing evidence restricts the space of possibilities, then congratulations! You

understand the heart of Bayes’ theorem. Maybe the numbers you’d estimate would be

different, but what matters is how you fit the numbers together to update a belief based

on evidence. Here, see if you can take a minute to generalize what we just did and write it

down as a formula. The general situation where Bayes’ theorem

is relevant is when you have some hypothesis, say that Steve is a librarian, and you see

some evidence, say this verbal description of Steve as a “meek and tidy soul”, and

you want to know the probability that the hypothesis holds given that the evidence is

true. In the standard notation, this vertical bar means “given that”. As in, we’re

restricting our view only to the possibilities where the evidence holds. The first relevant number is the probability

that the hypothesis holds before considering the new evidence. In our example, that was

the 1/21, which came from considering the ratio of farmers to librarians in the general

population. This is known as the prior. After that, we needed to consider the proportion

of librarians that fit this description; the probability we would see the evidence given

that the hypothesis is true. Again, when you see this vertical bar, it means we’re talking

about a proportion of a limited part of the total space of possibilities, in this cass,

limited to the left slide where the hypothesis holds. In the context of Bayes’ theorem,

this value also has a special name, it’s the “likelihood”. Similarly, we need to know how much of the

other side of our space includes the evidence; the probability of seeing the evidence given

that our hypothesis isn’t true. This little elbow symbol is commonly used to mean “not”

in probability. Now remember what our final answer was. The

probability that our librarian hypothesis is true given the evidence is the total number

of librarians fitting the evidence, 4, divided by the total number of people fitting the

evidence, 24. Where does that 4 come from? Well it’s the

total number of people, times the prior probability of being a librarian, giving us the 10 total

librarians, times the probability that one of those fits the evidence. That same number

shows up again in the denominator, but we need to add in the total number of people

times the proportion who are not librarians, times the proportion of those who fit the

evidence, which in our example gave 20. The total number of people in our example,

210, gets canceled out – which of course it should, that was just an arbitrary choice

we made for illustration – leaving us finally with the more abstract representation purely

in terms of probabilities. This, my friends, is Bayes’ theorem. You often see this big denominator written

more simply as P(E), the total probability of seeing the evidence. In practice, to calculate

it, you almost always have to break it down into the case where the hypothesis is true,

and the one where it isn’t. Piling on one final bit of jargon, this final

answer is called the “posterior”; it’s your belief about the hypothesis after seeing

the evidence. Writing it all out abstractly might seem more

complicated than just thinking through the example directly with a representative sample;

and yeah, it is! Keep in mind, though, the value of a formula like this is that it lets

you quantify and systematize the idea of changing beliefs. Scientists use this formula when

analyzing the extent to which new data validates or invalidates their models; programmers use

it in building artificial intelligence, where you sometimes want to explicitly and numerically

model a machine’s belief. And honestly just for how you view yourself, your own opinions

and what it takes for your mind to change, Bayes’ theorem can reframe how you think

about thought itself. Putting a formula to it is also all the more important as the examples

get more intricate. However you end up writing it, I’d actually

encourage you not to memorize the formula, but to draw out this diagram as needed. This is sort of the distilled version of thinking

with a representative sample where we think with areas instead of counts, which is more

flexible and easier to sketch on the fly. Rather than bringing to mind some specific

number of examples, think of the space of all possibilities as a 1×1 square. Any event

occupies some subset of this space, and the probability of that event can be thought about

as the area of that subset. For example, I like to think of the hypothesis as filling

the left part of this square, with a width of P(H). I recognize I’m being a bit repetitive,

but when you see evidence, the space of possibilities gets restricted. Crucially, that restriction

may not happen evenly between the left and the right. So the new probability for the

hypothesis is the proportion it occupies in this restricted subspace. If you happen to think a farmer is just as

likely to fit the evidence as a librarian, then the proportion doesn’t change, which

should make sense. Irrelevant evidence doesn’t change your belief. But when these likelihoods

are very different, your belief changes a lot. This is actually a good time to step back

and consider a few broader takeaways about how to make probability more intuitive, beyond

Bayes’ theorem. First off, there’s the trick of thinking about a representative sample

with a specific number of examples, like our 210 librarians and farmers. There’s actually

another Kahneman and Tversky result to this effect, which is interesting enough to interject

here. They did an experiment similar to the one

with Steve, but where people were given the following description of a fictitious woman

named Linda: Linda is 31 years old, single, outspoken,

and very bright. She majored in philosophy. As a student, she was deeply concerned with

issues of discrimination and social justice, and also participated in anti-nuclear demonstrations. They were then asked what is more likely:

That Linda is a bank teller, or that Linda is a bank teller and is active in the feminist

movement. 85% of participants said the latter is more likely, even though the set of bank

tellers active in the femist movement is a subset of the set of bank tellers! But, what’s fascinating is that there’s

a simple way to rephrase the question that dropped this error from 85% to 0. Instead,

if participants are told there are 100 people who fit this description, and asked people

to estimate how many of those 100 are bank tellers, and how many are bank tellers who

are active in the feminist movement, no one makes the error. Everyone correctly assigns

a higher number to the first option than to the second. Somehow a phrase like “40 out of 100”

kicks our intuition into gear more effectively than “40%”, much less “0.4”, or abstractly

referencing the idea of something being more or less likely. That said, representative samples don’t

easily capture the continuous nature of probability, so turning to area is a nice alternative,

not just because of the continuity, but also because it’s way easier to sketch out while

you’re puzzling over some problem. You see, people often think of probability

as being the study of uncertainty. While that is, of course, how it’s applied in science,

the actual math of probability is really just the math of proportions, where turning to

geometry is exceedingly helpful. I mean, if you look at Bayes’ theorem as

a statement about proportions – proportions of people, of areas, whatever – once you

digest what it’s saying, it’s actually kind of obvious. Both sides tell you to look

at all the cases where the evidence is true, and consider the proportion where the hypothesis

is also true. That’s it. That’s all it’s saying. What’s noteworthy is that such a straightforward

fact about proportions can become hugely significant for science, AI, and any situation where you

want to quantify belief. You’ll get a better glimpse of this as we get into more examples. But before any more examples, we have some

unfinished business with Steve. Some psychologists debate Kahneman and Tversky’s conclusion,

that the rational thing to do is to bring to mind the ratio of farmers to librarians.

They complain that the context is ambiguous. Who is Steve, exactly? Should you expect he’s

a randomly sampled American? Or would you be better to assume he’s a friend of these

two psychologists interrogating you? Or perhaps someone you’re personally likely

to know? This assumption determines the prior. I, for one, run into many more librarians

in a given month than farmers. And needless to say, the probability of a librarian or

a farmer fitting this description is highly open to interpretation. But for our purposes, understanding the math,

notice how any questions worth debating can be pictured in the context of the diagram.

Questions of context shift around the prior, and questions of personalities and stereotypes

shift the relevant likelihoods. All that said, whether or not you buy this

particular experiment the ultimate point that evidence should not determine beliefs, but

update them, is worth tattooing in your mind. I’m in no position to say whether this does

or doesn’t run against natural human intuition, we’ll leave that to the psychologists. What’s

more interesting to me is how we can reprogram our intuitions to authentically reflect the

implications of math, and bringing to mind the right image can often do just that. This is just one way to visualize Bayes’

theorem, and I’d like to share with you another way that can be generalized to cases

where you have more possibilities than a simple yes or know for a hypothesis, maybe even a

continuous range of hypotheses. For example, say you want to update your belief about the

mass of the earth based on new measurements you take. We’ll also take a glimpse at the

kind of constructs programmers build on top of this formula as you get more sophisticated.

All this with the goal of finding that deeper understanding, and all of this in the next

video.

Also check out the footnote on the "quick" way to see Bayes' theorem, together with some discussion of independence: https://youtu.be/U_85TaXbeIo

If you rearrange Bayes theorem you get P(E)P(H|E) = P(H)P(E|H) both sides of which are just P(E.H) i.e. the probability of E and H together.

Edit: Duh, this is one of the points of the short follow-up video

Wow, embarassed to admit it, but I had to pause the video to ponder the activist bank teller example. I absolutely would have gotten that question wrong.

So, the information from the question improves Steve's probability of actually getting the riddle correct from 1/21 to 1/6 (?) and he's not supposed to double down on optimism? Did anyone tell Steve he wasn't playing hide and seek? In other words, I think people are picking up on where the evidence is headed, as a fox hunts for mice in the snow (https://youtu.be/D2SoGHFM18I).

I'm still struggling with probabilities. I know it is useful in several instances, but it's just not clicking with me. I'll watch this video again tomorrow and see if I get new insight. I'm looking forward to the next one as well.

I got Bayes in my 💓.

You are the man

The Linda question seemed like it would be better if I choose the 1st option. Since both options include a fact, that she is a bank teller, it would make more sense to choose the 1st than the 2nd because the likelihood of her being a bank teller obviously exceeded a niche, very specific category. That was my take on it.

Visualising division and multiplication geometrically or using vectors in Euclidean space is very important. Pythogoras started like this and sadly our school curriculum doesn’t have that presentation.

I love your videos. These videos got me to eventually change my major because they're so good

I think there's an additional piece of information when considering the Steve problem, which is that being told about the personality traits makes you think that his personality is what's most relevant to the correct answer. Basically it's like when you're given a formula or a hint on a test question, the fact that you're given the formula at all automatically makes you think that you have to use it somewhere in your solution (even if it's totally irrelevant and the examiner is just being a dick)

You're back!

I've updated my belief that this channel is awesome.

Casually teaches Bayes’ Theorem in 4 and a half minutes.

I don't think it conflicts with intuition. I think the questions are SPECIFICALLY designed almost on purpose to omit assumptions and information usually present in other examples.

I remember learning Bayes theorem in school. I got it, but I could never successfully use it in a problem that was even slightly abstract. Instead, I drew this exact diagram except with rectangles overlapping to represent subsets. I'm so glad everyone here gets a chance to learn this, it's so helpful!

This tattooed more in my brain than other explanations of bayes theorem

In the case of Linda, people take the mistake, because they try to be in her shoes, while when you change the question from Linda into 100 individuals, people wouldn't.

hard to follow

Microtheorems in pieces

Hi @3blue1brown, i used to interpret that thr Bayes theorem is just a formula between marginal probability

P(A and B) = P(A|B) * P(B)

Which can be modeled easily.

Why do people try to complicate things up?

Just because people don't understand what's being told in the video, they can NOT dislike it.

Excellent video. Thanks

"Very little interest with the world of reality" – how does that fit with any farmer ? Shy and withdrawn – sure, Meek and tidy – sure but farmers are very much concerned with the world of reality…

I think what happens is that we focus on the likelihood of the description describing a situation, and forget about the actual person it describes.

So what is the likelihood that this is the description of a set of librarians, rather than what is the likelihood a person described by this being a librarian.

So when choosing between the set of librarians or the set of farmers we fall down on the set of librarians.

That also explains why we switch perspectives when actual numbers are introduced.

a way to make the video clearer: keep the example (steve)'s version of H and E on the corner of the screen as you change visuals

Great continue with probability then Machine learning: SVM and Convolutional Neural networks then LSTM, RNN and GAN

3:52 420 lmaaaaaao blaaaaaazZz33##3

👍 understood one of the most important thing in life

Finally a video I can understand

In this example, how would anyone know the probabiltiy that 'meek and tidy soul'-s are librarians? In a real example, how is this 'evidence' calculated? Also: The hypothesis is, to me, the evidence. It's the raw facts saying the ratio of librarians to farmers. How can this be considered a hypothesis? It's literally facts, while the 'evidence' of meek and tidy souls are much more speculative. Am I overthinking/missunderstanding something?

And why not just use a practical real example?

I've seen some people on the internet who are vehemently against falsificationism and champion Bayes Theorem in its place, which as a falsificationist has always bothered me, but the way you emphasize here that Bayes theorem is all about updating beliefs about a preexisting hypothesis and not about coming up with truths out of a vacuum makes it seem to me that it and falsificationism go hand in hand a lot more than those people on the internet want to make it. Falsificationism is also all about updating beliefs about a preexisting hypothesis rather than pullings truths out of a vacuum, and it seems to me following your explanation here that Bayes theorem is all about figuring out which side of a bifurcation of possibilities — the ones where the hypothesis is true, and the ones where it's false — is less likely relative to the other, which is entirely in keeping with the main thrust of falsificationism, which is that there is never a way to concretely establish that one particular possibility is the only correct one, only to rule out alternatives to it. Bayes theorem seems to me, following your explanation here, that it "partially rules out" one or the other of two mutually contradictory alternatives, narrowing in on one side of the space of possibilities as the region in which the truth (probably) lies.

I've seen some people on the internet who are vehemently against falsificationism and champion Bayes Theorem in its place, which as a falsificationist has always bothered me, but the way you emphasize here that Bayes theorem is all about updating beliefs about a preexisting hypothesis and not about coming up with truths out of a vacuum makes it seem to me that it and falsificationism go hand in hand a lot more than those people on the internet want to make it. Falsificationism is also all about updating beliefs about a preexisting hypothesis rather than pullings truths out of a vacuum, and it seems to me following your explanation here that Bayes theorem is all about figuring out which side of a bifurcation of possibilities — the ones where the hypothesis is true, and the ones where it's false — is less likely relative to the other, which is entirely in keeping with the main thrust of falsificationism, which is that there is never a way to concretely establish that one particular possibility is the only correct one, only to rule out alternatives to it. Bayes theorem seems to me, following your explanation here, that it "partially rules out" one or the other of two mutually contradictory alternatives, narrowing in on one side of the space of possibilities as the region in which the truth (probably) lies.

Yay just got an A- in probability :,)

Oh man, I remember getting just such a study questionnaire, where I get a long series of questions like: "What's more likely X, or X and Y" and I thought long and hard about every single question… I felt so stupid when my class mates explained how easy it was.

In my mind I was building up these two characters of people of which you would say that they fit X, or people of which you would always say both X and Y, and then comparing these stereotypes to the description…

failSo, what about that gold?

Outstanding alliteration at 12:20.

„Rationality is not about knowing fact, it’s about recognizing which facts are relevant.“

Sadly, this concept gets abused on a daily basis by certain individuals.

Sorry, but these examples are dodgy. Their whole "trick" relies on being phrased in an imprecise way and leaving out context. In the farmer example, an honest, CLEAR way to ask the question would be: "Within the ENTIRE US population, is Steve more likely to be a librarian or a farmer?" But the question LEAVES OUT this context (US pop) entirely. People assume it refers to generic human archetypes, e.g. characters in a book.

In books, characters are NOT represented proportionally to reality, but by "types": one farmer, one librarian, one soldier etc. Thus people interpret the question as "Between two archetypal characters, a farmer and librarian, who is Steve more likely to be?". Same with the bank teller example: nowhere is the whole US population mentioned as a context. People interpret it to mean "if in a book, a character is a bank teller, and she DOES HAVE a history of activism, is she more likely to be a feminist"? An honest way to ask the question would be: "Within the entire US population, is Linda more likely to be a bank teller or a feminist bank teller"

3:51 420 lmao

not starting from the beginning of probability stuff? I hope its a series 🙂

11:30 I think why people chose the 2nd answer is because based on the description, we assign the answer to "activist". Whether she's a bank teller or not is not relevant .

Which is why if you draw the venn diagram, the 2nd choice makes more sense, because at least it's an intersection between "activist" (which we assume is the higher likelihood) & bank teller.

I simply love how you are complicating otherwise easy to understand concepts.

One of the most amusing applications of Bayes theorem: https://www.youtube.com/watch?v=HHIz-gR4xHo

So | is just a &&

Can this be done in the 3D dimension?

Why can't I give these videos more than one thumbs up? I swear all of this channel needs to be permanently stored in the library of congress for the good of all humanity

Now I have a video I can point to that can explain what was going on in my Masters research. I could never adequately explain this topic. This video nails it!

"Rationality is not about knowing facts

, it's about recognizing which facts are relevant."

As I see you are doing a mad job at explaining concepts, you could consider doing something from math logic.

Even Ted-ed had a video on the logical fallacy as mentioned in one of your examples. https://youtu.be/Ghbkv0MKV-w

Wow. I thought there would be more librarians

There are loads of comments here already objecting to the wording of the Linda question. Please note that this video only gives a brief summary of the results, and the actual study went to great pains to eliminate the possibility that participants were merely mistaking the statement "Linda is a bank teller" to mean "Linda is a bank teller (and not a feminist)."

First off, the original question didn't give only the two options mentioned by Grant here. The original question asked participants to rank EIGHT choices in order of their relative probability, including not only the two Grant mentioned in the video, but also "Linda is a feminist," "Linda is a teacher in an elementary school," "Linda works in a bookstore and takes yoga classes," "Linda is a psychiatric social worker," "Linda is a member of the League of Women Voters," and "Linda is an insurance salesperson."

Among these EIGHT choices, participants consistently ranked "Linda is a bank teller and a feminist" higher than "Linda is a bank teller." But the authors weren't convinced that participants might be misreading things, so they gave participants a subset of FIVE of the eight statements I listed here, eliminating one of the two options "Linda is a bank teller and a feminist" vs. "Linda is a bank teller." Still, even when participants saw only ONE of the two options mentioned by Grant in the video, they consistently ranked "Linda is a bank teller and a feminist" higher in relative probability than "Linda is a bank teller" alone compared to the other options. (And they did other variants specifically to be sure this effect was real. They also tested it on grad students who had taken probability and stats courses and would be familiar with how to compute the probabilities of all of these statements — and they still saw people ranking the less likely statement higher in probability.)

This study has been repeated in various situations over the years. It's a very well-documented result, known as the "conjunctive fallacy" or similar to the idea of "misleading vividness." To see how it plays out in potentially nefarious ways, consider that it's a common strategy used by trial lawyers. Replace "bank teller" (the less probable thing) with "murderer." Lawyers frequently tell a detailed story that incorporates facts that seem likely about a defendant, thereby making it easier for a jury to believe that the defendant is ALSO a murderer — because everything else checks out with what they know about the person. Similarly, the description here makes people think it's likely that "Linda is a feminist" and they will rank statements including that supposition higher, regardless of whether you tack on irrelevant and significantly less probable details.

Wow this was great

This is a great Christmas gift. I learnt the formula at school and see many people hyping the Bayes T, but this is one of the more intuitive explanations out there. Thank you

Senpai, my head hurts 😰

I guess that, at least in the Bank Teller question, people chose option 1 because they thought it was about Linda being JUST a bank teller or her being a bank teller and being active in the feminist movement, so it made a bit more sense to for them to choose option 1. Obviously if they used that formula and the visual representation more of them would have chosen option 2 even thinking this way, even if 1/100 bank tellers that are women are active in the feminist movement and 2/100 are not in the feminist movement but fit that description, option 1 was the best choice, but if they intended the question this way it’s not THAT incredible that people chose option 2. Also it would imply that people didn’t change their mind at all between the answer they gave to the first question and the second one (the one in which everyone chose the right option) because that one asks how many bank tellers out of the ones that fit that description are just bank tellers (ex. 8) and how many are bank tellers and active in the feminist movement (Ex. 5); if 5/8 are active in the feminist movement and not just bank tellers and the first question asked how many of them out of all of them who fit the description (8) are active in the feminist movement (5), it made sense to chose option 2 (5/8).

That was fascinating, on both a mathematical and psychological level. Well done, great video!

Steve has 100 apples and gives his friend 20 apples, how many apples does he have left?

95, the average person can only carry 5 apples at best, so the friend can only accept 5 of them.

Rationality is not about knowing facts, it’s about recognising which facts are relevant.

So simple yet so profound !

"Programmers sometimes…": no joke I literally did this last week.

This was so useful, I became a patreon just because of this video

Oh boy only a video or two away now from 3Blue1Brown having to calm down all the LessWrong people about Bayes' Theorem and Roko's Basilisk

There is no Nobel Prize of economics. It's the "Nobel Memorial Prize", which is separate from the actual Nobel Prizes and has often been criticized for using the name.

Well if you speak to normal people (non-mathematicians), if you say Linda is a bank teller then you mean that she is only a bank teller and did not take part in the feminist movement. In that case 1) and 2) form a particion of the state space.

Question: Does Bayes theorem relate to conditional expectation?

I had no idea I could be so excited about a 15 minute video on a single tiny formula. good stuff!

I really struggle with the terms 'Evidence', 'Hypothesis' and 'Belief' in this topic.

Maybe it's because I don't see them as mathematical terms …. how do i mathematically distinct between those?

May it's just because english isn't my langguage …..

Great content.

Actually with the librarian farmer question I went with the librarian solely on the statement "with very little interest in people or in the *world of reality*". It is most likely a librarian since a farmer still has interest in the world of reality, that being farming.

finally!!!

i had a teacher once who would give us truth or false questions on exams and ask questions like the one about linda and when we told him that one answer included the other he told us to pick the one thats "most true". maybe thats why people picked that answer. somehow education thought us to do that

Thank you patrons!

❤❤❤❤

Can anyone point towards good probability course visually as understanding as this one. Like at 7:30 why multiplication.

It would be good to have probability video series but that’s asking a lot. Great work @3Blue1brown

This is really useful! I'm looking forward to the video on bayes theorem in machine learning!

Really really cool!

"all this will be in the next video"

*next video*: lol, jk

As always a brilliant video. Although I have to say, that picturing Steve as someone wearing a tie introduces a bias for the viewer, that makes the answer librarian more "obvious". I don't know if this is a deliberate design choice of you, but it definitely falls into a similar category as the criticism about the original question. Nonetheless I know have a more intuitive understanding of Bayes Theorem and won't need to look it up, to be sure I am applying it correctly. Thank you very much 🙂

Probability was my fav math course in high school. This was great to actually watch. Thanks for continuing my interest in probability 🙂

Amazing description

Just had an AHA moment. Well done Sir!

Amazing video!!!! Bayes’ theorem is such a powerful idea and you’ve explained it so masterfully. I can’t wait for the next video!

A superb video on the essence of Bayesian thinking. I think that Bayesian thinking should be part of the math curriculum of high schools. I have advocated the teaching of Bayesian probability at Dutch high schools. The book Basic Probability, What Every Math Student Should Know, World Scientific Press, 2019 has grown out of these efforts: https://www.amazon.com/Basic-Probability-Every-Student-Should/dp/9811203768/

The Bayesian Trap

11:50 seems like mostly an issue of language rather than a misunderstanding of probabilities. When asked "is it more likely that she's a bank teller or that she's a bank teller active in the feminist movement", most people will infer that the first possibility implies "she's a bank teller [and not active in the feminist movement]". Saying that (b) is a subset of (a) only makes sense if you purposefully misunderstand how people communicate in daily life. Kind of like trying to prove a result by using an inclusive "or" in a context where most people would read it as an exclusive "or". Unless people are primed to think of these statements like a mathematician beforehand, the result is kind of pointless IMO.

instantly after i heard you explain the study i understand that the idea of "irrationality" given by it is nonsense

garbage example

At 1:30, why do I get the sense that Farmer Steve's crop is deigned to be smoked?

Thanks for this amazing topic

YouTube. Please put this in suggested links on all politics clips.

I think, the notation of P(Librarian given description) is misleading. What is P(description)? The probability of that specific string? No, this is not what you meant since what you compute is ratio of the number of people fitting that description to the number of farmers and librarians. Communicating about probabilities is damn difficult. And just to avoid negativity, I love your videos!

Two 3b1b videos at the same time, this is the best Christmas present 🎁

But you seem to be more likely to meet and know a librarian than a farmer. Thus you making this remark, makes it more probable that Steve is a librarian.

I find it even easier to understand when using Venn diagrams. That way E and H are somewhat equivalent since they are both just two random events and it doesn't really have to be a hypothesis and evidence. But I do use your type of diagram when dealing with independant events since then the two rectangles that you take to normalize the posterior have the same hight. So effectively the two axes represent the two events independantly.

Thanks and happy holidays 3b1b..!:)

Thanks for that unit square. I am definitely gonna find that useful

Bayes' theorem maybe ?? (with the apostrophe after "Bayes")

The Linda case is not weird at all, it's just pure logic. For a group of people to be both bankers and feminists implies that their number is either equal or smaller than the number of bankers, because they are more specific sub-set of bankers (i.e. bankers who also happen to be feminists). This also hold when compared to the number of feminists: there are more feminists than feminists who also happen to be bankers. It has nothing to do with wording the information with percentages or saying "a number out of a number".

In the case of Steve I did not for a second think that he was a librarian because he is "meek", but because he's not interested in reality. I find it very unlikely for someone uninterested in the physicality of the world to get involved in a labor intensive job such as farming, where reality is constantly kicking you in the teeth.

Nie mam pojęcia po co odpowiadać w ogóle na taką zagadkę jak postawione w tytule, tzn. na podstawie tego kto jaki ma charakter odgadywać jego wykonywany zawód. Brzmi to po prostu jak nonsens. Dorabianie do tego rachunku prawdopodobieństwa, to jak budowanie zamków na piasku. Na skraju morza.

I'd say that only about 10% of people are capable of consistently thinking this way, but given you are someone who watches these kinds of videos, the posterior probability that you are good at bayesian reasoning is about 88%, with a 95% equitailed credible set of 84%-92%

While I love this video and the concept makes sense, the emotional intelligence part of my brain really doesn't like the human examples being used to illustrate it. My brain looks at someone like Linda and thinks that someone like that is virtually guaranteed to be involved in the feminist movement, no matter what job she does. The same as a person like Steve would almost never be a farmer, that profession is so utterly unfit for someone with that temperament that, again, virtually no one like him would ever be a farmer. Thus the probability that he is a librarian is so close to 100% that it renders the example moot in my mind.

I can absolutely see how this would be useful in social science and as a way to challenge assumptions about populations, but I very much feel like explaining first using the human examples is not the best way to make it intuitive.