*UPDATE 11am response to commentator: is there an association between inability to understand Bayes’ theorem with ethnic prejudice?*

*UPDATE 3:30PM explaining risk of false positives to congressmen and commentators*

Congressman Chairman: Muslims! Terrorists! Muslims! Terrorists!

Witness: Let A be the event of terrorism, and B be the event of Muslimism. Then P(A|B)≠P(B|A)

Congressman: What are you talking about?

Witness: You seem to be confusing the probability that a Muslim person will be a terrorist with the probability that a terrorist person will be a Muslim

Congressman: And you seem to be confusing everyone in this hearing, smartass.

Congressman: What did you just call me?

Witness: it’s simple, the probability that a Muslim will be a terrorist will be 13,000 times lower than the probability that a terrorist will be a Muslim. That is, the ratio of the probability of being a terrorist to the probability of being a Muslim is about 1 over 13,000 (P(A)/P(B)).

Congressman: so even the math department has been taken over by politically correct academic radicals who hate America?

Witness: even if you think that the Probability of a Terrorist being a Muslim is 95.3%, the probability of a Muslim being a Terrorist is only 0.0007%. That is less than the probability of a left-handed octogenarian Olympic discus-thrower being struck by lightning.

Congressman: or maybe even less than the probability that anyone is listening to you?

Witness: maybe this picture will help.

Congressman: I’m calling your state legislature right now to fire your radical butt.

POSTCRIPT: response to commentator:

Mr. McKinney, perhaps your prejudices led you to mis-read the piece. 13,000 was how much larger one conditional probability was than another, which is helpful for understanding Bayes’ Theorem but not for policy. The policy-relevant probability is that of a Muslim being a terrorist, which based on a Rand report was calculated here as 0.007 percent.

If you still don’t get this, then why don’t you also start targeting white males, since 80% of serial killers fit that description, and these serial killers kill about 100 people a year.

Regards, Bill Easterly

POSTCRIPT 2 3:30PM

To the Congressman and Mr. McKinney (again):

One other probability you may want to consider is that Al-Qaeda’s recruiting will become more successful by a δ >= 0.0007 percent after you have persecuted the 99.9993 percent of Muslims who are innocent.

Very fun. Where did you get these numbers and proportion?

Heh heh… similarly:

From Tina Fey’s article in today’s New Yorker. http://www.newyorker.com/reporting/2011/03/14/110314fa_fact_fey#ixzz1GYbQ2JqF

Bad use of statistics. A probability of 1 in 13,000 means that over a sufficient number of cases, say 13,000, you’ll find that 1 is a terrorist. But the ratio says nothing about a single case. I could easily find two Muslim terrorists in a row. Or I could test just 5 cases and find a terrorist. Probabilities only work out over large numbers of cases. They cannot predict the outcome of a single case.

A small number of Muslims have advertised for decades that they are terrorists intent on committing terrorist acts and they will try to blend in with the normal Muslim population in order to make it difficult to identify them.

So where do we start looking for terrorists? Among the Amish?

I feel sorry for the majority of good Muslims who oppose terrorism that a few idiots are hiding behind them in order to commit murder. If we had more stories about Muslims turning in other Muslims for planning terror acts, more people would see the differences. Instead, we get nothing but whining from Muslims because they are inconvenienced in the process of discovering the murderers among them.

@Roger: pretty much every article about the King hearings has mentioned that of the 120 foiled terrorist plots planned by Muslims in the US since 9/11, 48 were foiled because they were turned in by other Muslims. The stories are there, but unfortunately not everyone pays attention.

Here’s one, just FYI: http://www.usatoday.com/news/opinion/editorials/2011-03-10-editorial10_ST_N.htm.

Evidence once again that there ought to be an entrance exam into congress, with a minimum threshold of math (my bias would include Bayes’ Theorem). Perhaps the statistical approach was a little bit of a stretch, but that’s no excuse for the congressman to have been so completely lost in the conversation.

Spare me.

Wafa, that is information that needs to be more advertised! I wasn’t aware of it.

But back to the issue of statistics, consider how marketing uses them to determine who might buy. They are modeling rare events, just as the models of terrorism do and they have a high number of false positives. Yet marketers find them very useful. The results are much better than random.

The same thing goes for models of money laundering that banks use, or fraud models used by mortgage and insurance companies. But the results will be far better than random testing even with the false positives.

Applied to terrorism, what is the single most valuable variable in predicting whether someone is a terrorist or not? Their religion! Are the odds high that a selection based solely on religion will have a false positive? Of course! Just as the marketing and fraud models have high false positives. Are they better than random selection? Far, far better.

And what are the consequences of false positives vs false negatives? The consequences of false negatives are far more disastrous than false positives, so the models should be skewed to reducing false negatives as much as possible, which will create more false positives.

Anyone who understands the difficulties of predicting rare events realizes how silly the probability exercise above is.

PS, we can either work with flawed models or use random investigations of everyone. Which do you think will be more successful?

Thanks for the correction of my misunderstanding of the probabilities. I simply read too fast.

Why not target white males? Because they are serial killers and not terrorists. I’m confident that the “profilers” at the FBI do target white males for serial killers. But they aren’t terrorists.

To take a different perspective, suppose your were going to put together a logistic regression model to predict terrorism. What variables would you include? Are you saying that country of origin and religion will not improve the model significantly?

Before we dig too deep, lets not forget that the proper comparison is not the probability of a terrorist vs non-terrorist Muslims, but rather the comparison of probabilities between all ethnic/religious groups on terrorist tendencies. Get your microscopes out folks because those numbers will make .007% look quite large. In my opinion, check them all. If Muslims don’t like it, then they need to start restoring order to their own house and openly condemning the behavior of those under its roof. By saying nothing, they condone the behavior and bring aspersion onto themselves.

McKinney: I think you’ve hit on a winning strategy there, considering it is so incredibly difficult for Muslim terrorists to disguise themselves as non-Muslims.

Jon, considering that Muslim terrorists haven’t bothered to disguise themselves so far, your point is irrelevant. As for the ease of disguise, I think you’re right; it is difficult. That would involve a change of name and background documents that the terrorists haven’t been able to accomplish, yet. I’m sure they would like to do it if they could.

Professor,

I understand that you are smart and would make your graph quickly transparent to make your point. Would you show us the graph with y-axis being “terriorists” and the x-axis being “muslims” instead of using double negatives on both axises? I apperciate your precision and effort to be policy relevant.

yours Dum

“One other probability you may want to consider is that Al-Qaeda’s recruiting will become more successful by a δ >= 0.0007 percent after you have persecuted the 99.9993 percent of Muslims who are innocent.”

So you’re privy to al qaeda’s marketing plan? How many of al qaeda’s recruits suffered some kind of persecution by Americans, considering the vast majority are from Egypt, Saudi Arabia and Yemen?

Why don’t the testimonies of al quaeda members matter to anyone? They tell us why they became terrorists and none of them mention persecution by Americans or Christians, poverty or any of the standard leftist ideas as to why young men become terrorists. Why is it the left thinks they understand Muslim terrorists better than Muslim terrorists understand themselves?

“Al-Qaeda’s recruiting will become more successful by a δ >= 0.0007 percent after you have persecuted the 99.9993 percent of Muslims who are innocent.”

I’ll take this as Mr. Easterly’s indirect answer to my question above: which variables would you include in a logistic regression predicting the probability that an individual is a terrorist.

So Mr. Easterly thinks that mistreatment at the hands of the TSA or some other federal agency is the best predictor of for a model of terrorists. I would guess that such a model would predict that a lot of terrorists exist among the hispanic population, then.

Of course, inconveniencing people is not the same thing as persecuting them, just as making people uncomfortable is not torture.

American blacks, Indians, Japanese (during WWII) and others have suffered real persecution in the US. How many of them are terrorists?

PS, Christians in Egypt, Pakistan and other Muslim countries are among the most persecuted people on the planet. How many of them are terrorists?

Correction – independent probabilities follow P(A|B)=P(A) and P(B|A)=P(B). If this is the case, Bayes’ law still holds, but the reverse is not true. The image refers to independent probabilities, and the story doesn’t.

Original comment didn’t make it – it was: Interesting article. However, the picture is not useful as an illustration of Bayesian probability. It represents independent probabilities, (…). The more general (and more complicated) Bayesian case could look quite a bit different.

“If you still don’t get this, then why don’t you also start targeting white males, since 80% of serial killers fit that description, and these serial killers kill about 100 people a year.”

And one 9/11 killed 3000 people. That equals 30 years of the serial killers’ victims.

McKinney: What I was trying to suggest is that Easterly’s (already quite convincing, imho) argument still assumes that we have either 100% accurate knowledge of who is/isn’t Muslim, or some quick and easy ‘Muslim test’ with 100% accuracy. We don’t. If we can only realistically determine with, say, 99.5% accuracy that someone is Muslim, and we use this test on the entire population in order to correctly identify the 0.8% or so who are Muslim in order to THEN determine whether or not they are terrorists, you have a whole other Bayesian problem to deal with (false positives/false negatives regarding who is actually Muslim). This can easily add a couple more 0’s to that 0.0007%, especially if you use realistic rates of false positives/negatives (consider a man with a name like, say, Keith Ellison, who was born in Detroit, Michigan; if he were not a famous member of Congress how exactly would you know he was Muslim?).

Anyway, I was going to go through a long mathematical example here, but really I think the main problem with terrorist-hunting on the basis of ethnicity, from a Bayesian perspective, is actually the problem of false positives, not the relative success rates of different random-investigation strategies. If you are hunting for something extremely rare, even a very low rate of false positives will result, via Bayesian probability, in you wasting 98% or 99% or 99.9% of your resources chasing false positives.

Sure, you would waste even more if you relied on purely random sampling of the overall population. And if you wanted to catch specifically underwear bombers, you might have even more luck profiling Nigerian passengers, since we all know that 100% of underwear bombers are Nigerian. Or look at people whose last names start with ‘A’ — 73% of the 9/11 hijackers’ did, but only 2.8% of the top 1000 US surnames begin in ‘A.’ There are all sorts of great techniques available to the investigator with no understanding of statistics!

Fortunately investigating people at random is not the only strategy at our disposal for identifying terrorists (in fact, in any other context I think most sensible people would suggest that investigating people at random has no place at all in a democratic society of laws)! Devoting those resources to following up on actual, credible tips, intelligence, and evidence, without worrying overly much about the religion or ethnicity of the tipster or target, as well as investigating *already identified* terrorists and groups we haven’t caught/shut down yet, seems like a more productive use of taxpayer dollars to me.

Andi: We didn’t have ‘a 9/11′ for the 20 years before 2001 nor in the 10 years after. This argument only works mathematically if 9/11’s happen more than once in 30 years. In any case, I think the point was that this is an obviously ridiculous strategy, not that we should be racially profiling for white male serial killers *rather than* Muslim terrorists (you could also do both, if you wanted to be extra stupid).

See here also for a nice worked example: http://en.wikipedia.org/wiki/Base_rate_fallacy

Jon, TSA uses random testing. It will pull out 3-yr old children and 90-year old grandmothers at random and give them the executive treatment because it refuses to use any common sense.

“evoting those resources to following up on actual, credible tips, intelligence, and evidence, without worrying overly much about the religion or ethnicity of the tipster or target”

Muslims in the Middle East take advantage of that strategy by offering “tips” on their enemies to get them arrested. It takes a long time and a lot of money to verify tips, intel and other evidence. And of course that is a necessary strategy.

But why not use a statistical model, too? Why are people find statistical methods so vital to aid and other areas of life but refuse to use them to protect lives? Law enforcement agencies around the country have reduced crime significantly by employing statistical models that help them predict where and when crimes will happen. Are the 100% accurate models? Far from it, but they are useful and have been a great addition to other methods.

No one is arguing that a model consisting solely of religion and ethnicity should be used. That would be incredibly stupid. But do those variables have no place at all in a model of terrorism?

“UPDATE 11am response to commentator: is there an association between inability to understand Bayes’ theorem with ethnic prejudice?”

No. Bayes’ theorem is a red herring. It’s totally irrelevant to the discussion.

“UPDATE 3:30PM explaining risk of false positives to congressmen and commentators”

You already made your point: false positives will create armies of new terrorists. But you refused to respond to my point that the evidence shows no correlation whatsoever between persecution and terrorism.

It’s amazing how quickly people who promote evidence based aid will abandon evidence for ideology when the evidence fails to support them.

@ McKinney — You raise some really valid points and questions. Personally, I found this article to be rather childish and you have provided some well thought out criticisms.

Still, I’d like you to know that I find issue with the following comment:

“Why don’t the testimonies of al quaeda members matter to anyone? They tell us why they became terrorists and none of them mention persecution by Americans or Christians, poverty or any of the standard leftist ideas as to why young men become terrorists. Why is it the left thinks they understand Muslim terrorists better than Muslim terrorists understand themselves?”

Granted there’s a lot of poorly thought through gibes on the internet from leftist opinion holders, but they’re not necessarily the standard view of “leftists” (it may just seem that way because they’re the loudest). Erroneous logic and exaggerated statements are pretty much standard to all sides of the political spectrum.

I appreciate your enthusiasm at setting this particular conversation straight but please don’t undermine your point with uncalled for generalizations. My community and myself tend to lean left on quite a few issues but few of us can make any claims about how much we understand Muslim terrorists.

Roger McKinney,

This seems like an important issue to you, but I have no idea what you’re recommending.

Differential screening rates of Muslims and non-Muslims at airports? Differential “intensive” screenings at airports? Differential random investigations of various kinds across all areas of society? Surely not that we should only screen Muslims (whatever kind of screening you favor)?

The comments above should have convinced you that the difference in probability by religion of being a terrorist is small, even if you personally don’t think its negligible.

Low-probability, random screening imposes huge costs in return for few benefits, whether or not it is aided by differential targeting. Presumably you do care about these costs — you don’t think we should “screen” everybody (which we could no longer call “screening”), or you wouldn’t care about using statistical models at all. What level of costs would be too costly for you? If you care about the costs at all, you have to agree that there are tons of screening models that would be way too costly — whether in terms of budgetary costs, monetary costs incurred by the screened populations, or non-monetary costs like the injustice for being singled out for your religion, or the erosion of the First Amendment’s religious protection.

I don’t really know what you’re recommending; surely you see these points? I mean, the “difficulty of predicting rare events” is exactly the point of the silly probability exercise you’ve taken issue with.

Best,

Joe

…presumption of innocence, unreasonable search and seizure…

Jane, I apologize to people like yourself. But don’t you think you are the exception?

Joe, TSA already employs random testing, which is stupid and based on the assumption that every US citizen has an equal probability of being a terrorist.

A good model of predicting who might be a terrorist would start with the fact that a person is a Muslim. From there, you would add variables such as his nation of origin because some countries are hotbeds for radical Islam and some aren’t, Indonesia for example. Age is a factor as well.

Then refine the model with attributes that terror experts know about that the rest of us don’t.

This isn’t rocket surgery! Police departments have developed similar models for the past 30 years. They aren’t perfect and don’t eliminate crime 100%. But they are good tools and very useful.

Roger,

Then the alternative you’re interested in is differential screening by TSA in airports.

A) TSA should reduce the amount of screening it does on 90-yr old grandmothers and 3-yr old children.

B) TSA should intensify the amount of screening it does of “high-risk” populations.

C) TSA should do both.

If you want to give 3-yr old children and 90-yr old grandmother’s a free pass, I won’t argue with that. Let’s arbitrarily focus on adult males ages 16-55. Does anyone in this population get a free pass?

Point 1: In this context, any model’s predictive power and accuracy will be low to the point of uselessness.

Whether a given profile has a 0.000001 probability of being a terrorist, or even two or three or ten times that probability, I think is irrelevant for law enforcement purposes.

Point 2: The ability to test whether or not someone is a terrorist further reduces the value of any screening (differential or fully random).

Point 3: Combined with my belief that there is no improvement in safety due to differential screening in this context, I think it’s unfair to single out groups of people to receive more intensive screening.

Now, I can see the counter-argument that on an individual basis, in the particular context of airport screening, maybe this is not a huge cost to impose on people. But if we really believe that, then we should be perfectly happy to just screen everybody, right?

Basically, I think we should always be concerned and skeptical about asking the government to discriminate between groups of people according to criteria it chooses, both because it is unjust and because it may have unintended consequences. Yes, sometimes it might still be the best choice available. In this case, it also strikes me as being basically useless.

Rocket surgery or not.

Joe, I didn’t want to limit the discussion to just airport security, but doing so I would say that such a model would not be worthless. If you have ever done any modeling of rare events, as I have, you would no it’s not worthless. As I wrote, police departments have used similar models for decades; marketing companies use them; the FBI uses them in money laundering and bank fraud cases; they’re definitely not useless.

As I mentioned, other variables would be necessary of which I’m not familiar, but homeland security is. TSA, for example, would need to look at the type of ticket a person buys (one-way or round trip); do they have luggage?

Israeli airport security is very good and they use this type of profiling without harassing travelers.

But so what if they do hassle someone who is innocent? If they’re decent people there will be no negative consequences. After all, who gets more hassle in this country than Mexican immigrants? They don’t turn to terrorism.

PS, the fear of reprisals from innocent Muslims turning terrorist is the wrong approach because it surrenders to radical Muslims.

What you and Easterly are saying is that we should live in terror of offending any Muslims because they will kill us if we do. We should be more afraid of offending Muslims than any other group of people because they are murderers.

I don’t believe that of the Muslim people. The vast majority are good law-abiding people who don’t mind a little inconvenience for the sake of better security from terrorists.

If I were Muslim, I would be highly offended at the suggestion that Muslims turn to terror when inconvenienced by the government doing its job to protect citizens.

Thanks for your reply, Roger. I still don’t really know what you’re recommending. I understand you want to put all possible variables into a statistical model, but I don’t really understand what you want to do with the results even in a particular context like airport screening. When do you waive someone through, when do you pull someone out of a line, and then what do you do to them?

But I’ll ask a different question: given your experience modeling rare events, can you give me an idea of what kinds of numbers you’d consider useful and likely from a model like this? Since you mentioned logistic regression, at a rough guess I would have in mind predicted probabilities that range between 10^-6 and 10-5 for everyone whose name isn’t specifically on some watch list. Would you consider that useful? Or am I underestimating the accuracy of the model?

Hi all,

While I agree with the general thrust of the article, the quoted 0.0007% is not the only relevant number here. We should also consider

P(A|B) / P(A| not B),

the ratio between the probability that a Muslim is a terrorist with the probability that a non-Muslim is a terrorist. Using Bayes theorem and numbers quoted above, we find that this is on the order of 1000.

I won’t comment on how one should develop policy from these facts.

Thanks Eric. I agree with your numbers.

But I think the absolute difference in probability is the most relevant fact. The absolute difference in probability between P(Terrorist | Muslim) – P(Terrorist | Not Muslim) is at most 0.00007, 7×10^-5.

In most contexts I’d still say that’s a useless differential for policy purposes. If, in addition, you account for the imprecision of whatever test you’re actually using to determine whether or not someone is a terrorist, this differential would become even smaller.

