Warning: Invalid argument supplied for foreach() in /home/techrecipes/public_html/wp-content/themes/techrecipes/header.php on line 77

What Medical Tests Should Teach Us about the NSA Surveillance Program

hard drive and stethascope

The National Security Agency (NSA) is collecting massive amounts of information about people in the United States and throughout the world. From details about every phone call to collections of people’s activities online–the US government is creating a monumental amount of data on each individual person in existence. The balance between privacy and security is always difficult, and the ethics of the NSA’s practices be will debated for the near future. However, as a physician I worry about something just as difficult. Excellent reasons exist why I as a physician do not order every test on every patient. I know that amassing too much data can be harmful.

Patients request excessive amounts of testing on a routine basis. Maybe a friend was recently diagnosed with an illness, or the press is reporting that a celebrity just died from a rare disease. Commercials on television have reminded all of us to “discuss with your doctor” and thus add labels to our list of medical problems. It would seem that pages upon pages of labs results and fancy digital pictures should rule out anything dangerous and reassure us all. Although many patients will be disappointed, the wrong answer is to fire off a huge workup. The incorrect answer is to collect needless data.

To understand the dangers of over-testing requires a little background knowledge. Allow me to use a hypothetical example. A very rare disease called TT has been discovered and it affects about .01% of the population. If the test is 90% accurate, for every nine people that are correctly diagnosed, one person will be missed. Although many patients will be disappointed, the wrong answer is to fire off a huge workup. The incorrect answer is to collect needless data. More importantly 10,000 people will be incorrectly told they might have the disease! Remember that this is one single test. If we perform multiple tests, the incorrect results multiply.

In the world of medicine the false positive effects of over-testing cause great expense through specialist referrals and additional testing. The work up of these abnormal tests harms patients directly through unneeded biopsies, radiation required for radiographs, and medication side effects. What does this have to do with the NSA?

As a physician, I know that collecting too much information often can be more dangerous than not collecting enough. Let us assume that the number of patients with our fictional TT disease are actually the number of terrorists in the United States. If NSA’s lab test of searching phone metadata is 90% accurate, 10,000 people will be incorrectly labelled for every nine people correctly caught. The actual test results are actually even worse considering nobody actually believes 0.01% of the US population are terrorists or that any data mining technique is really 90% accurate.

Remember that the number of people incorrectly labelled increases with each test. If NSA’s lab test of searching phone metadata is 90% accurate, 10,000 people will be incorrectly labelled for every nine people correctly caught. So when the NSA searches Internet patterns, that is another huge group of people incorrectly “diagnosed” as terrorists. Searching travel patterns yields another. The terrorist database list grows and grows.

We have proof that this terrorism overdiagnosis is occurring. The press frequently reports how the the terrorist “no-fly” list has erred by preventing an innocent child or a government official from travelling. For those affected, great expense is lost in travel plans, from time wasted, and in trying to get reputations cleared. Is it worth harming 10,000 people or 100,000 people to prevent one terrorist from flying?

As a physician I know to avoid ordering too many labs and tests. The collection of excessive data causes physical and monetary harm to my patients. We all need to realize that the over-testing and overdiagnosis of terrorism will eventually hurt us all.

 

About David Kirk

David Kirk is one of the original founders of tech-recipes and is currently serving as editor-in-chief. Not only has he been crafting tutorials for over ten years, but in his other life he also enjoys taking care of critically ill patients as an ICU physician.
View more articles by David Kirk

The Conversation

Follow the reactions below and share your own thoughts.

  • Cellar

    Now consider that the many and varied “terrorism tests” have a positive rate of… er… somewhere well below half. In many cases you’re lucky to reach 10%, if that far. Note that nobody bothers to really measure this at all.

    This means that the security theatre isn’t just obviously security theatre, it’s mathematically security theatre as well. That is, all that harassment is a complete tossup, it’s pure thuggery, anything it finds is entirely random, it cannot possibly be viewed as effective for the stated purpose. And thus all that money wasted is much, much worse than useless.

  • VennData

    So one time a test has a failure rate. Then what happens when you redo it? Assuming you do do it on every single person in the US? Huh?

    The NSA takes people how have a suspicion of being connected to a criminal/terrorist whatever (Ie they have already passed the first test) and then follow up with more tests. If they do not find that this person is involved. Then they can stop. Why would they waste resources?

    Furthermore this his utter hyperbole “The National Security Agency (NSA) is collecting massive amounts of information about people in the United States and throughout the world. From details about every phone call to collections of people’s activities online–the US government is creating a monumental amount of data on each individual person in existence.”

    Do you really have a background in science?

  • guest

    I think your data model guys are doing it wrong. If a disease is rare (0.1% affected) then a model with a 90% accuracy is significantly worse than simply predicting false all the time (99.9% accuracy!).

    They should probably be using something like an F1 score to see how their model is doing. The only case where more data is not necessarily better is when your model has high bias to begin with and won’t get much better no matter how much data you throw at it.

    That’s not to say you should be running labs for everything all the time. Proper data modeling would tell you what combination of labs and results are most effective at identifying a particular disease correctly.

    If the NSA is related to this in any way, it’s that they could probably teach most data analysts a thing or two about data science :)

    • h1

      > the 90% accuracy also includes “negative” tests so that 10% of the ones that do not have the condition are misdiagnosed as having the condition.

    • DoingMoreWithLes

      > been there, there aren’t on the leading edge of anything and haven’t been for years. This over reach is because they can’t do their primary SIGINT mission.

  • Steve

    But if you feed back the data into medical research, the accuracy of the tests will improve if researchers have access to more data.

  • laughingskeptic

    David Kirk misses by a mile in comparing communications data and medical data. He fails most importantly to recognize the transient nature of comms data. If it is not collected, it is gone. Just imagine if the police learned of a kidnapping 3 days after the fact and want to know where the victim’s phone went. They can get this info for a limited time directly from the phone company. Now imagine you are working in intelligece and it has taken you 2 years to identify an enemy agent. The only way you will be able to know what this person was doing 2 years earlier is if you were collecting that information all along. The phone company long ago will have deleted that data because they no longer needed it.

    • DoingMoreWithLes

      >Phone companies are required to retain data for 5 years. And once the Utah center is up and running it will be retained forever.

  • Kent

    Unfortunately, you conflated miss rate with false positive rate. The difference between the two is how we judge the goodness of the test. But your overall result is accurate. The lower the density of positives in the population, the more false positives will dominate the test results. Keep working it — this is a difficult issue to explain to non-tech types.

  • DoingMoreWithLes

    Millions of laws on the books that nobody could possibly follow them all. So all we need to do is find out what you did wrong. Remember, ignorance of the law is no excuse. Now all you have to do is get on the wrong list and we will find something to bag you with. I know, you’ve got nothing to hide so I guess you are ok with all those for profit mercenaries known as “contractors” having all your personal data.

  • Bobby Anon

    What you are missing is that the surveillance is not intended to catch ‘terrorists’.
    Wholesale surveillance of a domestic population is used for one thing: internal repression.

  • Wesley Parish

    He’s right, you know. Most scientists don’t give a fig for Off-Topic information. If I’m studying galactic formation, details about motor-neuron disease … is something best for motor-neuron disease specialists to deal with. If I’m studying coral-reef formation, then geological details such as land-slips had best be relevant to coral-reef formation or they’ll get ignored.

    If, on the other hand, I’m examining data on “terrorists”, however defined, then obviously Mr Random Citizen’s discussion of which sushi restaurant to go for lunch today, is so obviously vital that the NSA just can’t stop itself listening. “Who knows, if we raid the sushi restaurant we might be able to intimidate them into giving us free sushi!”

    I mean, the TSA found itself unable to stop harrassing attractive female passengers … obviously because some girl with a nice figure’s so obviously more of a suspicious character than a dowdy middle-aged matron … “Best check if that rack’s real or is she packing some serious heat up there …”

  • SirBoss

    I know all too well that the good doctor is right. But the best example is literally today 7/15/2013. At this time we have a civil threat of riots and literally many threats of terrorism regards the George Zimmerman trial. Yet at this time the police and NSA are incapable of even being aware of this. I could say more but it would just be piling on.