Book Rounds: Statistical Smarts


Book Rounds, Professional Skills Development / Tuesday, July 2nd, 2019

How to Lie with Statistics

Darrell Huff

A humorous, insightful and easy read outlining the foibles of statistics and our interpretation of them. Highly recommended as a refresher for those that feel like they’ve lost touch with statistics, or those that would appreciate a reasonable reference to offer owners who wish to be well-informed. The author uses many real life examples of statistics gone wrong that improves understanding.

The title is rather tongue in cheek, as the author points out that our enamoration with precision often washes right over our common sense, just as often as the misuse or misinterpretation of statistics contributes to problems. The author sums his purpose with “The crooks already know these tricks; honest men must learn them in self-defense.” 

The common manipulations he suggests we be wary of are as follows: 

Sample Bias: Even the most honest of researchers can be blind-sided by unintentional sample bias. Be especially wary of self-selecting or self-reporting samples. The conclusions drawn need to be very limited to the represented population, and rarely provide a “true” representative sample. Sample size can also bias the results. Small sample sizes are rarely representative of the normal distribution of data points. Check for number and source of data points.

Average: Average can actually be a very non-specific term, and may be used to refer to the meanmedian or mode. Frequently, the mean is used. This may be helpful in some cases, but given it is the sum of all values divided by the number of values, it may not give a clear picture. A small number of samples may lie at the extreme of values and skew the mean towards the extreme of the range. The median is the value at which half of the values lie above, and half below. This gives a clearer picture of the distribution of values. The mode is the value of most frequent occurrence, so can still skew the picture, but towards frequency, rather than range. Understanding which term is being utilized for average gives you a bit more information to understand the bias. The author warns that “an unqualified “average” may be … meaningless”. 



Data presentation: The data may be presented in a variety of fashions that warp our understanding. Graphs or visual representation can be manipulated in manners that invite us to subconsciously draw conclusions that may not be correct. A graph should be labeled on both axes. Check these labels to make sure they make sense. Do you know what the numbers are supposed to be, or are there just a set of numbers strung up the side of the graph? The data may be honestly and fairly manipulated (say, placed on a logarithmic scale), but the visual perception increases the differences between groups. Alternatively, the scale can be manipulated to omit chunks of numbers, which again toys with the visual perception of the data. Always be suspicious of visual depictions that aren’t labeled.

https://imgs.xkcd.com/comics/normal_distribution.png

Correlation vs Causation: A very common error and tendency is to identify correlation and presume causationCorrelation is only indicating that two factors have a commonality. It is entirely possible (and frequently probable) that this is merely a coincidence. Sensationalism and natural human tendencies will presume that this commonality indicates that one caused the other. Even if causation is present, the statistics indicating correlation don’t clarify which factor is the effector and which is the effectee. In other words, a chicken or the egg conundrum should be considered if you believe that the reported correlation indicates causation. Finally, correlation may indicate a direct relationship, but the relationship may be the consequence of an entirely different factor. For example, every time your child gets a cold, they always have a runny nose and a cough. That doesn’t mean the cough causes the runny nose, nor that the runny nose causes the cough. Rather, the cold virus causes both events to occur! 

p-value: This is a critical evaluator of the data, giving a probability that the significance of findings is due to true differences versus chance. A generally accepted p-value of significance is p<0.05. While generally accepted, this is somewhat arbitrarily set or selected. A researcher may choose to set significance at p<0.01. Partly to impress upon others their integrity and stringent evaluation methods, and partly to indicate that the probability that their findings are random chance are particularly low. There is nothing to stop a researcher from setting their p-value significance at 0.1 or even 0.5, other than perhaps peer ridicule. Be suspicious of data presented or implied to be significant which is not accompanied by a p-value.  

https://www.facebook.com/sassyeconometrics/posts/i-hope-p-value-jokes-are-still-funny/1953677048179439/

Numbers: Be cautious of a false sense of security when numbers are thrown at you. We tend to assume greater validity when precise numbers are given, but that can be a false assumption. Numeric differences may be insignificant when associated with subjective material.  For example, intelligent tests-> can you really qualify someone as smarter than another individual with a two point difference in scores? Percentages can also manipulate our understanding of data, and can do some pretty underhanded tricks to our brains and our understandings. Beware of data presented or compared between percentages. A 1% increase in salary for a company’s employees could be approximately a hundred dollar addition (for an employee that grosses a thousand a month) or ten thousand (for an employee that grosses one million per year). The CEO and the employee are going to have very different feelings about that 1% increase. 

Indirect conclusionsDrawing a conclusion based on inference from the information is a very natural human tendency, but can be very risky, and very false. These indirect conclusions may be made for you (often by non-professionals interpreting data, such as news reporters or media sources), or be designed to prey on your humanness, by allowing you to unwittingly do the dirty work. For instance, a hand soap my claim to reduce bacteria by 99% (with proven studies). You think, “That’s fantastic. Got to be better than the handsoap that makes no claims about effectiveness! I’ll buy this (for 0.80 cents more)!” And yet, no one studied (or reported) if that reduction in bacteria actually changes the incidence of disease. Your super computer of a brain subconsciously drew that inferred conclusion. That 0.80 cents more may not have accomplished anything other than relieving you of some spare change. Advertisers love this dirty little trick, and it can be employed even by people we think we should trust. 

https://www.forbes.com/sites/erikaandersen/2012/03/23/true-fact-the-lack-of-pirates-is-causing-global-warming/#7fcf22013a67

As the author points out, “despite its mathematical base, statistics is as much an art as it is a science.” Often, there are multiple statistical methods that may be appropriate, and the statistician must subjectively select which they feel reflects the data best. Nevertheless, it is prudent to ask yourself for every statistic you face: “Does it make sense?” Do not be white-washed by the sciency feel of numbers, abandoning your common sense! Those with unscrupulous biases are hoping you’ll do just that. Now we know their dirty, lying tricks, though, and are prepared! Go forth, and be skeptical!

104 Replies to “Book Rounds: Statistical Smarts”

  1. Oh my goodness! Impressive article dude! Thanks, However I am experiencing difficulties with your RSS. I don’t know the reason why I am unable to join it. Is there anyone else having identical RSS problems? Anyone that knows the solution can you kindly respond? Thanks!!

  2. What i don’t understood is in fact how you’re no longer really a lot more well-liked than you may be right now. You are very intelligent. You understand thus significantly when it comes to this subject, made me for my part consider it from so many numerous angles. Its like men and women are not interested until it is one thing to do with Lady gaga! Your individual stuffs great. At all times deal with it up!

  3. I have to show thanks to the writer for bailing me out of this particular challenge. As a result of looking out through the search engines and seeing concepts that were not helpful, I believed my life was gone. Being alive without the strategies to the issues you have sorted out as a result of your main posting is a serious case, as well as the kind which might have badly affected my entire career if I had not come across your site. Your skills and kindness in playing with all areas was very helpful. I’m not sure what I would have done if I had not come across such a stuff like this. It’s possible to now look forward to my future. Thanks a lot very much for your impressive and results-oriented guide. I will not be reluctant to propose your web page to any person who needs and wants direction on this subject.

  4. Great post. I was checking continuously this blog and I’m impressed! Very helpful info specifically the last part 🙂 I care for such info a lot. I was looking for this certain info for a long time. Thank you and best of luck.

  5. I was wondering if anyone knows what happened to Dimepiece Los Angeles celebrity streetwear brand? I am having trouble to proceed to the checkout on Dimepiecela site. I have read in Cosmopolitan that the brand was bought out by a UK hedge fund in excess of $50m. I have just bought the Dimepiece Control the Guns Not Women’s Bodies Cuffed Beanie from Amazon and totally love it xox

  6. Nice blog right here! Also your website rather a lot up
    very fast! What web host are you the usage of? Can I get your affiliate link for your
    host? I wish my website loaded up as fast as yours lol

  7. I don’t even know how I ended up here, but I thought this post was good.
    I do not know who you are but definitely you’re going
    to a famous blogger if you are not already 😉 Cheers!

  8. Pretty part of content. I simply stumbled upon your website and in accession capital to assert that I get in fact enjoyed account your blog posts.
    Any way I’ll be subscribing in your augment or even I achievement you get right of entry to persistently quickly.

  9. You are so cool! I don’t think I’ve read something like that before.

    So good to discover another person with a few genuine thoughts on this subject.
    Seriously.. many thanks for starting this up. This web site is one thing that is
    needed on the internet, someone with some originality!

  10. you aгe truly a excellent weЬmaster. The website loading pace is amazing.

    It sort oof feeps tjat you are doing any distinctive
    trіck. In addition, Ƭhe c᧐ntеnts are mastеrpiece.
    you have ɗone a fantastic process in tһis t᧐pic!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.