Everybody Lies
Blending the informed analysis of The Signal and the Noise with the instructive iconoclasm of Think Like a Freak, a fascinating, illuminating, and witty look at what the vast amounts of information now instantly available to us reveals about ourselves and our world—provided we ask the right questions.By the end of an average day in the early twenty-first century, human beings searching the internet will amass eight trillion gigabytes of data. This staggering amount of information—unprecedented in history—can tell us a great deal about who we are—the fears, desires, and behaviors that drive us, and the conscious and unconscious decisions we make. From the profound to the mundane, we can gain astonishing knowledge about the human psyche that less than twenty years ago, seemed unfathomable.Everybody Lies offers fascinating, surprising, and sometimes laugh-out-loud insights into everything from economics to ethics to sports to race to sex, gender and more, all drawn from the world of big data. What percentage of white voters didn’t vote for Barack Obama because he’s black? Does where you go to school affect how successful you are in life? Do parents secretly favor boy children over girls? Do violent films affect the crime rate? Can you beat the stock market? How regularly do we lie about our sex lives and who’s more self-conscious about sex, men or women?Investigating these questions and a host of others, Seth Stephens-Davidowitz offers revelations that can help us understand ourselves and our lives better. Drawing on studies and experiments on how we really live and think, he demonstrates in fascinating and often funny ways the extent to which all the world is indeed a lab. With conclusions ranging from strange-but-true to thought-provoking to disturbing, he explores the power of this digital truth serum and its deeper potential—revealing biases deeply embedded within us, information we can use to change our culture, and the questions we’re afraid to ask that might be essential to our health—both emotional and physical. All of us are touched by big data everyday, and its influence is multiplying. Everybody Lies challenges us to think differently about how we see it and the world.

Everybody Lies Details

TitleEverybody Lies
Author
LanguageEnglish
ReleaseMay 9th, 2017
PublisherDey Street Books
ISBN-139780062390851
Rating
GenreNonfiction, Science, Psychology, Economics, Business, Technology, Sociology, Audiobook, Social Science, Politics

Everybody Lies Review

  • Will Byrnes
    January 1, 1970
    …people’s search for information is, in itself, information. When and where they search for facts, quotes, jokes, places, persons, things, or help, it turns out, can tell us a lot more about what they really think, really desire, really fear, and really do than anyone might have guessed. This is especially true since people sometimes don’t so much query Google as confide in it: “I hate my boss.” “I am drunk.” “My dad hit me.” There’s lies, damned lies and then there are statistics. One must won …people’s search for information is, in itself, information. When and where they search for facts, quotes, jokes, places, persons, things, or help, it turns out, can tell us a lot more about what they really think, really desire, really fear, and really do than anyone might have guessed. This is especially true since people sometimes don’t so much query Google as confide in it: “I hate my boss.” “I am drunk.” “My dad hit me.” There’s lies, damned lies and then there are statistics. One must wonder. Do the lies get bigger as the datasets grow? Seth Stephens-Davidowitz posits that the availability of vast sums of new data not only allows researchers to make better predictions, but offers them never-before-available tools that can offer insight that direct questioning never could. We have seen steps up of this type before. Malcolm Gladwell has made a career of such, with Blink, Outliers, and The Tipping Point. Freakonomics is the one I would expect most folks would know. Nate Silver put his data expertise into The Signal and the Noise. All these looks at data and how we interpret it rely on the analyst, regardless, pretty much, of the data. While the same might be true of Stephens-Davidowitz’s approach, he focuses on the availability of materials that have not been there in the past. The smarts that must be applied to get the most interesting results can now be applied to new oceans of data. It is more possible than it has ever been to draw inferences and actually test them out. In addition to the volume of data that is now available, there is the sort. The author looks at Google and FB data for evidence of underlying realities. Surveys can sometimes offer inaccurate outcomes, when the people being queried do not provide honest answers. Are you a racist? Yes/No. But one can look at what people enter into Google to get a sense of possible racism by geographic area. The everyday act of typing a word or phrase into a compact rectangular white box leaves a small trace of truth that, when multiplied by millions, eventually reveals profound realities. Looking for queries on jokes involving the N-Word, for example, turns out to yield a telling portrait of anti-black sentiment, which also correlates with lower black life expectancy. (And pro-Trump vote totals) We are treated to looks into a variety of research subjects, from picking the ponies, to seeing what really interests/concerns people sexually, looking for patterns of child abuse, selecting the best wine, using the texts of a vast number of books and movie scripts to come up with six simple plot structures.I thought the most interesting piece was on the use of associations, and provoking curiosity, rather than relying on overt statements to influence how people feel about a different group of people. Another was on using a data comparison of one’s (anonymous) medical information to others who share many characteristics to improve medical diagnoses.There are some areas in which it was not entirely persuasive that the methodology in question was tracking what was claimed. SS-D sees in searches of Pornhub, for example, what people really want and really do, not what they say they want and say they do. Really? I expect that what people check out on-line does not necessarily track with what might be of interest in real life. It would be like someone with an interest in mysteries being thought to have homicidal tendencies after searching for a variety of homicide related titles. Should a writer doing research into a dark subject like child pornography, human trafficking or cannibalism expect the heavy knock of the police on his/her door? Where is the line between an academic or titillation search and one made for planning?SS-D makes a point about there being a significant difference between searches that offer projections for groups or areas, and their inapplicability for predicting individual behavior, although that will not necessarily remain the case. In baseball, for example, the explosion of available information may very well be applied to specific players to diagnose and even correct flaws in technique, or recognize patterns that might expose underlying medical issues, or predict their arrival. The Big Data related here is much more macro, looking at group proclivities. Useful for spotting trends, measuring public sentiment, but in more detail than has been heretofore possible.And of course there is the impact of dark players. Those with the resources and motivation could manipulate the Big Data produced by Google and Facebook. Such players would not necessarily be limited to Russian cyber-spies and pranksters, but corporate and ideological players as well, like Robert Mercer. There could have been a bit more in here on those concerns.The book offers plenty of anecdotal bits that could have been lifted from any of the other data books noted at the top of this review. What one needs, ultimately is smart, insightful analysis. Having all the data in the world (that means you, NSA) is merely a burden unless there is someone insightful enough to figure out the right questions to ask, and how to ask them.SS-D notes several Google (Trends, Ngrams, Correlate) services that might be familiar to folks doing actual research, but which were news to me. It might be useful to check out some of these, maybe even come up with meaningful queries to shed light on pressing, or even completely frivolous questions.Not all problems can be solved, or even examined by the addition of ever more data. Sometimes, many times, the information that is available is perfectly sufficient to the task, but other factors prevent the joining together of its various pieces to create a meaningful whole. The now classic example is from 9/11, when an absence of coordination between the CIA and FBI resulted in suicide bombers who could have been foiled succeeding in their mission. Politics and the culture of nations and organizations figure into how data is usedSo if everybody lies, is Seth Stephens-Davidowitz telling us the truth? I am sure there is a query one could construct that would look at diverse data sources, pull them all together and give us a fuller picture, but for now, we will have to make do with reading his book and articles, checking out his videos, applying the analytical tools already incorporated into our brains, and seeing if there is enough information there with which to come to a well-grounded conclusion. And that’s no lie.Review posted – May 5, 2017Publication date – May 9, 2017=============================EXTRA STUFFLinks to the author’s personal, Twitter, and FB pagesVIDEOS – SS-D speaking----- Stanford Seminar - Insights with New Data: Using Google Search Data-----Google Sex with Seth Stephens-Davidowitz - Arts & Ideas at the JCCSF----- Big Data and the Social Sciences - The Julis-Rabinowitz Center for Public Policy and FinanceThe June 2017 National Geographic cover story has particular relevance to the treatment of actual truth in today's political environment. It is illuminating, if not exactly uplifting. - Why We Lie: The Science Behind Our Deceptive Ways - By Yudhijit BhattacharjeeJuly 12, 2017 - Washington Post - one of the very serious applications of big data - The investigation goes digital: Did someone point Russia to specific online targets? - by Philip BumpJuly 15, 2017 - One of the ways big data gets compromised is via automated dishonesty - Please Prove You’re Not a Robot by Tim Wu - Thanks to Henry B for letting us know about the article
    more
  • Jessica
    January 1, 1970
    This book tries too hard to be Freakonomics. The first two parts are full of random examples of interesting but mostly pointless things that can learned via Google search trends. However, a whole lot of assumptions are made off these bits of data that don't seem to have much basis in factual scientific methods of research. Unprofessional jokes are thrown in randomly. If you need a footnote to explain why a joke was not homophobic maybe you should have just skipped the joke. And any book of less This book tries too hard to be Freakonomics. The first two parts are full of random examples of interesting but mostly pointless things that can learned via Google search trends. However, a whole lot of assumptions are made off these bits of data that don't seem to have much basis in factual scientific methods of research. Unprofessional jokes are thrown in randomly. If you need a footnote to explain why a joke was not homophobic maybe you should have just skipped the joke. And any book of less than 300 pages of text should not need to use the same example three times, especially when it's about how the author can't believe women are concerned about the smell of their vagina.The last section of the book explains the limitations big data holds and is really the most grounded section, the rest being almost hagiography. It would have done a lot to work the third section into the examples of the first two sections. It would have balanced out the praise and also would have done much to explain the flaws present in some of the examples included.Some cool facts buried in a lot of murky oddness.Disclaimer: I was given this book in a Goodreads giveaway.
    more
  • Lori
    January 1, 1970
    When sociologist ask people if they waste food, people give the only correct answer. It's wrong to waste food. When sociologist survey the contents of the same people's garbage, they get a more accurate answer.Just imagine how much more information is available trolling through internet searches.
    more
  • David
    January 1, 1970
    This is an engaging book about how big data can be used to improve our understanding of human behavior, thinking, emotions, and preference. The basic idea is that if you ask people about their behavior or their preferences in surveys, even anonymous surveys, they will often lie. People do not like to admit to low-brow preferences; racists do not want to admit to their prejudices, most people who watch pornography do not want to admit to it, and even voting is often misrepresented; some people wh This is an engaging book about how big data can be used to improve our understanding of human behavior, thinking, emotions, and preference. The basic idea is that if you ask people about their behavior or their preferences in surveys, even anonymous surveys, they will often lie. People do not like to admit to low-brow preferences; racists do not want to admit to their prejudices, most people who watch pornography do not want to admit to it, and even voting is often misrepresented; some people who voted for Trump would not admit to it. But, by analyzing immense datasets from Google, public archives, social media, and the like, Seth Stephens-Davidowitz has been able to unearth a lot of fascinating answers to puzzling questions. For example, he is able to predict, through Google searches for various symptoms, who is likely to have early stages of pancreatic cancer. He can predict epidemic breakouts of some contagious diseases well before they are announced by the CDC (Center for Disease Control). He shows that the single factor that correlates with voting for Trump is that of racism.Then there are the fun factoids, about the sorts of things that people search for most often on Google. Most commonly, the search "Is my son ..." is followed by "gifted", while the search "Is my daughter ..." is followed by "overweight". That tells us something about stereotypes for the way people think about their children. Interestingly, the release of a new violent movie in a city is correlated with a decrease in violent crime in that city. Perhaps the reason is that violent people who are watching the movie are not out on the streets, committing crimes.And here we get to the main problem with this sort of analysis. Undoubtedly, the research and analysis of big datasets is done correctly. However, once a surprising result is found, understanding the motivations behind the online activity are often subjective and open to interpretation. While this book is very careful about its underlying assumptions, it is a slippery road to getting the correct interpretations and explanations.This is an easy, well-paced book that should appeal to anybody who enjoys books like Freakonomics: A Rogue Economist Explores the Hidden Side of Everything.
    more
  • Trish
    January 1, 1970
    Maybe everyone does lie. But they don’t lie all the time. Stephens-Davidowitz makes the good point that asking people directly doesn’t always, in fact may not often, yield true answers. People have their own reasons for answering pollsters untruthfully, but it is clear that this is a documented fact. People sometimes lie to pollsters.Stephens-Davidowitz was told by mentors and advisors not to consider Google searches worthwhile data, but the more he looked at it, the more he was convinced that G Maybe everyone does lie. But they don’t lie all the time. Stephens-Davidowitz makes the good point that asking people directly doesn’t always, in fact may not often, yield true answers. People have their own reasons for answering pollsters untruthfully, but it is clear that this is a documented fact. People sometimes lie to pollsters.Stephens-Davidowitz was told by mentors and advisors not to consider Google searches worthwhile data, but the more he looked at it, the more he was convinced that Google searches contained the best data for determining what people are concerned about. He has uncovered some interesting trends that are not apparent through direct questioning because people are sometimes ashamed of their fears, feelings, prejudices, and predilections.♾I didn’t really like this book. Partly the reason is because I listened to it, and Stephens-Davidowitz gives charts, graphs, data points that obviously cannot be represented in the audio version. These usually help me to grasp things easily and maybe bypass pages of material that is not as interesting to me. It wasn’t that his material was hard, it was that I oftentimes did not like what he was talking about. He had a tendency to focus on deviant behavior, e.g., sexual predators, abuse, porn, etc. One might make the argument that these behaviors are important to understand and therefore worth looking at. Possibly. However, if ‘everybody lies,’ one might make the argument that we do not have to look at deviance to find untruthfulness.What we discover is that to test Stephens-Davidowitz’s thesis that ‘everybody lies,’ we have to spend quite a lot of time with statistics and creating studies, or as he is wont to do, studying big data. Big data probably irons out discrepancies in the reasons for our Google searches, e.g., that it is not me that is interested in the herpes virus, it is my brother, because in the end it doesn’t matter why we did the search; what matters is that we did the search. Besides, maybe I’m lying about my brother having the virus, but my interest in the topic is not a lie.Stephens-Davidowitz has made a career so far out of the study of big data, showing us ways to slice and dice it so that it is useful to our view of the world. Only thing is, I am not as interested in what big data tells us as he is. He’d trained as an economist, and towards the end of the book he hit a couple of areas I did find more interesting, like the notion of regression discontinuity, a term used to describe a statistical tool created to measure the outcomes of people very close to some arbitrary cut-off.** S-D talks about using this tool on federal inmates, discovering criminals treated more harshly committed more crimes upon their release. But S-D also studied students on either side of the admissions cut-off for the prestigious Stuyvesant High School: those who attended Stuyvesant did not have a significant performance difference in later life than students who did not. Apparently Stephens-Davidowitz went into data science because of Freakonomics, the bestselling book by Steven D. Levitt. He believes that many of the next generation of scientists in every field will be data scientists. I did finish the audiobook, another study he took note of in the last pages. Apparently few readers finish ‘treatises’ by economists. He believes this is his big contribution to our knowledge base, and there is no doubt his contrariness did highlight ways big data can be used effectively.If I may be so bold, I might be able to suggest a reason why many female readers may not be as interested in the material presented, or in Stephens-Davidowitz himself (he was/is apparently looking for a girlfriend). Stay away from the deviant sex stuff, Seth. It may interest you but I can guarantee that fewer women are going to find that appealing or reassuring conversation or reading material.An interesting corollary to this economists’ data view is the question of whether the truth matters, which is how I came to pick up this book. Recently on PBS’ The Third Rail with Ozy, Carlos Watson asked whether the truth matters. At first blush the answer seems obvious, and two sides debated this question. One side said of course truth matters…but most of us know one man’s truth to be another man’s lie. The other side said ‘everybody lies.’ It got me to thinking…I do think the two ways of coming to the notion of lying dovetail at some point, and one has to conclude that truth may not matter as much as we think. What matters is what we believe to be true.Finally, it appears Stephens-Davidson agrees to some degree with Cathy O'Neill, author of Weapons of Math Destruction, in that he agrees you best not let algorithms run without human tweaking and interference. The best outcomes are delivered when humans apply their particular observations and knowledge and expertise along with big data.** S-D describes it this way: “Any time there is precise number that divides people into two different groups, a discontinuity, economists can compare, or regress, the outcomes of people very very close to the cut off.”
    more
  • Richard Derus
    January 1, 1970
    I have nothing unique to add to the conversation about this book. I think those most in need of reading it won't, and that's frustrating.If you've ever seen a number adduced to explain a trend, read this book. If you've ever asserted that a certain percentage of something was something/something else, read this book. If you've ever seen a politician quote a study and your innate bullshit filter clogged up, read this book.Really simple, high-level terms: READ. THIS. BOOK.
    more
  • Atila Iamarino
    January 1, 1970
    Acertei em cheio nessa leitura! Seth Stephens-Davidowitz apresenta uma análise de como as pessoas se comportam, na mesma linha do The Signal and the Noise: Why So Many Predictions Fail - But Some Don't e do Dataclisma: Quem somos quando achamos que ninguém está vendo. Mas enquanto Signal and the Noise fala de tendências de dados e Dataclisma fala do comportamento das pessoas dentro do OkCupid!, Everybody Lies fala de como as pessoas se comportam em geral.O autor usa uma série de dados de forma b Acertei em cheio nessa leitura! Seth Stephens-Davidowitz apresenta uma análise de como as pessoas se comportam, na mesma linha do The Signal and the Noise: Why So Many Predictions Fail - But Some Don't e do Dataclisma: Quem somos quando achamos que ninguém está vendo. Mas enquanto Signal and the Noise fala de tendências de dados e Dataclisma fala do comportamento das pessoas dentro do OkCupid!, Everybody Lies fala de como as pessoas se comportam em geral.O autor usa uma série de dados de forma bastante inovadora, como tendências de buscas no Google (onde ele trabalha), buscas no PornHub, Facebook e outras fontes de big data para fazer o que ele chama de "sociologia de verdade" ou sociologia baseada em evidências. Os dados que ele mostra sobre preconceito (buscas por temas preconceituosos), insegurança de auto-imagem, inseguranças em relação aos filhos e afins mostram uma imagem bem mais crua e feia da sociedade do que o que pintamos com postagens em Facebook e Instagram. Outros revelam informações no mínimo interessantes, sobre a diferença que se formar em Harvard pode fazer (nenhuma, o ponto parece estar em quem se forma), onde criar os filhos, como aumentar as chances de sucesso em um encontro... O livro lembra bastante uma versão mais nova e, na minha opinião, mais curiosa da abordagem inovadora de Freakonomics.Se você não está interessado na revolução que o registro e a disponibilidade de dados está causando no mundo, e no estrago que empresas e governos conseguem fazer com o controle que têm sobre a informação, no mínimo vai curtir o livro pelos fatos curiosos e mórbidos que ele levanta dos dados. Saber por exemplo que o número de homens que buscam como fazer bem sexo oral nas mulheres é o mesmo que busca por como fazer sexo oral em si mesmo fala muito sobre como as pessoas pensam. Um livro para todos os gostos.
    more
  • Jim
    January 1, 1970
    I am now convinced that Google searches are the most important data set ever collected on the human psyche. writes the author early on & he shows why. (Google trends is available to all here: https://trends.google.com/trends/) He also checked other big data sets including Wikipedia, Facebook, Pornhub, & even Stormfront, the largest racist site. What he found was really interesting & it will help harden the soft, social sciences. It's a new frontier. He points out problems with tradit I am now convinced that Google searches are the most important data set ever collected on the human psyche. writes the author early on & he shows why. (Google trends is available to all here: https://trends.google.com/trends/) He also checked other big data sets including Wikipedia, Facebook, Pornhub, & even Stormfront, the largest racist site. What he found was really interesting & it will help harden the soft, social sciences. It's a new frontier. He points out problems with traditional reporting. In the section about child abuse & abortions, Google searches suggest that child abuse does increase during economic downturns while gov't figures incorrectly show little change. Closing abortion clinics doesn't stop them, it simply leads to more self-induced abortions. Both happen off the books, but there is now convincing supporting data to show us what we need to address & make more informed decisions with resources.Big data has an advantage over every other type of survey because few realize it is being collected, so we don't lie to make ourselves look better. It's also anonymous & aggregate, so caution needs to be used when forming conclusions. For instance, based on Pornhub searches, the author concludes that about 5% of men are gay because they searched for gay porn. That seemed a reasonable conclusion until he pointed out that 15% of women search for rape porn. Does that mean they want to be raped? The author says of course not & makes a big deal out of the difference between fantasy & reality. That makes me question his first conclusion, although it seems about right.Gut reactions are often wrong & he provides several examples where it's wrong due to cognitive biases. He also points out "The Curse of Dimensionality". Given large enough sets of data, there will be correlations just through chance. For instance, there are graphs that show how closely autism diagnoses track with organic food sales or Jenny McCarthy's popularity. Separating these out is a whole other problem.Big Data only gives us trends that we need to examine. We can't use it on the individual level. While 1000 people searched for how to kill their girl friend, only 1 girl was killed in his example. That's horrific & might have been stopped if someone had looked at his search history, but do we give up everyone's privacy for a 1 in 1000 chance that we might prevent a murder? Some might be willing, but I'm not, so we also have new questions to address.The audio book was well narrated & I didn't miss the graphs too much. They're provided in the extra material, but weren't handy when I was listening & the book took that into account for the most part. Highly recommended in either format.
    more
  • aPriL does feral sometimes
    January 1, 1970
    I was annoyed by the author’s writing style in ‘Everybody Lies’. I have no doubts author Seth Stephens-Davidowitz was trying to write to a large general audience, including that assumed class of American non-science reader who hates math and binge watches ‘Keeping Up with the Kardashians’. Good for him, and maybe you, right? But I became more and more annoyed as I read. Ah, well. It is an interesting and informative read, in spite of trying too hard to be fun, imho.What is the book about? I am g I was annoyed by the author’s writing style in ‘Everybody Lies’. I have no doubts author Seth Stephens-Davidowitz was trying to write to a large general audience, including that assumed class of American non-science reader who hates math and binge watches ‘Keeping Up with the Kardashians’. Good for him, and maybe you, right? But I became more and more annoyed as I read. Ah, well. It is an interesting and informative read, in spite of trying too hard to be fun, imho.What is the book about? I am glad to report it has genuine information about the science of statistics and ‘big data’ collecting, and how the erroneous selection of study parameters or assumptions about what is relevant data to study affects conclusions (as far as I know - I am a dunce at scientific math, despite that I passed a statistics class). The author used what seemed to me genuinely interesting new methods to formulate statistical studies, primarily using Google’s forensic tools, along with other sources. I was shocked by what people type into Google Search (which Google compiles into anonymous data). For example, President Obama’s race appears to have truly ignited racists into coming out of their closets. Comparing survey interviews with people who state they are racist (a low percentage) with the percentage of those who Googled “n***** jokes” state by state turns out to show some truly hidden pockets of unexpected racism - and the total percentage of racist searches on Google was WAY higher than the racism that typical surveys show. In addition, those places who adore Trump also searched most for “n***** jokes”. Correlation? Idk, no one does know for the record, but I think yes.Also of interest to me (please don’t bust my balls because of my prurient interests - and maybe there is a pun in this sentence, hehheh - read on) men really truly do Google a lot about penis sizes. Come on, fellas, give it a rest! (Yes, I am trying to be snarky since the too much ‘at rest’ position is part of what men appear to be most anxious about!) Men prowl porn sites in humongous numbers - shocking, right? - which is good for statisticians looking for Truth about sexuality for their inputs into their mathematical equations. Based on Google porn searches, the author estimates 5% of the population is gay. (Btw, conservatives mostly use the word ‘homosexual’ while liberals use the phrase ‘same-sex’, statistically, in Google searches.)Not to neglect what Google says about what the ladies’ biggest sexual worry is, all I can say is, Oh. My. God. Vagina odor. Really? Really!!All statisticians should take note - interrogative surveys often show different results from those statistics revealed in Google searches about the percentages of who is thinking/feeling what where and when, especially in those morally-weighted or personally embarrassing areas of society. Of course, interpretation is always fraught with possible erroneous judgements whatever the source of sampling. I have always trusted those insurance actuarial tables FAR more than political or media spins or even university data studies - so now I am adding Google statistics to my ‘trusted info’ list. Of course, gentle reader, I know any compilations of data can be erroneously or purposely manipulated or massaged. ‘Garbage in, garbage out’ still applies...which is the case ‘Everybody Lies’ makes as well. The book seemed on top of the science, as far as I know. I am not a science-brain, but an amateur wannabe. My one irritation with this book is all about the manner in which the information is explained. Gentle reader, my complaint is subjective as hell. Honestly, I can’t put my finger on it, though. The writer seemed to be trying to fill out his actual 200-page book to 300 pages by having personal emotional filler similar to the gaspy asides many shows use to increase the viewers’ emotional high about what is being discussed. Are you familiar with those TV shows that, after each commercial break, recap the entire show in the preceding minutes before the commercial break in a breathless montage manner? And they often had a shocked-gasp teaser of what will be shown before the commercial break? Anyway, I felt there was a lot of that style of emotional manipulation (and extending of the material) going on in this book, somehow. I simply did not appreciate the personal ‘fun’ filler so much. Maybe there wasn’t enough snark. I prefer snarky humor, if there is humor. Bite me. Maybe a more tightly edited book would have worked better for me to enjoy reading it. Anyway, I realize I am floundering about here. None of this may be true at all for you.Ultimately, this is a book worthy of reading for the general reader (for the record, I definitely have a lit/history brain, so yes, I am a general science reader!) and the explanatory information about how statistical studies are done (the only math-involved college class which engaged me) and what people are really feeling and thinking (if Google searches are to be believed, and I think they are). Included are extensive Notes and Index sections.
    more
  • Caroline
    January 1, 1970
    I wish I could give this book more than five stars. Anyone who has a sneaking feeling that Americans aren't who they SAY they are will find confirmation here. It's also easy to read, no academic language here.I was already riveted by the introduction. His premise is that we all lie to each other, pollsters, and ourselves, but not to that white box where you type internet searches. Both before and after the election everyone went nuts trying to figure out why Trump was doing so much better than p I wish I could give this book more than five stars. Anyone who has a sneaking feeling that Americans aren't who they SAY they are will find confirmation here. It's also easy to read, no academic language here.I was already riveted by the introduction. His premise is that we all lie to each other, pollsters, and ourselves, but not to that white box where you type internet searches. Both before and after the election everyone went nuts trying to figure out why Trump was doing so much better than polls would indicate, looking for factors that would explain it. There was only one. "[Nate] Silver found that the single factor that best correlated with Donald Trump's support in the Republican primaries was that measure I had discovered four years earlier. Areas that supported Trump in the largest numbers were those that made the most Google searches for 'n-----'." (He uses the real word, which deepens the revulsion you feel at what he's discovered.)Despite Obama's two easy election victories and the narrative that we were post-racism, the Google search data tells another story about reactions to those victories.Immediately after the San Bernadino shootings, what happened online? A ton of people searched for "kill Muslims."And there is a lot more, about sex and child abuse and sexism. Did you know that the most common term used to complete the sentence "Is my son..." is 'gifted' or some variant thereof, and the most common term used to complete the sentence "Is my daughter..." is 'overweight'?America is not post-anything except maybe post-good intentions.What use does he think this can be? Well, he did have some good suggestions, and none of them are based on finding out who any individual is who's done a search. For example, if searches for "kill Muslims" spike in a certain city, a few extra police could be deployed to watch over the local mosque until the spike subsides. He spends a moment talking about how big data is not meant to be, and should not be, used to try to figure out who specifically is going to commit crimes.By the time I was done with this book I was a bit discouraged at who Americans seem to be, but it's better to know. I hope that this kind of study continues, so we can attempt to realistically work with our society instead of pretend it's something it's not.
    more
  • linhtalinhtinh
    January 1, 1970
    A pretty short book with some interesting remarks, but not yet charming enough for me. The author definitely has his quirky and funny moments, when he presents himself, his family, and especially his views more. Yet the books' ideas and findings aren't exactly ground breaking. The types of questions like this have been posed in Freakonomics: A Rogue Economist Explores the Hidden Side of Everything. The usefullness of big data has been discussed by ones such as Dataclysm: Who We Are (discussion o A pretty short book with some interesting remarks, but not yet charming enough for me. The author definitely has his quirky and funny moments, when he presents himself, his family, and especially his views more. Yet the books' ideas and findings aren't exactly ground breaking. The types of questions like this have been posed in Freakonomics: A Rogue Economist Explores the Hidden Side of Everything. The usefullness of big data has been discussed by ones such as Dataclysm: Who We Are (discussion on sex and gender actually resemble Dataclysm a lot). I was looking for something more nuanced, a long and rigorous thematic research on human's tendency, and data as an extremely useful tool but not the main focus. Instead, it's more like a collection of observations. Each time Stephens-Davidowitz has an idea, he looks for answer from the available data, then moves on. The questions are somewhat related to human's private behaviors that traditionally we can't observe. The tool seems to be a bit more at the center here, but he doesn't discusses the cons and all the ethical implications of big data that deep enough, except for a short section at the end of the book. Now, that's totally ok, for a casual and light, yet still useful read. More importantly, we have to consider that these type of research and the topic of big data are still relatively new. It takes decades and decades more to build a literature huge enough to draw really meaningful and profound conclusions. The time simply hasn't arrived yet for the book of my taste, but this one, as the author states, hopefully would raise interests in young people, young social scientist, steering them towards potentially fruitful topics and research methodologies. That's why it's a 3 star.
    more
  • Matt Ward
    January 1, 1970
    This book could have used a good editor. It tries to be a Gladwell-type of book without fully succeeding. Issue 1 is that the anecdotal stories are not fleshed out enough to really draw you in like Gladwell does. This causes much of the book to come across as a list of facts, and it gets pretty old by the midway point.The other issue is a growing trend among people writing data books. They want to write in a colloquial style to make it seem informal and easy to read. They don't want to scare off This book could have used a good editor. It tries to be a Gladwell-type of book without fully succeeding. Issue 1 is that the anecdotal stories are not fleshed out enough to really draw you in like Gladwell does. This causes much of the book to come across as a list of facts, and it gets pretty old by the midway point.The other issue is a growing trend among people writing data books. They want to write in a colloquial style to make it seem informal and easy to read. They don't want to scare off people with talk of algorithms and things like that. Unfortunately, using tons of sentence fragments and colloquial phrases only makes a book like this harder to read. It's precision and clarity that make books easy to understand. Introducing ambiguity in order to sound like a friendly conversation is exactly the wrong approach.Overall, there are a bunch of interesting facts in here. I think Seth gets a bunch wrong, though, in not understanding fully why certain search terms are used.
    more
  • Lubinka Dimitrova
    January 1, 1970
    I sought out the book after reading an interview with the author, and it was totally worth it. The book is quite enlightening, and to be honest, deeply frightening. Internet data can work miracles for the benefit of humanity, but it can bring to life many unimaginable, Big-Brother-type nightmares (current US presidents not excluded, just sayin...). Still, it's good to know.
    more
  • Emma Deplores Goodreads Censorship
    January 1, 1970
    3.5 starsThis is an engaging and informative book about the huge amount of data available online and what it tells us about society. I read it alongside Dataclysm and found Everybody Lies to be by far the better of the two, presenting a wealth of information in a cohesive fashion and making fewer unfounded assumptions. The author was a data scientist at Google, and draws in large part on the searches people make on the site, along with information from sites including Facebook and Pornhub.There’ 3.5 starsThis is an engaging and informative book about the huge amount of data available online and what it tells us about society. I read it alongside Dataclysm and found Everybody Lies to be by far the better of the two, presenting a wealth of information in a cohesive fashion and making fewer unfounded assumptions. The author was a data scientist at Google, and draws in large part on the searches people make on the site, along with information from sites including Facebook and Pornhub.There’s a lot of interesting stuff in the data, from the rate of racist searches in the rust belt predicting the rise of Donald Trump, to common body anxieties and whether they actually matter to the opposite sex, to an estimate of how many men are gay and whether that varies by geography (it appears not), to rates of self-induced abortions. This is a great book to read if you love unusual factoids, whether on sexual proclivities or how sports fans are made. The author also writes in a compelling way about the uses of Big Data itself, and while he waxes evangelical about it (evidently preferring to spend all his time immersed in statistically significant data, he finds novels and biographies too “small and unrepresentative" and therefore uninteresting), there are certainly a lot of possibilities there. In health, for instance, compiling early searches about symptoms with later searches for how to handle a diagnosis can help doctors detect pancreatic cancer at an earlier stage, while epidemics can be tracked through symptom searches. The author is also interested in how applying data can revolutionize a field, discussing at length the data that predicted the success of the racehorse American Pharaoh. (By "at length" I mean 9 pages; this is a book that moves through a broad range of topics quickly.)Overall, the writing is engaging and the book hangs together well, being informative while mostly resisting the urge to speculate. But the author does make a couple of assumptions worth pointing out. One is that people’s Google searches are made in earnest and for personal reasons. Certainly, you might search for “depression symptoms” out of concern that you or someone you know is depressed. But you also might want to be prepared in advance to identify warning signs, or might have encountered something in the media that sparked your interest, or you might be a student writing a paper on the topic. On the other hand, if you’re intimately familiar with depression already, you’re unlikely to google the symptoms. None of this means the author’s finding a 40% difference in rates of depression symptom searches between Chicago and Hawaii isn’t relevant, but data that’s both over- and under-inclusive serves better as a starting point for research than a definitive conclusion. It's certainly not proof that better geography is twice as effective as antidepressants, as the author suggests.The other assumption is that everybody lies: the book insists on it, based largely on the fact that typically rosy social media posts fail to reflect all those unhappy or hateful searches. Selectively sharing information doesn’t necessarily seem to me to be lying, but the author appears invested in proving the book’s title. For instance, he discusses a particular type of tax fraud: in areas where few tax professionals or people eligible for the scheme live, 2% of people who could benefit from this lie tell it, while in areas with high concentrations of both, the rate of cheating is around 30%. The author concludes that “the key isn’t determining who is honest and who is dishonest. It is determining who knows how to cheat and who doesn’t.” This bleak view of the world fails to account for the 70% who don’t cheat even in areas with high levels of knowledge; finding that significant numbers of people cheat if they know how is a far cry from finding that everyone does.So, like the author of Dataclysm, Stephens-Davidowitz is probably a better statistician than sociologist. But if you’re interested in Big Data, or in getting a peek at the thoughts and anxieties people ask Google about because they’re not comfortable sharing with others, this is the book I recommend. You’ll certainly get a lot of interesting tidbits from it, along with perhaps new inhibitions about typing things into Google!
    more
  • Wen
    January 1, 1970
    The title steered me a bit off-course at first—I thought it was one of those self-help psychology books that I tend to avoid. I eventually decided to give it a shot, mostly because Steven Pinker, and author I highly respect, wrote the forward. So glad I did.To the author Mr. Davidowitz , I did finish the book, so did I with regard to the first two books you mentioned below --moot point for the third book as it’s not even on my to-read list ;-) “more than 90 percent of readers finished Donna Tart The title steered me a bit off-course at first—I thought it was one of those self-help psychology books that I tend to avoid. I eventually decided to give it a shot, mostly because Steven Pinker, and author I highly respect, wrote the forward. So glad I did.To the author Mr. Davidowitz , I did finish the book, so did I with regard to the first two books you mentioned below --moot point for the third book as it’s not even on my to-read list ;-) “more than 90 percent of readers finished Donna Tartt’s novel The Goldfinch. In contrast, only about 7 percent made it through Nobel Prize economist Daniel Kahneman’s magnum opus, Thinking, Fast and Slow. Fewer than 3 percent… made it to the end of economist Thomas Piketty’s much discussed and praised Capital in the 21st Century.”As the subtitle suggested, the book was a primer on data science, a still budding field but serves as the very foundation for hot markets du-jour, like artificial intelligence and machine learning.First of all, as informative as the book was, I’d say the book targeted general reading public. It mostly steered clear of mathematical, statistical and programming jargons,. The writing style was of light-heartedness; it certainly did not remind me of a serious (boring??) college textbook. That said, I assume those readers who love numbers and prefer talking in percentage terms would enjoy the book more. In short, data science in this book was telling stories through data, big data, new data, i.e. the gigantic data sets we now have access to thanks mostly to keywords we put into internet search engines every day. And today even our personal computer might be capable of processing and analyzing such data sets , given increasingly cheaper memory chips and more powerful CPUs, GPUs or other processors.While Davidowitz admitted our guts could do a decent job drawing conclusions and making predictions naturally, he pointed out we need big data to “sharpen the picture”. For instance, it is common sense that harsh winter weather could lead to depression (the D-word was frequently brought up by tour guides during my recent vacation in Northern Europe), but how much of a temperature drop could affect people’s mood materially—10 degrees, or 50?? Would other factors, say “economic conditions, education levels, and church attendance”, muddied the picture? And how about our guts getting it totally wrong? The book gave an example of a study that concluded, totally against our intuition, couples maintaining separate sets of friends tend to stay in the relationship longer.So Davidowitz spent the bulk of the book, in Part two, to illustrate “the powers of the big data” that we have access today: 1) new types of data that are beyond survey data or tabular data, think tweets and pictures; 2) honest data, the data generated at the subconscious level, such as doing a Google search (instead of answering a survey), when people are not as inclined to lie;3) the data granular enough that we could zoom in on small subsets for our particular study;4) data so large and comprehensive that would allow us to undertake rapid, controlled experiments. Very soon into the book, I spotted its similarity with a very popular title published more than a decade ago that I also loved, Freakonomics: A Rogue Economist Explores the Hidden Side of Everything by Steven Levitt and Stephen Dubner. It was not only because Levitt was frequently mentioned in this book. • The two books shared similar carefree and witty writing style, and were of similar length;• both tried to employ data to debunk myths largely derived from our intuition;• as an engaging exercise for the reader, both gave lists of factors and asked readers to guess which ones had significant impact on the outcome, then gave the answers; • both devoted a large portion explaining the distinction between “causation” and “correlation”, and describing the A/B testing, the randomized, controlled study;• the two books even tackled a similar subject, albeit from different angles, whether the school choice determines the student’s later success.Essentially, both books encouraged readers to think out of the box and ask the right questions. Of course this book was more up-to-date; it listed data sources--Google Trends, Ngrams, and Correlate, along with unstructured data types that are more relevant to today’s digitally-connected readers. Not surprisingly, at the end of the book Davidowitz revealed that it was the very “rogue economist” Levitt and his Freakonomics that inspired him to pursuit his current career. In fact, Davidowitz explored the same data set, birth certificates data in California which included black residents’ first name ( or if it was a common white name or a distinctive black name). While Levitt’s established the connection between a black person’s first name and his socioeconomic background, Davidowitz built on the study, and used a black person’s first name as a proxy for his socioeconomic background, to study the linkage between this factor and the chance of the person making the NBA. To me this exposed one of the best known pitfalls of data analysis, so-called “garbage in, garbage out”. What if someone came out to prove Levitt wrong in his linkage? That would be like an earthquake to Davidowitz’ subsequent study? It would be great if this topic could be covered in more depth in the sequel of this book. To be fair, Davidowitz covered several limitations of big data in the book, particularly from the moral/ethics standpoint, e.g. price gouging, discrimination, and privacy.There were other parts int the book that I found less convincing. "... in the prediction business, you just need to know that something works, not why." As an example, the author cited that Walmart discovered its custoemrs preferred stockpiling Strawberry Pop-tart before hurricanes. So the store should just stock the pastries on its shelves simply based on the data, without confirming the causality first?I'd also be causcious about reading too much into the various first-date signals in the book. So a guy would be more interested in me if I act more like a narcissist? hmmm... Well, as an intro the book only scratched the surface of data science. And yet, it was an enjoyable, fast-paced, thought-provoking read. I decided to change my rating to 4, as I think both Freakonomics and Thinking, Fast and Slow mentioned above are relatively better reads.
    more
  • Cheryl
    January 1, 1970
    Believe the hype. This is not a perfect book, but it's fun, enlightening, ground-breaking, and important. Too many people don't know the potential power of the new methodologies of data analytics, and too few ppl who think they do know that power don't know the limitations. SethSD does, and he shares a lot of what he knows with us. This is good science for arm-chair science consumers like me, and a good read for those who just like to dabble in non-fiction. It's both concise and rich. Documented Believe the hype. This is not a perfect book, but it's fun, enlightening, ground-breaking, and important. Too many people don't know the potential power of the new methodologies of data analytics, and too few ppl who think they do know that power don't know the limitations. SethSD does, and he shares a lot of what he knows with us. This is good science for arm-chair science consumers like me, and a good read for those who just like to dabble in non-fiction. It's both concise and rich. Documented with notes, and index, and the author's own website which he promises has lots more hard info. It may turn out to be a four-star book as more on the topic get published. But right now I urge everyone to read it. Next, I do hope to read Seth's next book, and more on the subject. Yes, Seth, I did read right to the end, and still I'm glad you didn't keep struggling to say anything for the ages in your conclusion... imo, you ended it perfectly.On a personal note, one of the key points from the intro. and one of the key points from the conclusion are amazingly relevant. Here's the thing. Our youngest is looking for a school to transfer up to, at the same time we're looking for our first post-retirement community. We're hoping to find a college & town all three of us would like, and a particular field of study for our kid. In the beginning of this book are two maps, one that reveals Trump supporters, and one that reveals pockets of closet racists as exposed by their Google searches)... which is obviously relevant data for us as we choose what part of the country to move to. And at the end of the book, Seth tells my geeky son what studies to focus on:"I hope there is some young person reading this right now who is a bit confused on what she wants to do with her life. If you have a bit of statistical skill, an abundance of creativity, and curiosity, enter the data analytics business."(Well, my young person has been listening to me read bits from the book, but otherwise that could have been directly tailored for him.)Read the book. Don't be fooled by my long review; I'm only sharing a bit of what I learned from it.Other book darts:"[P]laces with the highest racist search rates included upstate New York, western Pennsylvania, eastern Ohio, industrial Michigan and rural Illinois, along with West Virginia... The true divide... was not South versus North; it was East versus West. You don't get this sort of thing much west of the Mississippi. And racism was not limited to Republicans...."The 4 powers of Big Data can be summarized: "Offering up new types of data...""Providing honest data...""Allowing us to zoom in on small subsets of people...""Allowing us to do many causal experiments...."Now we get to an example of what is not perfect about the book. First, context: Seth is a careful scientist; he knows about sampling errors, biases, correlation not equaling causation, etc. However, sometimes he forgets about alternative explanations and interpretations. That is to say, when the book shows us data, it's fine, but sometimes when Seth interprets the data, he gets trapped by a fallacy. Eg, he says, "[O]f the minority of women who visit PornHub, there is a (25%) subset who search... for rape imagery... sometimes people have fantasies they wish they didn't have and which they may never mention to others." Maybe... or maybe they're victims trying to process, or maybe they're wannabee authors doing research, or they're men lying to present as female.... It looks to me like Seth didn't want to think too hard about this one....Big data allows researchers to zoom in on subsets of demographic groups, and geographical regions.... "But another huge--and still growing--advantage of data from the internet is that is easy to collect data from around the world.... And data scientists get an opportunity to tiptoe into anthropology."Big data could really help in the field of healthcare. When I'm done here I'm going to check out the site PatientsLikeMe.com. "Heywood hopes that you can find people of your age and gender, with your history, reporting symptoms similar to yours--and see what has worked for them."I also want to consider reading Irresistible: The Rise of Addictive Technology and the Business of Keeping Us Hooked and Super Crunchers: Why Thinking-By-Numbers Is the New Way to Be Smart.
    more
  • Greg
    January 1, 1970
    UPDATE: In summary, the author bounces back and forth between real data/numbers and pure speculation. It's fascinating, really, as that's got to be the entire point: to show us how to tell what's real and what's fiction as we are bombarded by information.. ORIGINAL REVIEW:Yes, "Everybody Lies" including, obviously, the author because if Seth Stephens-Davidowitz never lies, I'm sure the subtitle would have been "Except Me Within This Book". So, from our data thus far, we know the author lies, and UPDATE: In summary, the author bounces back and forth between real data/numbers and pure speculation. It's fascinating, really, as that's got to be the entire point: to show us how to tell what's real and what's fiction as we are bombarded by information.. ORIGINAL REVIEW:Yes, "Everybody Lies" including, obviously, the author because if Seth Stephens-Davidowitz never lies, I'm sure the subtitle would have been "Except Me Within This Book". So, from our data thus far, we know the author lies, and maybe even within this book. The author's first major error comes from a hilarious statement about how gay men like Judy Garland (one can only suppose the author's sample is his gay uncles and their friends, in which case he should have used Edith Piaf instead, or maybe Bette Midler, as both Bette and Piaf were indeed placed into stardom by gay communities, while Garland was a star for everyone who went to the movies in 1939 and saw her in "Wizard of Oz". Oh, I'm digressing, sorry. I liked some of this book: specifically the parts where real numbers/data are used: for example, in 1950 a survey revealed 20% of a certain sample said they had a library card, but the official sample count indicated only 13% actually had one. (Why, oh why, would any adult in the USA NOT have a library card? This boggles my mind.) But get this ridiculous utilization of words: "...the overwhelming majority of black Americans think they suffer from prejudice....On the other hand, very few white Americans will admit to being racist." Good grief. EVERYONE has prejudices (that's how we get through this chaotic world, as we are prejudicial for, say, driving instead of flying because we like road trips and like to stop at places we have never visited and meet people we would otherwise never have a chance to speak with-and I'm talking about me, as long airport lines are no fun) but racism is a different issue entirely, as racism has nothing to do with my prejudicial decision to drive. A section devoted to "omitted-variable bias" certainly belongs in another science book this year entitled "We Have No Idea". Hilariously, the author concludes with a lot of questions, including this howler: "Where do sexual preferences come from?" The answer is simply one of genetics (in combination with epigenetics, which may turn "on" or "off" these genes) but that's old news. Hence, can we conclude that all economists and statisticians don't read current books and journals regarding genetics. Hardly, but one would assume an editor somewhere would catch this issue. In summary: when the author uses real, solid numbers (the number of likes on Facebook vs that same person's internet searches - and let's keep in mind google doesn't release names, but does release the number of loving wives who praise their husbands on facebook and then the number of wives who googles "Is My Husband Gay". Now that I think about it, why such a fixation on gay matters? A better title for this book would have been, "Everybody Lies About Their Sex Life" cause we know absolutely that is true.
    more
  • Annie
    January 1, 1970
    This is a pretty fun use of "big data"- the mindbogglingly massive data set produced every day from the Internet- to analyze human behavior in ways we never have been able to. Some favourite revelations below. --------Voting--------Nearly everyone predicted Clinton would win the 2016 election. But Stephens-Davidowitz wouldn’t have, looking at Google data. Googling “Clinton” or “Trump” doesn’t really say much (you might google them whether you hate or love them) but if you google “Clinton Trump d This is a pretty fun use of "big data"- the mindbogglingly massive data set produced every day from the Internet- to analyze human behavior in ways we never have been able to. Some favourite revelations below. --------Voting--------Nearly everyone predicted Clinton would win the 2016 election. But Stephens-Davidowitz wouldn’t have, looking at Google data. Googling “Clinton” or “Trump” doesn’t really say much (you might google them whether you hate or love them) but if you google “Clinton Trump debate” or “Trump Clinton polls” that reveals a lot: you tend to list the candidate you support first in those searches, so geographic areas with more “Clinton Trump” searches than “Trump Clinton” will probably end up going for Clinton. Other things, like searching for how or where to vote, which predict voter turnout, can be analyzed (areas with high population of black Americans did not have many of these, and since black Americans as a group supported Clinton over Trump, that would hurt Clinton). It also reveals that unexpected areas are the most racist. Rather than southern US, it’s Appalachia that searches the most racist terms- particularly eastern Ohio, western Pennsylvania, West Virginia, etc. and some neighouring Great Lakes states (Indiana, Michigan, Illinois)- some of which are major swing states that went for Trump this election. The single greatest predictor of whether a region would support Trump? Not employment, religion, gun ownership, immigration. No, the single greatest predictor, with the highest correlation to Trump support, was number of Google searches in that area for the n-word. Jesus fucking Christ.--------Sports--------If you’re over 7 feet tall, you have a 1-in-5 chance of making it to the NBA. That’s right: of all the 7ft+ tall men in America, 20% are/were NBA players. Makes you wonder how much sports “natural talent” is just height + daily practice. --------Relationships--------Having friends in common is a predictor that a relationship will not last. Read: space is a good thing. --------Intelligence--------Your kid is most likely to achieve fame and success if they grow up in certain areas- particularly, places with lots of universities. Big cities (Boston, NYC) and college towns (Ithaca, NY) alike. Want them to be even more successful? Live somewhere with a high number of immigrants. Yes, even if you aren't an immigrant yourself, being somewhere with many immigrants will contribute to your kid's success.Facebook likes for Mozart, thunderstorms, and curly fries (???) are correlated to high intelligence. Correlated to low intelligence are Facebook likes for Harley Davidson motorcycles, Lady Antebellum, and the "I love being a mom" page. ~~~~~~~Book Riot's Read Harder Challenge~~~~~~~#14: A book of social science
    more
  • Elena
    January 1, 1970
    The author is a bit too bragging, exaggerating, and name dropping for my taste. Still, i do not regret spending the time with the book (but would regret paying money if it would not be a library borrow).Memorabilia. Predicting rate of unemployment with the frequency of porn site searches (amount of time on their hands). Predicting success of dating (listen, then listen some more, then, when you think you are done listening, listen some more). Doppelganger (DOPP-el-gang-er) searches in Internet ( The author is a bit too bragging, exaggerating, and name dropping for my taste. Still, i do not regret spending the time with the book (but would regret paying money if it would not be a library borrow).Memorabilia. Predicting rate of unemployment with the frequency of porn site searches (amount of time on their hands). Predicting success of dating (listen, then listen some more, then, when you think you are done listening, listen some more). Doppelganger (DOPP-el-gang-er) searches in Internet (by medical history, interests, etc.). Regression discontinuity (sample is taken from the section around a sharp numerical divide). Natural experiments. Presidents association and the afterlife of the economy. Future of students who made into prestigious schools and who did not. Recidivism of prisoners who were treated harsher (because they just made into the more dangerous classification) and vs vc.
    more
  • Amos
    January 1, 1970
    No practicing analyst or social scientist will find anything of value in this book. It verges on being dangerously deceptive, filled with logical fallacies and half baked reasoning for it's conclusions. The book claims to be finding truth in an uncertain world, but actually is just adding to the noise.
    more
  • Anton
    January 1, 1970
    Delightful, very engaging read on modern takes on data analysis. Fans of Levitt and Pinker I am sure will enjoy.Hardly any 'cons' to flag up... but it is a bit on a short side and overwhelmingly US focused. Still very clever and thought-provoking Overall: definitely worth your time
    more
  • Ahmed Hussein Shaheen
    January 1, 1970
    A great book, I enjoyed every word of it. It is amazing how much we can learn about sex, penis size, homosexuality, racism, and many other interesting topics by just looking at the searches made by the people. I can’t wait to read his next book, tentatively titled Everybody (Still) Lies."More than 40 percent of complaints about a partner’s penis size say that it’s too big."
    more
  • Charlene
    January 1, 1970
    There are so many things to love about this book. Not the least of which is that it focused largely on how big data would act like a truth serum and replace terrible self report findings when trying to answers myriad questions that arise in all areas of life. I say bravo to that! However, just because you identify a problem with one measurement method (self-report), it does not necessarily mean you have found the fix. Does big data sound extremely promising? Hell yes. In fact, i think when we le There are so many things to love about this book. Not the least of which is that it focused largely on how big data would act like a truth serum and replace terrible self report findings when trying to answers myriad questions that arise in all areas of life. I say bravo to that! However, just because you identify a problem with one measurement method (self-report), it does not necessarily mean you have found the fix. Does big data sound extremely promising? Hell yes. In fact, i think when we learn how to escape the many pitfalls of trying to collect big data to represent an accurate picture (those are some humongous pitfalls that have not been adequately addressed in this book), big data might be our best bet at understanding human nature. Some aspects of big data already seem solid and trustworthy. For example, people indicate they will watch a certain type of movie on Netflix-- perhaps a documentary, an intellectual film, or the like-- but they watch a mindless comedy instead and often. No matter how many of one type of movie a person puts in their Netflix queue, their actual watching habits are far more reliable at predicting what types of movies they like to watch than what they have chosen, themselves, to put in their own queue. There are plenty of other examples scattered throughout the book that are similarly compelling. You will find excellent data mining that shed light on things from the election of Trump to the things people really worry about but won't admit. Data helped Nate Silver figure out how Trump won the 2016 primary. It turned out that Trump won in areas that had googled the word "nigger" most often in the year leading up to the primary. Controlling for other factors, this is an extremely disturbing finding. There were other really interesting examples of what people search for in different areas of the USA and different areas of the world that give us an extremely candid picture of what they might be thinking. I say might because it is entirely possible that people who are concerned about the things their neighbors say or do, might also look those things up. The effect might be smaller than it seems. Nevertheless, all of these searches are a valuable window at least into what people are thinking about. And speaking about what people are thinking about, it seems that the main concern for men is how big their penis is. For women it is if their vagina smells bad. It seems to be a downright preoccupation.As much as I loved this book, and I have to say there were many parts that I really, really loved, I was sorely disappointed at times. The author is certainly more critical than most in his thinking, and takes pains to convince the reader as such. However, if you are going to write a book about how to think critically, you should probably be in the top 1% of critical thinkers. Chapter Five tells me that probably is not the case. He cited experts' concerns that violence in movies cause violent behaviors and then cited a study that had serious fundamental flaws. The study sought (I use the term sought lightly) to find out if violent movies caused violent actions. Turned out that on weekends that violent movies were shown, violence went down, not up. On weekends with non-violent movies, violence went up. Mind boggling! or so the author claims. Not really. It's easy to imagine many, many reasons why this might be. The authors of the study concluded, using absolute conjecture) is that it keeps violent people occupied. I will buy that……. in the short term. But, that doesn’t prove whether violent movies cause or do not cause violent acts. Violent acts can sometimes take a long time to cultivate. You cannot measure them by a weekend​. This study was not a lone lapse in critical thinking, and that bothered me because this book is about critical thinking. Overall though, I have to say the great parts outweighed the bothersome parts. I would recommend. ​ ​
    more
  • Ram
    January 1, 1970
    For a social scientist such as Stephens-Davidowitz, big data has four central virtues. First, it’s a “digital truth serum”: it supplies honest data on matters people lie about in surveys, for instance racist attitudes, but above all (to quote Mick Jagger) “sex and sex and sex and sex”. Second, it offers the means to run large-scale randomised controlled experiments – which are usually extremely laborious and expensive – at almost no cost, and in this way uncover causal linkages in addition to m For a social scientist such as Stephens-Davidowitz, big data has four central virtues. First, it’s a “digital truth serum”: it supplies honest data on matters people lie about in surveys, for instance racist attitudes, but above all (to quote Mick Jagger) “sex and sex and sex and sex”. Second, it offers the means to run large-scale randomised controlled experiments – which are usually extremely laborious and expensive – at almost no cost, and in this way uncover causal linkages in addition to mere correlations. Third, the sheer quantity of data allows us to zoom in precisely on small subsets of people in a way that was previously impossible. Finally, it provides new types of data.I was fascinated by this book. The data collected by google, Wikipedia, PornHub and even the IRS can give you so much social information, and we are still in the real beginning of this type of study.A few examples that are not really surprising but what is impressive is that you can find them in the data:searches for racist jokes rise about 30% on Martin Luther King Day in the USin the recent Republican primaries, regions that supported Donald Trump in the largest numbers made the most Google searches for “nigger”Analysis of searches related to Flu predict the spread of a Flue epidemic much more accurately and earlier than any information taken from surveys or hospitals.The main interesting fact I found about this is that when we search, we do it completely anonymously and we are completely candid and honest. The result of this information , if correctly analyzed is the most honest large scale information that can be found.I did find some flaws in the logic and conclusions that I found a bit …… too radical but in general it was an interesting read and ….. yes a bit shocking.
    more
  • Steve Sarner
    January 1, 1970
    It’s no lie! Big Data shows the majority of my Goodreads reviews begin with bad Dad Jokes. LOL.This book is The National Enquirer meets Big Data Science. It features all the stuff that stops people in their tracks in the grocery check out line and grabs their attention: Sex, crime, weird sex, abuse, freaks, drugs and even weirder sex. It’s sometimes on the edge of gratuitous but still an interesting, easy and well-written read.The best part of this book? It validates something that has been susp It’s no lie! Big Data shows the majority of my Goodreads reviews begin with bad Dad Jokes. LOL.This book is The National Enquirer meets Big Data Science. It features all the stuff that stops people in their tracks in the grocery check out line and grabs their attention: Sex, crime, weird sex, abuse, freaks, drugs and even weirder sex. It’s sometimes on the edge of gratuitous but still an interesting, easy and well-written read.The best part of this book? It validates something that has been suspected for a very long time - people say one thing and do another. And there are a lot of interesting and plausible insights across a broad range of subjects – even out side of sex.That said there were a lot of elements I am not totally bought in to and some things that seem just plain incorrect. While search queries are very interesting and reveling – some of the correlations seem a bit outlandish.Regardless if you like the subject matter or not, it seems like a good beginners look into what really is “Big Data” and how it has become a “thing. A lot of this stuff was never accessible in the past. Although some of my true Big Data scientist friends might cringe at this thought as well as this book.
    more
  • Lolly K Dandeneau
    January 1, 1970
    via my blog: https://bookstalkerblog.wordpress.com/“In 2014, there were about 6,000 searches for the exact phrase “how to kill your girlfriend” and 400 murders of girlfriends.” As a chapter tells us, ALL THE WORLD’S A LAB. The data collected and shared by Seth Stephens- Davidowitz is downright disturbing at times. That there are dark sexual proclivities isn’t shocking so much as what they are, based on research. Also, who knew that your neighbor winning the lottery can have a strange impact on y via my blog: https://bookstalkerblog.wordpress.com/“In 2014, there were about 6,000 searches for the exact phrase “how to kill your girlfriend” and 400 murders of girlfriends.” As a chapter tells us, ALL THE WORLD’S A LAB. The data collected and shared by Seth Stephens- Davidowitz is downright disturbing at times. That there are dark sexual proclivities isn’t shocking so much as what they are, based on research. Also, who knew that your neighbor winning the lottery can have a strange impact on your own life. How odd human nature, what bizarre subjects human beings become, and subjects of research, it seems, we all are. What the heck does google searches reveal about us? A lot, actually. I spent a few chapters of this book with my moth hanging open, catching flies. Ethical questions certainly give rise to much of the research, just where is the ‘internet’ taking us all? Just who is watching, why? Well, read on my fellow test subjects. Do we think in strange ways? Naturally. I struggle with the methods of collecting data and yet, it’s true that while it can be used for nefarious purposes, just like anything else, there can be great benefits too.How can we know what is real? How can anyone trust searches as solid fact? Data makes some of us cross eyed with boredom, but here Seth Stephens-Davidowitz presents it in a manner most people can understand and also be humored and at times shocked by. I will never think about strawberry pop tarts without thinking about hurricanes. A strange comment, but that’s what this book is all about- the bizarre data we provide, whether we realize it or not. Are we really just a bunch of liars? Do we all just masquerade online? Is the world so twisted? Just how much can you really measure to determine the future of what’s hot, what will sell, what stocks will rise and fall? How did one man predict the success of the horse American Pharoah? Who gives corporations the right to use collected data, and should they?How do interests and fun tests measure IQ on facebook? Just what is our doppelganger and why does it matter? And hilariously, how many of the readers finish books? What about this one? Well, I did. I particularly enjoyed the chapter “Was Freud Right?” I wonder, were he alive today, how much of his theories would stand up to actual research. The Banana dream data is food for thought and yes I’m trying to be punny here, I wonder what that means about me, according to research.The information isn’t overbearing, and most of it is fascinating. Statistically, you may well finish this book too.Publication Date: May 9, 2017Dey Street Books
    more
  • Dan
    January 1, 1970
    I recommend this highly with a couple of caveats.The central insight of this book is that you can get a better idea of what people actually think, despite what they say to others (or even to themselves) by looking at Google and Pornhub searches (among other anonymized big data sets). Things that people won’t admit to other people (thoughts of suicide, to whom they are attracted, homicidal thoughts, racist thoughts, dissatisfaction with a marriage, regrets over having children, etc.) are often cl I recommend this highly with a couple of caveats.The central insight of this book is that you can get a better idea of what people actually think, despite what they say to others (or even to themselves) by looking at Google and Pornhub searches (among other anonymized big data sets). Things that people won’t admit to other people (thoughts of suicide, to whom they are attracted, homicidal thoughts, racist thoughts, dissatisfaction with a marriage, regrets over having children, etc.) are often clearly revealed to Google through searches. This far I’m definitely willing to go, and would recommend the book on that basis. (Unsurprisingly, Facebook is a far less reliable source for honest data sets than google or pornhub.)Big data can tell us a lot, but there are limits. In one sense it’s a bigger tree from which to cherry pick data that might support your view. While Stephens-Davidowitz is very open when results didn’t match his expectations, there is a strong ideological bent in which data he chooses to explore. I wish he’d been upfront about that at the beginning. I don’t object to the ends to which he’s working (I agree with most of them), but I wish he’d put in a disclaimer (like Stephen Jay Gould did at the beginning of Mismeasure of Man). He did put in a chapter at the end about what Big Data can’t do, but since it’s at the end, I think it’s less effective (all the more so since one of his findings is that people are less likely to read the final chapters of a book than the early ones).His conclusion is an argument that Big Data is the best shot for the social sciences to officially move into the realm of “real” science, even claiming that if Karl Popper were alive he’d likely be convinced. I’m not quite so skeptical of the social sciences as Popper was, and certainly no science, even “hard” science, is completely free from the biases of the scientist. But there is definitely more room for ideology to skew results in the social sciences. Stephens-Davidowitz makes a good case that big data (in combination with smaller data) can significantly strengthen the reliability of results. He convinced me that it does, but not to the extent that he believes it does.On the whole, this is an excellent book. Read it with a bit of skepticism, and it will give you a lot to think about. 3.5, but I'm rounding up.
    more
  • Tadas Talaikis
    January 1, 1970
    Interesting data, but sometimes with uncleared assumptions. For example, there is no way to know why exactly some search term is used. Some (?, I don't know how many) data scientists believe their algos based on big data can reveal something about real world. Most often it is not, but this illusion is one the reasons why they have their jobs. I see all this A.I./ML/DL nonsense every day. Movie suggestions, Facebook feed, Google suggestions, all are sh*t, have no way to guess my true intentions ( Interesting data, but sometimes with uncleared assumptions. For example, there is no way to know why exactly some search term is used. Some (?, I don't know how many) data scientists believe their algos based on big data can reveal something about real world. Most often it is not, but this illusion is one the reasons why they have their jobs. I see all this A.I./ML/DL nonsense every day. Movie suggestions, Facebook feed, Google suggestions, all are sh*t, have no way to guess my true intentions (don't know how it's true on larger scale) and only irritate when trying to get to the real target. Maybe they are right on the thing that those "irritations" are the drugs to keep people busy on their sh*tty sites.So, everything mentioned should be taken with a grain of salt. Like with mentioned applications for trading, signals here are very weak (in the range of few percents from random) and most often don't create any meaningful improvement over simple things (like pattern recognition) with increased complexity of big data.OK, those assumptions a bit revealed towards the end of the book, but not enough.More down to earth example. I remember doing a lot of searches like "conservative psychology", "conservative idiots", "conservative fear", etc. for research purposes, found a lot of interesting research. And after few months doing "liberal psychology", "liberal idiots", "Kennedy the Nazi", etc. Big data doesn't reflect my true intentions and meaning and what I think (e.g. who am I), because it doesn't take care about individual search sequences that mean only that I am interesting in understanding people, i.e. to be able to "read the mind" and discern true meaning of said words. But here's the thing, it's not so two-dimensional that some sentences distributions can be compared to other sentences, because people should take into account much more three-dimensional (intuitive) data to be able to grasp *some* of the truth.Don't be so over-positive with all this "progress".
    more
  • Hakan Jackson
    January 1, 1970
    I never really thought of big data that much as a social science tool. After reading this book I'm starting to think big data can do for sociology what MRI has been able to do for psychology. I'm excited to see what the future holds. I definitely can pick up the influence of Freakanomics, Malcolm Gladwell, and Stephen Pinker in this book. If you like any of those three, definitely pick up this book.
    more
  • Jade
    January 1, 1970
    Critical analysis of Big Data takes a fine mind that knows how to look at correlations. The author is educated and practiced at it. Not only that but he is adept at choosing to present compelling findings on subjects that I'm sure he knows readers are interested in, because he has that skillset! Do you want to know how to figure out which racehorses are champions? What is the best family configuration for the top NBA stars? What do people who watch porn search for most? The author chooses fun an Critical analysis of Big Data takes a fine mind that knows how to look at correlations. The author is educated and practiced at it. Not only that but he is adept at choosing to present compelling findings on subjects that I'm sure he knows readers are interested in, because he has that skillset! Do you want to know how to figure out which racehorses are champions? What is the best family configuration for the top NBA stars? What do people who watch porn search for most? The author chooses fun and titillating topics, but he also makes it clear how much more those with his skillset can do with it. The author's writing style is comfortable to read, which is no easy feat when you're talking about statistics and other difficult concepts. The dude obviously is a brainiac but he makes sure you're not intimidated by that fact. If you snooze and don't read Everybody Lies, you will get left in the dust for what's on the horizon with Big Data.
    more
Write a review