What 'Big Data' means to Psychological and Brain Sciences

As in many other areas of contemporary life, Big Data has become a powerful tool in psychological and brain sciences, where it is opening up novel forms of investigation into questions both old and new. At PBS, Big Data is especially pervasive. As professor Mary Murphy suggests, more than at most schools, it is making its way into all the sub-fields in PBS, due to the frequent collaboration that is so characteristic of the department. Murphy herself, a social psychologist, has been comparing notes (and methods) with computational neuroscientist Franco Pestilli, and has been generally inspired by workshops and conversations with cognitive scientists and others in the department.

What exactly is Big Data, you might ask? Often used to define the age in which we live, the term is aptly vague. As PBS professor Michael Jones points out in his book, in many ways it is "a moving target," partly on account of the perpetual changes in  the volume, velocity and variation of data and the techniques used to manage and interpret it the three Vs, as they are called. Yet, amidst the change, one observation holds steady: New uses of data are transformative, and are enabling PBS researchers to reimagine their work in powerful and provocative ways.

Jones graphically depicts the change Big Data has brought about in his field: “As cognitive scientists, we are used to developing causal models to explain behavior in carefully controlled laboratory experiments, but rarely explore how those theoretical mechanisms would fare in the natural environment outside of the lab. Now we turn models developed from lab experiments loose in the real world, feeding them large quantities of data, to see if they behave like humans." 

Abridged list of what Big Data enables PBS researchers to do:

Visualize brain networks  

Test-drive language-learning theories  

Reconstruct the building blocks of infant cognition,  

Foresee the development of precision neuroscience 

Grasp the variations in neuropsychological disorders across populations.  

Build the cyber-structure for sharing neuroscientific data and resources 

Map out social networks 

Minimize the threat of cyberwarfare and identity theft

Pave the way for research reproducibility  

Chip away at stereotype threat in our schools  

Mine the internet for clues to human psychology – in digital apps and sensors, social media and video games, search engines and digital libraries.  

The promise of Big Data has also led several other faculty members down enormously productive paths, among them Mary Murphy and Franco Pestilli.

As Murphy explains, "one of the driving factors in the move to big data is that we can't answer our questions with single study samples any more. Once you nail down that a phenomenon is occurring, you want to look at heterogeneity in effects, boundary conditions and moderators."

A series of studies aimed at student retention and faculty mindset has led Murphy and her colleagues to develop their own research-practice startup, with fulltime staff to manage and model the quantities of data they produce.

In one of these projects, she and her colleagues designed and implemented an intervention to help incoming college freshman cope with feelings of belonging uncertainty and identity threat that contribute to higher drop-out rates and lower academic performance. The intervention had been shown in several single studies to reduce the sense, felt most acutely by black, Latino and first-generation students, that one does not belong at college. Murphy and her colleagues, now want to see if and how the intervention works on a larger scale across a much more varied sample consisting of different types of universities and student bodies. For the past two years they have administered the intervention at 22 universities and are starting to analyze their results.

"People often think that if an effect is real, it should emerge at any time, in any context. But that’s not how human psychology works, especially in education, where the differences on the ground, in the student body, student support and financial resources – all these different structures and contexts can change that. At Indiana University or other big public selective universities, belonging and identity threat operate in one way; at a broad access university or an Ivy League college, where the resources on the ground and the student bodies are very different, the effects will likely be different."

The goal, she adds, is to "anticipate heterogeneity. That’s the huge contribution of big science. We can really start to look at heterogeneity using theory to guide us."

Likewise for Pestilli, who similarly explains that "Though it is useful to test questions on a smaller population, it is harder to see subtle variations that exist on a larger scale. You need a bird’s eye view to understand variation."

The questions he has in mind are quite different from Murphy’s. For Pestilli, Big Data sets the stage for a new precision neuroscience that would make it possible to better understand individual differences, to determine who might be at risk for various neurological and mental health issues.

"If we can pool data across populations," he suggests, "we can predict major risk factors fast enough to anticipate a problem – suicide for instance, or the onset of Alzheimer's – before it's too late. These are the hopes and goals behind the movement for Big Data."

Pestilli is hopeful that Big Data will be used to solve some of the most widespread and debilitating mental health issues, and is fully engaged in building the framework and culture needed to support Big Data research in neuroscience. On this front he is leading two major NSF grants that brings together engineers, statisticians and those in psychological and brain sciences across the Midwest to establish the infrastructure and computational resources needed for open data sharing.

"We are informavores," observes Jones, drawing on a word coined in 1983 that has gained currency in recent decades. Just as we spend our days as omnivores foraging, producing and consuming certain kinds of food; as informavores, we also forage, produce and consume information. Lots of it.

"The human brain," says Jones, "became optimized by evolution to make us informavores." And in the face of the constellations of data that extend into all aspects of our lives, dotting the landscape at every turn, and making up a virtual space nearly as vast as the universe, it is hard to contest his point.

It’s also a two-way mirror that reflects both on who we are and how we study inside and out of PBS.