Detail from Mondrian vs. Rothko.
The emergence of social media creates a radically new opportunity to study cultural processes and dynamics. For the first time, we can follow the imaginations of hundreds of millions of people—the images and videos they create and comment on, the conversations they are engaged in, the opinions, ideas, and feelings they express.
Until now, the study of the social and the cultural (individual beings, individual artifacts, and larger groups of people/artifacts) relied on two types of data: “shallow data” about many (statistics, sociology) and “deep data” about a few (psychology, psychoanalysis, anthropology, art history—methods such as “thick description” and “close reading”). However, the rise of social media, along with the computational tools that can process massive amounts of data, makes possible a fundamentally new approach for the study of human beings and society. We no longer have to choose between data size and data depth. Rather than having to generalize from small samples or rely on our intuition, we can study exact cultural patterns formed by millions of cultural texts. In other words, the detailed knowledge and insights, which before could only be reached about a few texts, can now be obtained about massive collections of these texts.
Google Logo Space—587 versions of the original Google logo, which appeared on google.com pages between 1998 and summer 2009.
In 2007, Bruno Latour summarized these developments as follows: “The precise forces that mould our subjectivities and the precise characters that furnish our imaginations are all open to inquiries by the social sciences. It is as if the inner workings of private worlds have been pried open because their inputs and outputs have become thoroughly traceable”. (Bruno Latour, “Beware, your imagination leaves digital traces”, Times Higher Education Literary Supplement, April 6, 2007.)
But how do you “read” through billions of Twitter posts, blogs, Flickr photos, or YouTube videos in practice? That is, how do you read for patterns?
Today people use a variety of software tools to select the content of interest to them from this massive and constantly expanding universe of cultural texts and conversations. These tools include search engines, RSS feeds, and recommendation systems. But while these tools can help you to find what to read, they do not show the larger patterns across this universe.
Computer scientists and media companies use a different set of tools and techniques that allow for the detailed study of such patterns. They employ statistical data analysis, data mining, information visualization, and visual analytics. They also have access to substantial computational resources needed to analyze massive data sets. For example, many companies use “sentiment analysis” to study the feelings which people express about their products in blog posts. Recent publications in computer science investigated how information spreads on Twitter (data: 100 million tweets), what qualities are shared by most favored photos on Flickr (data: 2.2 million photos), and what geotagged Flickr photos tell us about people’s attention (data: 35 million photos).
Mapping Time—The covers of every issue of Time magazine published from 1923 to summer 2009.
What if everybody had access to such techniques? At present, this requires knowledge of advanced topics in computer science and statistics. However, with the right tools, anybody should be able to at least explore large image collections and notice interesting patterns. At Software Studies Initiative, we have been developing such software tools, and testing them on sets of different types of cultural images ranging from all 4,535 covers of Time magazine (1923-2009) to one million manga pages. Currently we are using these tools to study video remixes on YouTube, millions of images from deviantart.com, and spatial patterns in Second Life, as well as documenting the tools and releasing them as open source.
To download them, visit:
Lev Manovich is the author of Software Takes Command (released under CC license, 2008), Soft Cinema: Navigating the Database (The MIT Press, 2005), and The Language of New Media (The MIT Press, 2001) which is described as “the most suggestive and broad ranging media history since Marshall McLuhan.” Manovich is a Professor in Visual Arts Department, University of California, San Diego, a Director of the Software Studies Initiative at California Institute for Telecommunications and Information Technology (CALIT2), and a Professor at European Graduate School (EGS). Manovich has been working with computer media as an artist, computer animator, designer, and educator since 1984. In 2007 Manovich founded Software Studies Initiative—the first digital humanities lab focusing on exploring massive visual data sets.
Images: culturevis Flickr