Data journalism is a journalism specialty reflecting the increased role that numerical data is used in the production and distribution of information in the digital era. It reflects the increased interaction between content producers (journalist) and several other fields such as design, computer science and statistics. From the point of view of journalists, it represents "an overlapping set of competencies drawn from disparate fields".
Data journalism has been widely used to unite several concepts and link them to journalism. Some see these as levels or stages leading from the simpler to the more complex uses of new technologies in the journalistic process.
Designers are not always part of the process. According to author and data journalism trainer Henk van Ess, "Datajournalism can be based on any data that has to be processed first with tools before a relevant story is possible. It doesn't include visualisation per se".
One of the earliest examples of using computers with journalism dates back to a 1952 endeavor by CBS to use a mainframe computer to predict the outcome of the presidential election, but it wasn't until 1967 that using computers for data analysis began to be more widely adopted.
Working for the Detroit Free Press at the time, Philip Meyer used a mainframe to improve reporting on the riots spreading throughout the city. With a new precedent set for data analysis in journalism, Meyer collaborated with Donald Barlett and James Steele to look at patterns with conviction sentencings in Philadelphia during the 1970s. Meyer later wrote a book titled Precision Journalism that advocated the use of these techniques for combining data analysis into journalism.
Toward the end of the 1980s, significant events began to occur that helped to formally organize the field of computer assisted reporting. Investigative reporter Bill Dedman of The Atlanta Journal-Constitution won a Pulitzer Prize in 1989 for The Color of Money, his 1988 series of stories using CAR techniques to analyze racial discrimination by banks and other mortgage lenders in middle-income black neighborhoods. The National Institute for Computer Assisted Reporting (NICAR) was formed at the Missouri School of Journalism in collaboration with the Investigative Reporters and Editors (IRE). The first conference dedicated to CAR was organized by NICAR in conjunction with James Brown at Indiana University and held in 1990. The NICAR conferences have been held annually since and is now the single largest gathering of data journalists.
Although data journalism has been used informally by practitioners of computer-assisted reporting for decades, the first recorded use by a major news organization is The Guardian, which launched its Datablog in March 2009. And although the paternity of the term is disputed, it is widely used since Wikileaks' Afghan War documents leak in July, 2010.
The Guardian coverage of the war logs took advantage of free data visualization tools such as Google Fusion Tables, another common aspect of data journalism. Facts are Sacred by The Guardian Datablog editor Simon Rogers describes data journalism like this:
"Comment is free," wrote Guardian editor CP Scott in 1921, "but facts are sacred". Ninety years later, publishing those sacred facts has become a new type of journalism in itself: data journalism. And it is rapidly becoming part of the establishment.