PageOneX is an open source software tool designed to aid the coding, analysis, and visualization of front page newspaper coverage of major stories and media events. Newsrooms spend massive time and effort deciding which stories make it to the front page. Communication scholars have long used column-inches of print newspaper coverage as an important indicator of mass media attention.
In the past, this approach involved obtaining copies of newspapers, measurement by hand (with a physical ruler), and manual input of measurements into a spreadsheet or database, followed by calculation and analysis. Some of these steps can now be automated, while others can be simplified; some can be easily shared by distributed teams of investigators working with a common dataset hosted online.
The project has gone through different phases.
Initially, this type of data visualization was made through a ‘manual’ process: images of newspaper front pages were downloaded from the web and reorganized in a vector graphics program to draw rectangles on top of them to highlight certain stories.
The first version of an automated tool was a script written in Processing, that downloaded newspapers front pages and generated an organized array of images ordered by date.
The second version is this tool written in Ruby on Rails that you are using. It is developed to be a web platform to provide a ready to use front page analysis tool for anyone with a connection to the Internet. The platform automates the process of newspaper selection, download, thread coding, and data visualization. The alpha version was developed by Pablo Rey Mazón with Ahmd Refat, thanks to Google Summer of Code program (GSOC) and the Berkman Center as host institution in Summer 2012. You can test the alpha version at PageOneX.com.
In Winter-Spring 2013, at the Center for Civic Media at MIT Media Lab Pablo Rey Mazón, with Edward L Platt and Rahul Bhargava worked on the beta version, the first stable version, that was released on April 2013. It has major improvements in speed and a cleaner data model. During Summer 2013 we were fixing bugs and making the tool work better and faster.
During 2014 we've made some improvements like making available to embed a PageOneX in your site or switching on and off topics in the data visualization. We need your feedback!.
In Summer 2014 PageOneX won the 2014 APSA-ITP (American Political Science Association-information technology and politics) award for Best Software.
The project has many collaborators. The coders have been/are: initially Ahmd Refat, and now developed at the Center for Civic Media by Edward L Platt, Rahul Bhargava and Pablo Rey Mazón; Rafael Porres has developed part of the original scraping code. Sasha Costanza-Chock is giving advice and support from the Center for Civic Media; Alfonso Sánchez Uzábal is providing tecnical support and Montera34 helped with the server and domains. Thanks to Jeff Warren for his advice and Rogelio López for testing.
Join the project
The code of PageOneX is open source and available at https://github.com/numeroteca/pageonex.