Dashing off 140 character quips on Twitter may take only seconds, but for researchers at the George Washington University, analyzing thousands of Twitter messages can be a time-consuming slog.
Enter Social Feed Manager.
The application developed by Daniel Chudnov, director of scholarly technology, and a team of librarians at the Estelle and Melvin Gelman Library, collects and stores millions of tweets per day.
A $130,405 grant from the National Historical Publications and Records Commission (NHPRC) will help Mr. Chudnov and an advisory board of librarians, archivists and university faculty tackle web archiving. The three-year project will develop the application in partnership with University of North Texas, University of California, Riverside, and Stanford University for future researchers.
“We are able to empower researchers to collect data that may be out of scope for them to collect for themselves and digest into a format that they understand,” Mr. Chudnov said.
The Social Feed Manager simplifies a process called “querying the API,” a means of using code to ask a program, such as Twitter, to show you specific information. The application searches Twitter by hashtag, keyword, username or other filters and gathers up to 3,200 tweets at a time that can be exported to an accessible and sortable Microsoft Excel spreadsheet.
GW students, faculty and researchers can access the application on the library website using their Twitter login. Partners at Stanford, UC Riverside and North Texas also will run the software.
To date, Mr. Chudnov and his colleagues have collected 5.5 million tweets from nearly 2,400 individual Twitter accounts.
“Researchers can go from a research idea, to data, to applying the methods that they have in their disciplines—faster,” he said.
Following the Research
Three years ago, Kim Gross, associate director of the School of Media and Public Affairs, was mired in data from nearly 3,000 tweets for a study in partnership with the Pew Research Center’s Journalism Project.
“We had this totally clunky method to try to analyze Twitter data,” Dr. Gross said. “A team of students pulled tweets by hand and copied them into a word document—it was nightmare.”
Following painstaking work by 36 SMPA senior seminar students, Dr. Gross and her colleagues produced “How Media Outlets Use Twitter,” a report that analyzed the Twitter participation of 13 major new organizations.
Mr. Chudnov was intrigued by Dr. Gross’s research, but knew she needed another method.
“It was really good work,” Mr. Chudnov said. “So I wrote her and said, ‘I know a little about Twitter, let me create an application to do the work for you’—she didn’t seem to believe it was possible,” he joked.
Using skills he learned as a technical manager of the Library of Congress’ Twitter Archive project, Mr. Chudnov developed the coding script that would become Social Feed Manager.
"Researchers can go from a research idea, to data, to applying the methods that they have in their disciplines—faster."
-Dan Chudnov, Director of Scholarly Technology at GW Libraries
The application received a series of improvements through a 2013 Institute of Museum and Library Services grant. Mr. Chudnov and his team— Electronic Resources Content Manager Laura Wrubel, Senior Software Developer Daniel Kerchner and School of Engineering and Applied Sciences graduate student Ankushi Sharma—hope to make the application even faster and easier to use with the support of the NHPRC funding.
“When Dan first explained the scope of the application, I thought, ‘Oh god, you can save me,’” Dr. Gross laughed.
Dr. Gross continued to study how journalists use Twitter in 2012 and 2013. She found that the Social Feed Manager solved many of her research problems. Previously, she had to drop “prolific tweeters” from the study because the research team couldn’t pull all of their tweets before Twitter stopped storing them, but with Social Feed Manager, she was able to handle larger amounts of data.
“The program unwraps all sorts of things that are in the metadata of Tweets, like the number of re-tweets, the number of replies and when it was sent,” Dr. Gross said. “It has been so helpful to have this information become available in this much smarter way.”
Danny Hayes, associate professor of political science, started using the Social Feed Manager for his research on congressional candidates after learning about the application during a brown bag lunch session last year.
Dr. Hayes and his co-author Jennifer Lawless, professor of government at American University, have been using the app to gather Twitter data from candidates in the primary and congressional candidates in the general election since this summer. They plan to examine whether the public response to congressional candidates changes based on the candidate’s gender.
He said that the application has been a huge benefit for his research as a political scientist because Twitter is the “21st-century press release” for politicians.
“Twitter is a great tool, but you are just awash in data, and you have find a way to collect all of it,” Dr. Hayes said. “Social Feed Manager is great for researchers like me who want the data but don’t know how to get it.”
Building a Social Media Archive
In addition to helping researchers, library staff has been collecting Twitter data on student organizations and university departments to enhance the library’s digital archive. Former GW University Archivist Bergis Jules started the initiative before leaving the university last summer.
“The student groups used to have physical materials,” Mr. Chudnov said, “But these days their stuff is on Twitter, Tumblr and Facebook. This app is helping us preserve university history.”
He hopes to expand the program to other platforms such as Tumblr, Flickr, Facebook, YouTube and even those that have yet to be created.
“These applications are not necessarily stable. Remember MySpace?” he asked. “It’s been replaced by other things and many people deleted their profiles, but it’s a great example of how we lose data when we don’t collect from these platforms while they’re around.”
Future archives will rely heavily on our digital footprint, which is why libraries are embracing the digital sphere and web archives, Mr. Chudnov said.
“We have access to this stream of how people talk to each other, and we know there will be interest in this data in the future,” Mr. Chudnov said. “This application aligns with the mission of libraries because it follows the research.
“Modern libraries have such great tools now and techniques for dealing with data at scale, who knows what researchers, historians and students will be able to do with this data in the future,” he added.