This article analyzes the issue of degradation of data accuracy in large-scale longitudinal data sets. Recent research points to a number of issues with large-scale data, including problems of reliability, accuracy and quality over time. Simultaneously, large-scale data is increasingly being utilized in the social sciences. As scholars work to produce theoretically grounded research utilized "small-scale" methods, it is important for researchers to better understand the critical issues associated with the analysis of large-scale data. In order to illustrate the issues associated with this type of research, a case study analysis of archival Internet data is presented focusing on the issues of degradation of data accuracy over time. Suggestions for future studies are given.
|Original language||English (US)|
|Title of host publication||Proceedings of the 2015 ACM Web Science Conference|
|Publisher||Association for Computing Machinery, Inc|
|State||Published - Jun 28 2015|
|Event||7th ACM Web Science Conference, WebSci 2015 - Oxford, United Kingdom|
Duration: Jun 28 2015 → Jul 1 2015
|Name||Proceedings of the 2015 ACM Web Science Conference|
|Other||7th ACM Web Science Conference, WebSci 2015|
|Period||6/28/15 → 7/1/15|
Bibliographical noteFunding Information:
The authors wish to acknowledge support from the National Science Foundation (Grant #1244727), the NetSCI Network Science Research Lab at Rutgers University, as well as the Internet Archive (archive.org) in making this data available.
Copyright is held by the owner/author(s).
- Keywords are your own designated keywords