Skip to main content

Currently Skimming:

9- Data Citation in the Humanities: What's the Problem?
Pages 59-70

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 59...
... Data in the digital humanities Humanities scholars started using machine-readable data in 1948, when Father Roberto Busa began work on the Index Thomisticus. This index was a full-text concordance of every word of every work published by St.
From page 60...
... Human culture did not end when humans built computers, however, so digital artifacts and born-digital objects are also objects of study for humanities disciplines. Scholars are studying digital art forms, hypertexts, interactive games, databases, and digital records of any kind.
From page 61...
... The closest thing found in the sample to this ideal practice were papers which mention published resources, which are explicitly described, sometimes with a URL pointing to the item, but with no reference to the resource in the references. Sometimes, the references include instead a reference to a related paper, which may indicate both a desire to cite the work and a discomfort with citing resources which do not take traditional scholarly forms, or perhaps uncertainty about how to cite data resources directly.
From page 62...
... If, on the contrary, that familiar division of labor reflects only a way of organizing the management of paper and other physical objects, then the digital world may well converge on a different and incommensurable set of roles. In many ways, the challenge of locating reliable metadata among them, digital objects seem to be in an incunabular phase; like the earliest printed books, digital objects lack established conventions for identifying the object or those responsible for it.
From page 63...
... As a senior figure in the field wrote to me: I think you will still find plenty of people saying "we ran a stylometric analysis on a corpus which has these properties, but we cannot let you see the actual corpus because we did not obtain the copyright." Anti-scientism: Citing data resources may seem foreign to the culture of humanistic scholarship, an eruption into the humanities of natural-scientific practices and perhaps a symptom of science envy, to be discouraged as naïve and unhelpful. Citation chains: Print has (reasonably)
From page 64...
... It is not unreasonable for scholars to be skeptical of the use of URLs to cite data of any long-term significance, even if they are interested in citing the data resources they use.
From page 65...
... At ICPSR, we do not assess data quality ourselves; it is the community that will determine whether the sample is adequate and scientifically sound. It is important, therefore, to have that descriptive information about how the survey was conducted.
From page 66...
... In the United States, we now have the National Science Foundation asking researchers applying for grants to provide data management plans, and metadata are a big component of those plans. We are hoping that this will be a positive influence on what eventually gets deposited into the data centers.
From page 67...
... If a paper is published and others decide to make judgments about its merits and publish something themselves about the quality of the data or its content, they can do that. As a data center, ICPSR does write what we consider to be comprehensive metadata references, and we track publications based on our data.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.