Descriptive Metadata, Iconclass, and Digitized Emblem Literature       
Timothy W. Cole; Myung-Ja Han; Jordan Vannoy

In this paper, we describe features of a digital library designed to support the ways emblem scholars discover and use digitized emblem resources. Digitized Renaissance and Baroque emblem literature and associated scholarship pose interesting challenges as regards resource and metadata granularity, the use of interdisciplinary controlled vocabularies, and requirements to present digitized sources in a complex setting of associated sources, derivatives, and contemporary context. We focus in this paper on metadata design, issues of resource granularity and identification, and the use of the Linked Data Web services for Iconclass, a multilingual classification system for cultural heritage art and images. Results of this work, undertaken as a collaboration between emblem scholars and librarians, demonstrate the importance of librarian-scholar collaboration and illustrate the ways digital libraries need to move beyond merely disseminating digitized book surrogates. Work to date has laid a foundation for broader and more interactive use of digitized emblem content on an increasingly Linked Data scholarly Web.


Generating Ground Truth for Music Mood Classification Using Mechanical Turk    
Jin Ha Lee; Xiao Hu

Nominated for Vannevar Bush Best Paper

Mood is an important access point in music digital libraries and online music repositories, but generating ground truth for evaluating various music mood classification algorithms is a challenging problem. This is because collecting enough human judgments is time-consuming and costly due to the subjectivity of music mood. In this study, we explore the viability of crowdsourcing music mood classification judgments using Amazon Mechanical Turk (MTurk). Specifically, we compare the mood classification judgments collected for the annual Music Information Retrieval Evaluation eXchange (MIREX) with judgments collected using MTurk. Our data show that the overall distribution of mood clusters and agreement rates from MIREX and MTurk were comparable. However, Turkers tended to agree less with the pre-labeled mood clusters than MIREX evaluators. The system evaluation results generated using both sets of data were mostly the same except for detecting one statistically significant pair using Friedman’s test. We conclude that MTurk can potentially serve as a viable alternative for ground truth collection, with some reservation with regards to particular mood clusters.