Similar Researcher Search in Academic Environments     
Sujatha Das Gollapalli; Prasenjit Mitra; C. Lee Giles

ABSTRACT
Entity search is an emerging IR and NLP task that involves the retrieval of entities of a specific type in response to a query. We address the “similar researcher search” or the “researcher recommendation” problem, an instance of “similar entity search” for the academic domain. In response to a ‘researcher name’ query, the goal of a researcher recommender system is to output the list of researchers that have similar expertise as that of the queried researcher. We propose models for computing similarity between researchers based on expertise profiles extracted from their publications and academic homepages. We provide results of our models for the recommendation task on two publicly-available datasets. To the best of our knowledge, we are the first to address content-based researcher recommendation in an academic setting and demonstrate it for Computer Science via our system, ScholarSearch.

An Analysis of the Named Entity Recognition Problem in Digital Library Metadata
Nuno Freire; Jose Borbinha; Pável Calado

ABSTRACT
Information resources in digital libraries are usually described, along with their context, by structured data records, commonly referred as metadata. Those records often contain unstructured information in natural language text, since they typically follow a data model which defines generic semantics for its data elements, or includes data elements modeled to contain free text. The information contained in these data elements, although machine readable, resides in unstructured natural language texts that are difficult to process by computers. This paper addresses a particular task of information extraction, typically called named entity recognition, which deals with the references to entities made by names occurring in the texts. This paper presents the results of a study of how the named entity recognition problem manifests itself in digital library metadata. In particular, we present the main differences between performing named entity recognition in natural language and in the text within metadata. The paper finalizes with a novel approach for named entity recognition in metadata.