SAPIENT Automation

Semantic Automation of Papers Enrichment Tool

Project Information

Start date: 1 April 2009.

End date: 31 September 2010.

Project Minute Summary

SAPIENT Automation aims to help researchers process scientific papers faster and get the information they are interested in out of them. The project will achieve this by automating the recognition of core scientific concepts such as Motivation, Method, Result, Conclusion in papers and use them to generate automatic use based summaries.


SAPIENT Automation will build on the outcomes of the JISC funded project ART, to assess the added benefit from annotating core scientific concepts (CISP meta-data) in research papers. The CISP meta-data are ontology motivated labels, defined within the ART project, covering 11 concepts ("Motivation","Background","Hypothesis","Goal","Object","Method","Experiment","Observation","Result","Conclusion","Model"). The ART project produced of 225 papers (1 million words) from physical chemistry and biochemistry annotated with CISP meta-data, as well as a web annotation tool SAPIENT, which allowed 16 experts to manually annotate the papers. SAPIENT Automation will automate the SAPIENT tool by employing machine learning and using the ART corpus as training data. The automatically generated concepts will be used to produce digital summaries of the papers.

