CSO Classifier

Classifying research papers according to their research topics is an important task to improve their retrievability, assist the creation of smart analytics, and support a variety of approaches for analysing and making sense of the research environment. In this page, we present the CSO Classifier, a new unsupervised approach for automatically classifying research papers according to the Computer Science Ontology (CSO), a comprehensive ontology of research areas in the field of Computer Science.

Try out our CSO Classifier Demo

The CSO Classifier takes as input the metadata associated with a research paper (title, abstract, keywords) and returns a selection of research concepts drawn from the ontology. It consists of two main components: (i) the syntactic module and (ii) the semantic module. Figure 1 depicts its architecture. The syntactic module parses the input documents and identifies CSO concepts that are explicitly referred in the document. The semantic module uses part-of-speech tagging to identify promising terms and then exploits word embeddings to infer semantically related topics. Finally, the CSO Classifier combines the results of these two modules and enhances them by including relevant super-areas.

Figure 1: Framework of CSO Classifier

We developed the classifier in Python 3 and we release it under Apache 2.0 Licence.

Relevant papers

If you want to know more about this research initiative please refer to the following papers:

  • Salatino, A.A., Thanapalasingam, T., Mannocci, A., Osborne, F. and Motta, E. 2018. Classifying Research Papers with the Computer Science Ontology. ISWC-P&D-Industry-BlueSky 2018 (2018). Read from ORO
  • Salatino, A.A., Osborne, F., Thanapalasingam, T., Motta, E.: The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly Articles. In: TPDL 2019: 23rd International Conference on Theory and Practice of Digital Libraries. Springer. Read from ORO

Download and Install

The CSO Classifier is an ongoing project. You can follow its development through our Github repository https://github.com/angelosalatino/cso-classifier, or you can download the latest release from Zenodo:


You can also install the CSO Classifier using the package manager for Python:

PyPI version
  1. Ensure you have Python 3.6 or above installed. Download latest version.
  2. Use pip to install the classifier: pip install cso-classifier
  3. Download English package for spaCy using python -m spacy download en_core_web_sm

Demo CSO Classifier

The CSO Classifier is also available as demo on the CSO Portal.

You can run the classifier on your own papers or using sample papers already provided within the demo.

Go to CSO Classifier Demo