ResearchFlow: Understanding the Knowledge Flow between Academia and Industry

ABSTRACT:

Understanding, monitoring, and predicting the flow of knowledge between academia and industry is of critical importance for a variety of stakeholders, including governments, funding bodies, researchers, investors, and companies. To this purpose, we introduce ResearchFlow, an approach that integrates semantic technologies and machine learning to quantifying the diachronic behaviour of research topics across academia and industry. ResearchFlow exploits the novel Academia/Industry DynAmics (AIDA) Knowledge Graph in order to characterize each topic according to the frequency in time of the related i) publications from academia, ii) publications from industry, iii) patents from academia, and iv) patents from industry. This representation is then used to produce several analytics regarding the academia/industry knowledge flow and to forecast the impact of research topics on industry. We applied ResearchFlow to a dataset of 3.5M papers and 2M patents in Computer Science and highlighted several interesting patterns. We found that 89.8% of the topics first emerge in academic publications, which typically precede industrial publications by about 5.6 years and industrial patents by about 6.6 years. However this does not mean that academia always dictates the research agenda. In fact, our analysis also shows that industrial trends tend to influence academia more than academic trends affect industry. We evaluated ResearchFlow on the task of forecasting the impact of research topics on the industrial sector and found that its granular characterization of topics improves significantly the performance with respect to alternative solutions.


In this page you can download all data regarding the evaluation of ResearchFlow, described in the paper:
Salatino, A.A., Osborne, F., Motta, E.: ResearchFlow: Understanding the Knowledge Flow between Academia and Industry. Submitted to EKAW 2020.

ResearchFlow integrates data from publication, patents, and organizations in order to characterise each topic according to its frequency in time of i) publications from academia, ii) publications from industry, iii) patents from academia, and iv) patents from industry. This representation is then used to produce several analytics regarding the academia/industry knowledge flow and to forecast the impact of research topics on industry.

We evaluated ResearchFlow on the task of forecasting the impact of a topic and found that its granular characterisation the topic evolution improves significantly the performance with respect to alternative solutions.

It is also available the dataset that allowed us to develop ResearchFlow and perform this analysis. You can download the file from the button below. In this archived files there is a folder containing json files. Each file represents a topic and contains a json dictionary with four main keys: ‘papers-education’, ‘papers-company’, ‘patents-education’, ‘patents-company’. Each key is then associated to a list of 29 values: number of documents of that signal from 1990 to 2018 (both included).

For any question about ResearchFlow and the evaluation please contact angelo.salatino@open.ac.uk or francesco.osborne@open.ac.uk.