AI-powered analysis tools can smartly sift through millions of documents – rapidly collating insights and identifying crucial building blocks for your project.
On this page you will find a range of tools to help you harness TDM's potential.
- Articles
- Tokenised full text
- Natural language processing tools
Free TDM resources you can request from us
A full text XML article sample
If you would like to test out TDM with our journals, we can provide you a full-text XML sample of machine-ready articles to download and use with your TDM software.
Tokenised full text articles
TDM starts with breaking up text into units (or tokens) that get fed into the mining software. One year of our open access journal Chemical Science’s articles are available as space-separated tokenised full text. The algorithm we use (Chemtok) has the extra benefit of being chemically aware, so will keep names, such as 1,1,1-trichloroethane, together.
Open source TDM resources to use now
Chemlistem: open source algorithm
Through their own work with text and data mining, our data science team developed Chemlistem, named entity recognition (NER) code which uses deep learning.
Chemtok Python library
This is the code we used to produce the tokenised full text articles above, available for anyone to apply to their own work.
Chemical ontologies
Chemistry-relevant terms organised for use with TDM or for structuring results in a graph database.
You can find ontologies for chemical methods, named reactions, and molecular processes. They contain definitions of the terms of interest, relations between them and synonyms for each of the terms.
Request your free TDM resources
If you’re an organisation interested in Text and Data Mining, complete the form to choose the downloads you would like to receive.
A new frontier for science exploration and discovery
Recent advances in AI, robotics, data analysis, modelling and simulation have allowed scientists to augment their research, advancing discovery more quickly, reducing the time it takes to do some tasks in the labs from weeks or months to just hours and identifying patterns and possibilities that humans alone would not see.
Get in touch
- Email:
- Send us an email