TagTog is an AI startup company making NLP modelling easier with its text analytics, visualization and annotation system democratized by subject matter experts bringing in domain-specific insights. It can annotate text, pdf, source code, or web URLs manually, using semi-supervised learning, and automation. It was launched in October 2017. Founders Jorge Campos Prieto and Dr Juan Miguel Cejuela during their PhD research in text mining applied biomedical in the University of Munich. Dr Cejuela along with some colleagues had represented a paper-based on TagTog. TagTog is based in Munich (Germany) and Gdansk (Poland).
TagTog helps in generating high-quality text datasets for training NLP algorithms with moderation and customization. The platform uses ML assisted models in learning from pre-annotated data to quickly annotate new data and put through the relevant information in the text. Manually annotation services are also provided following customer’s guidelines. TagTog specializes in text classification and annotation, entity extraction, entity normalisation, concept search ( Discover patterns in unstructured text, identify problems, realize solutions), Big Texts, annotated corpus, semantic search, text mining, business intelligence, and CRM data enrichment. Its automatic review annotations help in saving costs and time. They have an active open-sourced GitHub community.
Features
Generate training data for ML methods and create labelled datasets:
- To manage the team to annotate text manually or import pre-annotated data
- Leverage machine-learning models with constant feedback to work at scale and semi-supervised manner.
- Find out if data is biased.
- Multiple formats supported for documents and not only plain text. Annotate PDFs or import text from different file formats HTML, TXT, CSV, Markdown, source code files.
- Supports Unicode and Multilingual (English, Spanish, Hindi, Bengali, French, Chinese, Japanese, Arabic, Swedish, Dutch, etc.)
- Dictionary annotations use ML to learn from pre-annotated data and automatically generate similar annotations.
Text corpus with ontology
Team Collaboration and Quality management – Invite team members to annotate text and to create an annotated corpus. Specify instructions and roles to each user at any moment. Distribute tasks automatically reflecting on dashboards among users based on your quality requirements.
Track quality and compare the performance- the interface provides the evaluation of the different annotators based on the inter-annotator agreement (IAA).
Metrics
Machine Learning with people in the loop Feed the system with the ML model and a team of SMEs provide feedback on the predictions made from continuous training. Improve the quality of training data and accuracy.
Chatbot Training with overlapping entities
Host on secure Cloud or On-premises On the Cloud, there is nothing to install, no servers to run on. On-premises, run in as docker with SSO integration, with no Internet access required. In both cases, just the browser is needed.
Export Data annotations using the API or the web interface in numerous formats available.
API
In Python
import requests tagtogUrl = "https://www.tagtog.net/-api/document/v-1" authr = requests.auth.HTTPBasicAuth(username="your-Username", password="your-Pswd") params = {"project": "ProjectName", "admin": "your-Username", "format": "formatted", "output": "null"} payload = { "text": "The filmstars are George Cooney, Jennifer Aniston and Angelina Jolie" } responses = requests.post(tagtogUrl, params=params, auth=authr, data=payload) print(responses.text)
In Javascript
var input = document.querySelector('input[type="file"]') var form_data = new FormData() form_data.append("file", input.files[0]) fetch('https://www.tagtog.net/-api/documents/v1?owner=your_Username&project=your_Project_Name&output=ann.json', { method_type: 'POST', headers: {'Authorize' : "Basic " + btoa('your_Username' + ":" + 'your_Password')}, body: data }).then(responses => responses.text()).then(text => { console.log(text); }).catch(function(error) { console.log('Error: ', error); });
Companies
TagTog has worked with the Fortune 500 companies. FlyBase, AWS, Lancaster University, Wolters Kluwer, Wevo, University of North Carolina and Chapel Hill, University of Copenhagen, University of Luxembourg, Center for Open Science.