Active Hackathon

Guide To Labelbox – The Customizable Data Annotator Tool

Today we will be discussing a rapid data annotator tool called LabelBox, which has been a market ruler over two years and relied on for many industry use cases.

Data Annotations have evolved in recent years and become better at the performance with advanced computer vision and deep learning techniques. Earlier algorithms only focused on bounding boxes(the rectangle encompassing objects) but now annotating techniques enable customized shapes for any kind of object to be identified. Many of these annotator tools provide end to end ML platforms, from data accumulation to production services.

Many AI-based companies are adapting to these annotations for efficient workflow management and iterating learning while training models. Customising annotations can be applicable to all kinds of use cases.


Sign up for your weekly dose of what's up in emerging technology.

Today we will be discussing a rapid data annotator tool called LabelBox, which has been a market ruler over two years and relied on for many industry use cases.


Labelbox was released in September 2018, by founders Dan Rasmuson, Brian Rieger and Manu Sharma. LabelBox allows users to manage their data using their high powered AI-enabled tools for data labelling by automating the labelling process and training models for active learning and has API support. It allows us to invite team members and collaborate over the workflows. Allows importing and exporting of different kinds of annotation formats. Complex ontology providing high-quality labels with minimal errors. Cloud services support on Azure, GCP, Sagemaker and many others. 

Labelbox allows customization of the tools to support your specific use case, including custom attributes, instances and much more.

  • The bounding box, Points & lines, Polygons
  • Instance segmentation toolkit (pen & superpixels)

Superpixel- allows the instance to split into different ranges of pixels and analyse the parts of the object.

Draw over objects – This tool allows to draw around the object edges

Brush – This tool works like normal paint brush with different radius


  • Supports complex ontologies with nested classifications

Named Entity Recognition and Text classification

Support for tiled imagery (slippy maps)- this is used for geospatial data

Custom labels using labelbox-api.js

Real-Time usage 

Python SDK

Latest version- 2.4.9

pip install labelbox

Project Setup(client initialisation and data connection):

from labelbox import Client
client = Client()
project = client.create_project(name="<project_name>")
dataset = client.create_dataset(name="<dataset_name>", projects=project)

Graph QL API

LabelBox GraphQL API is query-based and thus more flexible than RestAPIs. It has features like strongly typed schema, hierarchical architecture, specificity and strong tooling.

Solutions And Services:

  • Document data extraction
  • Safety monitoring
  • Manufacturing – Preventative maintenance, Defect detection, Waste management, Robotics automation
  • Health/medical – Digital pathology, Ultrasonography
  • Insurance – Property inspection
  • Drone/Aerial – Solar inspection
  • Consumer – Content moderation, Sports analytics, Thermal sensing, Generative design, Cashierless checkout
  • Agriculture – Crop weed detection, Livestock monitoring
  • Transportation – Driver safety

Use Cases

  • MIT students are using Labelbox with neural networks in serotonin research to automate tasks.
  • Stanford CS230 deep learning master grad research students use in their project for land urban air vehicles through satellite imagery. 
  • One of the winning teams in the RoboSub competition had built autonomous underwater robots.
  • Researchers at the Institute of Industrial, University of Tokyo are using model-assisted labelling to speed up annotation efficiency.
  • Labelbox supports American Family Insurance Automation.

Companies Using LabelBox:

Used by over 150+ companies to manage their workflows and collaborations.

  • Cape Analytics uses active learning and APIs to get faster AI production.
  • Pathware uses in pathology products by delivering AI-enabled analyse 
  • Arturo uses it for the insurance industry.
  • Omdena is used for labelling tasks in deep learning for tree identification.
  • SomaDetect is used for dairy farming.
  • Lytx is the market leader for telematics saving lives on the road through video surveillance.
  • Genius Sports – AI transformation in Sports
  • Conde Nast – The parent company for 20 media companies
  • NtConcepts – faster training and deployment of AI systems
  • Xarvio uses it for the agriculture industry to optimise crop production. 

More Great AIM Stories

Jayita Bhattacharyya
Machine learning and data science enthusiast. Eager to learn new technology advances. A self-taught techie who loves to do cool stuff using technology for fun and worthwhile.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM