Researchers at Tsinghua University recently released an autoML framework and toolkit for machine learning on graphs, known as AutoGL. AutoGL version 0.1.1 is claimed to be the first-ever autoML toolkit for graph datasets and tasks.
AutoML or automated machine learning has gained much traction over the years. It helps in bridging the talent gap in the machine learning industry. On the other hand, graphs are the ubiquitous data structure that various researchers have thoroughly applied in their work. As this new toolkit supports the fully automatic machine learning of graph data, it will help eliminate the mundane tasks of machine learning developers.
Tech Behind AutoGL
AutoGL or Auto Graph Learning is an automatic machine learning (AutoML) toolkit specified for graph datasets and tasks. According to its developers, the toolkit automatically handles all the stages involved in graph learning problems, including dataset download & management, data preprocessing and feature engineering, model selection and training, hyper-parameter tuning, and ensemble.
The toolkit is mainly developed for the machine learning researchers and developers, such that they can quickly conduct AutoML on the graph datasets and tasks. It uses the AutoGL datasets to maintain a dataset for graph-based machine learning based on a Dataset in PyTorch Geometric with some support added to corporate with the auto solver framework.
Based on the concept of automated machine learning, auto graph learning strives at solving the tasks automatically with data represented by graphs. However, as per the developers, unlike conventional learning frameworks, auto graph learning does not need humans inside the experiment loop.
To attain the performance of autoML, the auto graph learning framework includes the following-
- Dataset to maintain the graph datasets provided by the users.
- Various graph-based ML tasks are solved by various AutoGL solvers. For this, a solver object needs to be built for specifying the target tasks, and inside the solver, there are four sub-modules to help complete the auto graph tasks, namely auto feature engineer, auto model, hyperparameter optimisation, and auto ensemble.
- The sub-modules automatically preprocess and enhance the data, choose, optimise deep models, and ensemble them in the best way possible.
In this toolkit, the tasks are mainly solved by the corresponding learners, for instance:
- It can preprocess as well as feature engineer the given datasets. This is achieved by the module called auto feature engineer that can automatically add or delete useful or useless attributes in the given datasets.
- It can automatically train as well as tune popular ML models specified by the users. This is achieved by modules named auto model and hyperparameter optimisation.
- To find the best way to ensemble models found and trained in the last step, it can be done by the module named auto ensemble.
The toolkit supports various popular benchmarks, such as Cora, Citeseer, and Reddit Amazon Photo. It also supports the popular benchmark on OGB for node classification and graph classification tasks.
At present, the following algorithms are supported in this toolkit:
Benefits of AutoGL
Some of the benefits of using this toolkit are mentioned below:
- This toolkit is claimed to reduce human labours and biases in the machine learning loop on a large scale.
- With this toolkit, developers can now quickly conduct autoML on the graph datasets and tasks.
- AutoGL serves as a platform for users to implement and test their own autoML or graph-based machine learning models.
How To Install AutoGL
In order to install this tool, you must first need to install the requirements of AutoGL. The requirements include Python version >= 3.6.0 and PyTorch >=1.5.1.
Install from pip
pip install auto-graph-learning
Install from source
git clone https://github.com/THUMNLab/AutoGL
python setup.py install
Learn more here.