fast.ai has announced fastchan, a new conda mini-distribution. Centred around the PyTorch ecosystem, developers can use fastchan to install and update libraries in a faster, easier, and more reliable way.
Fastchan allows Anaconda users to install Python softwares such as fastai, RAPIDS, OpenCV, Hugging Face Transformers, and timms with just one unified command: conda install -c fastchan. The same approach can be applied for software upgradation.
Fast.ai, founded by Jeremy Howard in 2016, is a non-profit research group focused on deep learning and AI with a stated objective to democratise deep learning. One of its most popular innovations is fastai, an open-source deep-learning library. It provides components to deliver state-of-the-art results in standard deep learning domains. fast.ai allows practitioners to experiment, mix and match to discover new approaches. In short, to facilitate hassle-free deep learning solutions. The libraries leverage the dynamism of the underlying Python language and the flexibility of the PyTorch library.
The distribution problem
Anaconda is a free and open-source distribution of programming languages such as Python and R. This distribution comes with a Python interpreter and other machine learning and data science packages. Anaconda or Conda can also create a separate self-contained environment with isolated software installations. Meaning a developer can create a quick throwaway environment to test new software without breaking the base environment or maintain separate environments for different projects requiring different versions of Pythons or libraries.
Anaconda regularly releases new versions of its main installer along with other packages. Anaconda also tests new versions of packages between software releases and adds them to the default channel. Most of these packages in the default channel are sourced from conda-forge, a repository for users to upload procedures to build software. Anaconda takes a subset of these packages, along with software, does an additional integration testing of these packages and makes them available in their distribution.
While this works in many situations, many Python libraries are not in conda-forge channels or any channels at all; especially, libraries that use GPU as conda-forge do not have the facility for building and testing GPU-enabled software.
fastchan
To overcome these challenges, fast.ai has created a new channel and distribution called fastchan. It contains all the dependencies to install fastai, PyTorch, RAPIDS, etc. For fastchan, the team has used the official PyTorch build of PyTorch, the official NVIDIA build of RAPIDS, and CUDA toolkit to avoid packaging the software from scratch. Libraries and dependencies that are available only on conda-forge can be copied into the fastchan channel using an Anaconda command called copy.
For fastchan, the fast.ai team included only those dependencies that are not already available in the default channel. They also packaged softwares that are only available as pip packages as pypi using setuptools-conda and build.py program (written from scratch by the fast.ai team).
With fastchan, developers can rely just on the defaults and fastchan channels to install nearly every package and library, especially when working with software in the PyTorch and Hugging Face ecosystems.
What next
“I hope that fastchan will be a useful starting point for folks thinking about Python packaging and deployment,” said Jeremy Howard.
Soon, the team hopes to add many features, including running integration tests on both CPU and GPU to ensure the code that uses libraries together gives the expected results. Howard has also invited developers to add their own integration tests. “Integration tests are crucial to ensure that no-one adds or changes a package which causes breakage on dependent packages (or at least to ensure that broken downstream packages are clearly marked as such),” he said.
According to Howard, while fastchan was created primarily for the use of fast.ai, he hopes that in future, key players like PyTorch, NVIDIA, Anaconda, and conda-forge will solve the distribution problem together and make fastchan obsolete.