Active Hackathon

Google Releases An ML Model That Can Choreograph Dance

Full-attention cross-modal Transformer (FACT) model that can mimic and understand dance motions and can even enhance a person’s ability to choreograph dance.

The alignment of the dancing moves to the music beats is a fundamental human behaviour, a form of art that requires constant practice and professional training. In addition, expressive choreography calls for equipping the dancer with a rich repertoire of dance moves. 

Researchers explained while this process is challenging for people, it is even more difficult for a machine learning model. This is because an ML model requires generating a continuous motion with high kinematic complexity while capturing the non-linear relationship between the movements and the accompanying music.


Sign up for your weekly dose of what's up in emerging technology.

Entering the domain, Shan Yang, Software Engineer, and Angjoo Kanazawa, Research Scientist from Google Research, have proposed a full-attention cross-modal Transformer (FACT) model that can mimic and understand dance motions and can even enhance a person’s ability to choreograph dance.

In addition to the model, the team released a large-scale, multi-modal 3D dance motion dataset, aka AIST++. It comprises 5.2 hours of 3D dance motion in 1408 sequences, covering ten dance genres, each having multi-view movies with known camera angles. The proposed model — FACT, outperforms previous state-of-the-art approaches, both qualitatively and quantitatively.

Last year, researchers from Shanghai Tech University introduced a new GAN-based framework that can perform human image synthesis by using a 3D body mesh recovery module known as Impersonator++. According to the researchers, the Impersonator++ framework tackles human image synthesis, including human motion imitation, appearance transfer, and novel view synthesis.

The ten dance genres in the AIST++ dataset include: Old School (Break, Pop, Lock and Waack) and New School (Middle Hip-Hop, LA-style Hip-Hop, House, Krump, Street Jazz and Ballet Jazz). Although it contains multi-view videos of dancers, these cameras are not calibrated.

“We present a model that can not only learn the audio-motion correspondence but also can generate high-quality 3D motion sequences conditioned on music. Because generating 3D movement from music is a nascent area of study, we hope our work will pave the way for future cross-modal audio to 3D motion generation,” said the blog.

Get the codes here.

Get the AIST++ dataset here.

Read the entire paper here.

More Great AIM Stories

kumar Gandharv
Kumar Gandharv, PGD in English Journalism (IIMC, Delhi), is setting out on a journey as a tech Journalist at AIM. A keen observer of National and IR-related news.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

Council Post: How to Evolve with Changing Workforce

The demand for digital roles is growing rapidly, and scouting for talent is becoming more and more difficult. If organisations do not change their ways to adapt and alter their strategy, it could have a significant business impact.

All Tech Giants: On your Mark, Get Set – Slow!

In September 2021, the FTC published a report on M&As of five top companies in the US that have escaped the antitrust laws. These were Alphabet/Google, Amazon, Apple, Facebook, and Microsoft.

The Digital Transformation Journey of Vedanta

In the current digital ecosystem, the evolving technologies can be seen both as an opportunity to gain new insights as well as a disruption by others, says Vineet Jaiswal, chief digital and technology officer at Vedanta Resources Limited

BlenderBot — Public, Yet Not Too Public

As a footnote, Meta cites access will be granted to academic researchers and people affiliated to government organisations, civil society groups, academia and global industry research labs.