From generating deep fake faces to fake videos, Artificial Neural Network has touched new heights as well as opened up various directions in the domain of emerging technologies. While modelling a deep ANN, it is important to use the right metrics with the right dataset to build up a robust model.\u00a0\r\n\r\nGenerative adversarial networks (GANs) are one of the most popular methods for generating images and Frechet Inception Distance (FID) is the most popular metric used to validate it. This metric serves as a remedy for the pitfalls of GANs and is specifically designed for images.\r\n\r\nThe Fr\u00e9chet Inception Distance (FID) works by taking a large number of images from both the target distribution and the generative model and uses the Inception object-recognition network to embed each image into a lower-dimensional space which captures the important features. Computing the Fr\u00e9chet distance between these samples helps in providing a quantitative measure of how similar the two distributions actually are.\r\n\r\nAccording to the researchers at the tech giant, the access to robust metrics for evaluation of generative models is crucial for measuring (and making) progress in the fields of audio and video understanding, but currently, no such metrics exist until now. Recently, the researchers at Google proposed two new metrics for audio and video which are built on the principles of FID. They are Fr\u00e9chet Video Distance (FVD) and Fr\u00e9chet Audio Distance (FAD) which are useful in measuring the quality of synthesised video and audio respectively.\u00a0\r\n\r\n\r\n\r\nFigure: The key component for both metrics is a pre-trained model that converts the video or audio clip into an N-dimensional embedding. (Source)\r\nFr\u00e9chet Video Distance (FVD)\r\nFr\u00e9chet Video Distance (FVD) is a new metric for generative models of video which correlates well with the qualitative human judgment of generated videos. FVD can be used in situations such as unconditional video generation via Generative Adversarial Networks. There are several features which makes this metric better than the existing ones. They are mentioned below\r\n\r\n \tFVD is sensitive to both temporal, and frame-level perturbations.\r\n \tIt coincides well with the qualitative human judgment of generated videos.\r\n \tThis metric is accurate in evaluating videos that were modified to include static noise, and temporal noise.\r\n\r\nThis metric avoids the drawbacks of frame-level metrics which is common among the existing video evaluating metrics.\r\n\r\n\r\n\r\nImage: Examples of videos of a robot arm, judged by the new FVD metric. (Source)\r\nFr\u00e9chet Audio Distance (FAD)\r\nFr\u00e9chet Audio Distance (FAD) is a reference-free evaluation metric for music enhancement algorithms which is designed to measure how a given audio clip compares to clean, studio-recorded music. It compares statistics computed on a set of reconstructed music clips to background statistics computed on a large set of studio-recorded music.\u00a0\r\n\r\nThis metric is different from the other existing metrics as the existing metrics for generating audio quality either require a time-aligned ground truth signal, for instance, source-to-distortion ratio (SDR) or only target a specific domain like speech quality. But FAD, on the other hand, is a reference-free metric and can be used on any type of audio.\u00a0\r\n\r\nFor the evaluation of these two metrics, the researchers at Google performed a large-scale human study in order to determine how well these two new metrics align with the qualitative human judgment of generated audio and video.\r\nOutlook\r\nFor a few years now, Google has been doing several breakthroughs among intelligent platforms. The tech giant has also released the source codes for both Fr\u00e9chet Video Distance and Fr\u00e9chet Audio Distance\u00a0on\u00a0GitHub. According to the researchers at the tech giant, the two metrics, FAD and FVD will assist in keeping this progress measurable as well as improve the models for audio and video generation.