Data science expert Mark Tenenholtz admits that 90% of models he created are inefficient

The best models for audio datasets are ResNet and EffNet. Tenenholtz justified the usage of image models for audio datasets.

Kaggle Master and senior data scientist Mark Tenenholtz sent a tweet admitting that despite spending thousands of hours on ML models, 90 per cent of the models that he used were ineffective. Tenenholtz then listed the best baseline models for different datasets in the same thread. He underlined the importance of having a good baseline in models, and it was a valuable asset to solve issues with ML models.

For tabular data, Tenenholtz said that XGBoost, LightGBM, or RF models are some of the most commonly used models. Even though ensemble tree-based models can outperform neural networks, XGBoost is the most popular choice among Kagglers. 

For time series data, he stated that models like XGBoost, LightGBM and RF were the best despite them not being built for time series data. Tenenholtz explained that even if the dataset is for tabular data, one could set a prediction horizon that is well-matched to the lag between the input and the output so that the user can control the system better and effectively. Then, the dataset can be treated as tabular data. 


Sign up for your weekly dose of what's up in emerging technology.

For image datasets, ResNet and EffNet-BO are small and quick models that are effective for nearly any type of image data. A huge advantage of these models is that they can be scaled up and used for greater accuracy. 

DistilRoBERTa is the best model for text datasets. The model offers a combination of speed and accuracy. When scaled up, the accuracy of the model increases. 

The best models for audio datasets are ResNet and EffNet. Tenenholtz justified the usage of image models for audio datasets. He said that he had started audio problems by converting the audio to a spectrogram and combining it with an image model. 

More Great AIM Stories

Poulomi Chatterjee
Poulomi is a Technology Journalist with Analytics India Magazine. Her fascination with tech and eagerness to dive into new areas led her to the dynamic world of AI and data analytics.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM