The tech stack behind Bangalore based startup Synth’s summarisation tool

"Synth is compatible with everything; Twitter, Google meet, Teams, Zoom, or any other video conferencing client that will come up in the next ten years."

While learning and working have shifted to being digital today, the technologies to digest that information in an equivalent manner are still niche. Bangalore based start-up, Synth, is making video conferring, team calls and upskilling through videos easier with their transcribing and summarising tool. Their personal AI assistants, or ‘the second brain’, captures all the relevant information consumed through text, audio and video and summarise them in one click. The company is founded by Suneel Matham, Urvin Soneta and Vaibhav Saxena.  Analytics India Magazine spoke with Urvin and Vaibhav to know more about their services. 

AIM: What is the pain point that Synth was trying to solve?

Urvin: We (the co-founders) were part of the Plaksha tech leaders fellowship. During the batch, we had lots of audio information that we would go through daily. During this time, we realised that whatever information we were consuming was super valuable to us, not only for the fellowship’s timeline but also for us, even in the future. 

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

So, the three of us built a habit of externalising this knowledge to refer to later, very easily—we wanted to have our second brain. We tried out the existing tools (but) most of them did not work well. Or they would work only for one context of meetings and not work on the other. We tested out different applications that worked with it, but none of us actually felt comfortable with them. And we felt the entire burden of capturing information, organising information and retrieving the information was on the user rather than the application. And since we were studying AI, we thought, why not try to work towards moving this burden from the users back to software. So this idea was sown during the fellowship, but then we had capstone projects (for which the trio worked in different places). When we came back, it was COVID-19 time, and we were doing all our courses online. There was this trend of everything moving online and being recorded.

We were in a randomised team in our product management class at Plaksha. We picked this problem statement and started working towards it. COVID-19 nudged us into trying out entrepreneurship in a low stakes environment in some ways. The entrepreneurship support program (exposed us to) mentorship by great folks from the start-up industry and the VC industry, in the beginning, to get us started. 

Download our Mobile App

AIM: How does Synth automate monotonous tasks?

Vaibhav: We do three very simple things. Synth captures all the audio information. It does not matter if the meeting is on teams, or you’re listening to a podcast or upskilling yourself by watching YouTube videos; Synth captures any audio information. Then, we summarise it for you. 

How quickly can you understand a piece of information even after two months or three months? And how quickly can you search for it? For instance, you watched this video deep learning something, and there was some great idea in it. You probably wrote it somewhere but cannot find it out because we never captured it properly. So Synth wants to solve those things by capturing, summarising and retrieving your audio information as text just in one single place. So you don’t have to go through the hassle of finding your meeting notes or something of that service. 

AIM: Please explain the tech stack used behind the AI assistant.

Vaibhav: Synth is a desktop application. The content is all in JavaScript to build it in a much faster manner. The backend is where most of the magic essentially happens; things are all powered by language models or transformers. For us, language models form the basis of everything, be it transcription, summarisation or retrieval. Many proprietary things go within those models; we use our data, train and fine-tune these models further with the data we get when the user’s use the application. 

AIM: Tell us more about the summary.

Vaibhav: The summary is usually less than 500 words and gets generated independently. It’s intelligent enough to understand the context and what the person wants. And that’s where the proprietary technology or the AI algorithms that we’re using comes into the whole picture. So, it understands the whole context in the meeting or the video and presents you with its summary. 

AIM: Synth can be used for various use-cases. Please explain some through case studies.

Urvin: One use case is people use it for meetings. The second is people use it for upskilling and learning. And the last is people want to summarise the audiobooks and podcasts that they listen to. 

From the perspective of the meeting, founders and product managers use it, especially for the user calls and external calls that they have; partner calls or customer calls. Product managers use it for product discussions since going back to the exact thing that a certain person said helps in such cases. Additionally, summarising the information helps people share it easily within their teams. (This is also true for when) someone who did not attend a meeting asks for a brief. (With Synth, colleagues can) just share these notes to relive the conversation and only the highlighted parts through the summary. It is time-saving. 

To illustrate the ease of learning and upskilling, there’s a data scientist in the US who wants to create content in data science. But given how things keep moving quickly in the industry, he regularly watches Two Minute Papers on YouTube and uses Synth to consolidate his research, fasten the process and capture these important aspects. In addition, he has highlighted portions of important notes through the one-click shortcut, which helps you highlight during the call. 

AIM: Both Vaibhav and Urvin have pursued their bachelor’s in Arts before shifting to AI. What does the Arts-Computer Science background bring to the company?

Vaibhav: One of the important aspects of company building is communication and understanding other people’s mindsets. At Plaksha, we understood we would be working with varied people, and if we could not empathise with their problems, the start-up would be very weak. This intensifies while collaborating with designers, different kinds of engineers, and front and back end folks. (Here’s) how at least our education has helped us evolve.

Urvin: Our different backgrounds and the multidisciplinary aspect of it, and, in the larger context, the problem statement we are working with is also very multidisciplinary. We can say we are helping augment human intelligence. In terms of that, we want to ensure the product that we are building does technically work super well and blends in with the human on an everyday level. The importance of human-computer interaction comes in in this as well. My background in liberal arts and the kind of courses I have done in computer science, economics, humanities, and HCI were important aspects. It helps us understand the user on a human level but also give them the product which is technically super strong. 

AIM: What are the various audio/video softwares that are compatible with Synth?

Vaibhav: Synth is compatible with everything; Twitter, Google meet, Teams, Zoom, or any other video conferencing client that will come up in the next ten years, along with platforms like Coursera or Udemy. The biggest thing we wanted to solve was why we had to move to a different tool altogether. Since Synth works with almost everything, you don’t have to install a new app or download a new app for another meeting. 

AIM: Tell us about your expansion plan.

Urvin: We have been working on private beta for the past few months, with individual users and individual use cases. Currently, we are adding collaborative features to our applications, and we’ll start onboarding larger teams to start working on private beta for the next few months. Along with individuals, teams will also find significant value from Synth because it can become an institutional memory for teams and companies. The plan is to keep getting maximum feedback from the private beta and onboard users on a rolling basis from our waitlist. We plan to launch it after a few months publicly. In terms of further expanding, especially for the audiobooks and podcasts use cases, we plan on launching the app on mobile and tablet as well to acquire more users and expand to that use case.

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Avi Gopani
Avi Gopani is a technology journalist that seeks to analyse industry trends and developments from an interdisciplinary perspective at Analytics India Magazine. Her articles chronicle cultural, political and social stories that are curated with a focus on the evolving technologies of artificial intelligence and data analytics.

Our Upcoming Events

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023

21 Jul, 2023 | New York
MachineCon USA 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox