MITB Banner

Quora’s ML System is a Class Act

There are several machine learning algorithms that are running behind the scene that have helped Quora retain its position as one of the most popular websites even after a decade of its launch.

Share

Quora ML
Listen to this story

Have you ever noticed that whenever you do a Google search, the results, more often than not, throw up links from Quora? Founded in 2009 as a question and answer website by Adam D’Angelo and Charlie Cheever, Quora was made available to the public in 2010. This website allows users to ask and answer questions and even upvote/comment on answers given by other users. As of 2020, the website registered 300 million unique visitors to the website and counts itself among the top 20 websites. The most searched topics were technology, movies, health, food, and science.

There are several machine learning algorithms that are running behind the scene that have helped Quora retain its position as one of the most popular websites even after a decade of its launch. 

Ranking questions and answers

Every Quora user seeking information on a particular topic does so by feeding in their question or an ‘information need’. Machine learning algorithms conduct a question understanding process where the exact information that is being sought is extracted from the question. The next step is identifying ‘quality questions’, which is done through question quality classification that helps in distinguishing between high- and low-quality questions. 

At this stage, the algorithms also determine several different question types. Once the questions are classified, the step involves question-topic labelling, where the model determines the bucket/topic under which the question is to be listed. Here the analysis relies on data describing actions that ‘Quorans’ take on the platform. To make the analysis easier, Quora relies on a schematic relationship between users, questions, and topics. Unlike most topic modelling applications that deal with large document text and a smaller topic ontology, Quora’s algorithms work with short question text and ‘more than a million potential topics’ to tag the question on.

(Source: Quora)

When it comes to answers, Quora has a proprietary algorithm that ranks them. It is modelled similarly to Google’s ‘PageRank’, which counts the number and quality of links to a particular page to determine how important the website is. The underlying belief is that important websites are more likely to have backlinks from other websites. Likewise, Quora ranks answers based on how helpful they are. The ‘helpful’ part is subject to factors such as upvotes and downvotes on the answer; previous answers written by the author; whether the author is a subject matter expert; type and quality of content, among others.

Quora looks at two specific instances of ranking machine learning algorithms—search and personalised ranking. In the case of search ranking, first, the questions that match the query are returned; then, those documents are classified based on the probability of a click. In the case of personalised ranking, Quora attempts to select and rank the most ‘interesting’ answer depending on the user’s usage pattern gauged from their profile.

Quora uses a combination of interestingness of both the answers and questions. The upcoming actions are considered and aggregated at different temporal windows and fed to the ranking algorithm. Quora keeps experimenting with the personalised feed model.

Another important consideration for Quora when it comes to feed ranking applications is that it needs to be responsive to factors like user actions, impressions, and trending events. The challenge here is that there is a growing collection of questions and answers that may not be possible to rank in real-time for each user. To optimise the user experience, Quora implements a multi-stage ranking algorithm where candidates are ranked even before the final ranking is actually performed.

Maintaining quality

One of the main considerations in discussions about the quality experience on Quora is to filter out duplicate content. To this end, the ML team at Quora detects different questions that have the same intent and merge them into a single canonical question. One of the techniques used is a random forest model with features like cosine similarity of the average word2ved embeddings of tokens, common words, part of speech tags of the words, and common topics labelled on the questions. Apart from that, Quora also has different machine learning systems and their combinations to tackle spam content. Further, machine learning algorithms along with human moderators help in identifying offensive, abusive, and hurtful content on the platform.

Until 2016, the platform was ad-free. According to Nikhil Dandekar, former Engineering Manager at Quora, the platform uses Ad CTR prediction to make sure that the ads shown are relevant to users and deliver value for money for the advertisers as well.

Overall, the top machine learning algorithms used at Quora include, but are not limited to, Logistic Regression, Elastic Networks, Gradient Boosted Decision Trees, Random Forests, Neural Networks, LambdaMART, Matrix Factorization, Vector models and several other NLP techniques.

Primary references – here and here.

Share
Picture of Shraddha Goled

Shraddha Goled

I am a technology journalist with AIM. I write stories focused on the AI landscape in India and around the world with a special interest in analysing its long term impact on individuals and societies. Reach out to me at shraddha.goled@analyticsindiamag.com.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.