How Graph Processing Gets A Makeover With Hadoop

Graph analytics has been in use since decades to provide strength and direction of a relationship between objects in a graph. It has multiple pathways to function, some of which include clustering, cutting, partitioning, searching, shortest path, widest path and page ranking, among others. One can easily store, manage and query data using graph analytics. If there’s an anomaly in behaviour in a cross-channel network, graph analytics can help track that as well. It can also analyse entities which are linked. This helps to reduce big data. Some main functions of graph analytics in social media and other sites are:

  1. Find a bot account on social media and eliminate it from committing fraudulent activities on that site
  2. Trace sock puppets on social networking sites, since many people create accounts with the same name and post same things, these fake accounts can be traced and deleted
  3. Many people involve themselves in circular payment when people create fake intermediaries and transfer many to oneself. This can be eliminated with the help of graph analytics
  4. Money laundering and financial fraud, with the help of graph analytics fraudulent acts involving money, can be identified. Techniques like pattern recognition, class machine learning, statistical analytics can be used

How Graphic Analytics Works With Hadoop

Apache Hadoop has been challenged by Google when they brought their own framework called Dataflow, a cloud-based system which does real-time data analysis. According to reports, Hadoop lacks abstraction and encryption at storage and network levels. Graphic analytics techniques could easily help Hadoop analyse the data systematically.


Sign up for your weekly dose of what's up in emerging technology.

One of the examples of graph storage and processing is a Neo4J database system. This platform is an open-source graph database, which is also developed using Java. Some of the advantages of Neo4J are it has a flexible model, the real-time insights which aren’t available on Hadoop and easy retrieval of data.

Image for representation purposes only

Hadoop has several limitations due to which Apache Spark and Flink came into the market. These include lengthy lines of code, issues with small files, no real-time data processing, no security and slow processing speed. These flaws make Hadoop unfit for enterprise data processing. To overcome this, Spark used in-memory processing of data, which increased processing speed. Graph analytics can work on a platform and store data in a suitable and convenient format for the user. It increases intra-cluster similarity and has applications ranging over machine learning, image processing and tracing weak spots in the data. It can also be used for traffic analysis, social network analysis etc.   

Tech Giants Supporting The Alliance

There are several tech giants who support the use of graph analytics on Hadoop. Facebook uses an iterative graph processing system in its application and the system is called Apache Giraph. It performs graphics processing on big data. This application is an amalgamation of graph analytics on Hadoop. Another example is Aurelius, which introduced Titan in the market. Titan is a scalable graph database optimized for storing and querying graphs with billions of vertices and edges distributed across a multi-machine cluster. Titan also provides elastic and linear scalability for a growing data and user base. It provides other features like Apache Spark, Apache Giraph and Apache Hadoop. It gives support for global graph data analytics, reporting and ETL through integration with big data platforms like Hadoop.  

More Great AIM Stories

Jignasa Sinha
Jignasa pursued her bachelor’s degree in Biotechnology and is currently a trainee journalist at IIJNM. Her mind is usually preoccupied with art, music, food and travel.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM