Introducing Graphs to Power Your Infrastructure, Drive Sustainability, and Speed up Innovation

Graph expert Jim Webber looks at how graph database technology could be a boon in supporting data centre managers. Jim is Chief Scientist at graph database company Neo4j.

  • Wednesday, 8th March 2023 Posted 1 year ago in by Phil Alsop

A graph database has been designed as a more agile and flexible alternative to traditional databases that rely on tables or documents. Data is stored in the same manner in which you sketch out ideas on a whiteboard, enabling users to easily process complex sets of connections and gain insights into what's happening.

Despite its apparent simplicity, a graph database is powerful and capable of handling large amounts of data. Many enterprises are leveraging this technology to fuel their intelligent applications and gain an edge in the market. For instance, graph databases are being used to detect fraud, track products through supply chains, and conduct sophisticated impact analyses on networks.

Monitoring, automation and sustainability

Graph databases are rapidly gaining popularity for their ability to effectively manage dependencies and automate microservice monitoring across different platforms in the data centre. They also offer a cost-effective solution for improving scalability and optimizing performance. Furthermore, graph databases are proving to be environmentally friendly, contributing to sustainability initiatives. Their design takes up less server space, resulting in lower energy consumption compared to traditional databases. This not only benefits the environment but it also helps organizations reduce their carbon footprint and operate more sustainably.

A prime example of the challenges that can arise with data modeling can be seen in Adobe Behance, an online platform designed to showcase creative skills. Behance suffered from bloating issues with the architecture necessitating 48 Cassandra instances, 50 terabytes of data, and very high-powered machines to power the activity feed. Much of this was caused by the fanout strategy used for the activity feed. For every user, there had to be a corresponding row in Cassandra, similar to any relational database. It's worth noting that this issue was not related to Cassandra itself, but rather a data modeling problem that could have been addressed with a more efficient data architecture.

To address the complexity, storage, and infrastructure costs associated with their activity feed, the team at Adobe Behance realized that they needed a more flexible solution. Initially, they attempted to run Cassandra on fewer servers, but this failed to meet the platform's user requirements.

Ultimately, Adobe Behance decided to switch to a more adaptable solution, graph database technology, to power the activity feed and support five new features. This solution contained approximately ten times the amount of data as the previous Cassandra architecture, but the native graph’s flexibility and scalability allowed it to be deployed in production at scale. This decision enabled Adobe Behance to reduce complexity, lower costs, and improve performance, all while maintaining the features

that users required. The full production version runs on just three servers for the same workload, with a huge drop in data storage, and with substantially improved functionality. In effect, a factor of 40 reduction in hardware, which has a direct positive impact on the carbon footprint.

“We cut our data store from 50 terabytes of data to 50 gigabytes producing lots of cost savings on infrastructure” explains David Fox, systems engineer at Adobe. “Our user-facing features became even more compelling because we had this new data we could easily surface and easily manage”.

Before switching to a graph platform, Adobe Behance faced challenges with populating a user's activity on the fly when they signed up. However, with the implementation of graph technology this process became much more streamlined. Adobe Behance was able to connect users directly to the data stored in the database, improving overall performance.

The flexibility that comes with graph technology has allowed Adobe to innovate after the initial implementation. Adobe, for example, wanted to be able to allow users to filter projects just from people they follow. This was impossible with Cassandra because it could not query data to produce the required view. Adobe Behance was able to deliver this feature to users in just one week, leveraging the power of graph technology to provide a more user-friendly experience.

Graphs: managing infrastructure while promoting experimentation and flexibility

The move to a graph platform has led to reduced infrastructure costs for Adobe, Fox adds. Graphs have also made life easier for the team in terms of operational management and administration. As well as reducing infrastructure costs, the move to a graph platform has allowed Adobe to develop and iterate quickly.

“It has encouraged innovation because product development realises that we have this data that is already available in Neo4j and that we don’t have to run a long process to migrate or figure out how to get it end-usable,” explains Fox.

The project has shown Adobe that rigid data structures create a mindset and a culture of not innovating. “What happened with us is we were just trying to keep our system running and keep our Cassandra cluster functional and we lost the ability for product to say this is something we want to build on top of, this is something we want to give users”, concludes Fox.

This and other examples show that graphs continue to prove themselves as powerful tools in managing your infrastructure, as well as tackling the IT carbon footprint problem at the same time. Finally, graphs allow developers and operations teams the ability to focus on innovation rather than struggling to achieve tasks that are difficult or impossible with traditional database architectures.