Integrating StreamNative's Ursa Engine with PuppyGraph for Real-Time Graph Analysis
Weimo Liu

TL;DR

StreamNative's Ursa Engine, integrated with PuppyGraph, addresses the challenge of real-time graph analytics on streaming data. The solution eliminates the need for a dedicated graph database by enabling direct queries on streaming data using graph query languages. This integration provides cost-effective, real-time graph analytics with seamless data processing and visualization.

Opening

Imagine being able to analyze streaming data in real-time using graph-based queries without the need for a traditional graph database. At the Pulsar Virtual Summit EMEA 2024, StreamNative revealed a groundbreaking approach to graph analytics by integrating their Ursa Engine with PuppyGraph. This partnership is set to transform how data streaming practitioners approach graph analytics, making it more accessible and efficient than ever before.

What You'll Learn (Key Takeaways)

  • Zero-ETL Graph Queries – By leveraging StreamNative's Ursa Engine with PuppyGraph, users can execute graph queries directly on streaming data stored in formats like Iceberg, eliminating the need for costly and complex ETL processes.
  • Cost Efficiency and Scalability – The integration allows for cost-effective graph analytics by utilizing object storage and column-based storage formats, enabling scalable solutions that handle large data volumes efficiently.
  • Real-Time Security Insights – This setup supports real-time graph queries that can uncover security vulnerabilities, streamline cybersecurity operations, and provide actionable insights without the overhead of multiple data copies.
  • Compatibility with Existing Tools – With support for popular graph query languages like Gremlin and openCypher, users can easily integrate existing tools and visualization packages with PuppyGraph for a seamless analytics experience.

Q&A Highlights

Q: How does the integration handle data freshness and potential latency issues?
A: While Iceberg may introduce some latency due to metadata management, the trade-off between cost and performance is optimized. Ursa produces files every 30 seconds to balance latency and file management.

Q: Are Bloom filters used to optimize which Parquet files to read?
A: Yes, Bloom filters and other indexing methods are employed to enhance performance by determining which files and columns to access, optimizing query execution.

Q: How does PuppyGraph manage joins and traversals without relying on Iceberg's joins?
A: PuppyGraph avoids traditional join operations by using node and edge actions for graph traversal, implementing a cost-based optimizer to enhance query performance without the overhead of SQL-style joins.

Q: What visualization tools are supported by PuppyGraph?
A: PuppyGraph is compatible with Gremlin and Cypher clients, allowing the use of most graph visualization tools. Additionally, an open-source UI developed for high-demand scenarios is available.

Q: Is there a benchmark report available for performance evaluation?
A: While public benchmarks are not available to avoid competitive disputes, potential users can contact PuppyGraph for private benchmarks or to set up a proof of concept.

Weimo Liu
CEO, PuppyGraph

Dr. Weimo Liu serves as the CEO of PuppyGraph. He was a former software engineer within Google's F1 team and a research scientist at TigerGraph. In these capacities, he specialized in advancing query languages and engines. Dr. Liu earned his PhD degree from GWU, and his BS degree from Fudan University. Notably, he actively participates as a program committee member and reviewer for esteemed conferences like SIGSPATIAL, TKDE, and KDD. His contributions extend to publications in VLDB and ICDE.

Newsletter

Our strategies and tactics delivered right to your inbox

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.