Unlocking Network Insights: Horizon Advisory streamlines complex data analysis using Graph RAG on AWS
100+
hours of staff time saved per month
1m+
data points now processable
1k+
prominent individuals and institutions covered
Overview
Horizon Advisory, a strategic consultancy analyzing complex relationships within China’s technological and industrial ecosystems, faced challenges in efficiently extracting insights from their extensive datasets. To overcome these obstacles, they partnered with Lambert Labs to develop a sophisticated data processing and visualization pipeline on AWS. This solution leveraged AWS services like Amazon Neptune, AWS Lambda, and Amazon SageMaker, combined with custom application code and interactive visualization tools, to streamline data ingestion, simplify complex queries, and enhance the exploration of intricate network relationships. The resulting solution empowered Horizon Advisory with a powerful tool for research and analysis, demonstrating the potential for scalable data-driven insights.
Using RAG on AWS allowed us to
solve an interesting challenge for Horizon.
George Lambert, Founder & CEO, Lambert Labs
Opportunity / Customer Challenge
Horizon Advisory, specializing in analyzing complex relationships within China’s technological and industrial ecosystems, faced a significant challenge in efficiently processing and extracting insights from their vast datasets. Their research relied on intricate networks of connections between individuals (academics, scientists, industry experts) and institutions (universities, research labs, companies), stored across thousands of Excel spreadsheets containing millions of data points, as evidenced by the numerous CSV and XLSX files in their data repository. This network analysis is crucial for understanding the dynamics of innovation, technology transfer, and talent flow within China, but traditional data processing methods proved inadequate.
The initial approach, involving manual mapping and querying, was extremely labor-intensive and time-consuming, hindering the speed and scalability of their analysis. While they explored graph databases on AWS, particularly Neptune, the complexity of crafting Cypher queries to accurately represent these intricate relationships posed a major challenge. The JSON output of these queries, while comprehensive, was difficult to interpret and visualize, making it challenging for analysts to quickly grasp key insights. This difficulty was compounded by the need to search using Chinese names, as illustrated by their search interface, and the need to translate these searches into complex Cypher queries.
Furthermore, the ability to visually explore these connections was crucial, as demonstrated by their graph visualization interface. However, analysts required more than just a static representation. They needed interactive capabilities, such as flexible filtering based on node attributes, control over path length to explore multi-hop connections, and the ability to limit the number of displayed nodes to manage the complexity of large datasets. The nodes also needed to be easily understood, within the visualisation. While current tools offered basic graph visualization, they lacked the necessary interactivity and customization to support Horizon Advisory’s complex analytical workflows.
Solution
To address Horizon Advisory’s complex data analysis needs, Lambert Labs designed and implemented a sophisticated data processing and visualization pipeline on AWS. The core of the solution involved ingesting data from numerous Excel spreadsheets into Neptune, a powerful graph database service, to represent the intricate relationships between individuals and institutions. To automate and streamline this data ingestion process, Lambda functions were utilized to clean and format the data as it was uploaded to Amazon S3. SageMaker notebooks played a crucial role in further transforming the data and preparing it for ingestion into Neptune, ensuring accurate representation of relationships. Throughout the ETL pipeline, Amazon Simple Notification Service provided real-time progress updates, keeping Horizon Advisory’s team informed.
To tackle the challenge of writing complex Cypher queries, Lambert Labs developed a custom graph Retrieval Augmented Generation (RAG) application leveraging OpenAI’s API. This application, integrated within the SageMaker notebooks, simplified the query process by translating natural language searches into Cypher queries, enabling analysts to efficiently retrieve information from the graph database. This was vital, as the data contained many Chinese names, and needed to be searchable in Chinese.
Visualizing the complex relationships within the data was made possible through Pyviz, an interactive visualization tool also integrated into the SageMaker notebooks. This tool transformed the raw JSON output from Neptune into dynamic graphs, allowing analysts to explore connections and relationships with ease. Features like flexible filtering, path length control, and node labeling, as seen in the visualization interface, provided analysts with the necessary tools to navigate and understand the data. The frontend also utilized ipywidgets, enhancing interactivity and user experience.
By leveraging AWS services like Lambda, Neptune, SageMaker, and SNS, combined with custom application code and visualization tools, Lambert Labs delivered a comprehensive solution that empowered Horizon Advisory to efficiently query and analyze their complex data, moving beyond the limitations of manual analysis and basic graph database interfaces.
Outcome
The successful completion of the solution delivered significant strategic advantages to Horizon Advisory. The solution served as a powerful demonstrator, enabling Horizon to effectively present their innovative approach to potential investors and significantly bolstering their fundraising efforts. Moreover, the project provided a clear and achievable roadmap for scaling the solution to a full production environment, laying the groundwork for future growth and expansion.
Internally, Horizon Advisory’s research team experienced a substantial boost in productivity. The new system drastically reduced the time spent manually sifting through data, allowing researchers to focus on high-value analysis and report generation. The ability to quickly and accurately retrieve intricate connections between individuals and institutions, coupled with the interactive graph visualizations transformed their research process. Researchers could now easily explore complex networks and extract key insights directly from the data. The filtering capabilities, allowing for focused analysis based on node attributes and relationship types, provided a powerful tool for navigating and understanding the data.
The solution provided a tangible demonstration of the power of AWS combined with custom application code to solve complex data analysis challenges. The system not only met the immediate needs of Horizon’s researchers but also laid the foundation for a scalable and efficient data analysis platform, ultimately enhancing their ability to deliver timely and insightful research to their clients.
Using RAG on AWS enabled us to leverage our technical expertise to solve an interesting challenge for Horizon. (George Lambert, Founder & CEO, Lambert Labs)
About Horizon Advisory
Horizon Advisory is a strategic consultancy that provides in-depth analysis of technology, industry, and geopolitics, with a focus on China. They deliver critical insights to governments, financial institutions, and technology companies, enabling informed decision-making in a complex global landscape.