Querying big graph data
From November 2015, an ERC-funded project led by Professor Wenfei Fan will investigate the challenges of querying big graph data with limited resources.
Professor Fan has been awarded a European Research Council (ERC) Advanced Grant of £1,608,536 to co-ordinate the work.
An explanation of the project, published by CORDIS, sums up the challenge as follows:
“When we search for a product, can we find, using a single query, top choices ranked by Google and at the same time, recommended by our friends connected on Facebook? Is such a query tractable on the social graph of Facebook, which has over 1.31 billion nodes and 170 billion links? Is it feasible to evaluate such a query if we have bounded resources such as time and computing facilities?”
Tackling fundamental problems
Professor Fan says:
“Real-life graphs are typically big, with billions of vertices and trillions of edges. Graph computations are expensive, even intractable. A linear scan of 1PB of data takes days, and it takes years to scan 1EB of data.
“To add to the challenges, we often have constrained resources, such as limited time and computing facilities.
“Solving the problem demands a departure from traditional query evaluation techniques and even from classical computational complexity theory.
“A number of questions arise. What query classes are tractable in the context of big data? Does parallelization always help when querying big data? When exact answers are beyond reach in big data, can we find approximate answers that have accuracy guarantees for unpredictable queries? What query engines do we need for querying big data within constrained resources?
“This project aims to tackle these challenges, from fundamental problems to practical techniques.”
The project will develop GRACE, a system to query big GRAphs within bounded resourCEs, to support emerging applications such as social media marketing, social search and knowledge bases.
This will involve:
- developing a graph pattern query language that allows users to unify Web search (via keywords) and social search (via graph patterns), and express graph pattern association rules for social media marketing
- revising the conventional complexity theory to characterise the tractability of queries on big data, and formalise parallel scalability with the increase of processors
- developing algorithmic foundations and resource-constrained techniques for querying big graphs, by ‘making big data small.'
The GRACE project will run from November 2015 to September 2020.
Professor Fan is Chair of Web Data Management, School of Informatics, University of Edinburgh.