Project plan
Large Scale Graph Analytics
Introduction:
The important discovery in big data is Graph
analytics.
Applications like detecting frauds, employee retention
and determining product affinities by exploiting community buying patterns are
proved to be useful to the industry. There’s a need for specialized platform to
satisfy unique processing requirements of large scale graph analytics. But they
do not work well with the large ecosystem of SQL business applications. Hence
the spark processing comes into the picture to be combined with other analytics
techniques.
Latest programming frameworks achieve high performance
on large scale graph analytics by supporting processing of graphs that exceed
main memory capacity on small computing systems. Thus, we thrive to improve
performance of software frameworks for out of core graph analytics.
Implementing this innovation in a framework by choosing the best one which is
open source and can compare its performance to two state of art out of core
graph frameworks.
Tools selection:
Dataset:
1)
Crawl
web graph – It’s a stored WARC(Web Archive). The dataset contains crawl date,headers
used and other metadata.
2)
LDBC:
Social network graph generator: software to synthesise directed graphs whose
properties resemble social networks.
Steps:
1)Obtain data set
2)Clean data
3)Initialize spark
context
4)Create
graphframe
5)Running LPA
algorithm
6)Visualize
results
More work:
-Issues
Encountered in Distributed Graph Processing.
-Graph Languages
with Benefits
-Efficient Large
Scale Data Analytics
-Graph Analytics
for Social Networks
0 comments:
Post a Comment