GRADES2014 · GRADES2015 · GRADES2016 · SIGMOD/PODS 2013
GRADES: Graph Data-management Experiences & Systems
Keynote By prof. dr. Carlos Guestrin

GRAPHS AT SCALE WITH GRAPHLAB

Today, machine learning (ML) methods play a central role in industry and science. The growth of the Web and improvements in sensor data collection technology have been rapidly increasing the magnitude and complexity of the ML tasks we must solve. This growth is driving the need for scalable, parallel ML algorithms that can handle "Big Data."

In addition to scale, Big Data problems have brought to light another significant Graphs at Scale with GraphLab

Today, machine learning (ML) methods play a central role in industry and science. The growth of the Web and improvements in sensor data collection technology have been rapidly increasing the magnitude and complexity of the ML tasks we must solve. This growth is driving the need for scalable, parallel ML algorithms that can handle "Big Data."

In addition to scale, Big Data problems have brought to light another significant gap in large-scale systems: Standard SQL tables tend to be too rigid for this data, while unstructured representations lead to significant scaling problems. Graphs can provide a flexible intermediate point, allowing both flexibility and scalability.

In this talk, I will focus on:

  1. Examining common algorithmic patterns in distributed ML methods for tackling Big Graphs.
  2. Qualifying the challenges of implementing these algorithms in real distributed systems.
  3. Describing computational frameworks for implementing these algorithms at scale.

We will focus mainly on the GraphLab framework, which naturally expresses asynchronous, dynamic graph computations that are key for state-of-the-art ML algorithms. When these algorithms are expressed in our higher-level abstraction, GraphLab will effectively address many of the underlying parallelism challenges, including data distribution, optimized communication, and guaranteeing sequential consistency, a property that is surprisingly important for many ML algorithms. On a variety of large-scale tasks, GraphLab provides 20-100x performance improvements over Hadoop. In recent months, GraphLab has received many tens of thousands of downloads, and is being actively used by a number of startups, companies, research labs and universities.

About prof. dr. Carlos Guestrin

Carlos Guestrin is the Amazon Professor of Machine Learning at the Computer Science and Engineering Department of the University of Washington. He is also a co-founder and CEO of GraphLab Inc., focusing large-scale machine learning and graph analytics. His previous positions include Associate Professor at Carnegie Mellon University and senior researcher at the Intel Research Lab in Berkeley. Carlos received his PhD and Master from Stanford University, and a Mechatronics Engineer degree from the University of Sao Paulo, Brazil.

Carlos' work has been recognized by awards at a number of conferences and two journals: KDD 2007 and 2010, IPSN 2005 and 2006, VLDB 2004, NIPS 2003 and 2007, UAI 2005, ICML 2005, AISTATS 2010, JAIR in 2007 and 2012, and JWRPM in 2009. He is also a recipient of the ONR Young Investigator Award, NSF Career Award, Alfred P. Sloan Fellowship, IBM Faculty Fellowship, the Siebel Scholarship and the Stanford Centennial Teaching Assistant Award. Carlos was named one of the 2008 `Brilliant 10' by Popular Science Magazine, received the IJCAI Computers and Thought Award and the Presidential Early Career Award for Scientists and Engineers (PECASE). He is a former member of the Information Sciences and Technology (ISAT) advisory group for DARPA.

GRADES 2013 key data:
Workshop:     Sunday June 23
Submission: April 7 extended
Notification: April 29 shifted
Camera-ready:           May 19

Place: Times Square, NY;
Millenium Broadway Hotel.