CloudDP 2014

Fourth International Workshop on Cloud Data and Platforms
Co-located with EuroSys 2014 in Amsterdam, the Netherlands
April 13, 2014

Call for Papers

Deadline extended to February 10, 2014

Processing of very large data sets requires a unique combination of data management and distributed systems engineering knowledge. The data management challenges include, among others, the development of new approaches and algorithms that can reduce the complexity of the data processing and allow incremental and continuous result production. Simultaneously, the sheer volume and velocity of the data require support of systems which can automatically and adaptively scale up and out in order to accommodate big data processing algorithms.

The focus of this workshop is on new cloud-based data management and processing systems which span tens of thousands of machines in order to support processing of contemporary, very large data sets. Such systems require new architectures, programming models and designs that go beyond approaches used in fixed-sized compute clusters. The focus of such systems is to support the work of users who interactively explore and analyze large and quickly changing data sets, in real-time. The right platforms and techniques can simplify and accelerate the design, implementation, and execution of new "big data" applications.

In the past, data processing in the cloud has been dominated by batch processing paradigms such as MapReduce, but increasingly users seek to consume their results in near real-time. In order to efficiently support these new types of applications, it is necessary to overcome challenges when supporting adaptive, near real-time processing of data in cloud environments. Ultimately, adaptive low-latency data processing across large number of machines brings a new set of problems related to systems, distributed systems, networking, fault-tolerance, and data management research.

The specific topics of interest for the workshop are:
Fault tolerance
Vertical and horizontal scalability
Elasticity and adaptive scheduling
Resource allocation and provisioning
Predictability in cloud environments
Multi-tenancy and virtualization
Elastic storage and networking
Adaptive data management
New processing paradigms
Programming models


Important Dates

Paper Submission deadline (extended) : Feburary 10, 2014 January 31, 2014
Notification of acceptance: March 5, 2014 February 24, 2014
Camera-ready deadline: March 28, 2014
Workshop: April 13, 2014


Submissions

Paper submission is now closed. Authors of accepted papers should follow the following guidelines when preparing their camera-ready version: camera-ready paper must be exactly 6 pages double column, including everything, i.e., figures, tables, references, appendices, etc. They should use a 10pt font by specifying \\documentclass[10pt,twocolumn]{sigplanconf}. Papers must be formatted according to the ACM SIGPLAN style, for which templates are available for both LaTeX and Word.

Authors of accepted papers can submit their camera-ready version by uploading a new version of their paper on EasyChair: https://www.easychair.org/conferences/?conf=clouddp14

Accepted papers will be published as part of the ACM Digital Library.


Program and Keynotes

Program

  • 08:00-09:00 Registration

  • 09:00-10:00 Keynote: Querying Distributed Data Streams. Minos Garofalakis (Technical University of Crete) -- see details below

  • 10:00-10:30 Research session I
    • Exploiting Cloud Heterogeneity for Optimized Cost/Performance MapReduce Processing. Zhuoyao Zhang (University of Pennsylvania), Ludmila Cherkasova (Hewlett-Packard Labs) and Boon Thau Loo (University of Pennsylvania)

  • 10:30-11:00 Coffee Break

  • 11:00-12:30 Research session II
    • Excalibur: An Autonomic Cloud Architecture for Executing Parallel Applications. Alessandro Ferreira Leite (Université Paris-Sud/University of Brasilia), Claude Tadonki (MINES ParisTech/CRI), Christine Eisenbeis (INRIA Saclay/Université Paris-Sud), Taina Raiol (University of Brasilia), Maria Emilia M. T. Walter (University of Brasilia) and Alba Cristina Magalhaes Alves De Melo (University of Brasilia)

    • Compressing Large Scale Urban Trajectory Data. Kuien Liu (Chinese Academy of Sciences), Yaguang Li (Chinese Academy of Sciences), Jian Dai (Chinese Academy of Sciences), Shuo Shang (China University of Petroleum) and Kai Zheng (The University of Queensland)

    • freeCycles - Efficient Data Distribution for Volunteer Computing. Rodrigo Bruno (INESC ID / Instituto Superior Técnico) and Paulo Ferreira (INESC ID / Instituto Superior Técnico)

  • 12:30-14:00 Lunch

  • 14:00-15:00 Keynote: The Present and Future of Big Data Systems and Their Management. Ludmila Cherkasova (Hewlett-Packard Labs) -- see details below

  • 15:00-15:30 Research session III
    • Scale-up Graph Processing in the Cloud: Challenges and Solutions. Jasmina Malicevic (EPFL), Amitabha Roy (EPFL) and Willy Zwaenepoel (EPFL)

  • 15:30-16:30 Coffee Break and Joint Poster Session for all EuroSys Workshops

  • 16:30-17:00 Research session IV
    • FileMap: Map-Reduce Program Execution on Loosely-Coupled Distributed Systems. Michael Fisk (Los Alamos National Laboratory) and Curtis Hash (Los Alamos National Laboratory)

  • 17:00-17:15 Closing remarks

  • 17:30-19:00 Reception

Keynotes

  • Morning keynote: Querying Distributed Data Streams. Minos Garofalakis (Technical University of Crete)
    • Abstract: Effective Big Data analytics pose several difficult challenges for modern data management architectures. One key such challenge arises from the naturally streaming nature of big data, which mandates efficient algorithms for querying and analyzing massive, continuous data streams (that is, data that is seen only once and in a fixed order) with limited memory and CPU-time resources. Such streams arise naturally in emerging distributed architectures, such as micro-cloud federations, where the resources of several, dispersed corporate cloud platforms are pulled together to enable the analysis of massive data sets. In addition to memory- and time-efficiency concerns, the inherently distributed nature of such applications also raises important communication-efficiency issues, making it critical to carefully optimize the use of the underlying network infrastructure. In this talk, we introduce the distributed data streaming model, and discuss recent work on tracking complex queries over massive distributed streams, as well as new research directions in this space.
    • Biography: Minos Garofalakis received the Diploma degree in Computer Engineering and Informatics from the University of Patras, Greece in 1992, and the M.Sc. and Ph.D. degrees in Computer Science from the University of Wisconsin-Madison in 1994 and 1998, respectively. He worked as a Member of Technical Staff at Bell Labs, Lucent Technologies in Murray Hill, NJ (1998-2005), as a Senior Researcher at Intel Research Berkeley in Berkeley, CA (2005-2007), and as a Principal Research Scientist at Yahoo! Research in Santa Clara, CA (2007-2008). In parallel, he also held an Adjunct Associate Professor position at the EECS Department of the University of California, Berkeley (2006-2008). As of October 2008, he is a Professor of Computer Science at the School of Electronic and Computer Engineering of the Technical University of Crete, and the Director of the Software Technology and Network Applications Laboratory (SoftNet). Prof. Garofalakis’ research focuses on Big Data analytics, spanning areas such as database systems, data streams, data synopses and approximate query processing, probabilistic databases, and data mining. His work has resulted in over 120 published scientific papers in these areas, and 35 US Patent filings (27 patents issued) for companies such as Lucent, Yahoo!, and AT&T. GoogleScholar gives over 8900 citations to his work, and an h-index value of 50. Prof. Garofalakis is an ACM Distinguished Scientist (2011), and a recipient of the IEEE ICDE Best Paper Award (2009), the Bell Labs President’s Gold Award (2004), and the Bell Labs Teamwork Award (2003). Check his homepage for more information.

  • Afternoon keynote: The Present and Future of Big Data Systems and Their Management.
    Ludmila Cherkasova (Hewlett-Packard Labs)
    • Abstract: A meaningful analysis of large datasets (Big Data) has become a significant computing challenge. New frameworks and systems have been proposed for Big Data processing. They target a variety of data (everything from business transactions to sensor data to tweets) and aim to offer new useful insights via advanced real-time analytics and/or batch-driven data analysis. The common theme of these underlying systems is that they represent a scale-out approach on commodity machines. Using a MapReduce framework I will present and analyze challenges in performance management of such systems. I'll talk on the community and enterprise efforts to design unified and/or integrated data processing frameworks that aim to simplify the application development and enhance data analytics. Finally, I will discuss hardware and resource usage patterns imposed by modern and emerging scale-out applications and their possible impact on the future system design.
    • Biography: Dr. Ludmila Cherkasova is a principal scientist in Systems Research Lab, at Hewlett-Packard Labs, Palo Alto, USA. Her current research interests include the analysis, design, and management of concurrent and distributed systems (such as emerging systems for Big Data processing, internet and media applications, virtualized environments, and next generation data centers). She has authored over 100 referred publications and more than 70 patent applications. She is an ACM Distinguished Scientist and recognized by the Certificate of Appreciation from the IEEE Computer Society. She earned 6 Best Paper awards. Her most recent works were on the design of performance analysis and optimization techniques for MapReduce environments.


EuroSys Workshops Joint Posters Session

This year, EuroSys workshops are grouping together for a common poster session to take place during the afternoon coffee break. Authors of all accepted papers at CloudDP and other participating workshops are encouraged to prepare a poster or a poster and a demo for publicizing their work to the attendees of other workshops. More details can be found on the dedicated Web page.


Workshop Venue

The CloudDP 2014 workshop is co-located with the EuroSys 2014 conference and will be held in the main building of the VU University Amsterdam. Please refer to the EuroSys 2014 local information pages for details about the venue and accommodation. The exact location of the workshop will be announced early March 2014.


Organization

Workshop Organizers and Program Chairs:

Zbigniew Jerzak (SAP AG, Germany)
Etienne Rivière (University of Neuchâtel, Switzerland)
Luis Veiga (Técnico Lisboa - ULisboa / INESC-ID Lisboa, Portugal)

Technical Program Committee:

Roberto Baldoni (University of Rome La Sapienza, Italy)
Paulo Ferreira (INESC ID, Portugal)
Christof Fetzer (TU Dresden, Germany)
Pascal Felber (University of Neuchâtel, Switzerland)
Andrew Pavlo (Carnegie Mellon, USA)
Guillaume Pierre (Université de Rennes 1, France)
Peter Pietzuch (Imperial College London, UK)
Marco Serafini (Qatar Computing Research Institute, Qatar)
Marc Shapiro (INRIA and LIP6, France)

Contact:

Please do not hesitate to contact the organizers if you have any questions.