Current Challenges in Web Crawling
Abstract. Web crawling, a process of collecting web pages in an automated manner, is the primary and ubiquitous operation used by a large number of web systems and agents starting from a simple program for website backup to a major web search engine. Due to an astronomical amount of data already published on the Web and ongoing exponential growth of web content, any party that want to take advantage of massive-scale web data faces a high barrier to entry. In this tutorial, we will introduce the audience to five topics: architecture and implementation of high-performance web crawler, collaborative web crawling, crawling the deep Web, crawling multimedia content and future directions in web crawling research.
Denis Shestakov is a postdoctoral researcher at the Department of Media Technology, Aalto University, Finland. He spent one year as a visiting researcher at INRIA Rennes, France. Denis obtained his doctoral degree at University of Turku, Finland in 2008. In his doctoral work, Denis addressed the limitations of web crawlers, specifically the poor coverage of information available in online databases (a.k.a. the deep Web). His current research interests lie in the area of distributed algorithms for big data processing, with particular applications in web crawling and large-scale multimedia retrieval. Denis is maintaining an open group on research works in the area of web crawling (see here). Contact him at email@example.com or visit his homepage.
Enterprise Application Integration - the Cloud Perspective
Jörg Lässig and Markus Ullrich
Abstract. So far, asynchronous messaging has proven to be the best strategy for enterprise application integration (EAI) success. However, building and deploying messaging solutions causes several problems for developers and new technologies and computing paradigms as cloud computing demand for new solutions. There are more than sixty enterprise integration patterns that are designed to effectively develop messaging solutions for enterprises. The tutorial introduces the visual notation framework to describe large-scale integration solutions across different systems and technologies. This includes examples covering a variety of different integration styles and techniques. In a case study we illustrate the application of the patterns in practice and review existing and emerging standards. Also we try to shed light into the future of EAI. In particular cloud integration is an upcoming trend which is discussed in the tutorial, addressing advantages and limitations of this and other modern EAI strategies and architectures. Looking at open-source solutions for enterprise service buses and messaging systems, we also provide practical advice on designing code that connects an application to a messaging system. This provides information to help the practitioner to design EAI or cloud integration solutions by applying the introduced knowledge.
Jörg Lässig is a Full Professor in the field of Enterprise Application Development at the Department of Electrical Engineering and Computer Science at the University of Applied Sciences Zittau/Görlitz since 2011. He holds degrees in Computer Science and Computational Physics and received a Ph.D. in Computer Science for his research on efficient algorithms and models for the generation and control of cooperation networks at Chemnitz University of Technology, which he finished in 2009. Afterwards he has been participating in various research projects at the International Computer Science Institute at Berkeley, California and at the Universit‡ della Svizzera italiana in Lugano, Switzerland. He is currently focusing on various topics in the context of sustainable information technologies and applications which includes the directions sustainability in enterprise IT, green information systems, logistics and supply and business intelligence.
Markus Ullrich is currently a research associate at the University of Applied Sciences Zittau/Görlitz where he received his M.S. and B.S. in Computer Science in 2010 and 2012, respectively. From 2009 to 2012, he worked as a software developer at the Decision Optimization GmbH where he developed and tested data mining algorithms for predictive maintenance. His current research is focused on cloud computing, cloud integration, distributed systems and privacy preserving data mining.
An Introduction to Human Computation and Games With A Purpose
Alessandro Bozzon and Luca Galli
Schedule and tutorial slides
Abstract. Crowdsourcing and human computation are novel disciplines that enable the design of computation processes that include humans as actors for task execution. In such a context, Games With a Purpose are an effective mean to channel, in a constructive manner, the human brainpower required to perform tasks that computers are unable to perform, through computer games. This tutorial introduces the core research questions in human computation, with a specific focus on the techniques required to manage structured and unstructured data. The second half of the tutorial delves into the field of game design for serious task, with an emphasis on games for human computation purposes. Our goal is to provide participants with a wide, yet complete overview of the research landscape; we aim at giving practitioners a solid understanding of the best practices in designing and running human computation tasks, while providing academics with solid references and, possibly, promising ideas for their future research activities.
Alessandro Bozzon is an Assistant Professor at the Delft University of Technology, with the Web Information Systems group. His research interests are into the fields of data and information management on the Web, with specific focus on Semantic Web technologies, human- and social-computation, and data integration. His current research aims at defining a foundational theory for hybrid human and automatic information management systems, by studying the theoretical models and the technical means to achieve this integration.
Luca Galli is a Phd Student at Politecnico di Milano. His research interests involves Data Mining and Text Mining, Human and Social computation, Game Design and video games development technologies (innovative middleware architectures, game engine architecture, multi platform deployment). His current research aims at integrating traditional game paradigms and gamification techniques in the design and implementation of human enhanced applications. He is actively involved in the European CuBRIK project, where he investigates GWAPs and engaging techniques that can be used to drive the users’ entertainment in order to solve media refinement tasks.
Responsive Design and Development: Methods, Technologies and Current Issues
Michael Nebeling and Moira C. Norrie
Schedule and tutorial slides
Abstract. Responsive design is a major trend in web development to cater for the diversity of devices used for web browsing. However, applying responsive design to existing web sites often involves major reengineering due to the underlying fluid grid concept. Moreover, applications of responsive design are currently limited to desktop-to-mobile adaptation. This tutorial introduces the main ideas behind responsive design with a focus on the methods and technologies. Based on previous research, we highlight several limitations of the original approach and show how the concepts and methods can be extended to adapt to many different viewing conditions including large-screen settings and touch devices.
Michael Nebeling is a Post-doctoral Researcher and Lecturer at ETH Zurich. His research and teaching interests are at the intersection of Web Engineering and HCI, including context-aware and adaptive systems, multi-device and gesture-based interaction, end-user development and crowdsourcing. His PhD thesis, Lightweight Informed Adaptation: Methods and Tools for Responsive Design and Development of Very Flexible, Highly Adaptive Web Interfaces, has made several contributions to ICWE and has won best paper awards and nominations at CHI 2011 and EICS 2012.
Moira Norrie has been a Professor at ETH Zurich since 1996 when she established a research group on Global Information Systems. She heads the Institute for Information Systems which is part of the Department of Computer Science. Her main areas of research are information systems engineering, information interaction, web engineering and personal information management.