Publications by Type: Book Chapter

2013

S. Idreos, “Big Data Exploration,” in Big Data Computing, Taylor and Francis, 2013.Abstract

We are now entering the era of data deluge, where the amount of data outgrows the capabilities of query processing technology. Many emerging applications, from social networks to scientific experiments, are representative examples of this deluge, where the rate at which data is produced exceeds any past experience. For example, scientific analysis such as astronomy is soon expected to collect multiple Terabytes of data on a daily basis, while already web-based businesses such as social networks or web log analysis are confronted with a growing stream of large data inputs. Therefore, there is a clear need for efficient big data query processing to enable the evolution of businesses and sciences to the new era of data deluge. In this chapter, we focus on a new direction of query processing for big data where data exploration becomes a first class citizen. Data exploration is necessary when new big chunks of data arrive rapidly and we want to react quickly, i.e., with little time to spare for tuning and set-up. In particular, our discussion focuses on database systems technology, which for several decades has been the predominant data processing tool. In this chapter, we introduce the concept of data exploration and we discuss a series of early techniques from the database community towards the direction of building database systems which are tailored for big data exploration, i.e., adaptive indexing, adaptive loading and sampling-based query processing. These directions focus on reconsidering fundamental assumptions and on designing next generation database architectures for the big data era.

BigDataExploration.pdf

2006

Z. Kaoudi, I. Miliaraki, M. Magiridou, E. Liarou, S. Idreos, and M. Koubarakis, “Semantic Grid Resource Discovery using DHTs in Atlas,” in Knowledge and Data Management in Grids, Springer, 2006.Abstract

We study the problem of resource discovery in the Semantic Grid. We show how to solve this problem by utilizing Atlas, a P2P system for the distributed storage and retrieval of RDF(S) data. Atlas is currently under development in project OntoGrid funded by FP6. Atlas is built on top of the distributed hash table Bamboo and supports pull and push querying scenarios. It inherits all the nice features of Bamboo (openness, scalability, fault-tolerance, resistance to high churn rates) and extends Bamboo's protocols for storing and querying RDF(S) data. Atlas is being used currently to realize the metadata service of S-OGSA in a fully distributed and scalable way. In this paper, we concentrate on the main features of Atlas and demonstrate its use for Semantic Grid resource discovery in an OntoGrid use case scenario.

SemanticGridchapter.pdf

P. - A. Chirita, S. Idreos, M. Koubarakis, and W. Nejdl, “Designing Semantic Publish/Subscribe Networks Using Super-Peers,” in Semantic Web and Peer-to-Peer, 2006, pp. 159-179.Abstract

Publish/subscribe systems are an alternative to query-based systems in cases where the same information is asked for over and over, and where clients want to get updated answers for the same query over a period of time. Recent publish/subscribe systems such as P2P-DIET have introduced this paradigm in the P2P context. In this chapter we built on the experience gained with P2P-DIET and the Edutella super-peer infrastructure and present a semantic publish/subscribe system supporting metadata and a query language based on RDF. We define formally the basic concepts of our system and present detailed protocols for its operation.

SemanticWebP2P.pdf

Stratos Idreos

Gordon McKay Professor of Computer Science

Publications by Type: Book Chapter