D1.1. Requirement Description
Abstract: This document provides an overview of the progress of the work done during the first 22-month period of the Project BigStorage (from 01-01-2015 until 30-09-2016) with respect to WP1.During this period, ESRs have organized themselves into four working groups, where each of these groups analysed the requirements of the for main use cases of the project: Human Brain project, Square Kilometre Array, Climate modelling, and Smart Cities. In this deliverable we briefly describe each of the projects and list the requirements each of these projects have in the areas of covered by the ETN: Storage, IO, analysis, etc. For each requirement we give information about the requirements, it potential evolution, and sources of information where the reader can get deeper insight of the requirement. Finally, also for each project and requirement, we list what ESRs are working on solution that may cover fully or partially the listed requirements.
Document: Deliverable 1.1 (pdf)

D1.2 Benchmarks defined & tested

Abstract: This deliverable presents a set of benchmarks that represent the four use cases: The Human Brain project, the Climate modelling, the Square Kilometre Array, and smart cities. These benchmarks are defined in enough detail for ESRs to use them to test their progress. For each initiative and benchmark, we have detailed why it is important (including the requirements they fulfil), how to use and parametrize the benchmark, as well as some initial results that can help to understand the progress of ESRs in the near future. In order to model the different use cases, we have defined (or adopted) 1 benchmark for the human brain project, 8 benchmarks for square Kilometre array, 5 benchmarks for climate, and 9 benchmarks for smart cities. With these 23 benchmarks, we cover most for the relevant requirements defined in D1.1.

Document: Deliverable 1.2 (pdf)

D2.1. Intermediate Report WP2 (Data Science)
Abstract: This report presents an overview of the major systems that have been studied by the Early Stage Researchers (ESRs). This covers the following topics: (i) MapReduce, the de facto state-of-the-art programming model for Big Data processing and its reference implementations (Google MapReduce and Hadoop); (ii) Performance-optimized MapReduce processing based on in-memory storage (Spark, Flink); (iii) More generic models for BigData processing, supporting SQL-like queries, data streams, more generic workflows, general graph processing; and (iv) Systems leveraging machine-learning for Big Data processing. This report presents an overview of each of these topics, including a brief analysis of the state-of-the-art in each case and a preliminary presentation of the specific problems to be addressed in the project.
Document: Deliverable 2.1 (pdf)

D3.1. Intermediate Report WP3 (HPC-Cloud convergence)
Abstract: Although available storage mechanisms in HPC and Cloud environments significantly differ from their design as well as the services they deliver (files vs. key/value systems, hierarchical vs. flat namespaces, semantics) understanding  the  specific  techniques  used  in  both  areas  is  a  key  element  to  propose  new  storage  systems  offering  the  best  of  both  worlds.   The activities that have been conducted in WP3 focused on understanding the specific techniques used in HPC and Cloud areas. 
Document: Deliverable 3.1 (pdf)

D4.1. Intermediate Report WP4 (Storage Solutions)
Abstract: The WP4 covers problems related to device technology and how these affect storage architecture and the storage stack. WP4 categorizes issues and tasks as: (i) T4.1 Storage acceleration: How modern storage systems can take advantage of heterogeneity and locality to improve performance of applications; (ii) T4.2 Storage convergence: How storage systems can support different types of applications, especially where big data is involved, converging different uses of storage over the same platforms; and (iii) T4.3 Storage isolation: How we can infer system behavior in mixed environments, where there is heterogeneity in terms of storage technologies but also the applications that use storage. This report presents an overview of each of these topics, including a brief analysis of the state of the art in each case and a preliminary plan of the specific problems to be addressed by each Early Stage Researcher (ESR).
Document: Deliverable 4.1 (pdf)