Home >The Catalogue>Services> The Data Cycle Hub, Access and support to Big Data/AI stack environment
SERVICES

The Data Cycle Hub, Access and support to Big Data/AI stack environment

TYPE
Technology provision
REGION
Valencia
LANG
English, Spanish

Access and support to a ready-to-use Big Data and AI experimental environment based on open source technologies as Spark and Jupyter, for developing/deploying big data processing and analytics applications. This service provides access to this infrastructure, together with experimentation support services.

SERVICE DESCRIPTION

This service brings access to a data processing experimentation environment based on open source technologies. In particular, it deploys a system consisting in a Jupyter instance and one or several virtual machines hosting a Spark service. The user specifies the number of machines and the required computing resources.

Features:

  • The Spark node cluster is provisioned as fully configured, with the corresponding interconnected master and workers (but both the number of nodes and necessary resources of each one are defined by the user).
  •  In all spark nodes, HDFS is installed, allowing accessibility to datasets in an experiment.
  • Password-secured.
  • Some sample notebooks are provided including basic instructions to exemplify HDFS interaction and spark session and context initialization.
  •  The technology is deployed in an automated way from ansible playbooks, creating its own security group and network interface to be used in a cluster deployment.
  • Python libraries available to be installed.
  •  Cluster monitoring.

The service includes technical support for the configuration and deployment of the experimentation infrastructure, as well as issue reporting.

SPECIAL ACCESS CONDITIONS

Access is provided by SSH

PREREQUISITES

The user must be registered

CASE EXAMPLES

A user has a large dataset that needs to be analyzed and does not have the required computing resources not the know-how to deploy a Spark service. This service will provide a ready-to-use Big Data analysis stack (with Spark), and then the data can be uploaded and the analysis performed using Jupyter notebooks.

SERVICE OFFERED BY

MEMBER
The Data Cycle Hub
TYPE
DIH
COUNTRY
Spain

MORE INFORMATION ABOUT THIS SERVICE

Data Controller: INSTITUTO TECNOLÓGICO DE INFORMÁTICA (G96278734)
Purposes and legal basis: We will use your personal data to contact you back and answer your inquiries and provide you with information regarding our activity and in connection with our developments, research and services.
Data recipients: Your personal data will only be shared with the DIH your inquiry or information request may concern.
Rights: Regarding your personal data you have the right to access, rectify, erase, data portability, restrict processing , object, consent withdrawal and to file a complaint before the Supervisory Authority. More info
Exercise of rights: You can exercise the aforementioned rights by sending an e-mail to the e-mail address: dpo@iti.es or by sending a letter to the address Camino de Vera s/n, CPI Edif. 8, Acceso B, 46022 Valencia (Spain).