Cineca, established in 1969, is a non-profit consortium counting 67 Italian Universities, 9 national research institutes and the Ministry of Education, University and Research (MIUR). Cineca is the national facility for supercomputing applications and one of the largest of its kind in Europe. The mission of the HPC department is to support scientific activity and innovation by providing high-performance computing and data services for the Italian and European research community. In 2019 the HPC department of Cineca supported 2317 research projects and directly participated in 35 EU research projects, 20 research agreements and 13 industry projects.
Emilia-Romagna today is one of the most developed regions in Europe, and it is surely a leading region in Southern Europe, in terms of competitiveness, GDP and activity and unemployment rates.
Manufacturing plays the leading role for the whole regional economy. It is concentrated on some powerful clusters, apparently belonging to traditional sectors, but able to activate medium and medium-high technology activities and high innovation capabilities. The most relevant group of industries are linked to mechanical engineering and automotive. We can list sport cars and motorcycles, agricultural machines, shipbuilding and offshore, industrial automation and robotics, various industrial sectors equipment (food processing and packaging, wood processing, ceramics, etc...), sensoristics and precision farming, medical equipment. Other powerful clusters are agri-food, construction materials and technologies, biomedical industries, fashion. Tourism and entertainment industries are very important in the coastal area. Cultural and gastronomic tourism is increasing in the last years.
Other original aspects of the region are: the huge (and unique in Italy) presence of the co-operative economy, especially in the agri-food, construction, logistics, retail, social economy, but also with successful manufacturing cases; widespread handcraft and micro enterprise tradition; increasing presence of multinationals by mergers and acquisitions of existing firms, and recently, by green field investments.
The region Emilia Romagna is characterized by a rich and dynamic research system made up of 4 public universities, prominent research infrastructures and facilities, and the territorial branches of national research centers, that are well integrated into the local industrial landscape mainly composed by SMEs.
Highly specialized competences in different scientific and industrial domains and in the usage of infrastructures actively collaborate with each other, in this Region, and this results in increased funding opportunities (an example is the Competence Center BI-REX, coordinated by the University of Bologna, which is one of the high specialization centers for Industry 4.0 selected by the Italian government in June 2018, in implementation of the National Plan for Industry 4.0), a more comprehensive and qualified supply of training programs, an effective local innovation ecosystem that benefits of the cross-fertilization of different domains of expertise.
For the mandate of the national Ministry of Education, University and Research (MIUR), Cineca represents Italy in PRACE, the pan-European ESFRI e-infrastructure for HPC. Cineca is one of the founding members of the ETP for High Performance Computing, an industry-led forum to define research priorities and action plans on a number of technological areas. Moreover, Cineca is the Italian representative in the pan-European EUDAT Collaborative Data e-Infrastructure, core partner of the Human Brain Project and a member of the BDVA (Big Data Value Association) initiative.
Cineca is involved in 35 on-going EU projects among which, in the Machine Learning and AI domain, AIDA, HiFiTurb, i-Media-Cities, Oprecomp, Highlander, EOSC-Hub, FF4EuroHPC, ICARUS (call ICT-14-2016-2017 - Big Data PPP: cross-sectorial and cross-lingual data integration), AI4EU (ICT-26-2018-2020 - Artificial Intelligence) and two ICT-11-2018-2019 - HPC and Big Data enabled Large-scale Test-beds and Applications: CYBELE and IoTwins.
At the national level, the ecosystem is developed with the close collaboration with two important innovation enablers, the Associazione Big Data (https://associazionebigdata.it/ ) and the BI-REX Competence Centre (https://bi-rex.it/).
The Big Data Association collects all the stakeholders involved in big data production and management in Italy to map and analyse their potential impact on scientific, social and business domains. The aim is to interconnect and jointly exploit the knowledge, capacities, research and innovation potentials of this community to leverage the effects of actions and investments made so far and to maximize their impacts, locally but also at national, EU and international levels. The association facilitates the sharing of results, the cooperation between public and private entities, joint actions and the development of skills in the big data and AI domains.
BI-REX is one of the six Italian Competence Centres for Industry 4.0 established in 2018, bringing together 57 public and private actors to assist companies and especially SMEs in the adoption of enabling Industry 4.0 technologies, with a special focus on Big Data, additive Manufacturing and new related business models. In addition to providing a physical industrial laboratory to carry out tests and trials, Bi-REX already funded 50 innovation projects. Cineca is in the BI-REX steering committee, participates in innovation projects, is peer-reviewer of on-going projects, provides training and services.
At the regional level, Cineca is partner in the Services Innovation Cluster (https://innovate.clust-er.it/) of the region Emilia Romagna, a private association between companies, research centers and training institutions that share skills, ideas and resources to support the competitiveness of the innovation in the service sector. Associations such as CLUST-ER, strengthen the local stakeholders collaboration and allow SMEs to access innovative technologies made available by research institutions and to collaborate in regional (POR-FESR), national and European projects with other companies for the development of new products.
The datacentre of the European Centre for Medium-Range Weather Forecasts (ECMWF), which is going to be set up in Bologna in 2021, will bring other prominent computing and analytics capabilities, with the potential of further exploiting available expertise, infrastructures and facilities, by generating new local economic and societal opportunities.
Infrastructures and services
At present Cineca hosts three main HPC systems namely MARCONI, MARCONI100 and GALILEO, integrated into a common working environment to enable easy access and portability of data across the different platforms.
MARCONI, is a Tier-0 level system constituted by 5900 nodes, each containing 2 x Intel Xeon E5-2697 v4 processor with a clock of 2.30 GHz. The nodes are connected via Intel Omnipath network. The total performance is 20 PFlops.
GALILEO is a Tier-1 level system constituted by 1022 nodes, each with 2 x 18-cores Intel Xeon E5-2697 v4 (Broadwell) at 2.30 GHz processor. The nodes are connected via the Infiniband network and have a peak performance of 2.3 PFlops. A small part of the GALILEO (~60 nodes) are equipped with nVidia K80 and V100 GPU.
MARCONI100 is based on the IBM Power9 architecture with NVIDIA Volta GPUs. Specifically, each node hosts 2x16 cores IBM POWER9 AC922 at 3.1 GHz with 256 GB/node of RAM memory and 4 x NVIDIA Volta V100 GPUs per node, Nvlink 2.0, 16GB. The number of nodes is 980, totalizing 31360 cores. Internal Network: Mellanox Infiniband EDR DragonFly+.
Leonardo, the new supercomputer pre-exascale class of high performance HPC cluster for research and innovation is in its implementation phase. Leonardo system has been approved by the European Joint Undertaking EuroHPC. It will be a supercomputer operating at 250 petaflops at least (0,25 exaflops). The speed of exascale computers will be measured in exaflops, the ability to perform one quintillion (i.e. million trillion) calculations per second.
The HPC cloud infrastructure stands out by providing 80-state-of-the-art node servers for high performance calculations each equipped with 2 Broadwell 18-core processors (E5-2697 v4) and 250 GB of DDR4 RAM and interconnected via a capable 25 Gb/s Mellanox Ethernet network. Complete the infrastructure with a dedicated CEPH storage of 200 TB capacity in high availability (RAID6). This cloud infrastructure is tightly connected to a GSS storage of 6PB seen by all other infrastructure. This setup enables the use of all available HPC systems, addressing HPC workloads in conjunction with cloud resources.
The following data storage facilities are available:
- Scratch: each system has its own local scratch area (pointed by $CINECA_SCRATCH env variable)
- Work: working storage is mounted to the three systems (pointed by $WORK env variable)
- DRES: a shared storage area is mounted on all machine's login-nodes (pointed by $DRES env variable)
- Tape: a tape library (12 PB, expandable to 16PB) is connected to the DRES storage area as a multi-level archive (via LTFS)
Software libraries: Cineca offers a variety of third-party applications and community codes installed on its HPC systems.
The main service provided by Cineca is the access to computational (and storage) resources.
Other services include consultations, training and collaboration on projects (PoC).
Regarding the infrastructure, Cineca offers a flexible platform to accommodate different kinds of users with different requirements and usage models (from GPU intensive workloads on the HPC infrastructure in batch mode, to the more interactive containerized usage of the resources, to the platform as a service available on cloud computing) and different options for data storage (from the data lake, unstructured storage, to data base with metadata management, to repositories). The support service helps users to identify the most suitable resources, tools and services, depending on the user level of experience and requirements.
Cineca consultancy service is a vendor independent service that aims at solving a specific business (or industrial) problem. In the initial, experimental, phase, it may be configured as a Proof-of-Concept developed to demonstrate the solution. Two main broad classes of business problems, that involve different Cineca staff expertise are:
- how can a company benefit from data driven applications (data value demonstration)
- how can a data driven application be scaled (performance improvement)
In the first case, a Data Science consultancy is provided that can cover any phase of a data value extraction process, with the aim to facilitate the application of Data Science methods and best practices and their uptake by the companies.
Depending on the level of expertise of the customer, Data Science consulting may be limited to the suggestion of the most suitable tools, or to the implementation of a specific phase in the data analytics process (e.g. Machine Learning), up to the implementation of the whole process of data value extraction (business understanding, data acquisition and exploration, quality assessment, data transformation, visualization and analysis, model building, model evaluation, deployment).
This consulting service is cross-domain and over heterogeneous data sources, comprising textual, audio and video data. In particular, customers are supported in:
collection, preparation, annotation, metadata management, repository, curation, linking, sharing, security, access control, long-term preservation, post-processing
Pre-processing: data acquisition and integration, data anonymization, data exploration, outlier detection, missing data estimation, feature extraction, dimensionality reduction, structuring and cleaning data, data transformation, semantic metadata generation
Machine Learning (Supervised and unsupervised learning): clustering techniques (partitioning methods, hierarchical clustering, fuzzy clustering, density-based clustering, model-based clustering), association rules, sequential patterns, recommenders, graph analysis, predictive modelling (decision trees, random forests, neural networks, DNN, KNN, SVM, naive Bayes, regression, regression trees, symbolic regression), econometric models, indicators, Natural Language Processing, Named Entities Recognition, Information Extraction, Sentiment Analysis, text mining, semantic annotation, speaker segmentation, Automatic Speech Recognition, keyframe extraction, image recognition.
Remote visualization, Computer vision and visual computing, Computer Graphics, 3d modelling and rendering, Immersive device programming, Render farm service, Virtual Reality, Augmented Reality
Technical consulting is provided when an application (or part of it) has already been developed and needs to be adapted to run on HPC resources to improve performances or scale up. It is provided by HPC technology experts on three main areas:
- Algorithms optimisation and parallelization, to make a customer application run faster,
- Parallel IA application, to improve efficiency of model training on multiple compute unit
- Containerization, the configuration and deployment of workflows for Big Data in container
The training service aims at making the company autonomous in the development of data-driven applications and in the usage of HPC resources.