TeraLab offers its secure, neutral, and trusted data infrastructure for secure processing and sharing of data. It includes setting up highly customized virtual machines (CPU, RAM, storage..), installing necessary software and tools, and providing such services as access control, monitoring, data storage and backup, and account management. These virtual machines, or workspaces, can have three level of autonomy (administrated, autonomous, and turnkey), depending on the project needs, the client’s level of expertise, and the type of data (e.g. sensitive or confidential).
The environment that TeraLab provisions for a given project is called workspace, which is a dedicated network of virtual servers. The workspace is highly customizable. The main parameters are:
The number of Virtual servers (VM – virtual machine)
For each VM
- The CPU, RAM and Hot storage (fast storage – used during execution) sizes. Upon request CEPH distributed storage can be provided (in all 3 modes: block, file or object);
- The operating system (OS): the OS is obligatorily linux (Ubuntu, Debian, CentOS, RedHat are supported)
- A set of software, as specified by the datascientist end user.
- At least one project account (the “ProjAdmin” account).
- When needed, the sizing of GPU resources: number of boards, number of cores, RAM size
Cold storage size: the size is specified for each groups of storage
The firewall rules and the proxy/reverse proxy configurations
Three workspace types, specified during the preparation phase of the project: administrated, autonomous, and turnkey.
- Autonomous workspace: this offer targets projects where team members are skilled in system administration. The principle is to delegate a significant portion of the administration activity to these project members, called ProjAdmins).
- Turnkey workspace: this offer targets projects with insufficient system administration skills. Operations and securing the workspace are fully performed by the TeraLab team. Users legally commit to strictly follow the procedure when extracting and exporting data from the workspace
- Administrated workspace: This offer targets projects requiring a high level of assurance that project data will stay within the workspace. Beyond the legal commitment made by the users, means are implemented to technically prevent the extractions of data. This offer provides the core framework of the “Personal Health Data” use cases.
Whenever possible, TeraLab favors the most popular applications of the field as well as libre offers. This deliberate policy allows users to understand fast its environment and to have many support tutorials. Examples include SSH access to virtual servers; using git, jupyter lab or notebook, scikit-learn/python, Hadoop/Spark ....
This policy does not exclude the installation of non-free software (in this case, the user is asked to provide the commercial licenses).
SPECIAL ACCESS CONDITIONS
Setting up a workspace to share and process industrial data from different sources (from the client, its providers, and open data) to work on optimise predictive maintenance models.