Data storage is a big challenge in large-scale computing. Increase in data volumes and computing power requires an adequate increase in performance and reliability of data storage systems. The amount of data currently stored and being processed at CNAF is tens of PetaByte and is expected to double in the next 5 years.
The Data Storage infrastructure at CNAF is based on industry standards, both for physical inter-connection level using Storage Area Network based on Fiber Channel protocol, and for logical data access level using high-performance clustered parallel file systems (typically one per major experiment).
This approach has allowed the implementation of a high-performance data access system that is completely redundant from the hardware point of view.
Magnetic tapes are used as the main long-term storage medium. To access the files, special robotic arms find the right tapes and load them into the tape drives. More than 50 million files of experimental data are stored in the data centre.
Both disk space and tape space are structured as a Hierarchical Mass Storage system, and managed by GEMSS, the Grid Enable Mass Storage System, an home made integration of IBM General Parallel File System (GPFS) with the IBM Tivoli Storage Manager (TSM). An Oracle clustered database infrastructure is deployed for relational data storing/retrieving.
Disk and tape storage services, together with the data transfer services, are operated by the Data storage group of Tier-1 data center unit.