GLUE Schema for the Grid Storage Service

version 1.1

FINAL

Last revision: 12 March 2003


1. Terms and concepts

Definition 1. Storage Space: portion of a logical storage space identified by:

  1. an association to a directory of the underlying file system (e.g. /permanent/CMS)
    (all the assigned quota can be managed by users as they like, e.g. creating subdir)
  2. a set of policies (MaxFileSize, MinFileSize, MaxData, MaxNumFiles, MaxPinDuration, Quota)
  3. an association to a VO + set of VO speficic autorizathion policies (e.g. to privilege some user against some other)

In case of overlapping with policies from parent directories, the "nearest one" in the hierarchy is applied (e.g. / has assigned 1 GB MaxFileSize; /permanent has assigned 1,5 GB MaxFileSize; /permanent/CMS no assignemend; for file to be stored in /permanent/CMS, the max file size is 1,5 GB)

Definition 2. Permanent file type: a file stored in a Storage Space that can be removed only by the owner or by the system administrator [1]

Definition 3. Volatile file type: a file stored in a Storage Space that can be removed by the Storage Service when space is needed. A volatile file is pinned in the cache for a certain “lifetime” period. The length of the “lifetime” is the choice of the Storage Service Administrator or the Storage Service’s policy. Usually, a file is expected to be “released” or “unpinned” by the client before its lifetime expires. Provisions can be made for extending the pinning of a file, but we felt that honoring pinning extension requests should be an implementation choice as well. [1]

Definition 4. Durable file type: a file stored in a Storage Space that is intended to be removed as soon as possible, but should not be deleted by the Storage Service. It has a “lifetime” associated with it (perhaps longer than that of a volatile file), but when its lifetime expires a system administrator is alerted. Similar to a permanent file it can be only removed by the owner or the administrator. Thus, the concept of a “durable” file has the features of both volatile and permanent files.

The need for a “durable” file status was inspired by the scenario of files generated by some compute resource, and there is a need to temporarily store them in a shared space before they are archived. Normally, the files are stored in the shared space as “durable”, and then scheduled to be archived on some other archival storage system. After the files are archived, they are released either automatically by the archiving Storage Service or by the client. In case that the client neglects to release them, an administrator is alerted when the lifetime expires. [1]

Definition 5. Storage Service:

  1. grid service identified by a URI that manages disk and tape resources in term of Storage Spaces.
  2. each Storage Space is associated to a Virtual Organization and a set of VO-specific policies (sintax and semantic of these to be defined).
  3. all hardware details are masked.
  4. the Storage Service performs file transfer in or out of its Storage Spaces using a specified set of third part data movement services (e.g. GridFTP).
  5. files are managed in respect of the lifetime policy specified for the Storage Space where they are kept; a specific date and time lifetime policy can be specified for each file and this is applied against the following compatibility rules:
  Date&Time <= current time Date&Time>current time
Permanent File can be removed File is kept
Durable File can be removed File is kept
Volatile File can be removed File can be removed

This will help a better management of the space; I can "automatically" free space when I want, but I cannot break assigned policies (keep a file longer than it is allowed).

Example of Storage Services "back-end": SRM, Castor, JASMine, Enstore

2. Attributes description

StorageService:

 

UniqueID

 Unique ID of the storage service (URI)

Name a name for the service, does not need to be unique
InformationServiceULR URL of the local information service providing for info about this entity

Port

 port number that the service listens on

 

State  

CurrentIOLoad

system load (eg: number of files in the queue), normalized in the interval [0,1]

 

Policy

 

MaxFileSize

    maximum size for any single file (Bytes)

MinFileSize

    minimum size for any single file (Bytes)

MaxData

    max amount of data that may be stored by 1 job (Bytes)

MaxNumFiles

    max number of files which may be stored by 1 job

MaxPinDuration max number of seconds a file may be pinned

Quota

    assigned space (bytes); 0 means infinite (e.g. can be used in tape systems)

FileLifeTime Lifetime policy to be applied to the contained files (Permanent, Durable or Volatile)

 

AccessProtocol:

  Protocol specific information

Type

  NFS, AFS, GridFTP, RFIO, etc.

Port

  Control Port number for this protocol

Version

  The protocol version

AccessTime  
SupportedSecurity

  The security features the protocol can deal with

 

Storage Space: State  
UsedSpace Used space (in kilobytes)

AvailableSpace

Available space (in KiloBytes), might be shared

 

AccessControlBase

 

Rule[*]

list of access control rule (sintax to be defined; can be used for per-user policies or per-vo)

The StorageLibrary is used to describe hardware resources.

StorageLibrary:

the machine providing for the storage service

Name

the name for the storage library

InformationServiceURL URL of the local information service providing for info about this entity

UniqueID

the Unique ID for the storage library

The file system class can be specialized in REMOTE (for remote directory locally mounted) or LOCAL (for local directory); each local file system can contains directories. Each directory can be associated to a Storage Space.

FileSystem  
Root path name or other information defining the root of the file system
Name the name for the file system
Type the file system type (e.g. NFS, AFS)
ReadOnly is the file system readonly?

Size

Total space assigned for this file type (MB)
AvailableSpace Total available space for this file type (MB)

 

File  
Name Name for the file
Size File size in bytes
CreationDate File creation date and time
LastModified Last modified date and time

LastAccessed

Last access date and time
Latency Time taken to access file in seconds
LifeTime Date and time after which the file can be canceled
Owner Name of the owner for the file

 

Performance  
MaximumIOCapacity
 maximum possible bandwidth out of service on to network (may be limited by the network or disk or tape bandwidth; ie: limited by the hardware used)

 

Architecture:

 

Type

 storage hardware type: e.g.: disk, disk array, tape system

Open issue: where do attributes such as number of files the HPSS tape queue belong?

Open issue: how to describe a SE Service that is basically a gateway to multiple other SE Services at a given site?

UML Storage Service Class Diagram

GLOSSARY

DRM Disk cache system
TRM Tape archiving system
HRM Hierarchical Storage System
SRM Storage Resource Manager
   
   

REFERENCES

[1] SRM Joint Functional Design. Summary of Recommendations. January 2002
[2] The Storage Resource Manager Interface Specification. Version 2.0
[3] CERN Advanced STORage Manager (CASTOR)
[4] Jefferson Lab Asynchronous Storage Manager (JASMine)
[5] Enstore
[6] HPSS