During CHEP2003 the following persons discussed the current status of Glue
Schema for the Storage Element against the development of the SRM project. This
was done in different rounds with different persons. This is the whole list:
Sergio Andreozzi (CNAF, DataTAG WP4)
Brian Tierney (LBL, Berkeley)
Arie Shoshani (LBL, Berkeley)
Don Petravick (FermiLAB)
Michael Ernst (FermiLAB)
Olof Barring (CERN)
John Gordon (RAL)
Andy Kowalski (JLAB)
This is the list of the discussed issues and proposed solutions:
- STORAGE SPACE
- new attribute OWNER (unique id of either the VO, group of users or single
user who asked and has been acknowledged for a storage space), the suggestion
is to directly publish the info on a coarse-grain base (e.g. per-vo or per-group)
while gather the fine-grain info using an API. This mostly apply when the
Storage Service is an SRM.
- ACCESS PROTOCOL
- for scalability issue, in a storage system, head nodes are replicated
to be able to manage more parallel connections (FermiLab case study presented
by Don Petravick); this raise the need of publishing not only data
access protocol port and type, but also the related hostname (head node).
Since SRM might balance the load on the available connections, we need to
separate the two cases:
- in case of SRM, only type, version and supported security of access
protocol should be advertised; the SRM will select the one to be used
by a certain user on a dynamic base depending on the load of the system
- in case of no SRM, the complete list of available protocol,port,version,security
and hostname must be advertised
- in this case we need to add HostUniqueID attribute to the access protocol
class
- ACCESSTIME: do we need this attribute? How do we compute this?
- STORAGE SPACE->POLICY
- the QUOTA assigned for a storage space can be of two types (A.Shoshani):
- GUARANTEED: the user will always be able to use up to the assigned quota
(it might be charged for this)
- BEST EFFORT: the user might find all the assigned space or not, depending
on the global load of the system
- need to split QUOTA attribute in GUARANTEEDQUOTA and BESTEFFORTQUOTA
- the storage space can have a duration: need for DURATION attribute (A.
Shoshani)
- STORAGE SPACE->STATE
- when the StorageService is just file system access, we need USEDSPACE
attribute for the space in order to understand how much a certain user owning
the space is utilizing; if all users are mapped into the same local account,
available space and quota is equal for everyone;
(already present in version 1.1)
- STORAGE SERVICE:
- we need to add the following attributes:
- TYPE: (e.g. SRM, file system,...)
- VERSION
- SupportedFileLifeTime (Volatile, Durable, Permanent)
- Don Petravick suggested to add a State attribute for the system that can
be either:
- a three state attribute (idle, moderately busy, busy)
- an integer in the range of [0,100]
- STORAGE LIBRARY:
- the name is not proper, since usually it reminds to MSS; in this context,
the related entity is the front-end machine offering the service
- proposal: StorageAccesNode (other ideas?)
- ARCHITECTURE attribute, what are the possible values?
- disk, tape
- Castor, ...
- the FILE class
- a lively discussion regarding this element took place; I try to summarize
the main achievement: the Glue Schema models entities that which instances
description are accessible through the Grid Information Service (GIS);
the presence of the File class is misleading against this general view.
Nobody wants to publish the metadata related to file instance in the GIS,
but this are info that should be available throught an API; this lead
to the distinction of:
- materialized objects: the ones that are published by the GIS
- virtual objects: the ones that should be modelled, but accessible
through API
- we need to find a way for separating these two situation in order to
avoid confusion:
- SOLUTION A: move the File class in a different schema
- SOLUTION B: use different colors for those class which object instances
will be published by the GIS and those class which objects instances
will be accessed through API; in this case the API should be advertised
by the GIS
- StorageService->CurrentIOLoad
- (F. Carminati) "CurrentIOLoad is very ill-defined. By the name
I would guess it is MB/s.By the way, which queue?"
Last revision: 02/04/2003