The GRID is
going to be populated of a high number of Computing Services (CE) and Storage
Services (SE) each of them identified by a unique ID. A service request (job
execution) to a resource broker, can require some input file to the job. For
this file a logical name is provided (LFN) and replicas can be spread all over
the GRID in some storage service.
Once the broker gets the list of all physical replicas and their location, it
is asked to bind one of them with a suitable computing service for the
specified job. One possible strategy is to get a list of all CE’s that are
accessible by the service requestor (authorization) and that match the job
requirements. Once this list is created and ordered using some rank parameter,
we would like to select the best CE (top of the list) that has access to the best
replica.
To select the best replica of a file for a certain CE is not at all a trivial issues. The choice could be made on:
All of these parameters currently cannot be gathered from the MDS. Therefore the temporary solution provided to have a matching criteria is to explicitly publish a CE-SE binding that is meant to be used by the broker to perform the needed matching.
For each CE, there is a two-level binding advertisement. At the first level (Group Level), a list of the SE unique ID is provided. At the Single Level (for each SE), specific CE-SE attributes can be provided. At the moment, at the single level, only the mount point for a CE is published. This in case that the SE is locally accessible by jobs running on that CE.
To have a practical understanding of this solution, have a look to the following ldif example. In this example, CE which unique ID is grid006f.cnaf.infn.it:2119/jobmanager-pbs-workq is bound to SE’s edt004.cnaf.infn.it and grid025.pd.infn.it. The first one is accessible through an NFS mounted directory in the CE called /shared
dn: GlueCESEBindGroupCEUniqueID=grid006f.cnaf.infn.it:2119/jobmanager-pbs-workq, mds-vo-name=local, o=grid
objectclass: GlueGeneralTop
objectclass: GlueSchemaVersion
GlueSchemaVersionMajor: 0
GlueSchemaVersionMinor: 1
objectclass: GlueCESEBindGroup
GlueCESEBindGroupCEUniqueID: grid006f.cnaf.infn.it:2119/jobmanager-pbs-workq
GlueCESEBindGroupSEUniqueID: edt004.cnaf.infn.it
GlueCESEBindGroupSEUniqueID: grid025.pd.infn.it
dn: GlueCESEBindSEUniqueID=edt004.cnaf.infn.it, GlueCESEBindGroupCEUniqueID=grid006f.cnaf.infn.it:2119/jobmanager-pbs-workq, mds-vo-name=local, o=grid
objectclass: GlueGeneralTop
objectclass: GlueSchemaVersion
GlueSchemaVersionMajor: 0
GlueSchemaVersionMinor: 1
objectclass: GlueCESEBind
GlueCESEBindCEUniqueID: grid006f.cnaf.infn.it:2119/jobmanager-pbs-workq
GlueCESEBindSEUniqueID: edt004.cnaf.infn.it
GlueCESEBindCEAccesspoint: /shared
dn: GlueCESEBindSEUniqueID=grid025.pd.infn.it, GlueCESEBindGroupCEUniqueID=grid006f.cnaf.infn.it:2119/jobmanager-pbs-workq, mds-vo-name=local, o=grid
objectclass: GlueGeneralTop
objectclass: GlueSchemaVersion
GlueSchemaVersionMajor: 0
GlueSchemaVersionMinor: 1
objectclass: GlueCESEBind
GlueCESEBindCEUniqueID: grid006f.cnaf.infn.it:2119/jobmanager-pbs-workq
GlueCESEBindSEUniqueID: grid025.pd.infn.it
Who is in charge to publish the CE-SE binding in the MDS?
At the moment this is not yet decided. Probably, the temporary solution will be the CE GRIS. Later, we can think about an entity which activity is to monitor the GRID (e.g. per VO) and dynamically advertise for each CE the best CE-SE bind.
Query by example:
I want to know all SE Unique ID of storage services bound to the CE which Unique ID is grid006f.cnaf.infn.it:2119/jobmanager-pbs-workq
ldapsearch
-h my.giis.hostname -p 2135 -b "mds-vo-name=myVO, o=grid" -x -LLL
"(&(objectclass=GlueCESEBindGroup)(GlueCESEBindGroupCEUniqueID=
grid006f.cnaf.infn.it:2119/jobmanager-pbs-workq))" GlueCESEBindGroupSEUniqueID
I want to know all CE Unique ID bound to the SE grid025.pd.infn.it
ldapsearch -h my.giis.hostname -p 2135 -b "mds-vo-name=MyVO,o=grid" -x -LLL "(&(objectclass=GLueCESEBindGroup)(GlueCESEBindGroupSEUniqueID=grid025.pd.infn.it))" GlueCESEBindGroupCEUniqueID
References
[1] Glue CESEBind schema.
http://cvs.infn.it/cgi-bin/cvsweb.cgi/datatag-glue/glue-schemas/Glue-CESEBind.schema
Let’s explain it by example. The CE grid006f.cnaf.infn.it:2119/jobmanager-pbs-workq can submit jobs to a set of worker nodes identified by the subcluster grid006f.cnaf.infn.it. All of them hasve an NFS mounted directory locally named /shared/permanentfiles. The remote one is /permanent and it is provided by a Storage Library which Unique ID is edt004.cnaf.infn.it. This Storage Library provides also a Storage Service which Unique ID is edt004.cnaf.infn.it:7777.
Here a picture:
The current choice in the Glue modelling is to split services from underlying systems. In the schema you can clearly describe this situation as it.
The SE service edt004.cnaf.infn.it:7777 is provided by the Storage Library
edt004.cnaf.infn.it:
dn: GlueSEUniqueID=edt004.cnaf.infn.it, mds-vo-name=local, o=grid
objectclass: GlueSETop
objectclass: GlueSchemaVersion
GlueSchemaVersionMajor: 0
GlueSchemaVersionMinor: 1
objectclass: GlueSE
GlueSEUniqueID: edt004.cnaf.infn.it
GlueSEName: edt004.cnaf.infn.it
GlueSEPort: 7777
GlueSEHostingSL: edt004.cnaf.infn.it
The Storage Library edt004.cnaf.infn.it has a local directory called /permanent that can be NFS mounted.
dn: GlueSLLocalFileSystemName=/permanent, GlueSLUniqueID=edt004.cnaf.infn.it, mds-vo-name=local, o=grid
objectclass: GlueSLTop
objectclass: GlueSchemaVersion
GlueSchemaVersionMajor: 0
GlueSchemaVersionMinor: 1
objectclass: GlueSLLocalFileSystem
GlueSLLocalFileSystemName: /permanent
GlueSLLocalFileSystemRoot: /permanent
GlueSLLocalFileSystemSize: 3000000
GlueSLLocalFileSystemAvailableSpace: 300000
GlueLocalFileSystemType: ext3
The subcluster grid006f.cnaf.infn.it has an NFS remote directory mounted as /shared/permanent from edt004.cnaf.infn.it:/permanent
dn: GlueHostRemoteFileSystemName=/shared/permanent, GlueSubClusterUniqueID=grid006f.cnaf.infn.it, GlueClusterUniqueID=grid006f.cnaf.infn.it, mds-vo-name=local, o=grid
objectclass: GlueClusterTop
objectclass: GlueSchemaVersion
GlueSchemaVersionMajor: 1
GlueSchemaVersionMinor: 0
objectclass: GlueHostRemoteFileSystem
GlueHostRemoteFileSystemName: /shared/permanent
GlueHostRemoteFileSystemRoot: /shared/permanent
GlueHostRemoteFileSystemType: NFS
GlueHostRemoteFileSystemSize: 3000000
GlueHostRemoteFileSystemAvailableSpace: 300000
GlueHostRemoteFileSystemServer: edt004.cnaf.infn.it:/permanent
The CE grid006f.cnaf.infn.it:2119/jobmanager-pbs-workq can submit jobs to the subcluster grid006f.cnaf.infn.it
dn: GlueCEUniqueID=grid006f.cnaf.infn.it:2119/jobmanager-pbs-workq, mds-vo-name=local, o=grid
objectclass: GlueCETop
objectclass: GlueSchemaVersion
GlueSchemaVersionMajor: 1
GlueSchemaVersionMinor: 0
objectclass: GlueCE
GlueCEUniqueID: grid006f.cnaf.infn.it:2119/job-manager-pbs-workq
GlueCEName: long
GlueCEHostingCluster: grid006f.cnaf.infn.it
……………..
……………..
Maintained by Sergio Andreozzi - Last update: Tue 29 October, 2002