Monitoring FTS

In frames of the JINR-CERN cooperation, a full-function monitoring system of FTS service has been developed. After a detailed analysis of a file transfer service database, the model of the data of the system of monitoring, giving convenient base for creation of various reports has been designed.

The model kernel is made by information and system tables, and also the tables containing the processed and raw information. The data in the specified tables arrive directly from a file transfer service database, or they are set by the user. There are also tables of separate modules of the system: «warning system» and «expert system»

System data model:

 In the course of designing four basic groups of users have been defined.
Users:
* Managers of virtual organizations,
* Administrators of grid sites,
* Top management,
* FTS service administrators .

Each of the presented groups is interested in the various data collected for various time intervals and presented in various kinds.

Managers of the virtual organizations are interested in the information on the general parameters of service of data transfer and the concrete information on grid-sites for the certain period of time. The information on options and the current condition of data links, the information on errors on the part of their site and hosts is necessary for managers of grid-sites, and the last data and the information for the last days are more demanded.

Managers of grid-sites interest both in categories of errors for definition of problems, and concrete descriptions of errors for their elimination.

The higher management needs reception of summary reports for wide intervals of time.

At last, the operative information on errors, loading, work of sites and the virtual organizations, and also degrees of coherence of various errors is required to managers of FTS service.

The interface of the system consists of several modules. Users have a possibility to begin the work with system directly from the module in which they are interested, or from the main page on which the general reports allowing to define the state of service and possible sources of problems are presented.

The system gives possibilities of reception of a wide spectrum of reports, ratings, statistical calculations and definition of correlation coefficient for pair of errors. Almost all reports of data transfer service monitoring system are supplied by cross references that is very convenient for detailed elaboration of results.

In the system the warning mechanism is realized at the failures, allowing the manager of service to create own rule set (triggers) at which operation certain actions (messages by means of the web-interface, email, sms, and so on are sent) will be executed.

The following information about service data channels   FTS with details on grid sites and virtual organizations:

*    Number of file transfers;
absolute and relative number of successful and unsuccessful transfers;
*
    The identified causes of the errors (the first few in the chain) and their     quantitative ratio in general  number of errors;
*    The average size of the transferred files;
*    Average transmission time;
*    Average data rate in the channel;
*   The amount of data transmitted and received.

In detail to familiarize with system of FTS monitoring it is possible in A.Uzhinskiy’s and V.Korenkov’s article “System of monitoring of service of data transmission (FTS) of EGEE/WLCG project”.