Basic XRootD

From DUNE
Revision as of 01:29, 7 August 2016 by MaximPotekhin (talk | contribs)
Jump to navigation Jump to search

Disclaimer

The information below is not a reliable piece of documentation meaning it was obtained by experimentation with a small xrootd built from scratch in trying to overcome a modicum of obscurity that is present in the official xrootd docs. There are probably more scientific and proper ways to do things.

Running a minicluster

For basic "basement lab" type of experimentation it's convenient to use a few computers not needed at the moment for other purposes. One should really use ssh to manipulate a few machines from a single host, but if security is not an issue due to the network being strictly local, and the inet daemon is installed, telnet can be used as a quick solution. For example, on Ubuntu a daemon can be started as follows:

sudo /etc/init.d/xinetd start

You may need to add a few applications to your desktop, this is done as follows:

sudo cp /usr/share/applications/firefox.desktop  ~/Desktop/
sudo chmod +x ~/Desktop/firefox.desktop


Building XRootD

XRootD is packaged for installation for a few flavors of Linux. If your OS is not supported in this manner, building from source is a workable option. Follow the instructions on the official XRootD site.

Starting a simple instance of xrootd service

There is more than way to start the xrootd service (see documentation). The most primitive way is to start the requisite daemon processes from the command line. A few details are given below.

Starting the xrootd daemon by itself is enough to serve data from a single node.

xrootd -c configFile.cfg /path/to/data &

In this case configFile.cfg contains the necessary configuration. Without it present, some simple defaults will be assumed but one cannot do anything remotely meaningful. The path which is to be exported may be defined in the configuration file as well, in which case it's not necessary to put it in the command line.

The "-b" option will start the process in the background by default, and the "-l" option can be used to specify the path to the log file (otherwise stderr will be assumed). Examples:

cmsd -b -l /path/to/log/cmsd.log -c client.cfg
xrootd -b -l /path/to/log/xrootd.log -c client.cfg

The "cmsd" is the clustering daemon which is explained in one of the following sections.


If the "path to data" is not explicitely defined, xrootd will default to /tmp which might work for initial testing but isn't practical otherwise. Whether xrootd is running as expected can be tested by using the xrdcp client from any machine from which the server is accessible, e.g.

xrdcp myFile.txt root://serverIP//path/to/data

Clustering

In a clustered environment, you also need to start the cluster manager daemon, e.g.

xrootd -c configFile.cfg /path/to/data &
cmsd -c configFile.cfg /path/to/data &

Alternatively,

cmsd -b -l /path/to/log/cmsd.log -c client.cfg
xrootd -b -l /path/to/log/xrootd.log -c client.cfg

...in which case the log files are explicitly defined on the command line (as opposed to the default stderr) and the processes are run as daemons.

The data in the cluster is exposed through the manager node, whose address is to be used in queries. Example:

xrdcp -f xroot://managerIP//my/path/foo local_foo

The file "foo" will be located and if it exists, will be copied to "local_foo" on the machine running the xrdcp client. Caveat: if multiple files exist in the system under the same path, the result (i.e. which one gets fetched) is random.

Configuration File

An example of a working configuration file suitable for a server node (not for the manager node):

all.role server
all.export /path/to/data
all.manager 192.168.0.191:3121
xrd.port 1094
acc.authdb /path/to/data/auth_file

In the example above the IP address for the manager needs to be set correctly, it's arbitrary in this sample.

authdb

The "authdb" bit is important, things mostly won't work without proper authorization (quite primitive in this case as it relies on a file with permissions). If all users are given access to all data, the content of the file can be as simple as

u * /path/to/data lr

Redirector

The redirector coordinates the function of the cluster. For example, it finds the data based on the path given by the clients such as xrdcp, without the client having to know which nodes contains this bit of data. A crude (but working) example of the redirector configuration:

all.manager managerIP:3121
all.role manager
xrd.port 3121
all.export /path/to/data
acc.authdb /path/to/data/auth_file

Note the port number. This is not the data port but the service port to used for communication inside the cluster (e.g. for metadata).

Of course the redirector itseld can also carry data, so configuration of the server might look like this:

all.manager managerIP:3121
all.role manager
xrd.port 1094
all.export /path/to/data
acc.authdb /path/to/data/auth_file

A crude way to initiate a node in this role might look like this

xrootd -c server.cfg /path/to/data &
cmsd -c redir.cfg /path/to/data &

xrdfs

File Info

Filesystem functionality. Example:

xrdfs managerIP ls -l /my/path
xrdfs managerIP ls -u /my/path

In the above the first item performs similarly to "ls -l" in Linux shell, the second prints URLs of the files.

The following command locates the path, i.e. returns the address(es) of the server(s) which physically hold(s) the path - can be multiple machines:

xrdfs managerIP locate /my/path

Adding the "-r" option will force the server to refresh, i.e. to do a fresh query. Otherwise, a cached result will be used if it exists.

The "stat" command provides info similar to "stat":

xrdfs managerIP stat /my/path

The "rm" command does what the name suggest, with the usual caveat that if same path is present on a few machines, the result will be arbitrary - one of the files will be deleted at a time.

Host Info

xrdfs hostIP query config role

Checksum

XRootD hosts can report checksums for files, with a few checksum algorithms available. To enable this on a host a special line needs to be added to the configuration file, for example:

xrootd.chksum md5

As usual, it is only necessary to query the redirector in order to get this info by the xrdfs client:

xrdfs managerIP query checksum /my/path/to/file