.. highlight:: bash .. _inst-guide: ##################################### Installation and Administration Guide ##################################### ******* Preface ******* Overview ======== This guide provides information about how to use the rasdaman array database system, in particular: installation and system administration. For storage of multi-dimensional array data, rasdaman can be configured to use some conventional database system (such as PostgreSQL) or use its own storage manager. For the purpose of this documentation, we will call the conventional database system to which rasdaman is interfaced the *base DBMS*, understanding that this base DBMS is in charge of all alphanumeric data maintained as relational tables or object-oriented semantic nets. This guide is specific for *rasdaman enterprise*. Audience ======== The information in this manual is intended primarily for database and system administrators. Rasdaman Documentation Set ========================== This manual should be read in conjunction with the complete rasdaman documentation set which this guide is part of. The documentation set in its completeness covers all important infor­mat­ion needed to work with the rasdaman system, such as programming and query access to databases, guidance to utilities such as *raswct*, release notes .. _sec-download-and-install: *************** Getting Started *************** **Hardware & Software Requirements** It is recommended to have at least 8 GB main memory. Disk space depends on the size of the databases, as well as the requirements of the base DBMS of rasdaman chosen. The footprint of the rasdaman installation itself is around 400 MB. Rasdaman is continuously tested on the platforms listed below. The rasdaman code has been developed on SUN/Solaris and HP-UX originally, and has been ported to IBM AIX, SGI IRIX, and DEC Unix - but that was way back in the last millennium. - Debian 8 - Ubuntu 16.04, 18.04 - CentOS 7 **License Key** In order to run a rasdaman server you have to obtain a licence from rasdaman GmbH. This licence key encodes, among others, the number of cores and the server's interface name (such as "eth0") and corresponding MAC address. The following commands are usually used to obtain this information: :: # interface name + IP address $ ifconfig # alternatively $ ip link # number of CPUs $ cat /proc/cpuinfo # alternatively $ nproc After communicating these ingredients to rasdaman GmbH in the course of a licence purchase, a licence key file will be provided which has to be stored on the machine where the rasdaman server runs. **Support** For support in installing rasdaman and any other question you may contact rasdaman GmbH at `www.rasdaman.com `__. .. _sec-system-install-packages: Official Packages ================= This page describes installation of rasdaman enterprise RPM or Debian packages. With your purchase, you have received a login to the rasdaman download area, http://download.rasdaman.com. Open this site in a browser window, log in, and follow instructions there (for completeness also provided `below `). Rasdaman community and enterprise cannot run in parallel on the same machine. If you plan to have both installations on the same machine, make sure they reside in different directories and are not active at the same time. Rasdaman databases created with rasdaman community are upwards compatible with rasdaman enterprise. During generation of these packages, some configuration decisions have been made Most importantly, the rasdaman engine in the packages uses embedded SQLite for managing its array metadata. Notice, though, that the geo service component, petascope, currently still relies on a PostgreSQL database; this is planned to be changed in the near future. .. _sec-system-install-pkgs-deb: DEB packages ------------ Currently the following Debian-based distributions are supported: - Ubuntu 16.04 / 18.04 Installation ^^^^^^^^^^^^ 1. Import the rasdaman repository public key to the apt keychain (make sure to replace USERNAME / PASSWORD with your download credentials): :: wget -O - http://USERNAME:PASSWORD@82.165.134.122/Download/rasdaman.gpg | \ sudo apt-key add - 2. Add the rasdaman repository to apt (make sure to replace USERNAME / PASSWORD with your download credentials): - **stable:** these are only updated on stable releases of rasdaman. :: # For ubuntu 16.04 $ echo "deb [arch=amd64] http://USERNAME:PASSWORD@82.165.134.122/Download/deb xenial stable" \ | sudo tee /etc/apt/sources.list.d/rasdaman.list # For ubuntu 18.04 $ echo "deb [arch=amd64] http://USERNAME:PASSWORD@82.165.134.122/Download/deb bionic stable" \ | sudo tee /etc/apt/sources.list.d/rasdaman.list Copy the rasdaman license key to ``/opt/rasdaman/etc``, e.g: :: $ sudo mkdir -p /opt/rasdaman/etc $ sudo cp rmankey /opt/rasdaman/etc .. note:: This has to be done before installing rasdaman. 3. rasdaman can be installed now: :: $ sudo apt-get update $ sudo apt-get install rasdaman # to make rasql available on the PATH $ source /etc/profile.d/rasdaman.sh 4. **NOTE**: if during the install you get a prompt like the below, type **N** (default option) to keep your old ``petascope.properties`` in ``/opt/rasdaman/etc``; the installer will automatically invoke ``/opt/rasdaman/bin/update_properties.sh`` script to merge with the new ``petascope.properties`` version from the package. :: Configuration file `/etc/opt/rasdaman/petascope.properties' ==> Modified (by you or by a script) since installation. ==> Package distributor has shipped an updated version. What would you like to do about it ? Your options are: Y or I : install the package maintainer's version N or O : keep your currently-installed version D : show the differences between the versions Z : start a shell to examine the situation The default action is to keep your current version. *** petascope.properties (Y/I/N/O/D/Z) [default=N] ? If you are automating the installation (in a script for example), you can bypass this prompt with an apt-get option as follows: :: apt-get -o Dpkg::Options::="--force-confdef" install -y rasdaman 5. Check that everything is fine: :: $ rasql -q 'select c from RAS_COLLECTIONNAMES as c' --out string Typical output: :: rasql: rasdaman query tool v1.0, rasdaman v9.0.0 -- generated on 02.07.2015 08:44:56. opening database RASBASE at localhost:7001...ok Executing retrieval query...ok Query result collection has 0 element(s): rasql done. 6. Check that petascope is initialized properly, typically at this URL: :: http://localhost:8080/rasdaman/ows 7. You will find the rasdaman installation under ``/opt/rasdaman/``; `rasdaman.war` and `def.war` are installed in ``/var/lib/tomcat7/webapps`` (or tomcat8). 8. If SELinux is running then likely some extra configuration is needed to get petascope run properly. See :ref:`here ` for more details. Updating ^^^^^^^^ The packages are updated whenever a new rasdaman version is released. To update your installation: :: $ sudo apt-get update $ sudo apt-get install rasdaman $ sudo migrate_petascopedb.sh .. _sec-system-install-pkgs-rpm: RPM packages ------------ Currently the following RPM-based distributions are supported: - CentOS 7 - CentOS 7, linked against GDAL 2.x Installation ^^^^^^^^^^^^ 1. Add the rasdaman repository to yum (make sure to replace USERNAME / PASSWORD with your download credentials): - If GDAL 1.x is installed from the standard packages: :: $ sudo curl "http://USERNAME:PASSWORD@82.165.134.122/Download/rpm/stable/CentOS/7/x86_64/rasdaman.repo" \ -o /etc/yum.repos.d/rasdaman.repo - If GDAL 2.x has been manually compiled: :: $ sudo curl "http://USERNAME:PASSWORD@82.165.134.122/Download/rpm/stable/CentOS/7_gdal2/x86_64/rasdaman.repo" \ -o /etc/yum.repos.d/rasdaman.repo Edit the file downloaded, replacing "USERNAME" and "PASSWORD" with the same user name and password, respectively, as obtained from rasdaman GmbH or one of its authorized retailers. Copy the rasdaman license key to /opt/rasdaman/etc, e.g: :: sudo mkdir -p /opt/rasdaman/etc sudo cp rmankey /opt/rasdaman/etc 2. The rasdaman packages should be available now via yum: :: $ sudo yum clean all $ sudo yum update $ sudo yum search rasdaman Output: :: rasdaman.x86_64 : Rasdaman extends standard relational database systems with the ability to store and retrieve multi-dimensional raster data 3. Add the EPEL repository to yum (`official page `__): :: sudo yum install epel-release 4. Install the rasdaman package: :: $ sudo yum install rasdaman # to make rasql available on the PATH $ source /etc/profile.d/rasdaman.sh .. note:: If PostgreSQL has been newly installed (as opposed to having it installed before executing the commands on this page) then it is registered as a dependency of the rasdaman package. .. note:: If petascope has *problems* connecting to rasdaman, check this `FAQ entry `__ for some advice. 5. Check that everything is fine: :: $ rasql -q 'select c from RAS_COLLECTIONNAMES as c' --out string Typical output: :: rasql: rasdaman query tool v1.0, rasdaman v9.0.0 -- generated on 02.07.2015 08:44:56. opening database RASBASE at localhost:7001...ok Executing retrieval query...ok Query result collection has 0 element(s): rasql done. 6. Check that petascope is initialized properly, typically at this URL: :: http://localhost:8080/rasdaman/ows 7. You will find the rasdaman installation under ``/opt/rasdaman/``; `rasdaman.war` and `def.war` are installed in ``/var/lib/tomcat/webapps``. 8. If SELinux is running then likely some extra configuration is needed to get petascope run properly. See :ref:`here ` for more details. Updating ^^^^^^^^ The packages are updated whenever a new version of rasdaman is released. To download an update perform these steps: :: $ sudo service rasdaman stop $ sudo service tomcat stop $ sudo yum clean all $ sudo yum update rasdaman $ sudo migrate_petascopedb.sh Administration -------------- Once all above actions are completed, the rasdaman installation (or update) has been accomplished. This section provides additional background information for administrators. A ``rasdaman`` service script allows to start/stop rasdaman, e.g. :: $ service rasdaman start $ service rasdaman stop $ service rasdaman status Similarly, the ``tomcat``/``tomcat7`` and ``postgresql`` services can be started and stopped. See also the dedicated pages on :ref:`configuration and log files ` and :ref:`administration `. .. _sec-system-initialize-rasdaman: Initialize rasdaman ------------------- Server Configuration (Optional) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Rasdaman is a multi-server multi-user system. The server processes available must be configured initially, which is done in file ``$RMANHOME/etc/rasmgr.conf``. For distribution, this configuration contains ten server processes going by a name like, for example, ``N1``. If this is fine then you can just leave it as it is. If you want to change this by modify­ing server startup parameters or increasing the number of server process­es available then see :ref:`sec-rascontrol-invocation` for details on how to do this. Demo Database ^^^^^^^^^^^^^ The rasdaman distribution contains a demo database which serves as a first test of successful installation. Inserting demo data into the fresh database is done through :: $ rasdaman_insertdemo.sh localhost 7001 \ $RMANHOME/share/rasdaman/examples/images rasadmin rasadmin Note that repeated invocations are not harmful - each of the sample collection will simply receive additional objects made of the same images. After successful completion, you can check whether the three rasdaman collections containing the example images have been created through: :: $ rasql -q "select r from RAS_COLLECTIONNAMES as r" \ --out string This command shows a list of all collections existing in the database. There should be ``mr``, ``mr2``, and ``rgb``. Congratulations! At this point, if everything completed successfully, ras­da­man is up and running and prepared for data definition, data import and retrieval, and any other suitable task. Initialize geo service support ------------------------------ petascope ^^^^^^^^^ *Petascope* is the geo Web service frontend of rasdaman. It adds geo semantics on top of arrays, thereby enabling regular and irregular grids based on the `OGC coverage standards `__. Petascope is installed automatically in ``/var/lib/tomcat/rasdaman.war``. To implement the geo semantics, petascope uses a relational database for the geo-related metadata. The package post-install script will automatically set up PostgreSQL for use by petascope. The steps approximately performed by the script are listed below: 1. If postgres has not been initialized yet: :: $ sudo service postgresql initdb If the output is 'Data directory is not empty!' then this step is skipped. 2. Trust-based access in PostgreSQL is enabled by adding the below configuration before the ident lines to ``/etc/postgresql/9.4/main/pg_hba.conf`` on Debian 8, or ``/var/lib/pgsql/data/pg_hba.conf`` on CentOS 7: :: host all petauser localhost md5 host all petauser 127.0.0.1/32 md5 host all petauser ::1/128 md5 3. Reload PostgreSQL so that the new configuration will take effect: :: $ sudo service postgresql reload 4. Add a petascope user, for example ``petauser``, to PostgreSQL: :: $ sudo -u postgres createuser -s petauser -P > enter password In ``$RMANHOME/etc/petascope.properties`` set the ``spring.datasource.username``/``spring.datasource.password`` and ``metadata_user``/``metadata_pass`` options accordingly to this user / password. The password is randomly generated. 5. Copy ``/opt/rasdaman/share/rasdaman/war/rasdaman.war`` to the Tomcat webapps directory (``/var/lib/tomcat/webapps`` on CentOS 7) and restart Tomcat. Following successful deployment, petascope accepts OGC W*S requests at URL ``http://localhost:8080/rasdaman/ows``. .. _selinux-configuration: **SELinux configuration** If ``SELinux`` is enabled (result of ``getenforce`` is `enforcing`) then permissions for the ``tomcat`` user which is running petascope need to be configured properly: - Allow to load the ``gdal-java`` native library (via JNI) - Read / write files in ``/tmp/rasdaman_*`` - Make HTTP requests to rasdaman and get back results on ports ``7001-7010`` (these are default, specified in ``$RMANHOME/etc/rasmgr.conf``). Before proceeding, a SELinux utility package needs to be installed on CentOS 7: :: $ sudo yum install policycoreutils-python There are two ways to configure SELinux in order to enable petascope: 1. Change from ``enforcing`` to ``permissive`` for Tomcat: :: $ semanage permissive -a tomcat_t 2. Create specific rules for the ``tomcat`` user and register with ``SELinux``. - Create a rule config file ``tomcat_config.te`` with this contents: :: module tomcat_config 1.0; require { type tomcat_t; type tomcat_var_lib_t; type usr_t; type tomcat_exec_t; type unconfined_service_t; type afs_pt_port_t; type tomcat_tmp_t; type tmpfs_t; type afs3_callback_port_t; class tcp_socket name_connect; class file { append create execute read relabelfrom rename write }; class shm { associate getattr read unix_read unix_write write }; } # ============= tomcat_t ============== allow tomcat_t afs3_callback_port_t:tcp_socket name_connect; allow tomcat_t tmpfs_t:file { read write }; allow tomcat_t tomcat_tmp_t:file { execute relabelfrom }; allow tomcat_t tomcat_var_lib_t:file execute; allow tomcat_t unconfined_service_t:shm { associate getattr read unix_read unix_write write }; - Create a shell script ``deployse.sh`` to generate a binary package from this config file: :: #!/bin/bash set -e MODULE=${1} # this will create a .mod file checkmodule -M -m -o ${MODULE}.mod ${MODULE}.te # this will create a compiled semodule semodule_package -m ${MODULE}.mod -o ${MODULE}.pp # this will install the module semodule -i ${MODULE}.pp - Run the script to load the binary package module to ``SELinux``: :: $ sudo ./deployse.sh tomcat_config Restart Tomcat with ``sudo service tomcat restart``; now rasdaman should be able to import data to petascope via WCSTImport and get data from rasdaman via WCS / WMS / WCPS. secore ^^^^^^ SECORE (Semantic Coordinate Reference System Resolver) is a service that maps CRS URLs to CRS definitions. This component, which is part of the standard rasdaman distribution, is used by the `Open Geospatial Consortium `__ (OGC) for operating their official CRS resolver. Petascope uses SECORE for resolving CRS definitions of the coverages it holds, and it is best if SECORE is deployed locally as ``def.war``, alongside the petascope ``rasdaman.war`` application. For further details on SECORE management, security and troubleshooting see the `administration `__ and `developer guide `__ pages. SSL/TLS Configuration ^^^^^^^^^^^^^^^^^^^^^ Transport Layer Security (``TLS``) and its predecessor, Secure Sockets Layer (``SSL``), are technologies which allow web browsers and web servers to communicate over a secured connection.To configure it for ``petascope`` and ``secore web`` applications for ``Tomcat``, check the `official guide `__. .. _sec-rrasdaman-install: .. TODO .. _sec-system-install-conf: *********************************** Directories and Configuration Files *********************************** Overall directory structure =========================== As common with rasdaman, we refer to the installation location as ``$RMANHOME`` below; the default is ``/opt/rasdaman``. The table below lists the directories found in ``$RMANHOME`` after a fresh installation. +---------------------+-------------------------------------------------------------+ |**Directory** |**Description** | +=====================+=============================================================+ |``bin`` |rasdaman executables, e.g. rasql, start_rasdaman.sh, ... | +---------------------+-------------------------------------------------------------+ |``data`` |Path where the server stores array tiles as files; this | | |directory can get big, it is recommended to make | | |it a link to a sufficiently large disk partition. | +---------------------+-------------------------------------------------------------+ |``etc`` |Configuration files, e.g. rasmgr.conf | +---------------------+-------------------------------------------------------------+ |``include`` |C++ API development headers. | +---------------------+-------------------------------------------------------------+ |``lib`` |C++ and Java API libraries. | +---------------------+-------------------------------------------------------------+ |``log`` |``rasmgr`` and ``rasserver`` log files. | +---------------------+-------------------------------------------------------------+ |``share`` |Various artefacts like documentation, python/javascript | | |clients, example data, migration scripts, etc. | +---------------------+-------------------------------------------------------------+ Executables =========== Rasdaman executables are found in ``$RMANHOME/bin``; the table below lists the various binaries and scripts. More detailed information on these components is provided in the :ref:`sec-rasdaman-architecture` Section. +------------------------------+----------------------------------------------------------------+ |**Executables** |**Description** | +==============================+================================================================+ |``rasserver`` |Client queries are evaluated by a ``rasserver`` worker process. | +------------------------------+----------------------------------------------------------------+ |``rasmgr`` |A manager process that controls ``rasserver`` processes and | | |client/server pairing. | +------------------------------+----------------------------------------------------------------+ |``rascontrol`` |A command-line frontend for ``rasmgr``. | +------------------------------+----------------------------------------------------------------+ |``directql`` |A rasserver that can execute queries directly, bypassing the | | |client/server protocol; useful for debugging. | +------------------------------+----------------------------------------------------------------+ |``rasql`` |A command-line client for sending queries to a ``rasserver`` | | |(as assigned by the ``rasmgr``). | +------------------------------+----------------------------------------------------------------+ |``start_rasdaman.sh`` |Start ``rasmgr`` and the worker ``rasservers`` as | | |configured in ``$RMANHOME/etc/rasmgr.conf``, embedded | | |petascope configured in ``$RMANHOME/etc/petascope.properties`` | | |and embedded secore configured in | | |``$RMANHOME/etc/secore.properties`` by default. | | |Since v9.8, to start a specific service the | | |``--service (core | secore | petascope )`` option can be used | | | (``core`` refers to ``rasmgr`` + ``rasservers`` only). | +------------------------------+----------------------------------------------------------------+ |``stop_rasdaman.sh`` |Shutdown rasdaman, embedded petascope and embedded secore | | |by default. | | |Since v9.8, to stop a specific service the | | | ``--service (core | secore | petascope )`` option can be used | | |(``core`` refers to ``rasmgr`` + ``rasservers`` only). | +------------------------------+----------------------------------------------------------------+ |``rasdl`` |Tool for RASBASE creation/deletion and type management | | |(deprecated). | +------------------------------+----------------------------------------------------------------+ |``create_db.sh`` |Initialize the rasdaman metadata database (RASBASE); | | |convenience wrapper around ``rasdl``. | +------------------------------+----------------------------------------------------------------+ |``update_dh.sh`` |Applies migration scripts to RASBASE. | +------------------------------+----------------------------------------------------------------+ |``rasdaman_insertdemo.sh`` |Insert three demo collections into rasdaman (used in the | | |rasdaman Query Language Guide). | +------------------------------+----------------------------------------------------------------+ |``petascope_insertdemo.sh`` |Insert geo-referenced demo coverage in petascope. | +------------------------------+----------------------------------------------------------------+ |``migrate_petascopedb.sh`` |Applies database migrations on petascopedb. | +------------------------------+----------------------------------------------------------------+ |``wcst_import.sh`` |Tool for convenient and flexible ingestion of | | |geo-referenced data into petascope. | +------------------------------+----------------------------------------------------------------+ |``rasfed.jar`` |Federation daemon. | +------------------------------+----------------------------------------------------------------+ Configuration files =================== Configurations are automatically loaded upon rasdaman start. After any modification a restarthas to be performed for the change to take effect. Server rasdaman configuration files can be found in ``$RMANHOME/etc``. +------------------------------+---------------------------------------------------------------------------------------------------+ |``rasmgr.conf`` |allows fine-tunning the rasdaman servers, e.g. number of servers, names, database connection, etc. | +------------------------------+---------------------------------------------------------------------------------------------------+ |``petascope.properties`` |set `petascope `_ properties, e.g. database/rasdaman | | |connection details, CRS resolver URLs, various feature options | +------------------------------+---------------------------------------------------------------------------------------------------+ |``wms_service.properties`` |petascope properties specifically for the WMS service | +------------------------------+---------------------------------------------------------------------------------------------------+ |``secore.properties`` |`secore `_ configuration | +------------------------------+---------------------------------------------------------------------------------------------------+ |``rasfed.properties`` |federation daemon configuration | +------------------------------+---------------------------------------------------------------------------------------------------+ Specifically, log output is controlled via these configuration files: +---------------------+--------------------------------------------------+ |``log-rasmgr.conf`` |log output of rasmgr | +---------------------+--------------------------------------------------+ |``log-server.conf`` | log output of the rasservers | +---------------------+--------------------------------------------------+ |``log-client.conf`` | log output of client applications, e.g., rasql | +---------------------+--------------------------------------------------+ rasdaman uses the `Easylogging++ `__ library for logging in its C++ components. Log properties can be configured as documented on the `EasyLogging GitHub page `__. The enterprise license file ``rmankey`` is also found here. Further potentially relevant configuration files are +------------+---------------------------------------------------------+ | postgresql |``/var/lib/pgsql/data/{postgresql.conf,pg_hba.conf}`` or | | | ``/etc/postgresql/9.X/{postgresql.conf,pg_hba.conf}`` | +------------+---------------------------------------------------------+ | tomcat |``/etc/tomcat/``, ``/etc/default/tomcat`` | +------------+---------------------------------------------------------+ .. _sec-log-files: Log files ========= **rasdaman** *rasdaman* server logs are placed in ``$RMANHOME/log/``. The server components feed the following files where ``uid`` represents a unique identifier of the process, and ``pid`` is a Linux process identifier: ``rasserver...log`` ``rasserver`` worker logs: at any time there are several rasservers running (depending on the settings in ``rasmgr.conf``) and each has a unique log file. ``rasmgr..log`` ``rasmgr`` log: there is only one ``rasmgr`` process running at any time. ``rasfed.log`` ``rasfed`` log: there is only one ``rasfed`` process running at any time; on rasdaman restart the output from the new process is appended to the same log file. .. note:: ``ls -ltr`` is a useful command to see the latest recently modified log files at the bottom. **petascope & secore** *petascope* log messages can be typically found in ``/var/log/tomcatN/catalina.out``, where N can be 7 or 8 depending on your OS/Tomcat version. It is highly recommended to set a specific log file however in the log4j configuration section in ``petascope.properties`` (e.g. ``log4j.appender.rollingFile.File=/var/log/tomcatN/petascope.log``). Be careful that this location needs to be write accessible by the Tomcat user. The same can be set for SECORE in ``secore.properties``. .. _sec-rasdaman-architecture: ********************* rasdaman Architecture ********************* The parallel server architecture of rasdaman offers a scalable, distributed environment to efficiently process even very large numbers of concurrent client requests. Yet, server administration is easy to accomplish, with only few things to do to have a smoothly running, highly performant installation. Moreover, the system is implemented in a special high availability technique where most server management operat­ions can be done with the server up and running, limiting the need for a server shutdown to the absolute minimum. In this Section the general rasdaman server architecture is outlined. It is recommended to study this section so as to understand server administration terminology used in the next Section. Executables Overview ==================== The following executables are provided in the ``bin/`` directory, among others: * ``rasmgr`` is the central rasdaman request dispatcher; * ``rasserver`` is the rasdaman server engine, it should not (and actually cannot) be invoked in a standalone manner; * ``rascontrol`` allows to interactively control the rasdaman server by communicating with ``rasmgr``; * ``rasdl`` is the command-line based schema maintenance tool; this is (currently) not a client application, but connetxts directly to the relational database manager. It is mostly deprecated, as its functionality is support in rasql. * ``rasql`` is the command-line based query tool. * ``rasfed`` is the federation daemon, which enables efficient query query distribution in federated rasdaman networks. The ``rasdl`` and ``rasql`` tools are explained in detail in the *rasdaman Query Language Guide*. .. _sec-server-mgr-server: Server Manager and Server ========================= Overview and Terminology ------------------------ The rasdaman server configuration consists of one dispatcher process per computer, ``rasmgr`` (we will refer to it as *manager* in the sequel), and server processes, ``rasserver`` (referred to as *servers*), of which at a given time none, one, or several ones can be running. All server processes are under control of the manager. Server manager and rasdaman server(s) all run on the same physical hardware, the *rasdaman host*. The servers resolve requests, thereby generating calls to the relational database system which in turn accesses its database files. For the purpose of this manual, the relational server together with the data­base it maintains are collectively called the *database*. The machine the relational database server runs on is referred to as *database host* (:numref:`figure2`). .. _figure2: .. figure:: media/inst-guide/image3.png :align: center :width: 450px Overall server hierarchy, introducing the terminology for rasdaman hardware and software environment Server Structure in General --------------------------- The manager accepts client requests and assigns server instances to them, taking them from the pool of server processes it maintains. In distributed installations, it keeps contact to the managers on other machines to further dispatch client requests across all the rasdaman servers available. Whenever needed, the administrator can launch further server instances, or shut them down again. Upon system configuration definition (see :ref:`sec-rascontrol-invocation`), a unique name is assigned to each server identifying it to the manager. Each rasdaman server is assigned to a relational database server, laid down in the manager configuration file. Databases can be registered and associated to particular rasdaman servers at any time. rasdaman hosts and database hosts are identified by their resp. host name in common domain address form, e.g., ``martini.rasdaman.com`` or ``199.198.197.50``. ``Rascontrol`` is the interactive front-end to ``rasmgr`` and, as such, the main utility for user and system management. It provides the necessary functions to manage the whole system configuration, to add and remove user, to change their rights, and to obtain information about system activity. The rasdaman server, i.e., ``rasserver``, is controlled by the manager which starts and stops server instances. Hence, the ``rasserver`` executable should not (and actually cannot) be invoked directly. Dynamic Server Assignment ------------------------- The process of client/server communication and server scheduling is done as follows (see numbers in :numref:`figure-internal-server-mgmt`). 1. The client starts every ``OPENDB`` and ``BEGIN TRANSACTION`` request with an HTTP call to the manager, providing the required service type (RPC, HTTP, etc.) and the database name, together with user name and password. 2. The manager's answer is the server ID of a free server, or an error message in case no server is available or access is denied for the given login. 3. Client-Server communication to perform the database requests. 4. Upon ``CLOSEDB`` and ``ABORT/COMMIT TRANSACTION`` the server informs the manager that it is available again. This is also done upon a client timeout. These negotiation steps are performed between client library and server, hence transparent to the application. The rasdaman server system is started by invoking the server manager ``rasmgr`` (see :ref:`sec-running-manager`). If it finds a configuration file, them autopmatically all servers indicated will be started; alternatively, server configuration can be done directly through ``rascontrol`` (see :ref:`sec-rascontrol-invocation`). .. _figure-internal-server-mgmt: .. figure:: media/inst-guide/image4.png :align: center :width: 450px Internal server management System Start-up --------------- Invocation of the ``rasmgr`` executable must be done under the operating system login under which the rasdaman installation has been done, usually (and recommended) ``rasdaman``. The service script ``/etc/init.d/rasdaman`` (when rasdaman is installed from the packages) automatically takes care of this. Authentication -------------- On every machine hosting rasdaman servers a separate manager has to run. The manager maintains an authorization file, ``$RMANHOME/etc/rasmgr.auth``. It should not be changed by the ad­min­ist­rat­or, as they are generated, maintained, and overwritten by the manager. .. _figure4: .. figure:: media/inst-guide/image5.png :align: center :width: 500px rasdaman federation rasdaman Manager Defaults ------------------------- The manager's default name is the ``hostname`` (the one reported by the UNIX command hostname), but it can be changed (see the ``change`` command). By default, it listens to port 7001 for incoming requests and uses port 7001 for outgoing requests: Port Number Recommendations --------------------------- To keep overview of the ports used, it is recommended to use the following schema (there is, however, no restriction preventing from choosing another schema - just keep an overview\...): - use port number 7001 for the server manager; - use port numbers 7002 to 7999 for rasdaman servers. Parallel rasdaman [RE] ====================== rasdaman enterprise offers *intra-query* parallelization, that is: splitting of complex queries into sub-queries executed in parallel on a configurable set of compute nodes. Distribution is determined by criteria like data location in the networks and "breakpoints" in the query where minimal data transport occurs. In particular for complex queries and big data such methods are known to boost performance dramatically. Federation daemon ----------------- A separate background process per host, ``rasfed``, collects metadata about rasdaman instances in the known network. To this end, rasfed peri­odically contacts all known nodes to gather information used for dis­patch­ing and optimization. Nodes known are those specified as ``in­peer`` and/or ``outpeer`` in ``rasmgr.conf``. The following adjustments can be made editing ``rasfed.properties``: - ``rasdamanDatabaseConnectivityString`` sets the connectivity string to the database administered by rasdaman. - Default: ``jdbc:sqlite:/opt/rasdaman/data/RASBASE`` - ``rasdamanDatabaseUser`` sets the username for the above database. - Default: ``rasdaman`` - Need to change: NO for sqlite, YES for other backend DBMS - ``rasdamanDatabasePassword`` sets the password for the above username. - Default: ``rasdaman`` - Need to change: NO for sqlite, YES for other backend DBMS - ``rasdamanUser`` sets the rasdaman username. - Default: ``rasguest`` - Need to change: YES when changed in rasdaman - ``rasdamanPassword`` sets the password for the above username. - Default: ``rasguest`` - Need to change: YES when changed in rasdaman - ``listeningPort`` sets the port on which rasfed listens. Default: ``7000`` - ``hostname`` sets the machine hostname; usually this will be correctly guessed during installation. Default: output of ``hostname`` command - ``maxNumberOfRestartTries`` sets the number of restarts rasfed should perform in case of an error, until it gives up. Default: ``5`` - ``noContactMaxTime`` sets the maximal period (in milliseconds) during which a peer server doesn't send any messages, before being consider inactive. Default: ``130000`` - ``pathToRasdamanBinaries`` sets the path to rasdaman bin directory. Default: ``/opt/rasdaman/bin`` - ``penaltyTimeForBeingInactive`` sets the period (in milliseconds) for which an inactive server is ignored. After that, a new attempt to contact the inactive server is made. Default: ``50000`` - ``statusMessageUpdateTimeInterval`` sets the polling interval in milliseconds. Default: ``10000`` - ``restartServiceDelay`` sets the number of milliseconds that need to pass between consecutive restarts, in case of failure. Default: ``1000`` - ``petascopeEndPoint`` is the endpoint URL of petascope on this peer node Default: ``http://localhost:8080/rasdaman/ows`` Need to change: YES In summary, customization is typically required for ``petascopeEndPoint`` only, as well as the ``rasdamanUser`` and ``rasdamanPassword`` when they change in rasdaman. Disabling federation access --------------------------- Disabling network communication between nodes can be preset in rasmgr.conf or set interactively via rascontrol as follows. No queries to outpeers ^^^^^^^^^^^^^^^^^^^^^^ Do one of the following in rasmgr.conf or via rascontrol: - remove hosts from the outpeer clause - delete outpeer lines - put outpeer lines in comments (i.e., prefix with "#") No queries from inpeers ^^^^^^^^^^^^^^^^^^^^^^^ Do one of the following: - remove hosts from inpeer clause - delete inpeer lines - put inpeer lines in comments (i.e., prefix with "#") Restart server ^^^^^^^^^^^^^^ If rasmgr.conf has been edited, rasdaman needs to be restart on this node to make changes effective. Changes done through rascontrol become effective immediately, and no restart is required. .. _sec-storage-backend: Storage backend =============== rasdaman can store array data in two different ways: 1. Arrays in a file system directory, array metadata in SQLite; this is default. 2. Everything in PostgreSQL: arrays in BLOBs, array metadata in tables. .. note:: rasdaman enterprise additionally supports access to pre-existing archives of any structure. The array storage variant is fixed in the packages to ``sqlite``, i.e. the default recommended option. Storing arrays in a file system directory ----------------------------------------- In this storage variant, a particular directory gets designated to hold rasdaman arrays (maintained by rasdaman) and their metadata (maintained by an SQLite instance embedded in rasdaman). The recommended directory location is ``$RMANHOME/data/``; administrators may configure this to be a symbolic link to some other location, possibly another filesystem than where ``$RMANHOME`` resides (so as to keep programs and data separate). Alternatively, the path can be changed in the ``-connect`` option in ``rasmgr.conf``. The data directory will contain the named database. Currently only one database is supported, but this may change in future. Default database name, assumed by all tools, is ``RASBASE``. While it can be changed this is not recommended as all tools will need to receive an extra parameter indicating the changed name. The database name needs to be communicated to rasdaman in the ``$RMANHOME/etc/rasmgr.conf`` configuration file. Specifically, the connect string should be an absolute path to the ``RASBASE`` database (note that variables are not recognized in the script, therefore ``$RMANHOME`` has to be spelt out). Assuming the default values described above and a rasdaman installation directory of ``$RMANHOME=/opt/rasdaman``, the corresponding configuration line might look like this: :: define dbh rasdaman_host -connect /opt/rasdaman/data/RASBASE .. caution:: For a customized data directory location it is recommended to use a symbolic link, rather than modify installation defaults. .. _sec-query-caching: Query Result Caching [RE] ========================= Overview -------- Query results can be cached in a shared memory area of the server's main memory. Cache contents is shared among all rasserver processes running on the same computer. Cache coherence (i.e., automatic adaptation of the cache contents after database updates) is ensured. A cached result is used by a subsequent query if a subexpression in this query matches with the cached result; in this case, the cached result replaces the query expression, thereby speeding up processing of the query. Results do not have to match exactly; if a larger array is cached than a query needs then the subset needed will be extracted and reused, which still provides the query with a performance gain. Measurements have shown speedups of several orders of magnitude in presence of cache contents reuse. Key cache parameters are configurable by the administrator. By default, the cache is disabled; it needs to be activated through the cache control commands described below. Cache Reuse ----------- A query can use a cached item if it contains an occurrence of the expression that has produced the cached element, and if this expression has been applied to the same array object the query wants to access. Note that the decision considers what base data item (i.e., array) has been used – in other words, an expression can benefit from the cache only if it addresses the same array. Scope ^^^^^ The unit of caching is a single result item, either an array, or a tile, or a scalar. As rasql queries are set-oriented one query may access several arrays, and may deliver more than one item. In the cache, each such item constitutes a separate, independent entry. Subsequent queries check the cache for useful elements on the level of single elements. Therefore, even if only some array results can be reused in a query addressing a set of arrays then the query still can benefit from the matches found. For avoiding doubts, no complete queries (``select ... from .. where``) can be cached, but array access and operations up to complete ``select`` and ``where`` clauses, including data format encoding. Several situations are possible, they are explained below in turn. Reuse of Full Query Result ^^^^^^^^^^^^^^^^^^^^^^^^^^ Cached results can be used in several situations: - **Exact match:** An expression result is already in the cache. The result will be used, no further evaluation of the corresponding expression is necessary. - **Subexpression match:** An expression in the incoming query contains a subexpression which has been evaluated earlier and, hence, is in the cache. The cached subexpression result will be pasted into the expression, thereby reducing the computational effort needed. - **Partial match:** An expression in the incoming query contains a matching subexpression, but with only partial overlap in the domain of the cached array. The cached result will be used as much as possible, the non-cached cells of the expression will be computed. Cache Rule Concepts ------------------- Caching can be controlled through so-called *cache rules*. These define patterns of (sub-) expressions to be cached whenever they occur. Cache rules consist of regular expressions over the query language functions, so-called *query rules*, together with variable bindings, here called *argument rules*. For example, the following cache rule defines that the result of every application of log() applied to the array identified by OID 123 should be kept in cache: :: Rule N: log(x) x=1234 Below the concepts are explained in turn. Query Rule ^^^^^^^^^^ A query rule is a string representation of some query subexpression occurring in a ``select`` or ``where`` clause. It consists of concrete function in­vocations and wildcards. Concrete function invocations are written as they would be written in rasql. These function invocations may contain variables or subexpressions; in case of variables, the concrete binding is done in the Arguments Rule de­scribed below. Operator wildcards allow expressing any position of an operator in some nested expression. The underscore symbol (``_``) is used as a wildcard symbol; it matches any number of nested invocations of any function supported by rasql. The query rule syntax is given by the following rule set: - (empty string) This matches any query expression, i.e.: all expression results (including all subexpressions) will be cached. Such a rule should be defined very consciously as it will cause a massive cache utilization. - ``_`` Same as above: matches any expression. - ``f(_)`` where ``f`` is a function symbol defined in rasql. This matches exactly a (sub-) expression log(x), such as in :: select log(x) from x In the following query, ``log(x)`` will be cached / can be reused from cache: :: select abs(log(x)) from x Arity of the function (its number of parameters) is ignored, i.e., a single ``_`` represents any para­meter list. The function symbol can represent a scalar as well as an induced operation. For example, ``log(_)`` matches ``log()`` invocations with any argument, be it a concrete variable or a subexpression itself; - it matches: ``log(x)`` and ``log(x+abs(y))`` - it does **not** match: ``log(x)+log(y)`` because the topmost oper­ation is an addition. See below for binding variables to concrete array OIDs. - ``( _ op _ )`` where ``op`` is an infix function symbol defined in rasql. This symbol can represent a scalar as well as an induced operation. For example, - it matches: ``(x+(y*z))`` and ``(log(x)+log(y))`` - it does **not** match: ``(x*(y+z))`` - ``( case _ end )`` This matches all case expressions, for example: :: case when x > 0 then 1 when x < 0 then -1 else 0 end - ``( marray _ )`` This matches all marray expressions. ``marray _ values f()`` This matches all marray expressions containing an invocation of ``f()``. - ``( condense _ )`` This matches all condense expressions. - All of the above expressions can be nested. Anytime a rule matches the corresponding expression result gets cached. In case of nested functions this may mean that several rules match; in this case, each match will get cached, even if they are part of a more encompassing rule. For example, consider the following rules: :: Rule 1: log(_) Rule 2: ( _ + _ ) In this setup, in an expression ``log(x+y)`` the results of both ``x+y`` and ``log(x+y)`` get cached. Argument Rules ^^^^^^^^^^^^^^ An argument rule is attached to a query rule with the purpose of specifying further which expression results should be cached. An argument rule consists of a list of variable/OID pairs where the variables must occur in the query rule: :: var1 = oid1, var2 = oid2, ..., varN = oidN Example: ``log(x) x=1234`` This rule will only fire when a ``log()`` operator is applied to the array with OID 1234. Rule Evaluation ^^^^^^^^^^^^^^^ During evaluation of a rule, a matching of the Query Rule is done based on the concrete settings of the Argument Rule (if any). Results of expressions found this way will be put into the cache. For example, if collection ``C`` has two arrays with object identifiers 123 and 456 and collection ``D`` has two arrays with object identifiers 78 and 90. The query ``select C+D from C, D`` will yield four results, as it will be executed for each pair of objects from ``C`` and ``D``. Using an arguments rule one can restrict which of these four results will put into the cache: - If there is no argument rule then the overall rule matches all four results of the query; - If there is one argument rule ``C=123`` then the overall rule matches those two results of the query where the object addressed is involved; - If there are two argument rules ``C=123, D=78`` then the overall rule matches only one of the query results, that is: the combination in which both array objects addressed are participating; - If there are two argument rules ``C=123, D=123`` then the overall rule does not match any of results, as the object identified by 123 will never occur in a ``D`` position. Cache Management ---------------- For cache maintenance, the ``rascontrol`` syntax is extended with additional statements. As usual, these can be put into the ``rasmgr.conf`` configuration file or issued through an interactive ``ras­control`` shell. Periodically ``rasmgr`` performs cache maintenance which involves checking correctness of all cache data, removing invalid records, and releasing memory if more space is needed. Upon termination of ``rasmgr``, the whole cache is released. Without any cache control command (see :ref:`cache-control`) the cache remains disabled. How to Find Cache Rules ----------------------- The following method helps to find suitable cache rules. The rascontrol commands used here are documented in :ref:`cache-control`. - add a match-all-rule: :: define cache rule -query "_" - execute the query under consideration (e.g., using the rasql command-line tool) - inspect the cache to see how the cache component interprets the query: :: list cache - if the query as such should be cached, define a rule by copy-pasting the query expression string listed. This will cache exactly that ex­pression. If a more general pattern is to be defined, replace too specific parts of the query string listed by an underscore and add the resulting ex­press­ion as a new cache rule. Remove the match-all-rule, rerun the query and check whether cache performance is as expected. ***************** Access Interfaces ***************** Rasdaman services can be invoked in several ways: through command line, through Web services, and through C++ and Java APIs. Command Line Tools ================== Queries can be submitted to the command line tool ``rasql``. Complete control over the server is provided through several utilities, in particular ``rasmgr``; see :ref:`sec-rascontrol-invocation` for details. All tools can communicate with local and remote rasdaman servers (current exception: ``rasdl``). Web Services ============ Several Web services are available with rasdaman. They are implemented as servlets, hence independent from the array en­gine and only available if started in a servlet container such as Tom­cat or jetty. They can be accessed under the common context path ``/rasdaman``. The corresponding war files by default are located in the system Tomcat webapps directory ``/var/lib/tomcat/webapps``. rasql Queries ------------- Submission of rasql queries is possible through path ``/rasdaman/rasql``. This requires deployment of war file ``rasdaman.war``. Invocation syntax ^^^^^^^^^^^^^^^^^ The request has three mandatory parameters: +--------------+------------------------------------------------------------+ | ``username`` | rasdaman login name under which the query will be executed | +--------------+------------------------------------------------------------+ | ``password`` | password corresponding to the login | +--------------+------------------------------------------------------------+ | ``query`` | rasql query string, properly encoded for URI embedding | +--------------+------------------------------------------------------------+ Example ^^^^^^^ :: `www.acme.com/rasdaman/rasql `_ ? username=rasguest & password=rasguest & query=select%20encode(mr,"png")%20from%20mr Geo Web Services ---------------- A series of geo Web services is available at the following endpoints: * Web service for directly submitting rasql queries, and receiving results: ``/rasdaman/rasql:`` * Geo Web Services based on the interface standards of the Open Geospatial Consortium (OGC Web Services, OWS): - ``/rasdaman/ows/wms:`` OGC Web Map Service (WMS) - ``/rasdaman/ows/wcs:`` OGC Web Coverage Service (WCS) suite - ``/rasdaman/ows/wcps`` OGC WCPS (deprecated, now with WCS) - ``/rasdaman/ows/wps`` OGC Web Processing Service (WPS) This requires deployment of war file ``rasdaman.war``. * A Coordinate Reference System (CRS) Resolver service, SECORE, which is identical to the one deployed by OGC) is available under path ``/def``. This path is reflecting the OGC resolver architecture where `www.opengis.net/def/crs `_ is the branch for CRSs served by SECORE. This requires deployment of war file ``def.war``. The diagram below illustrates the OGC service architecture of rasdaman: .. code-bloc:: text clients read: read: +-----------------+ | | GetCapabilities select ... | +-----------+ | DescribeCoverage | |3rd party | | | +-----------+ | GetCoverage | | ProcessCoverage | +-----------+ | GetMap | |ws client | | +---------+ +---------+ | +-----------+ | +-------------> |petascope| +--------> |rasserver| | | +---------+ +---------+ | +-----------+ | write: write: | |wcst_import| | | +-----------+ | InsertCoverage create type/coll | | UpdateCoverage insert,update,delete +-----------------+ DeleteCoverage drop type/coll Rasdaman Web Admin Tools [RE] ----------------------------- The rasdaman Web administration interface contains several browser-based tools for server ad­ministration available at endpoint ``/rasdaman/admin``, e.g. :: https://yourserver:8080/rasdaman/admin When visiting this endpoint, a login form will require entering a valid username and password. .. note:: Currently, one admin account is supported, by specifiying the ``petascope_admin_user`` and ``petascope_admin_password`` settings in ``petascope.properties``. On successful login, the admin dashboard is shown with the following components: * :ref:`Web rascontrol `: exposes partial functionality of the command-line ``rascontrol`` tool; in particular, it allows to stop / start individual rasdaman servers, and check their status in real time. * :ref:`Statistic collection `: a reporting tool that allows monitoring incoming requests to petascope, with flexible aggregation and filtering capabilities. .. _web-rascontrol: Web rascontrol ^^^^^^^^^^^^^^ This is a web application which provides part of the rascontrol funct­ionality. As such it is a convenience interface which is not essential for oper­at­ing rasdaman; it is just as well possible to manage rasdaman exclusively by way of the command-line ``rascontrol`` and the rasdaman service script. :numref:`fig-rascontrol-web` shows a sample screenshot of the tool. .. _fig-rascontrol-web: .. figure:: media/inst-guide/rascontrol_web.png :align: center :width: 100% Web rascontrol screenshot Presently the following actions, or commands, are possible (right-most column): - Start this server. - Stop this server. This will only be performed if the server is idle at that moment; a busy server process with an open transaction will not react. - Kill this server. This will kill the server immediately, irrespective of its state. Any open transaction will be lost. Any eventual error messages will be displayed in the top message line. .. note:: Currently it is not possible to start or stop the *whole* rasdaman system via this tool -- technically, rasmgr needs to be started and stopped via command line. .. _statistic-collection: Request statistic interface ^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is a reporting tool which allows to filter and aggregate statistis information about incoming requests to petascope services (WCS, WMS, WCPS, rasql). :numref:`fig-statistic-web` shows a sample screenshot of the tool. .. _fig-statistic-web: .. figure:: media/inst-guide/stats_web.png :align: center :width: 100% Statistic collection is disabled by default in `petascope.properties` by setting ``stats_time_resolution`` to **empty**. One can enable this feature by specifying a valid time resolution (one of ``day`` / ``hour`` / ``minute`` / ``second``), which determines the smallest interval for which request statistics is aggregated and stored. When enabled, the following information is collected and stored in petascopedb per each time interval: - country from which the request originated; for this purpose GeoLite2 is used, a database file which allows to resolve a country name from an external IP address. Creative Commons License from `MaxMind `_. The following rules apply in special cases: - if the request is made from localhost, then the country will be set to ``"Localhost"``. - if the country cannot be resolved from the request IP, it will be set to ``"Unknown"``. - service (WCS, WCPS, WMS, rasql) - coverage name if applicable: WCS ``DescribeCoverage`` / ``GetCoverage`` and WMS ``GetMap``; otherwise, the following rules apply: - WCPS query referencing one or more coverages: only the first coverage name in the query is considered. - WCS ``GetCapabilities`` and WMS ``GetCapabilities``: the coverage name is set to ``"GetCapabilities"``. - rasql queries: the coverage name is set to ``"ows"``. - time in milliseconds to evaluate all requests - the total size in bytes of all responses - the number of successful and failed requests For example, if time resolution is set to `minute`, then within one minute (between 0 and 59 seconds), petascope will sum evaluation time, response size, and number of successful and failed requests for each unique triple (country, service, coverage name). By the end of each time interval, the collected data will be flushed to database and cleared for the next interval. APIs ==== Programmatic access is available through self-programmed code using the C++ and Java interfaces; see the C++ and Java Guide for details. .. _sec-server-administration: ********************* Server Administration ********************* This Section explains how to start up and shut down servers, as well as how to monitor and influence server state. It is recommended to first study the previous section so as to under­stand server administration terminology used here. General Procedure ================= ``rasmgr`` vs. ``rascontrol`` ----------------------------- It is important to distinguish between the manager, ``rasmgr``, and its control front-end, ``rascontrol``. The manager runs as a background process, supervising activity of local (and possibly remote) rasdaman servers. Interaction between user (i.e., administrator) and the manager takes place through the interactive control front end. In the sequel, it is first described how to launch the manager ``ras­mgr``, then ``rascontrol`` commands are detailed. Important Security Note ----------------------- To remain compatible with older rasdaman versions, clients use login "rasguest" / password "rasguest" by default (i.e., when no user and password are explicitly set by the application). In the distribution configuration, this user is defined to have read-only access to the databases - in plain words, * According to the default configuration, * users can access, * but not manipulate databases * without authentication. Therefore, the administrator is strongly urged to adapt authentication settings to the local security policy before switching databases online. See :ref:`sec-users-rights` to learn more about user management mechanisms. .. _sec-running-manager: Running the Manager =================== Manager Startup --------------- Starting up the rasdaman system is done by invoking the rasdaman manager, ``rasmgr``, from a shell under the ``rasdaman`` operating system login. Usually the manager will be sent to the background: :: rasmgr & Starting ``rasmgr`` is the only direct action to be done on it. Any further administration is performed using ``rascontrol``. Note that, unless a server configuration has been defined already, no rasdaman server is available just by starting the manager. Invocation Synopsis ------------------- Manager invocation synopsis: :: $ rasmgr [--help] [--hostname h] [--port p] where --help print this help --hostname h host on which the manager process is running is accessible under name / IP address *h* (default: output of Unix command hostname) --port p manager will listen to port number *p* (default: 7001) Examples -------- To start a manager which will listen at port 7001: :: $ rasmgr --port 7001 .. _sec-rascontrol-invocation: ``rascontrol`` Invocation ========================= The manager front end, rascontrol, is a command-line interface used for rasdaman admin­istrat­ion. It allows to define the whole rasdaman system configuration, including start up and shut down of server instances and user logins and rights. To secure access to the server administration facilities, rascontrol performs a login process requesting login name and password similar to the Unix rlogin command. User name must be one of the users defined in the rasdaman authentication list (see :ref:`sec-users-rights`). ``rascontrol`` Synopsis ----------------------- :: $ rascontrol [-h|--help] [--host *h*] [--port *n*] [--prompt *n*] [--quiet] [--login|--interactive|--execute *cmd*|--testlogin] where --host h name of the host where the manager runs (default: localhost) -h, --help this help --port n port number at which the manager listens to requests (default: 7001) --prompt n change rascontrol prompt as follows: - ``0`` - prompt '``>``' - ``1`` - prompt '``rasc>``' - ``2`` - prompt '``user:host>``' (default: 2) --quiet quiet, don't print header (default for ``--login`` and ``--testlogin``) --login print login and password, obtained from interactive input, to ``stdout``, then exit (see *Script Use* below) --interactive read login and password from environment variable ``RASLOGIN`` instead of requesting it interactively --execute cmd execute single ``*cmd*`` and exit (batch mode); all text following ``-x`` until end of line is passed as ``command``; this option implicitly assumes ``-e`` --testlogin just do a login and nothing else to check whether the login/password combination provided in the ``RASLOGIN`` variable is valid Interactive Use --------------- In interactive use, ``rascontrol`` will be invoked with the host parameter only. Following successful authentication, ``rascontrol`` accepts command line input from ``stdin``. Here is an example session (``mypasswd`` will not be echoed on screen): :: $ rascontrol Login name: *mylogin* Password: *mypasswd* mylogin:localhost> define dbh h1 -connect / mylogin:localhost> define db d1 -dbh h1 mylogin:localhost> define srv s1 -host localhost -type h -dbh h1 mylogin:localhost> up srv s1 mylogin:localhost> save mylogin:localhost> exit $ Script Use ---------- Alternatively to interactive login, user and password information can be taken from the environment variable ``RASLOGIN``. This variant is suit­able for batch scripting in conjunction with the ``-x`` option. The following example shows how first the ``RASLOGIN`` is set appropriately: :: $ export RASLOGIN=`rascontrol --login` \...and then a sample Unix shell script which starts all rasdaman servers defined in the system configuration, performing implicit login from the environment variable contents which has been obtained from the previous command and pasted into the shell script: :: #!/bin/bash export RASLOGIN=rasadmin:mytotallyencryptedpassword rascontrol -x up srv -all Comments in Scripts ------------------- To enhance legibility of scripts, ``rascontrol`` accepts comments in the usual shell syntax: Lines beginning with a hash sign '#' will be ignored, whatever they may contain. An example is usage in shell *here documents* (type ``man sh`` in your favourite shell for further information on this feature): :: $ rascontrol <``0``: restart rasdaman server after c requests for *c*\ =``0``: run rasdaman server indefinitely (default: *c*\ =``1000``) ``-xp options`` pass option string *options* to server upon start (default: no options, i.e., empty string) Option ``-xp`` must be the last option. Everything following "-xp" until end of line is considered to be "\ *options*\ " and will be passed, at start­up time, to the server; see :ref:`sec-server-control-options` below for the list of options available. Change Server Settings ---------------------- :: change srv s [-name n] -type t [-port p] [-dbh d] [-autorestart [on|off] [-countdown c] [-xp options] ``s`` change settings for server *s* ``-name n`` change server name to *n* ``-port p`` change port number to *p* ``-dbh d`` new database host where the relational database server runs to which the rasdaman server connects ``-autorestart a`` for *a*\ =on: automatically restart rasdaman server after unanticipated termination for *a*\ =off: don't restart ``-countdown c`` for *c*>0: restart rasdaman server after c requests for *c*\ =0: run rasdaman server indefinitely ``-xp options`` pass option string *options* to server upon start Option ``-xp`` must be the last option. Everything following "-xp" until end of line is considered to be "\ *options*\ " and will be passed, at start­up time, to the server; see Section :ref:`sec-server-control-options` below for the list of options available. Restrictions: - The server host cannot be changed. - The server name cannot be changed while the server is up. - The new settings will be used only next time the server starts. Remove rasdaman Server Definitions ---------------------------------- :: remove srv s ``s`` server name whose entry is to be deleted Remove server *s* from the definition table. It is not possible to remove a server definition while the corresponding server is up and running Status Information ------------------ ``list srv [ s | -host h | -all ] [-p]`` ``s`` give information about server *s* ``-host h`` give information about all servers running on host *h* information is requested ``-all`` list information about all servers on all hosts (default) ``-p`` additionally list configuration information The first is variant prints status information of the currently defined server(s); if *s* is provided, then only server s is listed. Database Hosts ============== Define Database Hosts --------------------- :: define dbh h [-connect c] ``h`` a unique symbolic database host name, usually the host machine name ``-connect c`` the connection string used to connect ``rasserver`` to the database server ``-user u`` the user name (optional) used to connect ``ras­server`` to the base DBMS server; for PostrgreSQL, using this parameter auto­matically implies trust authent­ic­ation. ``-passwd p`` the password (optional) used to connect ``rasserver`` to the base DBMS server; for PostrgreSQL, using this parameter auto­matically implies trust authent­ic­ation. Change Database Host Settings ----------------------------- :: change dbh h [-name n] [-connect c] ``h`` database host whose entry is to be changed ``-name n`` change symbolic database host name to *n* ``-connect c`` change connect string to *c* ``-user u`` the user name used to connect ``ras­server`` to the base DBMS server; using this optional parameter auto­matically implies ident-based authentication. ``-passwd p`` the password used to connect ``rasserver`` to the base DBMS server; using this optional parameter auto­matically implies ident-based authentication. The connection parameters can be changed at any time, however the servers will get the information only when they are restarted. Remove Database Host Definitions -------------------------------- ``remove dbh h`` ``h`` database host name whose entry is to be deleted Remove database host *h* from the definition table. It is not possible to remove a database host definition while this data­base host has active servers connected to it. Status Information ------------------ ``list dbh`` List all relational database hosts currently defined. Databases ========= Databases represent the physical database itself, together with the relat­ional database server accessing them. It is possible to have mult­iple database definitions in the rasdaman server environment which are distinguished by the database host; the interpretation, then, is that the same contents (be it the same physical database or a mirrored copy) is available through relational servers running on the different hosts mentioned. In other words, when a client opens a database, the server manager can freely choose any of the database hosts on which the database indicated is defined. The pair (database,database host) must be unique. Define Databases ---------------- ``define db d -dbh db`` ``d`` define database with name *d* ``-dbh db`` set database host name to *db* Change Database Settings ------------------------ ``change db d -name n`` ``d`` database whose name is to be changed ``-name n`` change to new database name *n* Remove Database Definitions --------------------------- ``remove db d -dbh db`` ``d`` name of database to be removed ``-dbh db`` host name of database to be removed Remove definition of database *d* from the definition table. The data­base itself remains unchanged, it is not physically deleted. It is not possible to remove a database definition while the corresp­ond­ing database has open transactions. Status Information ------------------ ``list db [ d | -dbh h | -all ]`` ``d`` give information about servers connected to database *d* ``-dbh h`` give information about all servers connected to database *d* via database host *h* ``-all`` list information about all servers connected to any known database (default) List relational database(s) defined. .. _cache-control: Query Cache Control [RE] ======================== For administrating the cache (cf. :ref:`sec-query-caching`), the rascontrol command language is extended as described below. Quick information can be retreived with ``help cache`` in rascontrol. Cache size ---------- Initial definition of a cache (such as in rasmgr.conf) is accomplished as follows: :: define cache -size S where ``S`` is an integer number with an optional modifier suffix of ``k`` (for kilo­bytes) ``M`` (for Megabytes) or ``G`` (for Gigabytes) or ``T`` (for Terabytes), for example: 500M. A cache in use can be resized through :: change cache -size S If this means an increase over the current cache size, more shared memory is allocated. If it means reducing the current cache then cache records get evicted according to the cache policy until the new size is reached. The cache is disabled by setting it to size 0 (zero): :: change cache -size 0 Cache rules ----------- Add a new cache rule on query expression ``Q`` and all variable bindings: :: define cache rule -query "Q" ( -arg var = val ) *Example:* The following rule establishes that the results of all query expressions be cached which are obtained from adding some logarithm result to object ``x`` (concretely identified by OID 1234). :: define cache rule -query "(x+log(_))" -arg x=1234 Cache eviction -------------- Remove a particular cache rule, identified by its position number as given by a list command: :: remove cache rule -pos p where ``p`` is a positive integer indicating the position number of the rule as printed by the list command. The sequence of rules may change dynamically, therefore it is strongly re­commended to determine the current position of a cache rule immediately before its deletion (and not rely on some earlier listing). All stored cache records not matching any remaining cache rule will be evicted from the cache. Cache inspection ---------------- Print current memory usage and all cached records: :: list cache Print all cache rules in use. Cache rules are numbered sequentially in no particular order: :: list cache rule Adding or deleting a rule may change the sequence completely, therefore it is strongly recommended to determine the current position of a cache rule immediately before its deletion. Cache Start-up and Shutdown --------------------------- **Cache Start** ``up cache`` Start the shared query cache: without this command caching will be turned off. This is automatically performed by ``service rasdaman start`` and ``start_rasdaman.sh`` **Cache Shutdown** ``down cache [ -force ]`` ``-force`` stop immediately without waiting for transaction end Stop the shared query cache. This is automatically performed by ``service rasdaman stop`` and ``stop_rasdaman.sh`` Server Start-up and Shutdown ============================ **Server Start** ``up srv [ s | -host h | -all ]`` ``s`` start only server *s* ``-host s`` start all servers on host *h*; this requires that a manager has been started on this host previously. ``-all`` start all servers defined; note that only those servers can be started on whose host a manager is currently running. Look up the named server(s) in the definition list, and start the spec­ified one(s) using the previously defined individual startup para­meters. At least one of the options *s*, -host *s*, and -all must be present. **Server Shutdown** ``down srv [ s | -host h | -all ] [-force] [-kill]`` ``s`` name of the server to be stopped ``-host s`` terminate all servers on host *h* ``-all`` terminate all servers ``-force`` *send SIGTERM* immediately, don't wait for transaction end ``-kill`` *send SIGKILL* immediately, don't wait for transaction end This command shuts down the indicated server(s). At least one of the options *s*, -host *s*, and -all must be present. Without ``-force`` and ``-kill``, the server is marked for shut down and will actually be terminated by sending ``SIGTERM`` after completing the current transaction. With ``-force`` and ``-kill``, the server is terminated instant­­an­eously; this should be handled with extreme caution, as experience shows that relational database systems react differently on such a situation: usually a running transaction is aborted (which is the desired behavior), but sometimes the running transaction is committed (most likely leaving the database in an inconsistent state). See a Unix manual for the difference between ``SIGTERM`` and ``SIGKILL`` signals. The manager on host *h* is not terminated. .. _sec-users-rights: Users and Their Rights ====================== Similarly to operating systems, rasdaman knows named users with access rights associated to them. Each rasdaman client must log in to the system under a specific login name using its specific password; this holds for database clients as well as for database administration. With each login name, a set of rights is associated which determines the set of actions admitted to the user under this login. To this end, the rasdaman administrator manages user login names (user names) equipped with a password and rights to access the databases. Attention: There is no way to retrieve a lost password! The set of known logins as well as the associated rights all are under administrator control; the ``define`` and ``remove`` commands serve to add or delete user logins, the ``change user`` command allows to individually assign rights to a login. In the rasdaman system's initial state after installation, user ``rasadmin`` is defined owning all possible rights (see below). A further user ``rasguest`` is defined which owns read-only access ("R") rights. For both users, the password initially is identical with the user name. It is highly recommended to change this immediately (See :ref:`sec-change-user-attributes`). Define New User --------------- ``define user u [-passwd p] [-rights r]`` ``u`` login name, must be unique (i.e., not yet existing) ``-passwd p`` set login password to pass (default: user name) ``-rights r`` rights associated with this login (default: R, i.e., read-only) The user's password can be changed at any time (see :ref:`sec-change-user-attributes`). Remove User ----------- ``remove user u`` ``u`` login name to be removed The user is removed from the login list and henceforth cannot login to the rasdaman system any more. User Rights ----------- User rights are indicated by upper case letters. They are divided into two categories: *system rights* and *data­base rights*. System rights apply to the whole system configuration of a server machine, whereas data­base rights can be specified individually for a database. The following system rights are defined: ``C`` user may change the system configuration ``A`` access control: the user may perform user management ``S`` start/stop right: the user may start and stop the system, in particular: rasdaman servers ``I`` info retrieval: the user may retrieve server status information The following database rights are defined: ``R`` user is allowed read data (select\...from\...where) from rasdaman databases ``W`` user is granted write access (update, insert, delete) to rasdaman databases Notation of Rights ------------------ In the ``change user`` command used for user rights admin­istr­at­ion, a user's rights set is described by a *rights string*. It is built from letters denoting the rights to be granted. To revoke a right, leave out the corresponding character. To grant no rights at all, use - (minus sign). No blanks or other characters are allowed in a rights string. Examples of valid rights strings are: - grant all rights: ``CASIRW`` - grant read access only: ``R`` - grant no rights at all: ``-`` These are examples for *invalid* rights strings: - Blanks between rights: ``CA SIR W`` - Invalid characters I: ``AXYZS`` - Invalid characters II: ``A_+S`` .. _sec-change-user-attributes: Change User Attributes ---------------------- ``change user u [-name n | -passwd p | -rights r]`` Options: ``u`` user login to be updated ``-name n`` change user name to *n* ``-passwd p`` change password to *p* ``-rights r`` change rights of user *u* according to rights string *r* Change name of user, login password, or user rights. Status Information ------------------ ``list user [-rights]`` ``-rights`` additionally list rights assigned to each user List all user names currently defined, optionally with their rights. .. _sec-server-control-options: Server Control Options ====================== The following options can be passed to the server when it is started by the server manager using the ``up srv`` command. Option settings are defined for a particular server using the ``rascontrol`` command ``change srv -xp`` which passes the rest of the line after ``-xp`` on to the server upon starting it (see :ref:`sec-rasdaman-servers`). --enablefs store new tiles in operating system files; only relevant when PostgreSQL backend is used. --log logfile print log to *logfile.* If *logfile* is stdout, then log output will be printed to standard output. (default: ``$RMANHOME/log/rasserver``.\ *serverid.serverpid*.log) --timeout t client time out in seconds for sign-of-life signal. If no t indicated: 300 sec; if set to 0, no sign-of-life check is done. Activated only if ``--mgmntint`` is also set. --transbuffer b set maximum size of transfer buffer to *b* bytes (default: 4 MB = 4,194,304 bytes) --cachelimit c upper limit of cache area in bytes (default: 0) --enable-tilelocking perform tile-level locking on insert / update / delete (default: whole database is locked) Distributed Query Processing ============================ Rasdaman can form a federation network for query answering. In such a setup, ``rasmgrs`` facing congestion (i.e., all ``rasserver`` worker processes busy) will try to acquire a free server from some other ``rasmgr``'s holding in the federation. Session-based server assignment ------------------------------- As always in rasdaman, acquisition and release of server processes is done on session level: when a client opens a new connection, it gets a server assigned; when it closes the connection, this server is released and put back into the pool of available processes. Hence, for optimal load balance clients should strive to have short-running sessions and not keep open connections unduly for a long time. Federation network ------------------ The federation network is defined in a decentralized way: each ``rasmgr`` knows peers from which it accepts requests, and to which it can send re­quests. To this end, each ``rasmgr`` maintains an ``inpeer`` and ``outpeer`` list: - The ``inpeer`` list contains those hosts from which this node's ``rasmgr`` will accept requests. - The ``outpeer`` list contains those hosts which this node's ``rasmgr`` will ask for server processes on local session overflow. By manipulating these two lists administrators can exercise fine-grain security policy in a rasdaman federation network. Note that the federation connectivity graph is not necessarily symmetric: a ``ras­mgr`` may send requests to some other ``rasmgr``, but not accept re­quests, and vice versa, depending on the individual configuration. Each host individually respects these statements, there is no global ras­da­man federation configuration. Federation node addressing -------------------------- Addressing is based on hostnames, where a hostname in the sequel is one of - a domain name, resolvable by this ``rasmgr``'s host - an IP address All ``inpeer`` and ``outpeer`` statements accumulate so that host identifiers can be added and removed incrementally. Security -------- A ``rasmgr`` request for a server process on another host is treated by the incoming host in the same way as any such incoming client request. The requesting ``rasmgr`` authenticates via the login and password which the originating client used for authenticating against rasdaman in the first place. This implies that a client approaching such a federation must be known in all federation nodes. See :ref:`sec-users-rights` for details on users and the various permissions they can have on a database. If neither any ``inpeer`` nor any ``outpeer`` is defined (either interactively through ``ras­control`` or by way of settings in ``rasmgr.conf``) then this ras­da­man instance will act completely standalone and will neither send nor accept peer requests. Define peers ------------ :: define inpeer hostname ``hostname`` host from which requests for rasdaman server process assignment will be accepted by this rasmgr :: define outpeer hostname [-port portnumber] ``hostname`` host from which this rasmgr may request a rasdaman server process ``portnumber`` port number at which the rasfed on that host is listening (default: 7000) List peers ---------- :: list inpeer list outpeer These commands list all currently defined inpeers and outpeers, respectively. Remove peers ------------ :: remove inpeer hostname remove outpeer hostname These commands remove hostname *hostname* listed from the list of peers. Examples -------- :: define inpeer www.acme.com define inpeer 192.168.28.10 Caveat: fluctuating IPs ----------------------- In cloud environments, IP addresses are maintained dynamically and can change for a given host between reboots. Hence, when growing a rasda­man federation by launching new VMs care must be taken that the in- and outpeers received the proper current IP address. Restrictions ------------ In the current version, the queries are distributed only if the receiving rasmgr has no locally assigned rasservers. This limitation will be removed in the next release. Miscellaneous ============= Help ---- :: help Display top level help page :: help [command] command help Display information specific to *command* (both syntax variants are equivalent) Version Information ------------------- :: list version ``version`` display rasdaman server version. Save Changes to Disk -------------------- :: save The ``save`` operation writes the current configuration and authorization values to disk. All changes done during the session thus become permanent. ``rascontrol`` Termination -------------------------- :: exit terminates ``rascontrol``. ******** Security ******** There are several security measures available, which should be considered seriously. Among them are the access right mechanisms found in Tomcat, web server, rasdaman, and PostgreSQL. We highly recommend to make use of these. For Tomcat, Web server, and PostgreSQL we refer to the pertaining documentation. For rasdaman, we recommend to change the default user passwords in rasdaman (rasguest/rasguest for read-only access, rasadmin/rasadmin for read-write and administrator access) to not run into the Oracle "Scott/tiger" trap. Even better, add separate, private users. For all these actions, the ``rascontrol`` utility is your friend. Along the same line we recommend to configure petascope access to rasdaman using a read-only login which is different from the default one provided in the ``petascope.properties`` file. The servlet is safe against SQL injection attacks - we are not aware of any means for the user to send custom queries to the PostgreSQL server or the rasdaman server. XSRF and XSS represent no danger to the service because there is no user generated content available. The rasdaman service doesn't use cookies. ***************************** Example Database and Programs ***************************** Example Database ================ A demonstration database is provided as part of the delivery package which contains the collections and images described in the *Query Language Guide*. To populate this database, first install the system as described here in the *Installation Guide*, and then invoke ``rasdaman_insertdemo.sh`` in the ``bin`` directory. This script makes use of the example images sitting in the ``examples`` directory. It is recommended to populate this demo database - it occupies only marginal disk space - first: Successful generation of this database shows overall successful rasdaman installation. Before the test programs can be used, the demo database has to be created and schema information has to be imported. The following command line creates the database *RASBASE*: :: $ rasdl --basename RASBASE --createdatabase The following imports schema information: :: $ rasdl --basename RASBASE --read examples/rasdl/basictypes.dl --insert Finally, the following line establishes the demo database (using a script from the ``bin`` directory: :: $ rasdaman_insertdemo.sh base It is not important whether the rasdaman server is running during ``rasdl`` execution, however, the server is required for the ``rasdaman_insert­demo.sh`` script, as this is a client application. Example Programs ================ Several example programs are provided in the ``c++`` and ``java`` subdirectories of ``$RMANHOME/share/rasdaman/examples``. Each directory contains a Makefile plus several ``.cc`` and ``.java`` sources, resp. Makefile -------- The ``Makefile`` serves to compile and link the sample C++ / Java sources files delivered. It is a good source for hints on the how-tos of compiler and linker flags etc. .. note:: All programs, once compiled and linked, print a usage synopsis when invoked without parameter. ``query.cc`` ------------ Sends a hardwired query to a running rasdaman system: .. code-block: rasql select a[0:4,0:4] from mr as a where some_cells( a[8:9,8:9] >= 0 ) In addition, it demonstrates how to work with the result set returned from rasdaman. The query can easily be changed, or even made a parameter to the program. ``Query.java`` -------------- Sends the following hardwired query if one is not provided as a parameter: .. code-block: rasql select avg_cells( a ) from mr ``AvgCell.java`` ---------------- This program computes the average cell value from all images of a given collection on client side. Note that it requires grayscale images. A good candidate collection is ``mr`` from the demo database. *************** Troubleshooting *************** General ======= The first step in troubleshooting problems should be to look into the :ref:`server logs `. Start with checking the ``rasmgr`` and ``rasserver`` logs for any errors. If this does not provide any clues, check the ``petascope.log`` or ``catalina.out``. Manually stop rasdaman ---------------------- If stopping rasdaman fails, it may be necessary to manually stop it: :: # check the rasdaman processes still running on the system ps aux | grep ras # force kill any rasmgr process; is the number in the 2nd column # of the output from the previous command kill -9 # then try to kill rasserver processes pkill rasserver # if this fails, force kill rasservers pkill -9 rasserver Checking the server logs could provide further information on why stopping rasdaman failed in the first place.