2. Installation and Administration Guide

2.1. Preface

2.1.1. Overview

This guide provides information about how to use the rasdaman array database system, in particular: installation and system administration.

For storage of multi-dimensional array data, rasdaman can be configured to use some conventional database system (such as PostgreSQL) or use its own storage manager. For the purpose of this documentation, we will call the conventional database system to which rasdaman is interfaced the base DBMS, understanding that this base DBMS is in charge of all alphanumeric data maintained as relational tables or object-oriented semantic nets.

This guide is specific for rasdaman enterprise.

2.1.2. Audience

The information in this manual is intended primarily for database and system administrators.

2.1.3. Rasdaman Documentation Set

This manual should be read in conjunction with the complete rasdaman documentation set which this guide is part of. The documentation set in its completeness covers all important information needed to work with the rasdaman system, such as programming and query access to databases, guidance to utilities such as raswct, release notes, and additional information on the rasdaman wiki.

2.2. Installation

This page describes installation of rasdaman enterprise Debian or RPM packages. With your purchase, you have received a login to the rasdaman download area, which will be necessary to setup the installation and updating of rasdaman packages.

Hardware & Software Requirements

It is recommended to have at least 8 GB main memory. Disk space depends on the size of the databases, as well as the requirements of the base DBMS of rasdaman chosen. The footprint of the rasdaman installation itself is around 400 MB.

Rasdaman is continuously tested on the platforms listed below. The rasdaman code has been developed on SUN/Solaris and HP-UX originally, and has been ported to IBM AIX, SGI IRIX, and DEC Unix - but that was way back in the last millennium.

  • Ubuntu 20.04, 22.04, 24.04

  • Debian 12

The rasdaman engine in the packages uses embedded SQLite for managing its array metadata. The geo service component, petascope, currently relies on a PostgreSQL database by default, but can be reconfigured with an embedded H2 database instead if desired.

Licence Key

In order to run a rasdaman server you have to obtain a licence from rasdaman GmbH. This licence key encodes, among others, the number of cores and the server’s interface name (such as “eth0”) and corresponding MAC address. The following commands are usually used to obtain this information:

# interface name + MAC address
$ ip link
# alternatively
$ ifconfig

# number of CPUs
$ nproc
# alternatively
$ cat /proc/cpuinfo

After communicating these ingredients to rasdaman GmbH in the course of a licence purchase, a licence key file will be provided which has to be stored on the machine where the rasdaman server runs.

Alternatively in case of cloud deployments where the MAC is not fixed, it is possible to do a licence verification through a verification service instead. This requires that the machine has access to the internet, so that it can communicate to the verification service over port 80.

Compatibility

Rasdaman community and enterprise cannot run in parallel on the same machine. If you plan to have both installations on the same machine, make sure they reside in different directories and are not active at the same time. Rasdaman databases created with rasdaman community are upwards compatible with rasdaman enterprise.

Support

For support in installing rasdaman and any other question you may contact rasdaman GmbH at www.rasdaman.com.

2.2.1. Debian-based systems

Currently the following Debian-based distributions are supported:

  • Ubuntu 20.04 / 22.04 / 24.04

  • Debian 12

2.2.1.1. Installation

  1. Copy the rasdaman licence key to /opt/rasdaman/etc, e.g:

    $ sudo mkdir -p /opt/rasdaman/etc
    $ sudo cp rmankey /opt/rasdaman/etc
    

    Note

    This has to be done before installing rasdaman.

  2. Import the rasdaman repository public key to the apt keyring:

    $ repo="download.rasdaman.com/Download"
    # set username and password variables to your download credentials
    $ username="USERNAME"
    # an empty space before the next command prevents it from saving in the shell history
    $  password="PASSWORD"
    
    # Import the rasdaman repository public key (note the commands are different for
    # Ubuntu and Debian)
    
    # Ubuntu:
    $ wget -O - "https://$username:$password@$repo/rasdaman.gpg" | \
      sudo apt-key add -
    
    # Debian:
    $ curl -fsSL "https://$username:$password@$repo/rasdaman.gpg" | \
      sudo gpg --dearmor -o /etc/apt/keyrings/rasdaman.gpg
    

    Note

    You may need to update the ca-certificates package to allow SSL-based applications (e.g. apt-get update or curl) to check for the authenticity of SSL connections:

    $ sudo apt-get install ca-certificates
    
  3. Add the rasdaman packages repository to apt:

    • stable: these packages are only updated on stable releases of rasdaman, and hence recommended for operational production installations.

      $ . /etc/os-release  # provides $VERSION_CODENAME
      $ echo "deb [arch=amd64] https://$repo/deb $VERSION_CODENAME stable" \
        | sudo tee /etc/apt/sources.list.d/rasdaman.list
      
    • testing: updated more frequently with beta releases, so it is aimed for feature testing in less-critical installations.

      $ . /etc/os-release  # provides $VERSION_CODENAME
      $ echo "deb [arch=amd64] https://$repo/deb $VERSION_CODENAME testing" \
        | sudo tee /etc/apt/sources.list.d/rasdaman.list
      
  4. Add the login credentials for the rasdaman packages repository:

    # note: $username and $password were defined in step 2.
    $ echo "machine download.rasdaman.com login $username password $password" \
      | sudo tee /etc/apt/auth.conf.d/rasdaman.conf
    # make sure that only the root user can read/write this file
    $ sudo chmod 600 /etc/apt/auth.conf.d/rasdaman.conf
    
  5. rasdaman can be installed now:

    $ sudo apt-get update
    
    # check CPU SIMD capabilities
    $ grep flags /proc/cpuinfo | head -n1 | grep -o -E '(sse|avx)[^ ]*'
    
    # install one of rasdaman-avx512, rasdaman-avx2, rasdaman-avx, rasdaman
    # in that order, depending on what SIMD extensions are supported by your CPU;
    # e.g. if you see avx512* in the output, then install rasdaman-avx512, if
    # you don't see avx512 but see avx2 then install rasdaman-avx2, etc.
    $ sudo apt-get install rasdaman-<simd>
    

    If during the install you get a prompt like the below, type N (default option):

    Configuration file `/etc/opt/rasdaman/petascope.properties'
     ==> Modified (by you or by a script) since installation.
     ==> Package distributor has shipped an updated version.
       What would you like to do about it ?  Your options are:
        Y or I  : install the package maintainer's version
        N or O  : keep your currently-installed version
          D     : show the differences between the versions
          Z     : start a shell to examine the situation
     The default action is to keep your current version.
    *** petascope.properties (Y/I/N/O/D/Z) [default=N] ?
    

    If you are automating the installation (in a script for example), you can bypass this prompt with an apt-get option as follows:

    $ apt-get -o Dpkg::Options::="--force-confdef" install -y rasdaman
    

    You will find the rasdaman installation under /opt/rasdaman/. Finally, to make rasql available on the PATH for your system user:

    $ source /etc/profile.d/rasdaman.sh
    
  6. Check that the rasdaman server can answer queries:

    $ rasql -q 'list collections on localhost' --out string
    

    Typical output:

    rasql: rasdaman query tool v1.0, rasdaman v10.0.0 -- generated on 26.02.2020 08:44:56.
    opening database RASBASE at localhost:7001...ok
    Executing retrieval query...ok
    Query result collection has 0 element(s):
    rasql done.
    
  7. Check that petascope is initialized properly, typically at this URL:

    http://localhost:8080/rasdaman/ows
    

2.2.1.2. Updating

The packages are updated whenever a new rasdaman version is released. To update your installation:

$ sudo apt-get update
$ sudo service rasdaman stop
$ sudo apt-get install rasdaman

Note

You may need to update the ca-certificates package to allow SSL-based applications like wget/curl to check for the authenticity of SSL connections:

$ sudo apt-get install ca-certificates

2.2.2. RPM-based systems

Currently no RPM-based distributions are supported.

If an RPM-based OS must be used, then one way to install rasdaman is to setup the latest Ubuntu LTS in a VM or a docker container and install rasdaman in it.

2.2.3. Customizing the package installation

When installing or updating rasdaman from the official packages, the process can be optionally customized with an installation profile (see example installer configuration).

  • To customize when installing rasdaman for the first time, it is necessary to first download the package install profile from here.

  • When updating an existing rasdaman installation, you can find the default package install profile in your installation at /opt/rasdaman/share/rasdaman/installer/profiles/package/install.toml.

Download / copy the install.toml file to some place, e.g. $HOME/rasdaman_install.toml, and make any desired changes to it before installing or updating rasdaman. Make sure that the RAS_INSTALL_PATH environment variable is set to point to the custom profile, e.g.

export RAS_INSTALL_PATH="$HOME/rasdaman_install.toml"

When you install or update rasdaman afterwards, the configuration process will take the custom profile into account instead of the default one.

2.3. Running rasdaman

This section provides a high-level overview on how start/stop rasdaman and petascope, monitor them, and configure for typical usage.

Most of the time the information presented here is sufficient for operating a rasdaman service; for deeper understanding on how it works behind the scenes, check the Server Administration section.

2.3.1. Service Control

2.3.1.1. rasdaman

A rasdaman service script allows to start/stop rasdaman, e.g.

$ service rasdaman start
$ service rasdaman stop
$ service rasdaman force-stop
$ service rasdaman status

It can be similarly referenced with systemctl, e.g.

$ systemctl start rasdaman
$ systemctl stop rasdaman
$ systemctl status rasdaman

The service script can be customized by updating environment variables in /etc/default/rasdaman (create the file if it does not exist). The default settings can be seen below.

▶ show

See also the dedicated pages on configuration and log files and administration.

2.3.1.2. petascope

Check this section on how to start / stop the petascope component of rasdaman.

2.3.2. Service monitoring

To help with monitoring the health of a running rasdaman service, a watch_rasdaman.sh script is provided in /opt/rasdaman/bin. It performs status checks on rasdaman by sending a test rasql query, and to petascope by sending a test WCS GetCapabilities request. If a problem is detected in the response, then rasdaman and/or petascope will be restarted, unless this is prevented via the appropriate options. To support the restart actions, it should be executed with root or a user that has sudo rights. Various information is logged to stdout, as well as /opt/rasdaman/log/watch_rasdaman.sh.log. In case of problems, the script can be configured to send an email notification.

To see usage details and a list of all options, execute watch_rasdaman.sh --help; in short:

watch_rasdaman.sh [ --email-config <C> ] [ --petascope-endpoint <E> ]
                  [ --no-restart-rasdaman ] [ --no-restart-petascope ]
                  [ --rmanhome <path> ] [ --custom-check-script <path> ]

It is recommended to execute it regularly with a cron job, e.g. every hour:

$ sudo su # switch to root user
$ crontab -e

0 * * * * /opt/rasdaman/bin/watch_rasdaman.sh --email-config ~/.email.cfg

Note

Executing the script as root is safe, the only system-modifying actions it makes are: logging information in /opt/rasdaman/log/watch_rasdaman.sh.log, and potentially restarting rasdaman, and restarting tomcat if external servlet container deployment is configured in petascope.properties.

2.3.3. Configure rasdaman

Rasdaman is a multi-server multi-user system. The server processes available must be configured initially, which is done in file $RMANHOME/etc/rasmgr.conf. For distribution, this configuration contains ten server processes going by a name like, for example, N1. If this is fine then you can just leave it as it is. If you want to change this by modifying server startup parameters or increasing the number of server processes available then see rascontrol Invocation for details on how to do this.

2.3.4. Configure petascope

Petascope is the geo Web service frontend of rasdaman. It adds geo semantics on top of arrays, thereby enabling regular and irregular grids based on the OGC coverage standards.

To implement the geo semantics, petascope uses a relational database for the geo-related metadata. Currently, PostgreSQL and H2 / HSQLDB are supported. The package post-install script will automatically set up PostgreSQL for use by petascope. The steps approximately performed by the script are listed below.

The default setup can be changed in the petascope.properties configuration file.

2.3.4.1. PostgreSQL

PostgreSQL is automatically configured when rasdaman is installed, so doing the below is not usually necessary; we list the steps as documentation of how is PostgreSQL configured by default:

  1. If postgres has not been initialized yet:

    $ sudo service postgresql initdb
    

    If the output is ‘Data directory is not empty!’ then this step is skipped.

  2. Trust-based access in PostgreSQL is enabled by adding the below configuration before the ident lines to /etc/postgresql/9.4/main/pg_hba.conf on Debian 8, or /var/lib/pgsql/data/pg_hba.conf on CentOS 7:

    host    all   petauser   localhost       md5
    host    all   petauser   127.0.0.1/32    md5
    host    all   petauser   ::1/128         md5
    
  3. Reload PostgreSQL so that the new configuration will take effect:

    $ sudo service postgresql reload
    
  4. Add a petascope user, for example petauser, to PostgreSQL:

    $ sudo -u postgres createuser -s petauser -P
    > enter password
    

    In $RMANHOME/etc/petascope.properties set the spring.datasource.username/spring.datasource.password and metadata_user/metadata_pass options accordingly to this user / password. The password is randomly generated.

  5. Copy /opt/rasdaman/share/rasdaman/war/rasdaman.war to the Tomcat webapps directory (/var/lib/tomcat/webapps on CentOS 7) and restart Tomcat.

    Following successful deployment, petascope accepts OGC W*S requests at URL http://localhost:8080/rasdaman/ows.

2.3.4.2. H2 / HSQLDB

To alternatively set up H2 / HSQLDB for use by petascope instead of PostgreSQL:

  1. Create a directory that will host petascopedb and the H2 driver:

    $ mkdir /opt/rasdaman/geodb
    
  2. Make sure the user running the webserver serving petascope can read/write to the folder above. For example, Tomcat webserver which uses tomcat user

    $ sudo chown -R tomcat: /opt/rasdaman/geodb
    

    However, if embedded deployment is enabled in petascope.properties, then the owner should be the rasdaman user which runs rasdaman

    $ sudo chown -R rasdaman: /opt/rasdaman/geodb
    
  3. Download the driver and place it in the created directory. For example, download a H2 driver

    $ cd /opt/rasdaman/geodb
    $ wget https://repo1.maven.org/maven2/com/h2database/h2/1.4.200/h2-1.4.200.jar
    
  4. Configure database settings in petascope.properties file, see details.

  5. Restart the webserver running petascope (or rasdaman if embedded tomcat).

2.3.4.3. SSL/TLS configuration

Transport Layer Security (TLS) and its predecessor, Secure Sockets Layer (SSL), are technologies which allow web browsers and web servers to communicate over a secured connection. To configure it for petascope and secore web applications for Tomcat, check the official guide.

2.3.5. MQTT broker connection

The rasdaman core system and the petascope geo-services component use the mosquitto MQTT broker to synchronize with each other. This is only done on Ubuntu 22.04 or later due to availability of dependency packages.

The mosquitto package is a dependency of rasdaman, and is automatically installed when rasdaman is installed. The package installation will configure the mosquitto service to allow only users with valid credentials and only local connections on the default port 1883 by updating /etc/mosquitto/mosquitto.conf, and will create a user rasdaman with a random password in /etc/mosquitto/password_file.

A broker configuration file /opt/rasdaman/etc/broker.properties will be generated as well, which provides the connection settings needed for rasdaman and petascope to connect to mosquitto:

address=tcp://localhost:1883
username=rasdaman
password=<random_password>

2.4. Installed Files and Data

2.4.1. Top-level directories

As common with rasdaman, we refer to the installation location as $RMANHOME below; the default is /opt/rasdaman. The table below lists the top-level directories found in $RMANHOME after a fresh installation.

Directory

Description

bin

rasdaman executables, e.g. rasql, start_rasdaman.sh, …

data

Path where the server stores array tiles as files; this directory can get big, it is recommended to make it a link to a sufficiently large disk partition.

etc

Configuration files, e.g. rasmgr.conf

include

C++ API development headers.

lib

C++ and Java API libraries.

log

rasmgr and rasserver log files.

share

Various artefacts like documentation, python/javascript clients, example data, migration scripts, etc.

2.4.2. Executables

Rasdaman executables are found in $RMANHOME/bin; the table below lists the various binaries and scripts. More detailed information on these components is provided in the Server Architecture Section.

Executables

Description

rasserver

Client queries are evaluated by a rasserver worker process.

rasmgr

A manager process that controls rasserver processes and client/server pairing.

rascontrol

A command-line frontend for rasmgr.

directql

A rasserver that can execute queries directly, bypassing the client/server protocol; useful for debugging.

rasql

A command-line client for sending queries to a rasserver (as assigned by the rasmgr).

start_rasdaman.sh

Start rasmgr and the worker rasservers as configured in $RMANHOME/etc/rasmgr.conf. More details here.

stop_rasdaman.sh

Shutdown rasdaman, embedded petascope and embedded secore if enabled. More details here.

watch_rasdaman.sh

Helper script for monitoring an operational rasdaman service. Details in section on Service monitoring.

create_db.sh

Initialize the rasdaman metadata database (RASBASE).

update_dh.sh

Applies migration scripts to RASBASE.

rasdaman_insertdemo.sh

Insert three demo collections into rasdaman (used in the rasdaman Query Language Guide).

petascope_insertdemo.sh

Insert geo-referenced demo coverage in petascope.

migrate_petascopedb.sh

Applies database migrations on petascopedb. More details here.

wcst_import.sh

Tool for convenient and flexible import of geo-referenced data into petascope. More details here.

prepare_issue_report.sh

Helps preparing a report for an issue encountered while operating rasdaman. More details here.

rasfed.jar

Federation daemon.

2.4.2.1. start_rasdaman.sh

This script starts rasdaman. Normally rasdaman is installed from packages, and instead of executing this script directly one would execute service rasdaman start. Any options to be passed on to start_rasdaman.sh can be set in /etc/default/rasdaman in this case; see more details.

To start a specific service (rasdaman, rasfed, or embedded petascope) the --service (core | rasfed | petascope) option can be used(core refers to rasmgr + rasserver only).

Since v10.0 the rasmgr port can be specified with -p, --port. Additionally, for security and usability reasons, start_rasdaman.sh will refuse running if executed with root user; this can be overriden if needed with the --allow-root option.

The script will use various environment variables, if they are set before it is executed:

  • RASMGR_PORT - the port on which rasmgr will listen when started, and to which client applications will connect in order to send queries to rasdaman. This variable will be overrided by the value of option --port, if specified. By default if none are specified, the port is set to 7001.

  • RASLOGIN - rasdaman admin credentials which will be used for starting rasmgr non-interactively. See more details on the format and how is this setting used here. If not set, the script defaults to using rasadmin/rasadmin credentials; see here on how to change these defaults.

  • JAVA_OPTS - options passed on to the java command when used to start the OGC frontend of rasdaman (petascope) if it is configured for embedded deployment. If not set, it defaults to -Xmx4000m

Check -h, --help for all details.

2.4.2.2. stop_rasdaman.sh

This script stops rasdaman. Normally rasdaman is installed from packages, and instead of executing this script directly one would execute service rasdaman stop. Any options to be passed on to stop_rasdaman.sh can be set in /etc/default/rasdaman in this case; see more details.

The script stops rasmgr, rasservers, rasfed, and petascope (if configured for embedded deployment) in the correct order with a regular TERM signal to each process; this ensures that the services exit properly. In some cases, a process may be hanging instead of exiting on the TERM signal; since rasdaman v10.0, stop_rasdaman.sh will detect and report such cases. It is prudent to then check the relevant process logs, and if it appears that there is no reason for the process hanging one can force-stop it with stop_rasdaman.sh --force, or manually do it by sending it a KILL signal (e.g. kill -KILL <pid>).

To stop a specific service the --service (core | rasfed | petascope ) option can be used. Since v10.0 the rasmgr port can be specified with -p, --port.

The script will use various environment variables, if they are set before it is executed:

  • RASMGR_PORT - the port on which rasmgr was set to listen when it was started. This variable will be overrided by the value of option --port, if specified. By default if none are specified, the port is set to 7001.

  • RASLOGIN - rasdaman admin credentials which will be used for stopping rasmgr non-interactively. See more details on the format and how is this setting used here. If not set, the script defaults to using rasadmin/rasadmin credentials; see here on how to change these defaults.

Check -h, --help for all details.

2.4.2.3. migrate_petascopedb.sh

This script is used to migrate coverages imported by wcst_import, OWS Service metadata and WMS 1.3 layers. For more details see Meta Database Connectivity and Configure petascope.

There are 2 types of migration:

  1. Migrate petascopedb v9.4 or older to a newer rasdaman version. After the migration, the old petascopedb is backed up at petascope_94_backup.

  2. Migrate petascopedb v9.5 or newer to a different database name or different database (e.g. PostgreSQL to HSQLDB).

Note

The petascope Web application must not be running (e.g in Tomcat) while migrating to a different database (type 2 above) to protect the existing data integrity.

The script will use various environment variables, if they are set before it is executed:

  • JAVA_OPTS - options passed on to the java command when used to start embedded petascope to migrate. If not set, it defaults to -Xmx4000m

2.4.3. Configuration files

Configurations are automatically loaded upon rasdaman start. After any modification a restarthas to be performed for the change to take effect.

Server rasdaman configuration files can be found in $RMANHOME/etc:

rasmgr.conf

allows fine-tunning the rasdaman servers, e.g. number of servers, names, database connection

petascope.properties

set petascope properties, e.g. backend/rasdaman connection details, CRS resolver URLs, features

secore.properties

secore configuration

rasfed.properties

federation daemon configuration

broker.properties

settings for connecting to a mosquitto MQTT broker (details)

Logging output of petascope and secore is configured in their respective config files, while logging output of rasdaman is controlled via the below configuration files:

log-rasmgr.conf

log output of rasmgr

log-server.conf

log output of rasserver worker processes

log-client.conf

log output of client applications, e.g., rasql

rasdaman uses the Easylogging++ library for logging in its C++ components. Log properties can be configured as documented on the EasyLogging GitHub page.

The enterprise licence file rmankey is also found in the etc directory.

External, potentially relevant configuration files are:

postgresql

/var/lib/pgsql/data/{postgresql.conf,pg_hba.conf} or

/etc/postgresql/9.X/{postgresql.conf,pg_hba.conf}

tomcat

/etc/tomcat/, /etc/default/tomcat

mosquitto

/etc/mosquitto/mosquitto.conf

2.4.4. Log files

rasdaman

rasdaman server logs are placed in $RMANHOME/log/. The server components feed the following files where uid represents a unique identifier of the process, and pid is a Linux process identifier:

rasserver.<uid>.<pid>.log

rasserver worker logs: at any time there are several rasservers running (depending on the settings in rasmgr.conf) and each has a unique log file.

rasmgr.<pid>.log

rasmgr log: there is only one rasmgr process running at any time.

rasfed.log

rasfed log: there is only one rasfed process running at any time; on rasdaman restart the output from the new process is appended to the same log file.

petascope.log

petascope log if java_server=embedded in petascope.properties.

watch_rasdaman.sh.log

Log from the watch_rasdaman.sh script is appended to this file whenever it is executed.

Note

ls -ltr is a useful command to see the most recently modified log files at the bottom when debugging recently executed queries.

petascope & secore

The path to the petascope.log file is set in the log4j configuration section in /opt/rasdaman/etc/petascope.properties.

  • If petascope is deployed embedded as part of rasdaman, then the path must be writable by the rasdaman user; default is on rasdaman installation is log4j.appender.rollingFile.File=/opt/rasdaman/log/petascope.log.

  • If petascope is deployed in an external servlet container, by default Tomcat 9, then the path must be writable by the tomcat9 user; default is log4j.appender.rollingFile.File=/var/log/tomcat9/petascope.log.

2.4.5. Temporary files

Rasdaman stores various data temporarily in /tmp/rasdaman\_* directories, in particular:

  • /tmp/rasdaman\_conversion/ - format-encoded data, such as TIFF, NetCDF, etc., is in some cases temporarily stored here before decoding into rasdaman. This also happens always when encoding query processing results into some format for export. The intermediate data is quickly removed as soon as the encoding or decoding process is finished.

    Temporarily, however, this directory can get rather large: if you export array result that encodes into a 1GB TIFF file, then the directory will contain 1GB of data for some time; if 10 such queries run concurrently, then it may contain up to 10GB of data. For this reason we recommend to check the size of /tmp during installation, and make sure it is large enough. It is always recommended to make /tmp a separate partition, so as to prevent system-wide problems in case the filesystem is filled up with data.

  • /tmp/rasdaman\_petascope/ - contains small temporary files generated during data import with the wcst_import tool.

  • /tmp/rasdaman\_transaction\_locks/ - during query read/write transaction, rasdaman generates various empty lock files in this directory. As the files are empty, the size of this directory is minimal.

    While rasdaman is running this directory must not be removed, otherwise it may lead to data corruption.

2.4.6. Demo data & programs

2.4.6.1. Example database

A demonstration database is provided as part of the delivery package which contains the collections and images described in the Query Language Guide. To populate this database, first install the system as described here, and then invoke:

$ rasdaman_insertdemo.sh

The demo database occupies marginal disk space, and is a straightforward way to show that the rasdaman installation has been successfull.

2.4.6.2. Example programs

Several example programs are provided in the c++ and java subdirectories of $RMANHOME/share/rasdaman/examples. Each directory contains a Makefile plus .cc and .java sources, resp.

2.4.6.3. Makefile

The Makefile helps to compile and link the sample C++ / Java sources files delivered. It is a good source for hints on the how-tos of compiler and linker flags.

Note

All programs, once compiled and linked, print a usage synopsis when invoked without parameter.

2.4.6.4. query.cc

Sends a hardwired query to a running rasdaman system:

In addition, it demonstrates how to work with the result set returned from rasdaman. The query can easily be changed, or made a parameter to the program.

2.4.6.5. Query.java

Sends the following hardwired query if one is not provided as a parameter:

2.4.6.6. AvgCell.java

This program computes the average cell value from all images of a given collection on client side. Note that it requires grayscale images. A good candidate collection is mr from the demo database.

2.5. Access Interfaces

Rasdaman services can be invoked in several ways: through command line, Web requests, and custom programs connecting via the C++ and Java APIs.

2.5.1. Command Line Tools

Queries can be submitted to the command line tool rasql. Complete control over the server is provided through several utilities, in particular rasmgr; see rascontrol Invocation for details. All tools can communicate with local and remote rasdaman servers.

2.5.2. Web Services

Several Web services are available with rasdaman. They are implemented as servlets, hence independent from the array engine and only available if started in a servlet container such as Tomcat or jetty. They can be accessed under the common context path /rasdaman.

  • /rasdaman/ows exposes geo Web Services based on the interface standards of the Open Geospatial Consortium (OGC Web Services, OWS). Supported OGC standards are:

    • Web Coverage Service (WCS)

    • Web Coverage Processing Service (WCPS)

    • Web Map Service (WMS) suites

  • /rasdaman/def provides access to a Coordinate Reference System (CRS) Resolver Service, SECORE. It is identical to the one deployed by OGC, where http://www.opengis.net/def/crs is the branch for CRS served by SECORE.

  • /rasdaman/rasql provides support for submitting rasql queries and receiving results with standard HTTP requests. Requests must specify three mandatory parameters:

    username

    rasdaman login name under which the query will be executed

    password

    password corresponding to the login

    query

    rasql query string, properly encoded for URI embedding

    Example:

    http://localhost:8080/rasdaman/rasql
        ?username=rasguest
        &password=rasguest
        &query=select%20encode%28mr2%2C%22png%22%29%20from%20mr
    

Note

rasql servlet also supports rasdaman user credentials in basic authentication header. In this case, username and password parameters are not required as the credentials are extracted from the header.

The diagram below illustrates the OGC service architecture of rasdaman:

▶ show

2.5.3. Rasdaman Web Admin Tools [RE]

The rasdaman Web administration interface contains several browser-based tools for server ad­ministration available at endpoint /rasdaman/admin, e.g.

http://localhost:8080/rasdaman/admin

When visiting this endpoint, a login form will require entering a valid rasdaman user, which has at least one of the following privileges: PRIV_SERVER_MGMT, PRIV_OWS_STATISTICS, PRIV_USER_MGMT, PRIV_ROLE_MGMT, PRIV_TRIGGER_MGMT.

On successful login, the admin dashboard is shown with the following components:

  • Web rascontrol: exposes partial functionality of the command-line rascontrol tool; in particular, it allows to stop / start individual rasdaman servers, and check their status in real time.

  • Statistic collection: a reporting tool that allows monitoring incoming requests to petascope, with flexible aggregation and filtering capabilities.

  • Web access control: tools that allow to manage local users, roles, and triggers in rasdaman.

2.5.3.1. Web rascontrol

This is a web application which provides part of the rascontrol funct­ionality. As such it is a convenience interface which is not essential for oper­at­ing rasdaman; it is just as well possible to manage rasdaman exclusively by way of the command-line rascontrol and the rasdaman service script. Figure 2.1 shows a sample screenshot of the tool.

_images/rascontrol_web.png

Figure 2.1 Web rascontrol screenshot

Presently the following actions, or commands, are possible (right-most column):

  • Start this server.

  • Stop this server. This will only be performed if the server is idle at that moment; a busy server process with an open transaction will not react.

  • Kill this server. This will kill the server immediately, irrespective of its state. Any open transaction will be lost.

Any eventual error messages will be displayed in the top message line.

The logged in user must have the following system privileges: PRIV_LIST_SERVERS to see the list of configured rasservers, and PRIV_SERVER_MGMT to be able to start/stop/kill them.

Note

Currently it is not possible to start or stop the whole rasdaman system via this tool – technically, rasmgr needs to be started and stopped via command line.

2.5.3.2. Request statistic interface

This is a reporting tool which allows to filter and aggregate statistics information about incoming requests to petascope services (WCS, WMS, WCPS, rasql). Figure 2.2 shows a sample screenshot of the tool.

The logged in user must have the PRIV_OWS_STATISTICS system privilege to be able to see the access statistics.

_images/stats_web.png

Figure 2.2 Request statistic screenshot

Statistic collection is disabled by default in petascope.properties by setting stats_time_resolution to empty. One can enable this feature by specifying a valid time resolution (one of day / hour / minute / second), which determines the smallest interval for which request statistics is aggregated and stored. When enabled, the following information is collected and stored in petascopedb per each time interval:

  • country from which the request originated; for this purpose GeoLite2 is used, a database file which allows to resolve a country name from an external IP address. Creative Commons License from MaxMind. The following rules apply in special cases:

    • if the request is made from localhost, then the country will be set to "Localhost".

    • if the country cannot be resolved from the request IP, it will be set to "Unknown".

  • service (WCS, WCPS, WMS, rasql)

  • coverage name if applicable: WCS DescribeCoverage / GetCoverage and WMS GetMap; otherwise, the following rules apply:

    • WCPS query referencing one or more coverages: only the first coverage name in the query is considered.

    • WCS GetCapabilities and WMS GetCapabilities: the coverage name is set to "GetCapabilities".

    • rasql queries: the coverage name is set to "ows".

  • requesting username. If basic authentication header is not enabled, then the username is set to empty.

  • time in milliseconds to evaluate all requests

  • the total size in bytes of all responses

  • the number of successful and failed requests

For example, if time resolution is set to minute, then within one minute (between 0 and 59 seconds), petascope will sum evaluation time, response size, and number of successful and failed requests for each unique triple (country, service, coverage name). By the end of each time interval, the collected data will be flushed to database and cleared for the next interval.

2.5.3.3. Web access control

The Web access control interface allows an administrator to manage users, roles, and triggers in rasdaman. The interface can be found in the Access Control section of the Web Admin Tools. Please read the rasdaman access control document for more information Access Control [RE].

2.5.3.3.1. User Management

The top panel contains:

  • Username textbox - identifier ([a-zA-Z_][a-zA-Z0-9_]*)

  • Password textbox - a string of printable ASCII characters (except \), of maximum length 200.

    Note

    Passwords will never be shown; admin can update password for any user.

  • Grant/Revoke roles dropdown - a list of roles which can be granted to the user

  • Clear button - clear all textboxes and checkboxes

  • Insert button - insert a new (non-existing) user

  • Update button - update password and granted roles for an existing user

The main panel contains a table with 3 columns:

  • User Names column - list all existing users.

    Note

    System users used by petascope (highlighted with red color, e.g rasadmin and rasguest) cannot be deleted, hence there are no Delete buttons in the Action column.

  • Granted Role Names column - list all granted roles for corresponding users

  • Action column - contains Delete buttons which allows to delete corresponding users

    _images/users_management_access_control.png

    Figure 2.3 User management web access control interface.

To see the users and their privileges, as well as perform actions such as creating / dropping users and granting them privileges, the logged in user should have the following system privileges granted: PRIV_LIST_USERS, PRIV_USER_MGMT, PRIV_LIST_ROLES, PRIV_GRANT, and PRIV_REVOKE.

2.5.3.3.2. Role Management

The top panel contains:

  • Role name textbox - identifier ([a-zA-Z_][a-zA-Z0-9_]*); cannot start with PRIV_ as this prefix is reserved for system privileges

  • Grant/Revoke roles dropdown - a list of roles which can be granted to a role or revoked from a role

  • Grant/Revole to users dropdown - a list of users which a role can be granted to

  • Grant/Revole trigger exemptions dropdown - a list of triggers for which exemptions can be granted to a role

  • Clear button - clear all textboxes and checkboxes

  • Insert button - insert a new (non-existing) role

  • Update button - update an existing role with the granted roles / trigger exemptions

The main panel contains a table with 5 columns:

  • Role Names column - list all existing roles

    Note

    System roles (with prefix PRIV_ and highlighted with red color, e.g: PRIV_SELECT) cannot be deleted, hence no Delete buttons in Action column.

  • Granted Role Names column - list of all granted roles for corresponding roles

  • Granted User Names column - list of all users which are granted by corresponding roles

  • Granted Trigger Names - column - list of all granted triggers for corresponding roles

  • Action column - contains Delete buttons which allows to delete corresponding roles

    _images/roles_management_access_control.png

    Figure 2.4 Role management web access control interface.

To see the roles and their privileges, as well as perform actions such as creating / dropping roles and granting them privileges, the logged in user should have the following system privileges granted: PRIV_LIST_USERS, PRIV_ROLE_MGMT, PRIV_LIST_ROLES, PRIV_LIST_TRIGGERS, PRIV_GRANT, and PRIV_REVOKE.

2.5.3.3.3. Trigger Management

The top panel contains:

  • Trigger name textbox - identifier ([a-zA-Z_][a-zA-Z0-9_]*)

    Note

    Trigger name is extracted automatically from the trigger query definition in View & Edit Query dialog.

  • View & Edit Query button - open a dialog which contains a textarea to allow one to write triger query definition.

  • Grant/Revoke to roles dropdown - a list of roles to which a trigger exemption can be granted to

  • Clear button - clear all textboxes and checkboxes

  • Insert button - add a new (non-existing) trigger

  • Update button - update the query and granted roles for an existing trigger

The main panel contains a table with 3 columns:

  • Trigger Names column - list of all existing triggers.

  • Granted Role Names column - list of all roles which are granted by corresponding triggers

  • Action column - contains Delete buttons which allows to delete corresponding triggers

    _images/triggers_management_access_control.png

    Figure 2.5 Triggers management web access control

To see the triggers, as well as perform actions such as creating / dropping triggers and granting exemptions, the logged in user should have the following system privileges granted: PRIV_LIST_TRIGGERS, PRIV_LIST_ROLES, PRIV_TRIGGER_MGMT, PRIV_GRANT, and PRIV_REVOKE.

2.5.3.4. Billing/Quota management

This is an interface which allows to view the list of existing users enabled for billing tracking, and insert / update / delete a billing user.

The logged in user must have the PRIV_USER_MGMT system privilege to make changes here.

_images/billing_management.png

Figure 2.6 Billing/Quota interface screenshot

The top panel contains:

  • Username combo box - select a rasdaman user to be enabled for billing / quota checking

  • List of textboxes to set the quota for a specific billing user for inserting / updating. A quota value (e.g. 35.35 KB or 25MB) has this pattern: Number[space]*[unit]; if unit is omitted, then it is byte (B); valid values for unit are: B|KB|MB|GB|TB|PB.

  • No quota checkbox - if checked, all textboxes are set to unlimited quota values: 100 PB.

  • Clear button - clear all textboxes

  • Insert button - add a new (non-existing) billing user with the specific quota values

  • Update button - update the quota values for an existing billing user

The main panel contains a table which show human-readable quota and usage values for each user enabled for billing.

2.5.3.5. Users queries metrics view

This interface allows to view a list of queries metrics collected in the billing database. The logged in user must have PRIV_USER_MGMT privilege to access functionality on this page.

_images/billing_user_metrics_view.png

Figure 2.7 Query metrics interface

The top panel contains:

  • Username combo box - filter query metrics by a rasdaman user

  • Query Type combo box - filter the queries by type (SELECT, INSERT, etc.)

  • Start Time - show only queries started after the specified value

  • End Time - show only queries that finished before the specified value

  • Rows Limit - max number of rows to view from the list of results

  • Rows Offset - set the starting rows to view from the list of results

  • View button - return the list of rows matching the user-input parameters

The main panel contains a table showing details for each query, including the full query string.

2.5.3.5.1. HTTP Headers & Authentication

Rasdaman can be configured to require authentication on incoming WCS/WCPS/WMS requests. This is done via the authentication_type setting in petascope.properties, which accepts the following values:

  • basic_header enables basic header authentication, so that requests will be required to provide credentials for an existing rasdaman user. If the request does not have valid credentials, an error with HTTP code 401 Unauthorized Error is returned.

    Moreover, petascope checks the assigned roles of the provided user from the incoming request to determine if the user can do a specific task or not. For example, a user will not be allowed to delete a coverage unless it has the PRIV_DELETE privilege (more details here).

    This is the default value for authentication_type when rasdaman is first installed.

  • An empty string, i.e. authentication_type=, disables request authentication in petascope. All requests will be forwarded to rasdaman with the credentials specified in petascope.properties by rasdaman_user / rasdaman_pass for read-only queries, and rasdaman_admin_user / rasdaman_admin_pass for update queries which make changes in the database.

2.5.4. Basic header authentication

For incoming requests, rasdaman-geo requires credentials attached in HTTP headers when authentication_type=basic_header is set in petascope.properties. The valid format of the credentials must be:

Authorization: Basic encode_in_base64(username:password)

For example, if username is admin and password is admin and the client is curl, one need to construct this request:

curl -H "Authorization: Basic YWRtaW46YWRtaW4=" \
     "http://localhost:8080/rasdaman/ows?
      service=WCS&version=2.0.1&request=DeleteCoverage&coverageId=test_coverage"

or simpler with the --user option:

curl --user "rasadmin:rasadmin" \
     "http://localhost:8080/rasdaman/ows?
      service=WCS&version=2.0.1&request=DeleteCoverage&coverageId=test_coverage"

Note

If basic authentication is not enabled in rasdaman, depending on the request, rasdaman uses one of the users configured by rasdaman_user and rasdaman_admin_user settings in petascope.properties appropriately to run a rasql query.

Note

Using wcst_import to insert/update coverages to petascope will require to specify credentials when running script wcst_import.sh. Check wcst_import.sh -h for optional parameters -i, --identity-file or -u, --user USER_NAME.

Note

WSClient (at http://YOUR_SERVER/rasdaman/ows) will detect if petascope requires authentication to show a login form via this API http://YOUR_SERVER/rasdaman/admin/authisactive. If it is the case, one needs to provide valid credentials to login and WSClient keeps the credentials in Web Browser’s local storage. After that, any requests to petascope will be added basic authentication in HTTP header implicitly.

2.5.5. APIs

Programmatic access is available through self-programmed code using the C++ and Java interfaces; see the C++ <cpp-dev-guide> and Java <java-dev-guide> guides for details.

2.6. Server Architecture

The parallel server architecture of rasdaman offers a scalable, distributed environment to efficiently process even very large numbers of concurrent client requests. Yet, server administration is easy to accomplish, with only few things to do to have a smoothly running, highly performant installation. Moreover, the system is implemented in a special high availability technique where most server management operations can be done with the server up and running, limiting the need for a server shutdown to the absolute minimum.

In this Section the general rasdaman server architecture is outlined. It is recommended to study this section so as to understand server administration terminology used in the next Section.

2.6.1. Executables Overview

The following executables are provided in the bin/ directory, among others:

  • rasmgr is the central rasdaman request dispatcher; clients connect to rasmgr initially and are then assigned to a specific rasserver process which will evaluate queries;

  • rasserver is the rasdaman server engine, it should not be generally invoked in a standalone manner;

  • rascontrol allows to interactively control the rasdaman server by communicating with rasmgr;

  • rasfed is the federation daemon, which enables efficient query query distribution in federated rasdaman networks;

  • rasql is the command-line based query tool, explained in detail in the rasdaman Query Language Guide.

2.6.2. Server Manager and Server

2.6.2.1. Overview and Terminology

The rasdaman server configuration consists of one dispatcher process per computer, rasmgr (we will refer to it as manager in the sequel), and server processes, rasserver (referred to as servers), of which at a given time none, one, or several ones can be running. All server processes are under control of the manager. Server manager and rasdaman server(s) all run on the same physical hardware, the rasdaman host.

The servers resolve requests, thereby generating calls to the relational database system which in turn accesses its database files. For the purpose of this manual, the relational server together with the database it maintains are collectively called the database. The machine the relational database server runs on is referred to as database host (Figure 2.8).

_images/image3.png

Figure 2.8 Overall server hierarchy, introducing the terminology for rasdaman hardware and software environment

2.6.2.2. Server Structure in General

The manager accepts client requests and assigns server instances to them, taking them from the pool of server processes it maintains. In distributed installations, it keeps contact to the managers on other machines to further dispatch client requests across all the rasdaman servers available. Whenever needed, the administrator can launch further server instances, or shut them down again.

Upon system configuration definition (see rascontrol Invocation), a unique name is assigned to each server identifying it to the manager.

Each rasdaman server is assigned to a relational database server, laid down in the manager configuration file. Databases can be registered and associated to particular rasdaman servers at any time.

rasdaman hosts and database hosts are identified by their resp. host name in common domain address form, e.g., martini.rasdaman.com or 199.198.197.50.

Rascontrol is the interactive front-end to rasmgr and, as such, the main utility for user and system management. It provides the necessary functions to manage the whole system configuration, to add and remove user, to change their rights, and to obtain information about system activity.

The rasdaman server, i.e., rasserver, is controlled by the manager which starts and stops server instances. Hence, the rasserver executable should not (and actually cannot) be invoked directly.

2.6.2.3. Dynamic Server Assignment

The process of client/server communication and server scheduling is done as follows (see numbers in Figure 2.9).

  1. The client starts every OPENDB and BEGIN TRANSACTION request with an HTTP call to the manager, providing the required service type (RPC, HTTP, etc.) and the database name, together with user name and password.

  2. The manager’s answer is the server ID of a free server, or an error message in case no server is available or access is denied for the given login.

  3. Client-Server communication to perform the database requests.

  4. Upon CLOSEDB and ABORT/COMMIT TRANSACTION the server informs the manager that it is available again. This is also done upon a client timeout.

These negotiation steps are performed between client library and server, hence transparent to the application.

The rasdaman server system is started by invoking the server manager rasmgr (see Running the Manager). If it finds a configuration file, them autopmatically all servers indicated will be started; alternatively, server configuration can be done directly through rascontrol (see rascontrol Invocation).

_images/image4.png

Figure 2.9 Internal server management

2.6.2.4. System Start-up

Invocation of the rasmgr executable must be done under the operating system login under which the rasdaman installation has been done, usually (and recommended) rasdaman. The service script /etc/init.d/rasdaman (when rasdaman is installed from the packages) automatically takes care of this.

2.6.2.5. Authentication

On every machine hosting rasdaman servers a separate manager has to run. The manager maintains an authorization file, $RMANHOME/etc/rasmgr.auth. It should not be changed by the administrator, as they are generated, maintained, and overwritten by the manager.

_images/image5.png

Figure 2.10 rasdaman federation

2.6.2.6. rasdaman Manager Defaults

The manager’s default name is the hostname (the one reported by the UNIX command hostname), but it can be changed (see the change command). By default, it listens to port 7001 for incoming requests and uses port 7001 for outgoing requests.

To keep overview of the ports used, it is recommended to use the following schema (there is, however, no restriction preventing from choosing another schema):

  • use port number 7001 for the server manager;

  • use port numbers 7002 to 7999 for rasdaman servers.

2.6.3. Storage backend

rasdaman stores array data in a file system directory, and array metadata in a standard SQL DBMS. As backends for the array metadata SQLite and PostgreSQL are supported. Default database name, assumed by all tools, is RASBASE. While it can be changed this is not recommended as all tools will need to receive an extra parameter indicating the changed name.

Note

rasdaman enterprise additionally supports access to pre-existing archives of any structure, see In-Situ File Archive Storage [RE] for more information; in this case no array data will be additionally stored.

2.6.3.1. SQLite

SQLite is the default backend, configured with this setting in /opt/rasdaman/etc/rasmgr.conf when rasdaman is first installed:

define dbh rasdaman_host -connect /opt/rasdaman/data/RASBASE

The -connect value is the absolute path to the SQLite database file on disk.

Array data is stored in the directory containing the SQLite database under a TILES subdirectory, i.e. /opt/rasdaman/data/TILES by default. To change the default location: stop rasdaman, move the whole directory to the new location, update rasmgr.conf and finally start rasdaman again.

An array data directory can also be specified independently from the SQLite RASBASE file path separated by a semicolon in the format <rasbase_path>;<data_directory>, e.g:

-connect /opt/rasdaman/data/RASBASE;/mnt/large_disk/rasdata

This may be needed when storing the large array data on a network filesystem which does not have good support for the SQLite database file. But also as the RASBASE file is typically small, it is worth putting it separately on a fast disk.

2.6.3.2. PostgreSQL

Instead of SQLite, rasdaman can be configured to use a PostgreSQL database. This may be desirable for scalability on a heavily used installation, as postgres offers better support when many users are concurrently accessing rasdaman, especially when importing and querying data simultaneously.

To switch to postgres, first the -connect string in rasmgr.conf needs to be updated to a value of the format <db_connection_string>;<data_directory>, where <db_connection_string> is a valid postgres connection string, and <data_directory> is an absolute path to a directory that will hold ingested array data.

For example to connect to database RASBASE and default data directory for the array tiles /opt/rasdaman/data:

define dbh rasdaman_host -connect dbname=RASBASE;/opt/rasdaman/data

This assumes that the system user running rasdaman can login to postgres without any password, probably created with createuser -s rasdaman. For the full syntax of the connection string refer to the corresponding PostgreSQL documentation. If the connection string or the data directory contain any spaces, the config value must be quoted with double quotes, e.g:

-connect "host=localhost port=5432 dbname=RASBASE connect_timeout=10;/opt/rasdaman/data"

Another example with a connection URI:

-connect "postgresql://user:secret@localhost/RASBASE;/opt/rasdaman/data"

Once rasmgr.conf has been updated with the new connect string, it is necessary to initialize the database schema by running create_db.sh (if it has not been done before):

sudo -u rasdaman /opt/rasdaman/bin/create_db.sh

Like with SQLite, the array data is stored in a TILES subdirectory of the data directory specified in the connect string, i.e. /opt/rasdaman/data/TILES.

2.6.4. Query Result Caching [RE]

2.6.4.1. Overview

Query results can be cached in a shared memory area of the server’s main memory. Cache contents is shared among all rasserver processes running on the same computer. Cache coherence (i.e., automatic adaptation of the cache contents after database updates) is ensured.

A cached result is used by a subsequent query if a subexpression in this query matches with the cached result; in this case, the cached result replaces the query expression, thereby speeding up processing of the query. Results do not have to match exactly; if a larger array is cached than a query needs then the subset needed will be extracted and reused, which still provides the query with a performance gain.

Measurements have shown speedups of several orders of magnitude in presence of cache contents reuse.

Key cache parameters are configurable by the administrator. By default, the cache is disabled; it needs to be activated through the cache control commands described below.

2.6.4.2. Cache Reuse

A query can use a cached item if it contains an occurrence of the expression that has produced the cached element, and if this expression has been applied to the same array object the query wants to access. Note that the decision considers what base data item (i.e., array) has been used – in other words, an expression can benefit from the cache only if it addresses the same array.

2.6.4.2.1. Scope

The unit of caching is a single result item, either an array, or a tile, or a scalar. As rasql queries are set-oriented one query may access several arrays, and may deliver more than one item. In the cache, each such item constitutes a separate, independent entry. Subsequent queries check the cache for useful elements on the level of single elements. Therefore, even if only some array results can be reused in a query addressing a set of arrays then the query still can benefit from the matches found.

For avoiding doubts, no complete queries (select ... from .. where) can be cached, but array access and operations up to complete select and where clauses, including data format encoding.

Several situations are possible, they are explained below in turn.

2.6.4.2.2. Reuse of Full Query Result

Cached results can be used in several situations:

  • Exact match: An expression result is already in the cache. The result will be used, no further evaluation of the corresponding expression is necessary.

  • Subexpression match: An expression in the incoming query contains a subexpression which has been evaluated earlier and, hence, is in the cache. The cached subexpression result will be pasted into the expression, thereby reducing the computational effort needed.

  • Partial match: An expression in the incoming query contains a matching subexpression, but with only partial overlap in the domain of the cached array. The cached result will be used as much as possible, the non-cached cells of the expression will be computed.

2.6.4.3. Cache Rule Concepts

Caching can be controlled through so-called cache rules. These define patterns of (sub-) expressions to be cached whenever they occur. Cache rules consist of regular expressions over the query language functions, so-called query rules, together with variable bindings, here called argument rules. For example, the following cache rule defines that the result of every application of log() applied to the array identified by OID 123 should be kept in cache:

Rule N: log(x) x=1234

Below the concepts are explained in turn.

2.6.4.3.1. Query Rule

A query rule is a string representation of some query subexpression occurring in a select or where clause. It consists of concrete function in­vocations and wildcards.

Concrete function invocations are written as they would be written in rasql. These function invocations may contain variables or subexpressions; in case of variables, the concrete binding is done in the Arguments Rule de­scribed below.

Operator wildcards allow expressing any position of an operator in some nested expression. The underscore symbol (_) is used as a wildcard symbol; it matches any number of nested invocations of any function supported by rasql.

The query rule syntax is given by the following rule set:

  • (empty string)

    This matches any query expression, i.e.: all expression results (including all subexpressions) will be cached. Such a rule should be defined very consciously as it will cause a massive cache utilization.

  • _

    Same as above: matches any expression.

  • f(_)

    where f is a function symbol defined in rasql.

    This matches exactly a (sub-) expression log(x), such as in

    select log(x) from x
    

    In the following query, log(x) will be cached / can be reused from cache:

    select abs(log(x)) from x
    

    Arity of the function (its number of parameters) is ignored, i.e., a single _ represents any para­meter list. The function symbol can represent a scalar as well as an induced operation.

    For example, log(_) matches log() invocations with any argument, be it a concrete variable or a subexpression itself;

    • it matches: log(x) and log(x+abs(y))

    • it does not match: log(x)+log(y) because the topmost oper­ation is an addition.

    See below for binding variables to concrete array OIDs.

  • ( _ op _ )

    where op is an infix function symbol defined in rasql. This symbol can represent a scalar as well as an induced operation.

    For example,

    • it matches: (x+(y*z)) and (log(x)+log(y))

    • it does not match: (x*(y+z))

  • ( case _ end )

    This matches all case expressions, for example:

    case when x > 0 then 1 when x < 0 then -1 else 0 end
    
  • ( marray _ )

    This matches all marray expressions.

    marray _ values f()

    This matches all marray expressions containing an invocation of f().

  • ( condense _ )

    This matches all condense expressions.

  • All of the above expressions can be nested.

Anytime a rule matches the corresponding expression result gets cached. In case of nested functions this may mean that several rules match; in this case, each match will get cached, even if they are part of a more encompassing rule. For example, consider the following rules:

Rule 1: log(_)

Rule 2: ( _ + _ )

In this setup, in an expression log(x+y) the results of both x+y and log(x+y) get cached.

2.6.4.3.2. Argument Rules

An argument rule is attached to a query rule with the purpose of specifying further which expression results should be cached. An argument rule consists of a list of variable/OID pairs where the variables must occur in the query rule:

var1 = oid1, var2 = oid2, ..., varN = oidN

Example: log(x) x=1234

This rule will only fire when a log() operator is applied to the array with OID 1234.

2.6.4.3.3. Rule Evaluation

During evaluation of a rule, a matching of the Query Rule is done based on the concrete settings of the Argument Rule (if any). Results of expressions found this way will be put into the cache.

For example, if collection C has two arrays with object identifiers 123 and 456 and collection D has two arrays with object identifiers 78 and 90. The query select C+D from C, D will yield four results, as it will be executed for each pair of objects from C and D. Using an arguments rule one can restrict which of these four results will put into the cache:

  • If there is no argument rule then the overall rule matches all four results of the query;

  • If there is one argument rule C=123 then the overall rule matches those two results of the query where the object addressed is involved;

  • If there are two argument rules C=123, D=78 then the overall rule matches only one of the query results, that is: the combination in which both array objects addressed are participating;

  • If there are two argument rules C=123, D=123 then the overall rule does not match any of results, as the object identified by 123 will never occur in a D position.

2.6.4.4. Cache Management

For cache maintenance, the rascontrol syntax is extended with additional statements. As usual, these can be put into the rasmgr.conf configuration file or issued through an interactive ras­control shell.

Periodically rasmgr performs cache maintenance which involves checking correctness of all cache data, removing invalid records, and releasing memory if more space is needed. Upon termination of rasmgr, the whole cache is released.

Without any cache control command (see Query Cache Control [RE]) the cache remains disabled.

2.6.4.5. How to Find Cache Rules

The following method helps to find suitable cache rules. The rascontrol commands used here are documented in Query Cache Control [RE].

  • add a match-all-rule:

    define cache rule -query "_"
    
  • execute the query under consideration (e.g., using the rasql command-line tool)

  • inspect the cache to see how the cache component interprets the query:

    list cache
    
  • if the query as such should be cached, define a rule by copy-pasting the query expression string listed. This will cache exactly that ex­pression.

    If a more general pattern is to be defined, replace too specific parts of the query string listed by an underscore and add the resulting ex­press­ion as a new cache rule. Remove the match-all-rule, rerun the query and check whether cache performance is as expected.

2.7. Server Administration

This Section explains on how to manage a rasdaman service on a lower level: start up and shut down individual server workers, as well as how to monitor and influence server state.

It is recommended to first study the previous section so as to understand server administration terminology used here.

2.7.1. General Procedure

2.7.1.1. rasmgr vs. rascontrol

It is important to distinguish between the manager, rasmgr, and its control front-end, rascontrol. The manager runs as a background process, supervising activity of local (and possibly remote) rasdaman servers. Interaction between user (i.e., administrator) and the manager takes place through the interactive control front end.

In the sequel, it is first described how to launch the manager rasmgr, then rascontrol commands are detailed.

2.7.1.2. Important Security Note

To remain compatible with older rasdaman versions, clients use login “rasguest” / password “rasguest” by default (i.e., when no user and password are explicitly set by the application). In the distribution configuration, this user is defined to have read-only access to the databases, so that users can access but not manipulate databases without authentication.

Therefore, the administrator is strongly urged to adapt authentication settings to the local security policy before switching databases online.

See Users and Their Rights to learn more about user management mechanisms.

2.7.2. Running the Manager

2.7.2.1. Manager Startup

Starting up the rasdaman system is done by invoking the rasdaman manager, rasmgr, from a shell under the rasdaman operating system login. Usually the manager will be sent to the background:

$ rasmgr &

Starting rasmgr is the only direct action to be done on it. Any further administration is performed using rascontrol.

Note that, unless a server configuration has been defined already, no rasdaman server is available just by starting the manager. Usually rasmgr is started from start_rasdaman.sh, rather than directly.

2.7.2.2. Invocation Synopsis

Manager invocation synopsis:

$ rasmgr [--help] [--hostname h] [--port p]

where

--help

print this help

--hostname h

host on which the manager process is running is accessible under name / IP address h (default: output of Unix command hostname)

--port p

manager will listen to port number p (default: 7001)

2.7.2.3. Examples

To start a manager which will listen at port 7001:

$ rasmgr --port 7001

2.7.3. rascontrol Invocation

The manager front end, rascontrol, is a command-line interface used for rasdaman administration. It allows to define the whole rasdaman system configuration, including start up and shut down of server instances and user logins and rights.

To secure access to the server administration facilities, rascontrol performs a login process requesting login name and password similar to the Unix rlogin command. User name must be one of the users defined in the rasdaman authentication list (see Users and Their Rights).

2.7.3.1. rascontrol Synopsis

$ rascontrol [-h|--help] [--host *h*] [--port *n*] [--prompt *n*]
             [--quiet]
             [--login|--interactive|--execute *cmd*|--testlogin]

where

--host h

name of the host where the manager runs (default: localhost)

-h, --help

this help

--port n

port number at which the manager listens to requests (default: 7001)

--prompt n

change rascontrol prompt as follows:

  • 0 - prompt ‘>

  • 1 - prompt ‘rasc>

  • 2 - prompt ‘user:host>

(default: 2)

--quiet

quiet, don’t print header (default for --login and --testlogin)

--login

print login and password, obtained from interactive input, to stdout, then exit (see Script Use below)

--interactive

read login and password from environment variable RASLOGIN instead of requesting it interactively

--execute cmd

execute single *cmd* and exit (batch mode); all text following -x until end of line is passed as command; this option implicitly assumes -e

--testlogin

just do a login and nothing else to check whether the login/password combination provided in the RASLOGIN variable is valid

2.7.3.2. Interactive Use

In interactive use, rascontrol will be invoked with the host parameter only. Following successful authentication, rascontrol accepts command line input from stdin.

Here is an example session (mypasswd will not be echoed on screen):

▶ show

2.7.3.3. Script Use

Alternatively to interactive login, user and password information can be taken from the environment variable RASLOGIN. This variant is suitable for batch scripting in conjunction with the -x option.

The following example shows how first the RASLOGIN is set appropriately:

$ export RASLOGIN=`rascontrol --login`

and then a sample Unix shell script which starts all rasdaman servers defined in the system configuration, performing implicit login from the environment variable contents which has been obtained from the previous command and pasted into the shell script:

#!/bin/bash
export RASLOGIN=rasadmin:mytotallyencryptedpassword
rascontrol -x up srv -all

2.7.3.4. Comments in Scripts

To enhance legibility of scripts, rascontrol accepts comments in the usual shell syntax: Lines beginning with a hash sign ‘#’ will be ignored, whatever they may contain. An example is usage in shell here documents (type man sh in your favourite shell for further information on this feature):

$ rascontrol <<EOF
# this is the command submitted to rascontrol:
list srv -all
# now terminate rascontrol:
exit
# the following line terminates rascontrol input:
EOF
$

2.7.4. rascontrol Command List

2.7.4.1. Command Synopsis

help

exit

list

up

down

define

remove

change

save

display information (general or about specific command)

exit rascontrol

list info about the current status of the system

start server(s)

stop rasdaman server(s) or server manager(s)

define a new object

remove an object

change parameters of objects

make configuration changes permanent

In the remainder of this section, commands are explained in detail, sorted by the targets they affect.

2.7.5. Server Hosts

2.7.5.1. Define Server Hosts

define host h -net n -port p
h

symbolic host name

-net n

set network host name to n

-port p

port on which the rasdaman manager will listen

2.7.5.2. Change Server Host Settings

change host h [-name n] [-net x] [-port p]
            [-uselocalhost [on|off] ]
h

host name whose entry is to be updated

-name n

change host name to n

-net x

change network name to x

-port p

change port number to p

-uselocalhost [on|off]

use domain name localhost (IP address 127.0.0.1) instead of regular network host name; usually this speeds up communication a little (default: on)

Note that it is not possible to change network name or port for a host while this server is running.

uselocalhost works only for the master manager and is on by default. This means that the servers running on manager master host should

2.7.5.3. Remove Server Host Definitions

remove host h
h

host name whose entry is to be deleted

Remove host h from the definition table.

It is not possible to remove a host definition while the corresponding host has active servers.

2.7.5.4. Status Information

list host

List all hosts currently defined.

2.7.6. rasdaman Servers

2.7.6.1. Define rasdaman Servers

define srv s -host h -port p -dbh d
    [-autorestart [on|off] [-countdown c]
    [-xp options]
s

a unique, not yet used name for the server

-host h

name of the host where the server will run

-port p

TCP/IP port on which the server will listen (recommended: 7002 - 7999)

-dbh d

database host where the relational database server to which the rasdaman server connects will run

-autorestart a

for a = on: automatically restart rasdaman server after unanticipated termination for a = off: don’t restart (default: a = on)

-countdown c

for c > 0: restart rasdaman server after c requests for c = 0: run rasdaman server indefinitely (default: c = 10000)

-xp options

pass option string options to server upon start (default: no options, i.e., empty string)

Option -xp must be the last option. Everything following “-xp” until end of line is considered to be options and will be passed, at startup time, to the server.

2.7.6.2. Change Server Settings

change srv s [-name n] [-port p] [-dbh d]
        [-autorestart [on|off] [-countdown c]
        [-xp options]
s

change settings for server s

-name n

change server name to n

-port p

change port number to p

-dbh d

new database host where the relational database server runs to which the rasdaman server connects

-autorestart a

for a = on: automatically restart rasdaman server after unanticipated termination for a = off: don’t restart

-countdown c

for c > 0: restart rasdaman server after c requests for c = 0: run rasdaman server indefinitely

-xp options

pass option string options to server upon start

Option -xp must be the last option. Everything following “-xp” until end of line is considered to be options and will be passed, at startup time, to the server.

Restrictions:

  • The server host cannot be changed.

  • The server name cannot be changed while the server is up.

  • The new settings will be used only next time the server starts.

2.7.6.3. Remove rasdaman Server Definitions

remove srv s
s

server name whose entry is to be deleted

Remove server s from the definition table.

It is not possible to remove a server definition while the corresponding server is up and running

2.7.6.4. Status Information

list srv [ s | -host h | -all ] [-p]

s

give information about server s

-host h

give information about all servers running on host h information is requested

-all

list information about all servers on all hosts (default)

-p

additionally list configuration information

The first is variant prints status information of the currently defined server(s); if s is provided, then only server s is listed.

2.7.7. Database Hosts

2.7.7.1. Define Database Hosts

define dbh h [-connect c]
h

a unique symbolic database host name, usually the host machine name

-connect c

the connection string used to connect rasserver to the backend database server; see Storage backend for more details on the format of c depending on whether the backend DBMS is SQLite or PostgreSQL.

2.7.7.2. Change Database Host Settings

change dbh h [-name n] [-connect c]
h

database host whose entry is to be changed

-name n

change symbolic database host name to n

-connect c

change connect string to c; see Storage backend for more details on the format of c depending on whether the backend DBMS is SQLite or PostgreSQL.

The connection parameters can be changed at any time, however the servers will get the information only when they are restarted.

2.7.7.3. Remove Database Host Definitions

remove dbh h

h

database host name whose entry is to be deleted

Remove database host h from the definition table.

It is not possible to remove a database host definition while this database host has active servers connected to it.

2.7.7.4. Status Information

list dbh

List all relational database hosts currently defined.

2.7.8. Databases

Databases represent the physical database itself, together with the relational database server accessing them. It is possible to have multiple database definitions in the rasdaman server environment which are distinguished by the database host; the interpretation, then, is that the same contents (be it the same physical database or a mirrored copy) is available through relational servers running on the different hosts mentioned. In other words, when a client opens a database, the server manager can freely choose any of the database hosts on which the database indicated is defined.

The pair (database,database host) must be unique.

2.7.8.1. Define Databases

define db d -dbh db

d

define database with name d

-dbh db

set database host name to db

2.7.8.2. Change Database Settings

change db d -name n

d

database whose name is to be changed

-name n

change to new database name n

2.7.8.3. Remove Database Definitions

remove db d -dbh db

d

name of database to be removed

-dbh db

host name of database to be removed

Remove definition of database d from the definition table. The database itself remains unchanged, it is not physically deleted.

It is not possible to remove a database definition while the corresponding database has open transactions.

2.7.8.4. Status Information

list db [ d | -dbh h | -all ]

d

give information about servers connected to database d

-dbh h

give information about all servers connected to database d via database host h

-all

list information about all servers connected to any known database (default)

List relational database(s) defined.

2.7.9. Query Cache Control [RE]

For administrating the cache (cf. Query Result Caching [RE]), the rascontrol command language is extended as described below. Quick information can be retreived with help cache in rascontrol.

2.7.9.1. Cache size

Initial definition of a cache (such as in rasmgr.conf) is accomplished as follows:

define cache -size S

where S is an integer number with an optional modifier suffix of k (for kilo­bytes) M (for Megabytes) or G (for Gigabytes) or T (for Terabytes), for example: 500M.

A cache in use can be resized through

change cache -size S

If this means an increase over the current cache size, more shared memory is allocated. If it means reducing the current cache then cache records get evicted according to the cache policy until the new size is reached.

The cache is disabled by setting it to size 0 (zero):

change cache -size 0

2.7.9.2. Cache rules

Add a new cache rule on query expression Q and all variable bindings:

define cache rule -query "Q" ( -arg var = val )

Example: The following rule establishes that the results of all query expressions be cached which are obtained from adding some logarithm result to object x (concretely identified by OID 1234).

define cache rule -query "(x+log(_))" -arg x=1234

2.7.9.3. Cache eviction

Remove a particular cache rule, identified by its position number as given by a list command:

remove cache rule -pos p

where p is a positive integer indicating the position number of the rule as printed by the list command.

The sequence of rules may change dynamically, therefore it is strongly re­commended to determine the current position of a cache rule immediately before its deletion (and not rely on some earlier listing).

All stored cache records not matching any remaining cache rule will be evicted from the cache.

2.7.9.4. Cache inspection

Print current memory usage and all cached records:

list cache

Print all cache rules in use. Cache rules are numbered sequentially in no particular order:

list cache rule

Adding or deleting a rule may change the sequence completely, therefore it is strongly recommended to determine the current position of a cache rule immediately before its deletion.

2.7.9.5. Cache Start-up and Shutdown

Cache Start

up cache

Start the shared query cache: without this command caching will be turned off. This is automatically performed by service rasdaman start and start_rasdaman.sh

Cache Shutdown

down cache [ -force ]

-force

stop immediately without waiting for transaction end

Stop the shared query cache. This is automatically performed by service rasdaman stop and stop_rasdaman.sh

2.7.10. Memory usage

By default, the rasdaman server is limited on the amount of memory it can use to 1 GB less than the amount of available system memory as reported by the MemAvailable in /proc/meminfo. This limit can be changed with:

change memory -size NEWSIZE

NEWSIZE

an integer number with an optional modifier suffix of k (for kilobytes), M (for Megabytes), G (for Gigabytes) or T (for Terabytes); for example: 5G.

The current memory limit and total memory usage as well as per rasserver process can be shown with:

list memory

It is recommended the sum of non-cache and cache memory be set to at least 4 GB less than the total amount of memory on the system, if the main programs running on it are rasdaman, Tomcat, and PostgreSQL.

2.7.11. Server Start-up and Shutdown

Server Start

up srv [ s | -host h | -all ]

s

start only server s

-host s

start all servers on host h; this requires that a manager has been started on this host previously.

-all

start all servers defined; note that only those servers can be started on whose host a manager is currently running.

Look up the named server(s) in the definition list, and start the specified one(s) using the previously defined individual startup parameters.

At least one of the options s, -host s, and -all must be present.

Server Shutdown

down srv [ s | -host h | -all ] [-force] [-kill]

s

name of the server to be stopped

-host s

terminate all servers on host h

-all

terminate all servers

-force

send SIGTERM immediately, don’t wait for transaction end

-kill

send SIGKILL immediately, don’t wait for transaction end

This command shuts down the indicated server(s). At least one of the options s, -host s, and -all must be present.

Without -force and -kill, the server is marked for shut down and will actually be terminated by sending SIGTERM after completing the current transaction. With -force and -kill, the server is terminated instantaneously; this should be handled with extreme caution, as experience shows that relational database systems react differently on such a situation: usually a running transaction is aborted (which is the desired behavior), but sometimes the running transaction is committed (most likely leaving the database in an inconsistent state). See a Unix manual for the difference between SIGTERM and SIGKILL signals.

The manager on host h is not terminated.

2.7.12. Users and Their Rights

See Access Control [RE].

2.7.13. Server Control Options

The following options can be passed to the server when it is started by the server manager using the up srv command. Option settings are defined for a particular server using the rascontrol command change srv -xp which passes the rest of the line after -xp on to the server upon starting it (see rasdaman Servers).

--log logfile

print log to logfile. If logfile is stdout, then log output will be printed to standard output. It is not recommended setting this option. (default: $RMANHOME/log/rasserver.uuid.serverpid.log)

--transbuffer b

maximum size of transfer buffer to b bytes (default: 100000000 bytes = 100 MB)

--tilesize s

default maximal size of tiles in bytes used when no tile size is specified in queries (default: 4194304 bytes)

--pctmin s

minimal size of inline tiles in bytes (default: 2048)

--pctmax s

maximal size of inline tiles in bytes (default: 4096)

--tiling name

default tiling scheme when inserting data when no tiling clause is specified, one of: NoTiling, RegularTiling, AlignedTiling (default: AlignedTiling)

--tileconf dom

default tile configuration when inserting data when no tiling clause is specified (default: [0:1023,0:1023])

--index name

default index to be used when inserting data when no tiling tiling clause is specified, one of: auto, dir, rdir, nrp, rnrp, tc, rc (default: nrp, i.e. R+ tree)

--indexsize s

specify the node size of the index; value of 0 lets rasdaman itself determine this value (default: 0)

2.7.14. Miscellaneous

2.7.14.1. Help

help

Display top level help page

help [command]
command help

Display information specific to command

(both syntax variants are equivalent)

2.7.14.2. Version Information

list version
version

display rasdaman server version.

2.7.14.3. Save Changes to Disk

save

The save operation writes the current configuration and authorization values to disk. All changes done during the session thus become permanent.

2.7.14.4. rascontrol Termination

exit

terminates rascontrol.

2.8. Federations [RE]

rasdaman enterprise offers intra-query parallelization, that is: splitting of complex queries into sub-queries executed in parallel on a configurable set of compute nodes. Distribution is determined by criteria like data location in the networks and “breakpoints” in the query where minimal data transport occurs. In particular for complex queries and big data such methods are known to boost performance dramatically.

2.8.1. Federation network

The federation network is defined in a decentralized way: each rasdaman node (an individual rasdaman installation) knows peers from which it accepts requests, and to which it can send requests. To this end, each rasdaman node maintains an inpeer and outpeer list:

  • The inpeer list contains those hosts from which this rasdaman node will accept requests.

  • The outpeer list contains those hosts to which this rasdaman node may send requests.

By manipulating these two lists administrators can exercise fine-grain security policy in a rasdaman federation network.

Each node individually respects these statements, there is no global rasdaman federation configuration.

2.8.1.1. Federation node addressing

Addressing is based on hostnames, where a hostname in the sequel is one of

  • a domain name, resolvable by this rasmgr’s host

  • an IP address

All inpeer and outpeer statements accumulate so that host identifiers can be added and removed incrementally.

2.8.1.2. Define peers

See details in the section on federation configuration.

2.8.1.3. Security

See Federated Access Control [RE] for details on how to configure access control across a federation network.

2.8.1.4. Fluctuating IPs

In cloud environments, IP addresses are maintained dynamically and can change for a given host between reboots. Hence, when growing a rasdaman federation by launching new VMs care must be taken that the in- and outpeers received the proper current IP address.

2.8.2. Federation daemon

A separate background process per node, rasfed, collects metadata about rasdaman instances in the known network. To this end, rasfed peri­odically contacts all known nodes to gather information used for dis­patch­ing and optimization. Nodes known are those specified as in­peer and/or outpeer in rasfed.properties.

The following adjustments can be made editing /opt/rasdaman/etc/rasfed.properties:

  • enable activates or deactivates rasdaman’s federation capabilities when running start_rasdaman.sh. Can be one of yes and no.

    • Default: no

    • Need to change: YES to enable federation capabilities

  • peerServiceUrl - URL of a central federation service that provides a list of federation peers. The peers will be added to rasfed’s peer registry in addtion to any currently existing peers.

    • Default: empty

    • Need to change: NO

  • publicKeysDir - Directory containing public keys of the external peers allowed to send queries to this node.

    • Default: /opt/rasdaman/etc/keys/federation

    • Need to change: NO

  • peerFetchingInterval - set the number of seconds, at which rasfed contacts the peer service URL for obtaining new peers.

    • Default: 1800 (30 minutes)

    • Need to change: NO

  • rasdamanDatabaseConnectivityString set the connectivity string to the database administered by rasdaman.

    • Default: jdbc:sqlite:/opt/rasdaman/data/RASBASE

    • Need to change: NO for sqlite, YES if RASBASE is stored in postgres (docs), e.g. jdbc:postgresql:RASBASE

    • Need to change: YES when changed in rasmgr.conf

  • rasdamanDatabaseUser set the username for the above database.

    • Default: rasdaman

    • Need to change: NO for sqlite, YES for other backend DBMS

  • rasdamanDatabasePassword set the password for the above username.

    • Default: rasdaman

    • Need to change: NO for sqlite, YES for other backend DBMS

  • rasdamanUrl URL of rasdaman database serving raster data

    • Default: http://localhost:7001

    • Need to change: YES when changed in rasmgr.conf

  • rasdamanDatabase name of rasdaman database serving raster data. Recommendation: use rasdaman standard name, RASBASE.

    • Default: RASBASE

    • Need to change: YES when changed in rasmgr.confg

  • rasdamanUser A rasdaman user that has at least READ rights to the rasdaman service.

    • Default: rasguest

    • Need to change: YES when changed in rasdaman

  • rasdamanPassword set the password for the above username.

    • Default: rasguest

    • Need to change: YES when changed in rasdaman

  • listeningPort set the port on which rasfed listens.

    • Default: 7000

    • Need to change: NO

  • hostname set the hostname advertised by the current node in the federation; must coincide with what other nodes use in inpeer statements.

    • Default: output of hostname command

    • Need to change: YES when the hostname is not consistent in the federation

  • maxNumberOfRestartTries set the number of restarts rasfed should perform in case of an error, until it gives up.

    • Default: 5

    • Need to change: NO

  • noContactMaxTime set the maximal period (in milliseconds) during which a peer server doesn’t send any messages, before being consider inactive.

    • Default: 30000 (30 seconds)

    • Need to change: NO

  • pathToRasdamanBinaries set the path to rasdaman bin directory.

    • Default: /opt/rasdaman/bin

    • Need to change: NO

  • statusMessageUpdateTimeInterval set the polling interval in milliseconds.

    • Default: 10000 (10 seconds)

    • Need to change: NO

  • restartServiceDelay set the number of milliseconds that need to pass between consecutive restarts, in case of failure.

    • Default: 1000 (1 second)

    • Need to change: NO

  • healthCheckTimeout set the number of milliseconds that a node has to respond to a health check message before being considered unhealthy.

    • Default: 500 (0.5 seconds)

    • Need to change: NO

  • petascopeEndPoint is the endpoint URL of petascope on this peer node. NOTE: the endpoint must be accessible from other peer nodes. Usually it is http://hostname:8080/rasdaman/ows with the hostname as configured above.

    • Default: http://localhost:8080/rasdaman/ows

    • Need to change: YES when a new hostname for this node is set

  • readFromOutpeerTimeoutMs set the timeout in milliseconds for reading a message from a remote outpeer node; increase the value if the network is too slow.

    • Default: 10000 (10 seconds)

    • Need to change: NO

  • writeToOutpeerTimeoutMs set the timeout in milliseconds for writing a message to an outpeer node; increase the value if the network is too slow.

    • Default: 10000 (10 seconds)

    • Need to change: NO

  • define inpeer hostname - define a remote rasdaman host from which requests over data on this rasdaman node will be accepted. Additional arguments allowing to configure federated access control are supported, see details.

    Example: define inpeer acme.com

  • define outpeer hostname [-port portnumber] - define a remote rasdaman host to which this rasdaman node may send subqueries for execution over data available on that host. Optionally the port number on which rasfed on the remote host is listening may be specified if it differs from the default port of 7000. Additional arguments allowing to configure federated access control are supported, see details.

    Example: define outpeer 192.168.28.10 -port 7000

In summary, customization is typically required for hostname and petascopeEndPoint, rasdamanUser and rasdamanPassword (when they change in rasdaman), as well as for defining the inpeer/outpeer nodes.

2.8.3. Enabling trust

In order to accept queries from another federation member, trust must be established first. This is done by placing the public key of the trusted peer in /opt/rasdaman/etc/keys/federation.

Similarly, the local public key must be placed on federation members which should accept queries from the local node. The local public key can be found in /opt/rasdaman/etc/keys/local and will have the name hostname.pub where $hostname is the name advertised in the federation by the local node.

The keys are also used for autheticating federation requests.

2.8.4. Disabling federation access

Disabling network communication between nodes can be preset in rasfed.properties by setting enable to no and restarting rasdaman. Alternatively, communication with particular nodes can be stopped as follows.

2.8.4.1. No queries to outpeers

In rasfed.properties, for the outpeers that you want to disable, do one of the following:

  • Delete outpeer lines

  • Put outpeer lines in comments (i.e., prefix with “#”)

  • Add option -disable to the define outpeers lines

2.8.4.2. No queries from inpeers

In rasfed.properties, for the inpeers that you want to disable, do one of the following:

  • Delete inpeer lines

  • Put inpeer lines in comments (i.e., prefix with “#”)

  • Add option -disable to the define inpeers lines

2.8.4.3. Restart server

If rasfed.properties has been edited, rasdaman needs to be restart on this node to make changes effective.

2.8.5. Federation Configuration Example

The following configuration scheme works well for setting up a federation between machine A and machine B.

Important: on both machines ports 7000-7011, as well as 8080 have to be opened to external peer nodes (rasfed - 7000, rasmgr - 7001, rasservers - 7002 - 7011, petascope - 8080).

2.8.5.1. machine A

  • IP: 1.2.3.4

  • In /etc/hosts:

127.0.0.1 A
5.6.7.8   B
  • In /opt/rasdaman/etc/rasfed.properties:

hostname=A
petascopeEndPoint=http://A:8080/rasdaman/ows
...

# add at the following lines at the end of the file
define inpeer B -role-map read < read
define outpeer B -role-map read > read
  • In /opt/rasdaman/etc/rasmgr.conf:

# change the -host value to A in all places where it occurs, e.g.
define srv N1 -host A -type n -port 7002 -dbh rasdaman_host
  • Establish trust by restarting rasdaman, then placing the public key file B.pub in /opt/rasdaman/etc/keys/federation. B.pub is automatically generated on machine B when rasdaman starts, under /opt/rasdaman/etc/keys/local/B.pub. After placing the public key B.pub in the corresponding directory, rasdaman needs to restart.

2.8.5.2. machine B

  • IP: 5.6.7.8

  • In /etc/hosts:

127.0.0.1 B
1.2.3.4   A
  • In /opt/rasdaman/etc/rasfed.properties:

hostname=B
petascopeEndPoint=http://B:8080/rasdaman/ows
...
# add at the following lines at the end of the file
define inpeer A -role-map read < read
define outpeer A -role-map read > read
  • In /opt/rasdaman/etc/rasmgr.conf:

# change the -host value to B in all places where it occurs, e.g.
define srv N1 -host B -type n -port 7002 -dbh rasdaman_host
  • Establish trust by restarting rasdaman, then placing the public key file A.pub in /opt/rasdaman/etc/keys/federation. A.pub is automatically generated on machine A when rasdaman starts, under /opt/rasdaman/etc/keys/local/A.pub. After placing the public key A.pub in the corresponding directory, rasdaman needs to restart.

Note

The -role-map parameter defines the translation between the authorization info passed in subqueries. In the example of machine A, all users from machine B having role read will be able to do federated queries. See Federated Access Control [RE] for more details.

2.9. Billing [RE]

The rasdaman server can optionally record the actual per-user resource consumption(collectively called “billing information” in the sequel) for the local installation(not across a federation). These statistics are kept in a standard relational table so that invoicing information can be extracted and aggregated with standard SQL methods.

Further, administrator-defined quota are evaluated prior to query execution, based on comparing a cost estimate against the existing resource consumption records and the user’s specific resource limits; in case limits are exceeded the query gets rejected.

2.9.1. Control Billing Records Collection

Billing by default is switched off, but can be enabled through the define billingrecords statement in both rasmgr.conf and the rascontrol command line utility:

Syntax

define billingrecords [on|off] -connect /opt/rasdaman/data/RASSTATS
save # rascontrol only

When changed, rasdaman needs to be restarted in order for the change to take effect.

2.9.2. Quota Evaluation

Upon arrival of a query and with billing enabled the following happens:

  • Get per-query and accumulated use limits from billing tables;

  • Estimate current query costs;

  • If estimated costs is greater than the per-query limit, or the accumulated costs + estimated costs are greater than the cumulative limit, then the query is aborted due to exceeding the defined quotas;

  • Otherwise the query is executed and billing tables are updated with the actual query costs.

2.9.3. Database Schema

The SQLite database RASSTATS contains all billing and quota relevant information allowing administrators to generate reports with standard tools. The path to the database is configured in rasmgr.conf with the -connect option of define billingrecords.

The following tables are defined in this database:

  • RAS_BILLING_USERS: Contains the users for which billing records are kept along with quota limits for each user; this table has to be maintained manually by the administrator, reflecting customer agreements.

  • RAS_BILLING_METRICS: Stores the executed queries and their resource consumption statistics.

2.9.3.1. Detailed table structure

2.9.3.1.1. RAS_BILLING_USERS

▶ show

2.9.3.1.2. RAS_BILLING_METRICS

▶ show

2.9.3.1.3. Table Query Examples
  • “Start monitoring resource consumption of user X”:

    INSERT INTO RAS_BILLING_USERS VALUES ("X", ...)
    
  • “What are the per-query limits for user U?”

    SELECT limit_processing_per_query, limit_access_per_query,
           limit_transfer_per_query, limit_result_per_query
    FROM RAS_BILLING_USERS u
    WHERE u.user='U'
    
  • “What are the remaining per-month resources of user U in March 2021?” (positive values means that something is left)

    SELECT u.limit_processing_per_month - SUM( m.cycles_spent ) as 'Remaining Processing Resources',
           u.limit_access_per_month     - SUM( m.bytes_accessed ) as 'Remaining Access Bytes',
           u.limit_transfer_per_month   - SUM( m.bytes_transferred ) as 'Remaining Transfer Bytes',
           u.limit_result_per_month     - SUM( m.bytes_delivered )  as 'Remaining Download Bytes'
    FROM RAS_BILLING_USERS u, RAS_BILLING_METRICS m
    WHERE u.user='U' AND u.user=m.user AND
          m.query_start BETWEEN '2020-03-01' AND '2020-03-31'
    
  • “Number of queries sent by user X”:

    SELECT COUNT(*) as 'Query Number'
    FROM RAS_BILLING_USERS u, RAS_BILLING_METRICS m
    WHERE u.user='X' AND u.user=m.user
    
  • “All queries sent by user X”:

    SELECT m.query
    FROM RAS_BILLING_METRICS m
             INNER JOIN RAS_BILLING_USERS u ON u.user = m.user
    WHERE u.user = 'X';
    
  • “rasdaman resource usage for user X in March 2020”:

    SELECT SUM(m.cycles_spent), SUM(m.bytes_accessed),
           SUM(m.bytes_transferred), SUM(m.bytes_delivered)
    FROM RAS_BILLING_USERS u, RAS_BILLING_METRICS m
    WHERE u.user='X' AND u.user=m.user AND
          m.query_start BETWEEN '2020-03-01' AND '2020-03-31'
    

2.9.4. Web APIs

Petascope provides Web APIs to query the billing records collected by rasdaman when this is enabled in rasmgr.conf.

2.9.4.1. Quota

2.9.4.1.1. View quota

The endpoint for viewing quotas is at /rasdaman/admin/billing/users/list.

Table 2.1 KVP parameters for view users quota API

Key

Type

Description

user

TEXT

Optional; if specified, then quota of this user is selected. If not specified, then quotas of all users are selected

LIMIT

INTEGER

Optional; default is 0. If specified, return the results starting from this row number

OFFFSET

INTEGER

Optional; default is 1000. If specified, return the maximum rows based on the requested value

A user can view

  • their own quota if privilege PRIV_SELF_MGMT is granted

  • quotas of other users if privilege PRIV_USER_MGMT is granted

The result is an array of JSON objects, for example:

▶ show

Examples

  • View quota of user rasguest:

    ▶ show

  • View quota of all users and select only top 2 rows:

    ▶ show

2.9.4.1.2. Update quota

The endpoint for updating a user’s quota is at /rasdaman/admin/billing/users/update. Only settings specified in the request will be updated, other settings are not changed.

Table 2.2 KVP parameters for updating user’s quota API

Key

Type

Description

user

TEXT

Required; the user’s quota is updated

LIMITPROCESSINGPERQUERY

TEXT

Optional; Processing limit that a single query must not exceed

LIMITACCESSPERQUERY

TEXT

Optional; Disk volume limit that a single query must not exceed

LIMITRESULTPERQUERY

TEXT

Optional; Result size limit that a single query must not exceed

LIMITTRANSFERPERQUERY

TEXT

Optional; Federation transfer limit that a single query must not exceed

LIMITPROCESSINGPERMONTH

TEXT

Optional; Processing limit that queries must not exceed in a month

LIMITACCESSPERMONTH

TEXT

Optional; Disk volume limit that queries must not exceed in a month

LIMITRESULTPERMONTH

TEXT

Optional; Result size limit that queries must not exceed in a month

LIMITTRANSFERPERMONTH

TEXT

Optional; Federation transfer limit that queries must not exceed in a month

Only users with privilege PRIV_USER_MGMT can perform the request.

If the request succeed, petascope returns an empty string.

Examples

  • Update quota of user rasguest:

    ▶ show

2.9.4.2. Metrics

2.9.4.2.1. View query metrics

The endpoint for viewing metrics is at /rasdaman/admin/billing/metrics/list.

Table 2.3 KVP parameters for view billing metrics API

Key

Type

Description

user

TEXT

Optional; if specified, then metrics of this user is selected. If not specified, then metrics of all users are selected

LIMIT

INTEGER

Optional; default is 0. If specified, return the results starting from this row number

OFFFSET

INTEGER

Optional; default is 1000. If specified, return the maximum rows based on the requested value

query

TEXT

Optional; if specified, then one can submit his own SELECT SQL query to view more complex statistics of metrics. See table columns here. NOTE: if SQL query is requested, other KVP parameters are ignored.

A user can view

  • their own metrics if privilege PRIV_SELF_MGMT is granted

  • metrics of other users if privilege PRIV_USER_MGMT is granted

The result is an array of JSON objects, for example:

▶ show

Examples

  • View metrics of user rasguest:

    ▶ show

  • View metrics of all users and select only top 10 rows:

    ▶ show

  • View metrics of user rasguest directly via SQL query:

    ▶ show

2.9.4.3. External Tomcat

When rasdaman.war (petascope) is deployed in external Tomcat (/var/lib/tomcat9/webapps), additional configuration needs to be done so that this API can work.

  1. The tomcat and rasdaman system users must be able to read/write in the directory where RASSTATS is located.

    We recommend putting RASSTATS in its own directory that is read/write by both, e.g. /opt/rasdaman/data/rasstats/, and setting permissions as follows:

▶ show

  1. By default on Ubuntu systems, the Tomcat service is managed by systemd, and it is restricted in what paths it can access on the system. The systemd service configuration can be found in /etc/systemd/system/multi-user.target.wants/tomcat9.service; here it is necessary to add the following in the [Service] section:

    [Service]
    ...
    ReadWritePaths=/opt/rasdaman/data/rasstats/
    

    Once updated, run sudo systemctl daemon-reload.

  2. Finally, restart rasdaman and tomcat9.

2.10. UDF packages [RE]

User-Defined Functions - UDFs [RE] allow to extend the built-in functionality of rasdaman at runtime. For convenience, rasdaman is shipped with several pre-packaged UDF collections ready to be used in rasql and WCPS queries.

2.10.1. Intel MKL

The Intel Math Kernel Library (Intel MKL) is a library of optimized math functions for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math.

The mkl UDF namespace contains various functions which act as a bridge to the corresponding MKL functions. In order to make these functions available in rasdaman queries, it is necessary to explicitly execute the following commands:

UDF_DIR=/opt/rasdaman/share/rasdaman/udf
$UDF_DIR/libmkl_install.sh <rasdaman_user> <rasdaman_password>

To run the UDF the following packages need to be installed with apt: libmkl-rt libomp5. When installing libmkl-rt, apt will show a dialog asking whether to “Use libmkl_rt.so as the default alternative to BLAS/LAPACK?”: select “<Yes>”, then “<Ok>” at the next screen. If you selected “<No>”, you can update the setting with sudo dpkg-reconfigure libmkl-rt.

The available LAPACK functions are listed subsequently. All functions expect one or two 2-dimensional arrays of base type float or double. For extensive information on the functions refer to the official documentation.

2.10.1.1. LAPACK: Linear Equations

2.10.1.1.1. gesv
array mkl.gesv (array A, array B)

Computes the solution to the system of linear equations with a square coefficient matrix A and multiple right-hand sides. Returns the solution matrix X with same spatial domain as B.

References:

2.10.1.1.2. posv
array mkl.posv (array A, array B, string uplo)

Computes the solution to the system of linear equations with a symmetric or Hermitian positive-definite coefficient matrix A and multiple right-hand sides. Returns the solution matrix X with same spatial domain as B.

References:

2.10.1.1.3. sysv
array mkl.sysv (array A, array B, string uplo)

Computes the solution to the system of linear equations with a real or complex symmetric coefficient matrix A and multiple right-hand sides. Returns the solution matrix X with same spatial domain as B.

References:

2.10.1.2. LAPACK: Linear Least Squares

Linear Least Squares (LLS) Problems: LAPACK Driver Routines

2.10.1.2.1. gels
array mkl.gels (array A, array B, string trans)

Uses QR or LQ factorization to solve a overdetermined or underdetermined linear system with full rank matrix. Returns a matrix containing the solution vectors (output parameter b) of size nrhs x m.

References:

2.10.1.2.2. gelsd
array mkl.gelsd (array A, array B)

Computes the minimum-norm solution to a linear least squares problem using the singular value decomposition of A and a divide and conquer method. Returns a matrix containing the solution vectors (output parameter b) of size nrhs x n.

References:

2.10.1.3. LAPACK: Generalized Linear Least Squares

Generalized Linear Least Squares (LLS) Problems: LAPACK Driver Routines

2.10.1.3.1. gglse
array mkl.gglse (array A, array B, array c, array d)

Solves the linear equality-constrained least squares problem using a generalized RQ factorization. Returns a 1-D array of size n containing the solution of the LSE problem (output parameter x).

References:

2.10.1.4. LAPACK: Symmetric Eigenvalue

Symmetric Eigenvalue Problems: LAPACK Driver Routines

2.10.1.4.1. syev
array mkl.syev (array A, string jobz, string uplo)

Computes all eigenvalues and, optionally, eigenvectors of a real symmetric matrix. Returns a 1-D array of size n containing the eigenvalues (output parameter w).

References:

2.10.1.4.2. syevd
array mkl.syevd (array A, string jobz, string uplo)

Computes all eigenvalues and, optionally, all eigenvectors of a real symmetric matrix using divide and conquer algorithm. Returns a 1-D array of size n containing the eigenvalues (output parameter w).

References:

2.10.1.4.3. syevx
array mkl.syevx (array A, long nselect, string jobz, string r, string uplo)

Computes selected eigenvalues and, optionally, eigenvectors of a symmetric matrix. Returns a 1-D array of size m containing the selected eigenvalues of the matrix A in ascending order (output parameter w).

References:

2.10.1.4.4. syevr
array mkl.syevr (array A, long nselect, string jobz, string r, string uplo)

Computes selected eigenvalues and, optionally, eigenvectors of a real symmetric matrix using the Relatively Robust Representations. Returns a 1-D array of size m containing the selected eigenvalues of the matrix A in ascending order (output parameter w).

References:

2.10.1.5. LAPACK: Nonsymmetric Eigenvalue

Nonsymmetric Eigenvalue Problems: LAPACK Driver Routines

2.10.1.5.1. geev
array mkl.geev (array A, string jobvl, string jobvr)

Computes the eigenvalues and left and right eigenvectors of a general matrix. Returns a 1-D array of size n containing the computed complex eigenvalues (output parameters wr and wi).

References:

2.10.1.6. LAPACK: Singular Value Decomposition

Singular Value Decomposition: LAPACK Driver Routines

2.10.1.6.1. gesvd
array mkl.gesvd (array A, string jobu, string jobvt)

Computes the singular value decomposition of a general rectangular matrix. Returns a 1-D array of size n containing the singular values of A sorted so that s[i] >= s[i + 1] (output parameter s).

References:

2.10.1.6.2. gesdd
array mkl.gesdd (array A, string jobz)

Computes the singular value decomposition of a general rectangular matrix using a divide and conquer method. Returns a 1-D array of size n containing the singular values of A sorted so that s[i] >= s[i + 1] (output parameter s).

References:

2.10.1.6.3. gejsv
array mkl.gejsv (array A, string joba, string jobu, string jobv, string jobr, string jobt, string jobp)

Computes the singular value decomposition using a preconditioned Jacobi SVD method. Returns a 1-D array of size n containing the singular values of A (output parameter sva).

References:

2.10.1.7. LAPACK: Cosine-Sine Decomposition

Cosine-Sine Decomposition: LAPACK Driver Routines

TODO

2.11. Security

2.11.1. General

There are several security measures available, which should be considered seriously. Among them are the access right mechanisms found in Tomcat, rasdaman, and PostgreSQL. We highly recommend to make use of these.

For Tomcat and PostgreSQL refer to the pertaining documentation. The servlet is safe against SQL injection attacks - we are not aware of any means for the user to send custom queries to the PostgreSQL server or the rasdaman server. XSRF and XSS represent no danger to the service because there is no user generated content available.

For rasdaman, we recommend to change the default user passwords in rasdaman (rasguest/rasguest for read-only access, rasadmin/rasadmin for read-write and administrator access) to not run into the Oracle “Scott/tiger” trap. Even better, use additional separate, private users. The rasdaman service doesn’t use cookies.

2.11.2. Reset default passwords

This is a quick guide to reset the default rasdaman users, rasguest and rasadmin, which are created when the system is first installed and initialized. Pre-requisites: rasdaman must be up and running.

  1. Reset rasguest

  • Update password in rasdaman:

    $ rasql -q 'ALTER USER rasguest SET PASSWORD TO "newpassword"' \
            --user rasadmin --passwd rasadmin
    
  • Update rasdaman_pass (for read-only access from petascope) in /opt/rasdaman/etc/petascope.properties to set it to the newpassword.

  • Update rasdamanPassword in /opt/rasdaman/etc/rasfed.properties to set it to the newpassword.

  • Grant PRIV_OWS_WCS_PROCESS_COV role so that the rasguest user will be able to run WCPS queries:

    $ rasql -q 'GRANT PRIV_OWS_WCS_PROCESS_COV to rasguest' \
            --user rasadmin --passwd rasadmin
    
  1. Reset rasadmin

  • Update password in rasdaman:

    $ rasql -q 'ALTER USER rasadmin SET PASSWORD TO "newpassword"' \
            --user rasadmin --passwd rasadmin
    
  • Update rasdaman_admin_pass (for read-write access from petascope) in /opt/rasdaman/etc/petascope.properties to set it to the newpassword.

  • Update RASLOGIN env variable in /etc/default/rasdaman:

    # get an md5 hash of the new password
    $ echo -n "newpassword" | md5sum | awk '{ print $1; }'
    
    # update the previous md5 hash of RASLOGIN by editing the file
    $ nano /etc/default/rasdaman
    
    # login credentials for non-interactive rasdaman start/stop
    RASLOGIN=rasadmin:<new md5 password hash>
    
  1. Restart rasdaman:

    sudo systemctl restart rasdaman
    
  2. Restart tomcat (if external tomcat is configured in petascope.properties):

    tomcat_service=$(systemctl list-units | grep -i tomcat)
    sudo systemctl $tomcat_service restart
    

2.11.3. Require authentication for API access

To make sure that geo requests to the /rasdaman/ows endpoint will require authentication, it is necessary to configure the authentication_type setting in petascope.properties to a non-empty value.

Full details can be found here.

2.11.4. Allow annonymous API access

It is possible to allow unauthenticated access in addition to supporting authenticated access.

Rasdaman must be configured as in the previous section (Require authentication for API access). In addition, a rasdaman user credentials need to be set in the rasdaman_user and rasdaman_pass settings in petascope.properties (docs). Any unauthenticated API requests will be executed internally with this rasdaman user, which can be restricted with appropriate privileges, triggers, and quotas.

2.11.5. Whitelist access control

A Trigger allows to define access control rule for all rasdaman users. Triggers can be defined in the Web admin interface, or with the rasql command-line tool. Once a trigger is created, specific rasdaman users can be exempted (i.e. whitelisted) from the trigger rule.

Currently the following resources can be restricted with triggers.

2.11.5.1. Completely disable access to a coverage

Disabling access to a coverage can be achieved with the accessed (<coverage-id>) condition. In this case, a user will not see the restricted coverage at all in GetCapabilities, unless they have been exempted from the trigger, or have been granted the PRIV_LIST_ALL_COLLS privilege.

To restrict access to a coverage C a trigger like this can be created:

CREATE TRIGGER C_disable_access
WHEN accessed(C)
BEGIN EXCEPTION "Access forbidden to C." END

Once created, access will be immediately blocked for all rasdaman users. Each userX that needs to be allowed access must be explicitly exempted:

GRANT EXEMPTION FROM C_disable_access TO userX

If multiple users should have the same rules for access, a role can be created, exemptions can be granted to it as needed, and then the role can be granted to rasdaman users. For example:

CREATE ROLE staff
GRANT EXEMPTION FROM C1_disable_access TO staff
GRANT EXEMPTION FROM C2_disable_access TO staff
...
GRANT staff TO user1
GRANT staff TO user2
...

2.11.5.2. Partially disable access to a coverage

Similar to the previous section, but access is disabled only to a spatio-temporal subset of the coverage, rather than the whole coverage. This allows to restrict sensitive areas within a coverage.

To achieve this, use the accessed(<coverage>, <subset>) condition. Currently the <subset> must be specified in grid coordinates. For example:

CREATE TRIGGER C_disable_partial_access
WHEN accessed(C, [1000:1500, -50:250])
BEGIN EXCEPTION "Access forbidden to this area in C." END

2.11.5.3. Limit data access/download per request

It is possible to limit how much data can users can access from disk, or download over the network (docs).

CREATE TRIGGER max_data_access_trigger
WHEN (CONTEXT.ACCESSVOLUME > 200000000)
BEGIN EXCEPTION "Data access is restricted to 200 MB." END

CREATE TRIGGER max_data_result_trigger
WHEN (CONTEXT.RESULTVOLUME > 200000000)
BEGIN EXCEPTION "Data download is restricted to 200 MB." END

2.11.6. Blacklist access control

The quota management capabilities of rasdaman allow to set rules per user. Quotas on resource usage can be defined per request, as well as aggregated per calendar month. This is most convinient to manage in the Web admin interface.

2.12. Backup

Both software and hardware can fail, therefore it is prudent to establish regular backup procedures. The rasdaman installation comes with a utility script /opt/rasdaman/bin/backup_rasdaman that can be used to easily backup the rasdaman installation, databases, and external data. Below we list the data that should be considered if backing up rasdaman manually:

  1. The rasdaman database, which normally can be found in /opt/rasdaman/data. The SQL database itself in this directory, RASBASE, is fairly small; the TILES subdirectory may be large as it contains all the array data, but if backup disk space is not scarce then it is definitely recommended to backup as well. Incremental backups of the TILES with rsync for example should work well without unnecessary duplicated data copying, unless existing data areas are often updated. Example with rsync:

    # backup small RASBASE to /backup/rasdamandb
    rsync -avz /opt/rasdaman/data/RASBASE /backup/rasdamandb/
    
    # backup potentially large TILES dir to /backup/rasdamandb
    rsync -avz /opt/rasdaman/data/TILES /backup/rasdamandb/
    
  2. The petascopedb geo metadata database is usually small and worth backing up. By default it is stored in PostgreSQL and can be extracted into a small compressed archive as follows:

    # create backup in a gzip archive petascopedb.sql.gz
    sudo -u postgres pg_dump petascopedb | gzip > /backup/petascopedb.sql.gz
    

    If necessary, it can be restored with

    # if a petascopedb already exists it needs to be renamed, as otherwise
    # restoring over an existing petascopedb will corrupt it
    sudo -u postgres psql -c "ALTER DATABASE petascopedb RENAME TO petascopedb_existing_backup"
    
    # create an empty petascopedb
    sudo -u postgres createdb petascopedb
    
    # restore backup petascopedb.sql.gz (use cat if it's not a gzip archive)
    zcat petascopedb.sql.gz | sudo -u postgres psql -d petascopedb --quiet > /dev/null
    

    Alternatively, if the above fails for some reason, petascopedb can be backup with this command:

    # note that /backup/petascopedb_backup will contain a large number of compressed files
    sudo -u postgres pg_dump -j 8 -Fd petascopedb -f /backup/petascopedb_backup
    

    If necessary, it can be restored with

    sudo -u postgres pg_restore -j 8 -d petascopedb /backup/petascopedb_backup
    
  3. The rasdaman configuration files in /opt/rasdaman/etc, but also consider the bin and share directories which may be useful in case of package update problems, as well as maybe log files in the log directory.

    # backup everything except the data dir, which is handled in step 1. above
    rsync -avz --exclude='data/' /opt/rasdaman /backup/
    

2.13. Migration

2.13.1. From one machine to another

Sometimes it is necessary to migrate the installation from one machine (OLD) to another (NEW). This section outlines the steps on how to do this.

  1. Make sure rasdaman is installed and functional on the NEW machine.

  2. Stop rasdaman and an external tomcat if installed on both the OLD and NEW machine, e.g:

    sudo service rasdaman stop
    sudo service tomcat9 stop
    
  3. Make a backup of the rasdaman and petascope databases on the OLD machine by following step 1. of the backup guide and copy the backup to the NEW machine.

  4. Restore the database backups on the NEW machine by following step 2. of the backup guide.

  5. Make a backup of the config files on the NEW machine (/opt/rasdaman/etc), then copy relevant configuration from the OLD to the NEW machine, in particular:

    • rasmgr.conf can probably copied as is, but check if the -host setting is correct for the NEW machine;

    • most settings from petascope.properties can be copied as is, except the ones for database configuration (spring.* and metadata*);

    • if federation is enabled on the OLD machine then most settings from rasfed.properties can be copied, but carefully check the hostname setting;

    • /etc/default/rasdaman can be copied as is usually;

  6. If UDFs have been registered on the OLD machine:

    • if the NEW machine is the same OS and CPU architecture as the OLD then the UDF libraries can probably be synced by copying the /opt/rasdaman/share/rasdaman/udf directory;

    • otherwise, they will likely need to be recompiled on the NEW machine;

  7. Make sure that the /opt/rasdaman directory is owned by the rasdaman user, to avoid any permission issues caused by copying with other system users:

    sudo chown -R rasdaman: /opt/rasdaman
    
  8. Finally start rasdaman and tomcat:

    sudo service rasdaman start
    sudo service tomcat9 start
    

2.13.2. Ubuntu 18.04 to Ubuntu 20.04

  1. Make a backup of the rasdaman and petascope databases by following the backup guide. In particular:

    # postgres version
    OLDVER=10
    # alt 1: create backup in petascopedb.sql.gz; to be restored with psql
    sudo -u postgres pg_dump petascopedb | gzip > /backup/petascopedb.sql.gz
    # alt 2: text backup to be restored with pg_restore
    sudo -u postgres pg_dump --create --compress=5 petascopedb \
      --file=/backup/petascopedb.sql.gz
    # backup postgres databases by direct copy as well just in case
    sudo cp -a /var/lib/postgresql/$OLDVER/main/ /backup/petascopedb_raw_$OLDVER
    # backup postgres config
    sudo cp -a /etc/postgresql/$OLDVER /backup/etc_postgresql_$OLDVER
    # backup rasdaman dir
    sudo cp -a /opt/rasdaman /backup/opt_rasdaman
    

    Disable the rasdaman repo in apt and remove rasdaman:

    REPO_FILE=/etc/apt/sources.list.d/rasdaman.list
    sudo mv $REPO_FILE $REPO_FILE.disabled
    # remove rasdaman package; this won't remove any configuration/data
    sudo service rasdaman stop
    sudo apt remove "$(dpkg -l | grep '^ii *rasdaman' | awk '{ print $2; }')"
    
  2. Upgrade to Ubuntu 20.04:

    # first remove this package as it breaks the upgrade
    apt remove postgresql-10-postgis-2.4
    # then upgrade
    do-release-upgrade
    
  3. Migrate data to new postgres version:

    sudo apt install postgresql-12-postgis-3
    
    OLDVER=10
    NEWVER=12
    
    # ideally one would run this command and be done, but it fails because the old
    # postgresql-10-postgis-2.4 gets removed during the upgrade and it is required
    # in order to do the pg_upgrade. Execute it in any case, as it may migrate
    # at least configuration files like pg_hba.conf
    sudo systemctl stop postgresql.service
    sudo -u postgres /usr/lib/postgresql/$NEWVER/bin/pg_upgrade \
      --old-datadir=/var/lib/postgresql/$OLDVER/main \
      --new-datadir=/var/lib/postgresql/$NEWVER/main \
      --old-bindir=/usr/lib/postgresql/$OLDVER/bin \
      --new-bindir=/usr/lib/postgresql/$NEWVER/bin \
      --old-options "-c config_file=/etc/postgresql/$OLDVER/main/postgresql.conf" \
      --new-options "-c config_file=/etc/postgresql/$NEWVER/main/postgresql.conf"
    sudo systemctl start postgresql.service
    
    # instead we have to restore the backup created in step 1. with psql/pg_restore
    sudo -u postgres -i
    
    #
    # alt 1: restore database with psql
    /usr/lib/postgresql/$NEWVER/bin/createdb -p 5433 petascopedb
    # enter the spring.datasource.password= from /opt/rasdaman/etc/petascope.properties
    /usr/lib/postgresql/$NEWVER/bin/createuser -s -p 5433 petauser -P
    zcat /backup/petascopedb.sql.gz | \
      /usr/lib/postgresql/$NEWVER/bin/psql -p 5433 -d petascopedb > /dev/null
    #
    # alt 2: restore database with pg_restore
    pg_restore -p 5433 --file=/backup/petascopedb.sql.gz
    #
    
    # swap ports in postgres config, so the new version is at 5432
    sed -i 's/port = 5432/port = 5433/' /etc/postgresql/$OLDVER/main/postgresql.conf
    sed -i 's/port = 5433/port = 5432/' /etc/postgresql/$NEWVER/main/postgresql.conf
    
    # restart postgres
    sudo systemctl restart postgresql.service
    
    # check version, should show 12.x
    sudo -u postgres psql -c "SELECT version();"
    
  4. Install rasdaman:

    # enable rasdaman repo with correct distribution codename
    REPO_FILE=/etc/apt/sources.list.d/rasdaman.list
    sed 's/bionic/focal/g' $REPO_FILE.disabled | sudo tee $REPO_FILE
    sudo apt update
    # install rasdaman (or rasdaman-avx, rasdaman-avx2, rasdaman-avx512
    # depending on CPU capabilities)
    sudo apt install rasdaman
    
  5. Test rasdaman installation to make sure everything is working; if UDFs are deployed they will need to be recompiled, and same with any custom C++ clients.

  6. Remove old postgres (purge removes its configuration and data as well):

    sudo apt purge postgresql-10 postgresql-client-10
    

2.13.3. Ubuntu 20.04 to Ubuntu 22.04

  1. Make a backup of the rasdaman and petascope databases by following the backup guide. In particular:

    # postgres version
    OLDVER=12
    # alt 1: create backup in petascopedb.sql.gz; to be restored with psql
    sudo -u postgres pg_dump petascopedb | gzip > /backup/petascopedb.sql.gz
    # alt 2: text backup to be restored with pg_restore
    sudo -u postgres pg_dump --create --compress=5 petascopedb \
      --file=/backup/petascopedb.sql.gz
    # backup postgres databases by direct copy as well just in case
    sudo cp -a /var/lib/postgresql/$OLDVER/main/ /backup/petascopedb_raw_$OLDVER
    # backup postgres config
    sudo cp -a /etc/postgresql/$OLDVER /backup/etc_postgresql_$OLDVER
    # backup rasdaman dir
    sudo cp -a /opt/rasdaman /backup/opt_rasdaman
    

    Disable the rasdaman repo in apt and remove rasdaman:

    REPO_FILE=/etc/apt/sources.list.d/rasdaman.list
    sudo mv $REPO_FILE $REPO_FILE.disabled
    # remove rasdaman package; this won't remove any configuration/data
    sudo service rasdaman stop
    sudo apt remove "$(dpkg -l | grep '^ii *rasdaman' | awk '{ print $2; }')"
    
  2. Upgrade to Ubuntu 22.04 with do-release-upgrade

  3. Migrate data to new postgres version:

    sudo systemctl stop postgresql.service
    sudo apt install postgresql-14-postgis-3
    
    # migrate data
    sudo -u postgres -i
    cd /tmp
    OLDVER=12
    NEWVER=14
    
    # migrate petascopedb
    /usr/lib/postgresql/$NEWVER/bin/pg_upgrade \
      --old-datadir=/var/lib/postgresql/$OLDVER/main \
      --new-datadir=/var/lib/postgresql/$NEWVER/main \
      --old-bindir=/usr/lib/postgresql/$OLDVER/bin \
      --new-bindir=/usr/lib/postgresql/$NEWVER/bin \
      --old-options "-c config_file=/etc/postgresql/$OLDVER/main/postgresql.conf" \
      --new-options "-c config_file=/etc/postgresql/$NEWVER/main/postgresql.conf"
    
    # swap ports in postgres config, so the new version is at 5432
    sed -i 's/port = 5432/port = 5433/' /etc/postgresql/$OLDVER/main/postgresql.conf
    sed -i 's/port = 5433/port = 5432/' /etc/postgresql/$NEWVER/main/postgresql.conf
    
    # restart postgres
    sudo systemctl restart postgresql.service
    
    sudo -u postgres -i
    /usr/lib/postgresql/$NEWVER/bin/vacuumdb --all --analyze-in-stages
    
    # check version, should show 14.x
    psql -c "SELECT version();"
    
  4. Install rasdaman:

    # enable rasdaman repo with correct distribution codename
    REPO_FILE=/etc/apt/sources.list.d/rasdaman.list
    sed 's/focal/jammy/g' $REPO_FILE.disabled | sudo tee $REPO_FILE
    sudo apt update
    # install rasdaman (or rasdaman-avx, rasdaman-avx2, rasdaman-avx512
    # depending on CPU capabilities)
    sudo apt install rasdaman
    
  5. Test rasdaman installation to make sure everything is working; if UDFs are deployed they will need to be recompiled, and same with any custom C++ clients.

  6. Remove old postgres (purge removes its configuration and data as well):

    sudo -u postgres /tmp/delete_old_cluster.sh
    sudo apt purge postgresql-12 postgresql-client-12 postgresql-12-postgis-3
    

2.13.4. Ubuntu 22.04 to Ubuntu 24.04

  1. Make a backup of the rasdaman and petascope databases by following the backup guide. In particular:

    # postgres version
    OLDVER=14
    # alt 1: create backup in petascopedb.sql.gz; to be restored with psql
    sudo -u postgres pg_dump petascopedb | gzip > /backup/petascopedb.sql.gz
    # alt 2: text backup to be restored with pg_restore
    sudo -u postgres pg_dump --create --compress=5 petascopedb \
      --file=/backup/petascopedb.sql.gz
    # backup postgres databases by direct copy as well just in case
    sudo cp -a /var/lib/postgresql/$OLDVER/main/ /backup/petascopedb_raw_$OLDVER
    # backup postgres config
    sudo cp -a /etc/postgresql/$OLDVER /backup/etc_postgresql_$OLDVER
    # backup rasdaman dir (note the data subdir may be large)
    sudo cp -a /opt/rasdaman /backup/opt_rasdaman
    

    Disable the rasdaman repo in apt and remove rasdaman:

    REPO_FILE=/etc/apt/sources.list.d/rasdaman.list
    sudo mv $REPO_FILE $REPO_FILE.disabled
    # remove rasdaman package; this won't remove any configuration/data
    sudo service rasdaman stop
    sudo apt remove "$(dpkg -l | grep '^ii *rasdaman' | awk '{ print $2; }')"
    
  2. Upgrade to Ubuntu 24.04:

    do-release-upgrade
    

    If this command fails due to the postgresql-14-postgis-3 package, it has to be removed first:

    sudo apt remove postgresql-14-postgis-3
    
  3. Migrate data to new postgres version:

    sudo systemctl stop postgresql.service
    sudo apt install postgresql-16-postgis-3
    
    # if postgresql-14-postgis-3 was removed in step 2. it has to be installed
    # again now, otherwise the petascopedb migration command later on will fail
    echo 'deb http://archive.ubuntu.com/ubuntu jammy main restricted universe multiverse' | sudo tee -a /etc/apt/sources.list
    sudo apt update
    sudo apt install -y postgresql-14-postgis-3
    
    # if external tomcat is used for petascope deployment, then install the
    # tomcat9 package from the jammy repositories, as petascope is
    # incompatible with the tomcat10 in the noble (Ubuntu 24.04) repository
    sudo apt install -y tomcat9
    
    # migrate data
    sudo -u postgres -i
    cd /tmp
    OLDVER=14
    NEWVER=16
    
    # migrate petascopedb
    /usr/lib/postgresql/$NEWVER/bin/pg_upgrade \
      --old-datadir=/var/lib/postgresql/$OLDVER/main \
      --new-datadir=/var/lib/postgresql/$NEWVER/main \
      --old-bindir=/usr/lib/postgresql/$OLDVER/bin \
      --new-bindir=/usr/lib/postgresql/$NEWVER/bin \
      --old-options "-c config_file=/etc/postgresql/$OLDVER/main/postgresql.conf" \
      --new-options "-c config_file=/etc/postgresql/$NEWVER/main/postgresql.conf"
    
    # swap ports in postgres config, so the new version is at 5432
    sed -i 's/port = 5432/port = 5433/' /etc/postgresql/$OLDVER/main/postgresql.conf
    sed -i 's/port = 5433/port = 5432/' /etc/postgresql/$NEWVER/main/postgresql.conf
    
    # restart postgres (run this command in a different terminal)
    sudo systemctl restart postgresql.service
    
    /usr/lib/postgresql/$NEWVER/bin/vacuumdb --all --analyze-in-stages
    
    # check version, should show 16.x
    psql -c "SELECT version();"
    
    # finally, remove any lines with jammy in /etc/apt/sources.list
    
  4. Install rasdaman:

    # enable rasdaman repo with correct distribution codename
    REPO_FILE=/etc/apt/sources.list.d/rasdaman.list
    sed 's/jammy/noble/g' $REPO_FILE.disabled | sudo tee $REPO_FILE
    sudo apt update
    
    # check CPU SIMD capabilities
    grep flags /proc/cpuinfo | head -n1 | grep -o -E '(sse|avx)[^ ]*'
    
    # install one of rasdaman-avx512, rasdaman-avx2, rasdaman-avx, rasdaman
    # in that order, depending on what SIMD extensions are supported by your CPU;
    # e.g. if you see avx512* in the output, then install rasdaman-avx512, if
    # you don't see avx512 but see avx2 then install rasdaman-avx2, etc.
    sudo apt install -o Dpkg::Options::="--force-confdef" rasdaman-<simd>
    
  5. Test the rasdaman installation to make sure everything is working; if UDFs are deployed they will need to be recompiled, and same with any custom C++ clients.

  6. Remove old postgres (purge removes its configuration and data as well):

    sudo -u postgres /tmp/delete_old_cluster.sh
    sudo apt purge postgresql-14 postgresql-client-14 postgresql-14-postgis-3
    

2.14. Uninstallation

When uninstalling rasdaman, you can execute the following commands to ensure that all installed files and services are fully removed from the system.

▶ show

2.15. Troubleshooting

2.15.1. General

The first step in troubleshooting problems should be to look into the server logs.

Start with checking the rasmgr and rasserver logs for any errors. If this does not provide any clues, check the petascope.log or catalina.out.

Next, investigate the status of rasdaman and external Tomcat if applicable with systemctl rasdaman status (and similar for Tomcat). Inspect the output of ps aux | grep ras to list details about the rasdaman processes, or top for CPU and memory usage.

It can be useful to double check the system memory usage with free -m, and disk space usage with df -h.

2.15.2. Manually stop rasdaman

If stopping rasdaman fails, it may be necessary to manually stop it:

▶ show

Checking the server logs could provide further information on why stopping rasdaman failed in the first place.