Repocli

Repocli is a command-line tool to perform basic operations on the data files in your Radboud Data Repository (RDR) collection, such as downloading, uploading, renaming and removing files. It is an ideal solution to automate batch up- and downloading of large datasets in the RDR. We do not recommend to use the tool unless you are familiar with command-line tools and want to work with large or many data files. The tool is the standard solution at DCCN's HPC Cluster to transfer data from the central storage to the RDR.

If you encounter problems or errors when transferring data, you can contact the ICT helpdesk (email: icthelpdesk@ru.nl, telephone: 0031 (0)24 3622222).

Step 1: Install Repocli

Repocli is already installed on the DCCN HPC Cluster for transferring data between the central storage and the RDR. You can therefore skip this step if you are using DCCN's HPC Cluster.

Go to this site to download the latest version of repocli: 'repocli' for Linux, 'repocli.darwin' for MacOS and 'repocli.exe' for Windows users.

Next, run the file via a command terminal. See Github for more detailed information.

Step 2: Configure Repocli

The credentials of the RDR should be provided in a configuration file. In order to generate (or overwrite) this file, use the following command:

$ repocli config

The command asks for the following credentials:

  • repo baseurl: Either enter https://webdav.data.ru.nl to access all your collections or enter the specific WebDAV URL of a collection you are working on

  • username and password: Fill in your data access credentials. You can find these credentials on the top right of the RDR under [Your name] > Data access credentials

  • save credential [y/N]: depending on if you would like to save your credentials type 'y' (yes) or 'N' (no). Do NOT save your credentials if you are on a publicly accessible PC or laptop

Step 3: Open your RDR collection

The name of the folder of your collection is based on the collection identifier. This identifier can be found in the RDR underneath the title and abstract of your collection.

../../../_images/collection_identifier_place.png

The collection identifier consists of the following parts:

../../../_images/collection_identifier.png

These parts correspond to the folder structure in repocli (i.e. your organisational unit first, then the last part of the collection identifier). To navigate to your collection, type 'repocli' followed by the collection identifier starting from the Organisational Unit. For example:

$ repocli rta/myproject001_dsc_457/

Alternatively, navigate using the repocli ls command.

Step 4: Work in Repocli

Use the repocli commands to manage files in your RDR folder. See here for a list of repocli commands.

Step 5: Check uploaded files

You should always check whether your files are uploaded to the RDR. In order to do this, open the RDR in your browser and log in. Find your collection and select it. Under the tab files you should now see the files you have added in repocli. To ensure that the data transfer is complete, you should always perform a checksum prior to deleting files from your original storage system or publication or archiving of your collection in the RDR.

Note: Uploading to the RDR can take some time (up to several hours), especially for large files. Please be patient when handling a lot of and/or large files.