RStudio

type access

  • Operating System:

  • Terminal:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

  • Utility:

  • Database:

  • CRAN Packages:

type access

  • Operating System:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

  • Utility:

  • Database:

type access

  • Operating System:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

  • Database:

  • Utility:

type access

  • Operating System:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

  • Database:

  • Utility:

type access

  • Operating System:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

type access

  • Operating System:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

type access

  • Operating System:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

  • Utility:

  • Database:

  • CRAN Packages:

  • Extension:

type access

  • Operating System:

  • Shell:

  • Editor:

  • Package Manager:

  • Programming Language:

  • Utility:

  • Extension:

RStudio is an integrated development environment for R, a programming language for statistical computing and graphics.

Introductory tutorials can be found here.

Install new packages

Additional packages can be installed inside the application container using the Dependencies parameter. The user should provide a Bash script (.sh) with the list of shell commands to be used for the installation. Below an example.

#!/bin/bash

sudo apt-get update
sudo apt-get install -y libssl-dev liblzma-dev libbz2-dev libicu-dev libxml2-dev libglpk-dev
sudo apt-get clean

R -e "install.packages('BiocManager', repos='http://cran.us.r-project.org'); \
    update.packages(ask=F); \
    BiocManager::install(ask=F); \
    BiocManager::install(c('dplyr','plotly'),ask=F)"

R -e "library(BiocManager); BiocManager::install(c('matrixStats','igraph','parallel', \
    'lattice','plyr','gplots','data.table','clusterProfiler','org.Hs.eg.db','STRINGdb', \
    'jsonlite','shinydashboard','shinyBS'    ,'limma'),ask=F)"

R -e "library(BiocManager); BiocManager::install(c('biomaRt'),ask=F)"

Batch mode

This option can be used to submit scripts which will be executed after the job starts. The job will stop after the execution of the program.

Allowed file formats are: Bash script (*.sh) and R script (*.R). These are some working examples:

#!/bin/bash
sudo apt-get update
sudo apt-get install -y libssl-dev liblzma-dev libbz2-dev libicu-dev libxml2-dev
sudo apt-get clean

Rscript /work/script.R
# Histogram of Random Normal Numbers
install.packages("pacman")

pacman::p_load(ggplot2, tidyr, dplyr, devtools, formatR)

data = data.frame(gender = c("M", "M", "F"),
                  age = c(20, 60, 30),
                  height = c(180, 200, 150))
data

# number of observations
num_obs <- 1000

# reading arguments ('mean' and 'sd')
args <- commandArgs(trailingOnly = TRUE)

if (length(args) == 0) {
        x <- rnorm(num_obs)
} else {
	if (is.missing(args[1])) {
	  mean <- 0
	} else {
	  mean <- as.numeric(args[1])
	}
	if (is.na(args[2])) {
	  sd <- 1
	} else {
	  sd <- as.numeric(args[2])
	}
	  x <- rnorm(num_obs, mean = mean, sd = sd)
}

print('Plotting histogram')

png('normal-histogram.png', pointsize = 18)
hist(x, las = 1, col = '#437899')
dev.off()

R project environment

It is possible to create isolated and portable environments for R projects using the renv package, which is integrated with RStudio.

Create a new project

An existing folder, say /work/my_project, can be converted into an R project by running the following commands from the RStudio console:

setwd('/work/my_project')
renv::init()
Sys.setenv(RENV_PATHS_CACHE = '/work/my_project/renv/cache')

where the last command sets the renv cache directory inside the project root folder for convenience. After the job is canceled, my_project will be available in the job output folder. The latter can moved to a different location and shared with collaborators.

The project can be activated in a new RStudio instance, by mounting its root folder and running the command:

renv::load(project = '/work/my_project')

Note

In order to install new packages within an existing R project, it is necessary to set up again the environmental variable RENV_PATHS_CACHE, as shown above.

Autoload an existing project

Activation of an existing R project can be automated via a Bash script, which is submitted using the optional Dependencies parameter. For example:

#!/bin/bash

echo "Sys.setenv(RENV_PATHS_CACHE = '/work/my_project/renv/cache')" > ~/.Rprofile
echo "renv::load(project = '/work/my_project')" >> ~/.Rprofile

where my_project is R project environment folder.

Parallel computing in R

Multicore processing and multi-threaded performance are enabled in the app container via the Basic Linear Algebra Subroutines (BLAS) library or the Intel Math Kernel Library (MKL). By default, an optimum number of threads is chosen by the program, which usually corresponds to the number of cores available in the computing node. The number of threads can also be controlled by setting the environmental variable OMP_NUM_THREADS for BLAS, or MKL_NUM_THREADS in case R was built using the Intel MKL.

Multi-threading is used to optimize basic linear algebra operations, such as dot product, matrix-vector multiplication, and matrix-matrix multiplication. However, it could interfere with some parallel operations, such as parallel foreach, parapply, sfClusterApply, etc. In this case, the user should make R use only one thread (core) for basic linear algebra operation, by setting

Sys.setenv(OMP_NUM_THREADS = 1)

or

Sys.setenv(MKL_NUM_THREADS = 1)

for R built against Intel MKL. These variables can also be loaded by adding them in the file /home/ucloud/.Renviron and restarting the R session.

In addition, it is possible to control the number of threads on BLAS (Aka 'GotoBLAS', 'ACML', and 'MKL') and the number of threads in OpenMP by installing the RhpcBLASctl R package.

For example:

install.packages("RhpcBLASctl")
library(RhpcBLASctl)
blas_set_num_threads(number of threads)