# Updating R libraries after R version update In [my previous post](/posts/installing-user-r-version-on-infrastructure) about R I showed you how to install a local R version and set up your `R_LIBS` environment variable so that it points to the new R version. If you already had a working R install, you may be wondering how you can get access to all of the R libraries that you previously installed. You have several options. You have to decide if you ever want to fall back on the previous R version if something breaks in the current R version. If you are worried about this scenario, then it is useful to maintain the working copy of both the old R executable as well as the libraries that are compatible with that version of R. Maintaing the old option is the 'safer' option, which requires additional storage space but you get the peace of mind that if anything happens with the new version, you can fall back to the old one since it's still in tact. Let's cover this scenario first. Let's assume your directory structure is like this: ```console # davised:Linux @ waterman in /nfs3/CGRB/home/davised/opt/R [11:25:39] $ pwd; tree -L 3 /nfs3/CGRB/home/davised/opt/R . ├── 3.6.0 │   ├── bin │   │   ├── R │   │   └── Rscript │   ├── lib64 │   │   └── R │   └── share │   ├── info │   └── man └── 3.6.1 ├── bin │   ├── R │   └── Rscript ├── lib64 │   └── R └── share └── man 13 directories, 4 files ``` So you've just updated from 3.6.0 to 3.6.1 and you want to get access to all of the same packages you had previously. The R packages are stored in `/lib64/R/library`. ```console # davised:Linux @ waterman in /nfs3/CGRB/home/davised/opt/R [17:10:51] $ ls 3.6.0/lib64/R/library -w 80 ade4 fastmap matrixStats Rsamtools ape fastmatch methods S4Vectors argparser foreach mgcv scales assertthat foreign mime segmented backports formatR multtest seqinr base futile.logger munsell ShortRead BH futile.options mzR snow Biobase gdata ncdf4 sourcetools BiocGenerics GenomeInfoDb nlme sp BiocManager GenomeInfoDbData nnet spatial BiocParallel GenomicAlignments parallel spData ... ``` Let's copy them over to the new `3.6.1` library directory. ```console $ cp -nr /nfs3/CGRB/home/davised/opt/R-3.6.0/lib64/R/library/* /nfs3/CGRB/home/davised/opt/R/3.6.1/lib64/R/library ``` We use the -n flag to disallow overwrites, and the -r flag to recurse into the package directories. Now, we need to update any packages to version 3.6.1 if they are available. ```console # davised:Linux @ waterman in /nfs3/CGRB/home/davised/opt/R [12:12:49] $ echo $R_LIBS /nfs3/CGRB/home/davised/opt/R/3.6.1/lib64/R/library # davised:Linux @ waterman in /nfs3/CGRB/home/davised/opt/R [17:16:12] $ R --version R version 3.6.1 (2019-07-05) -- "Action of the Toes" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under the terms of the GNU General Public License versions 2 or 3. For more information about these matters see https://www.gnu.org/licenses/. $ Rscript -e 'update.packages(repos="https://ftp.osuosl.org/pub/cran", checkBuilt=TRUE, ask=FALSE)' ``` The `update.packages` command will find all of the packages that are out of date (with `checkBuilt=TRUE` an out-of-date package is one that is a different major.minor version. A major.minor.bugfix version difference is fine) will be updated. **NOTE** But what if you have bioconductor packages installed? Well, run this after you run the update above: ```console # davised:Linux @ chrom1 in /nfs3/CGRB/home/davised/opt/R [13:24:08] $ Rscript -e 'BiocManager::install(ask=FALSE)' Bioconductor version 3.9 (BiocManager 1.30.10), R 3.6.1 (2019-07-05) Old packages: 'BiocParallel', 'GenomicAlignments', 'GenomicRanges', 'IRanges', 'mzR', 'rhdf5', 'Rhdf5lib', 'Rhtslib', 'Rsamtools', 'S4Vectors', 'SummarizedExperiment' ... ``` And there you have it. You are on 3.6.1 with the most up-to-date packages. You can set your `$PATH` and `$R_LIBS` variables to point back to 3.6.0 and you can recover exactly where you were before the upgrade. But what if you don't want to go through the hassle of copying the packages over each time? You can set up a library directory that is agnostic to the version of R that you are using. For example, lets set up a new directory for our packages here: ```console # davised:Linux @ waterman in /nfs3/CGRB/home/davised/opt/R [17:24:47] $ echo $R_LIBS /nfs3/CGRB/home/davised/opt/R/library ``` I set my `$R_LIBS` in my config file (either `~/.bashrc` or `~/.tcshrc` for you), and now I need to get some packages in there that I can update from now through all future R upgrades. As I mentioned above, doing it this way means you won't be able to fall back to the old R version in the future, but you will gain the ability to upgrade to new R versions faster, and you will save hard drive space not having duplicate R library directories for each version of R that you have installed. So, we need to copy all of the packages that aren't bundled with R by default. Let's get that list of packages. ```console # davised:Linux @ waterman in /nfs3/CGRB/home/davised/opt/R/library [17:32:15] $ Rscript -e 'ip <- as.data.frame(installed.packages()); write(rownames(ip[ip$Priority %in% c("base", "recommended"),]), "base_packages.txt")' # davised:Linux @ waterman in /nfs3/CGRB/home/davised/opt/R/library [17:32:23] $ head base_packages.txt base boot class cluster codetools compiler datasets foreign graphics grDevices ``` We need to compare this list in the `base_packages.txt` file with the list of installed packages, and then copy over just those that aren't in the base list. ```console # davised:Linux @ waterman in /nfs3/CGRB/home/davised/opt/R/library [17:32:31] $ ls -1 /nfs3/CGRB/home/davised/opt/R/3.6.0/lib64/R/library > all_installed.txt # davised:Linux @ waterman in /nfs3/CGRB/home/davised/opt/R/library [17:35:58] $ wc -l all_installed.txt 143 all_installed.txt # davised:Linux @ waterman in /nfs3/CGRB/home/davised/opt/R/library [17:36:04] $ wc -l base_packages.txt 29 base_packages.txt # davised:Linux @ waterman in /nfs3/CGRB/home/davised/opt/R/library [17:36:07] $ echo 143 - 29 | bc 114 ``` So we have 114 packages we need to copy over. Let's do it. ```console # davised:Linux @ waterman in /nfs3/CGRB/home/davised/opt/R/library [17:38:46] $ cat base_packages.txt| sed 's/^/^/' | sed 's/$/$/' | grep -vf - all_installed.txt | wc -l 114 # davised:Linux @ waterman in /nfs3/CGRB/home/davised/opt/R/library [17:38:58] $ cat base_packages.txt| sed 's/^/^/' | sed 's/$/$/' | grep -vf - all_installed.txt | xargs -I'{}' cp -r /nfs3/CGRB/home/davised/opt/R/3.6.0/lib64/R/library/{} . ``` This command will copy only those that aren't in the base list over to the new directory. Now we just need to update them again. Same command as before! ```console # davised:Linux @ waterman in /nfs3/CGRB/home/davised/opt/R/library [17:40:15] $ Rscript -e 'update.packages(repos="https://ftp.osuosl.org/pub/cran", checkBuilt=TRUE, ask=FALSE)' ``` And if you have bioconductor packages... ```console $ Rscript -e 'BiocManager::install(ask=FALSE)' ``` If you take this approach, you can leave your `$R_LIBS` as-is each time you upgrade R and only run the `update.packages()` function instead of copying the entire library folder over to the new version. You'll just have to update your `$PATH` variable to include the new `/bin` directory. Happy upgrading!