[r] Check existence of directory and create if doesn't exist

I often find myself writing R scripts that generate a lot of output. I find it cleaner to put this output into it's own directory(s). What I've written below will check for the existence of a directory and move into it, or create the directory and then move into it. Is there a better way to approach this?

mainDir <- "c:/path/to/main/dir"
subDir <- "outputDirectory"

if (file.exists(subDir)){
    setwd(file.path(mainDir, subDir))
} else {
    dir.create(file.path(mainDir, subDir))
    setwd(file.path(mainDir, subDir))

}

This question is related to r

The answer is


I had an issue with R 2.15.3 whereby while trying to create a tree structure recursively on a shared network drive I would get a permission error.

To get around this oddity I manually create the structure;

mkdirs <- function(fp) {
    if(!file.exists(fp)) {
        mkdirs(dirname(fp))
        dir.create(fp)
    }
} 

mkdirs("H:/foo/bar")

Here's the simple check, and creates the dir if doesn't exists:

## Provide the dir name(i.e sub dir) that you want to create under main dir:
output_dir <- file.path(main_dir, sub_dir)

if (!dir.exists(output_dir)){
dir.create(output_dir)
} else {
    print("Dir already exists!")
}

To find out if a path is a valid directory try:

file.info(cacheDir)[1,"isdir"]

file.info does not care about a slash on the end.

file.exists on Windows will fail for a directory if it ends in a slash, and succeeds without it. So this cannot be used to determine if a path is a directory.

file.exists("R:/data/CCAM/CCAMC160b_echam5_A2-ct-uf.-5t05N.190to240E_level1000/cache/")
[1] FALSE

file.exists("R:/data/CCAM/CCAMC160b_echam5_A2-ct-uf.-5t05N.190to240E_level1000/cache")
[1] TRUE

file.info(cacheDir)["isdir"]

As of April 16, 2015, with the release of R 3.2.0 there's a new function called dir.exists(). To use this function and create the directory if it doesn't exist, you can use:

ifelse(!dir.exists(file.path(mainDir, subDir)), dir.create(file.path(mainDir, subDir)), FALSE)

This will return FALSE if the directory already exists or is uncreatable, and TRUE if it didn't exist but was succesfully created.

Note that to simply check if the directory exists you can use

dir.exists(file.path(mainDir, subDir))

The use of file.exists() to test for the existence of the directory is a problem in the original post. If subDir included the name of an existing file (rather than just a path), file.exists() would return TRUE, but the call to setwd() would fail because you can't set the working directory to point at a file.

I would recommend the use of file_test(op="-d", subDir), which will return "TRUE" if subDir is an existing directory, but FALSE if subDir is an existing file or a non-existent file or directory. Similarly, checking for a file can be accomplished with op="-f".

Additionally, as described in another comment, the working directory is part of the R environment and should be controlled by the user, not a script. Scripts should, ideally, not change the R environment. To address this problem, I might use options() to store a globally available directory where I wanted all of my output.

So, consider the following solution, where someUniqueTag is just a programmer-defined prefix for the option name, which makes it unlikely that an option with the same name already exists. (For instance, if you were developing a package called "filer", you might use filer.mainDir and filer.subDir).

The following code would be used to set options that are available for use later in other scripts (thus avoiding the use of setwd() in a script), and to create the folder if necessary:

mainDir = "c:/path/to/main/dir"
subDir = "outputDirectory"

options(someUniqueTag.mainDir = mainDir)
options(someUniqueTag.subDir = "subDir")

if (!file_test("-d", file.path(mainDir, subDir)){
  if(file_test("-f", file.path(mainDir, subDir)) {
    stop("Path can't be created because a file with that name already exists.")
  } else {
    dir.create(file.path(mainDir, subDir))
  }
}

Then, in any subsequent script that needed to manipulate a file in subDir, you might use something like:

mainDir = getOption(someUniqueTag.mainDir)
subDir = getOption(someUniqueTag.subDir)
filename = "fileToBeCreated.txt"
file.create(file.path(mainDir, subDir, filename))

This solution leaves the working directory under the control of the user.


One-liner:

if (!dir.exists(output_dir)) {dir.create(output_dir)}

Example:

dateDIR <- as.character(Sys.Date())
outputDIR <- file.path(outD, dateDIR)
if (!dir.exists(outputDIR)) {dir.create(outputDIR)}

I know this question was asked a while ago, but in case useful, the here package is really helpful for not having to reference specific file paths and making code more portable. It will automatically define your working directory as the one that your .Rproj file resides in, so the following will often suffice without having to define the file path to your working directory:

library(here)

if (!dir.exists(here(outputDir))) {dir.create(here(outputDir))}


In terms of general architecture I would recommend the following structure with regard to directory creation. This will cover most potential issues and any other issues with directory creation will be detected by the dir.create call.

mainDir <- "~"
subDir <- "outputDirectory"

if (file.exists(paste(mainDir, subDir, "/", sep = "/", collapse = "/"))) {
    cat("subDir exists in mainDir and is a directory")
} else if (file.exists(paste(mainDir, subDir, sep = "/", collapse = "/"))) {
    cat("subDir exists in mainDir but is a file")
    # you will probably want to handle this separately
} else {
    cat("subDir does not exist in mainDir - creating")
    dir.create(file.path(mainDir, subDir))
}

if (file.exists(paste(mainDir, subDir, "/", sep = "/", collapse = "/"))) {
    # By this point, the directory either existed or has been successfully created
    setwd(file.path(mainDir, subDir))
} else {
    cat("subDir does not exist")
    # Handle this error as appropriate
}

Also be aware that if ~/foo doesn't exist then a call to dir.create('~/foo/bar') will fail unless you specify recursive = TRUE.