You want to write data to a file.
Writing to a delimited text file
The easiest way to do this is to use
write.csv(). By default,
write.csv() includes row names, but these are usually unnecessary and may cause confusion.
# A sample data frame data <- read.table(header=TRUE, text=' subject sex size 1 M 7 2 F NA 3 F 9 4 M 11 ') # Write to a file, suppress row names write.csv(data, "data.csv", row.names=FALSE) # Same, except that instead of "NA", output blank cells write.csv(data, "data.csv", row.names=FALSE, na="") # Use tabs, suppress row names and column names write.table(data, "data.csv", sep="\t", row.names=FALSE, col.names=FALSE)
Saving in R data format
write.table() are best for interoperability with other data analysis programs. They will not, however, preserve special attributes of the data structures, such as whether a column is a character type or factor, or the order of levels in factors. In order to do that, it should be written out in a special format for R.
Below are are three primary ways of doing this:
The first method is to output R source code which, when run, will re-create the object. This should work for most data objects, but it may not be able to faithfully re-create some more complicated data objects.
# Save in a text format that can be easily loaded in R dump("data", "data.Rdmpd") # Can save multiple objects: dump(c("data", "data1"), "data.Rdmpd") # To load the data again: source("data.Rdmpd") # When loaded, the original data names will automatically be used.
The next method is to write out individual data objects in RDS format. This format can be binary or ASCII. Binary is more compact, while ASCII will be more efficient with version control systems like Git.
# Save a single object in binary RDS format saveRDS(data, "data.rds") # Or, using ASCII format saveRDS(data, "data.rds", ascii=TRUE) # To load the data again: data <- readRDS("data.rds")
It’s also possible to save multiple objects into an single file, using the RData format.
# Saving multiple objects in binary RData format save(data, file="data.RData") # Or, using ASCII format save(data, file="data.RData", ascii=TRUE) # Can save multiple objects save(data, data1, file="data.RData") # To load the data again: load("data.RData")
An important difference between
save() is that, with the former, when you
readRDS() the data, you specify the name of the object, and with the latter, when you
load() the data, the original object names are automatically used. Automatically using the original object names can sometimes simplify a workflow, but it can also be a drawback if the data object is meant to be distributed to others for use in a different environment.