| Title: | A Comprehensive Toolkit for Working with Encrypted Parquet Files |
|---|---|
| Description: | Utilities for reading, writing, and managing RCDF files, including encryption and decryption support. It offers a flexible interface for handling data stored in encrypted Parquet format, along with metadata extraction, key management, and secure operations using AES and RSA encryptions. |
| Authors: | Bhas Abdulsamad [aut, cre, cph]
|
| Maintainer: | Bhas Abdulsamad <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.6 |
| Built: | 2026-05-30 18:18:11 UTC |
| Source: | https://github.com/yng-me/rcdf |
Adds variable labels and value labels to a data frame based on a metadata
dictionary. This is particularly useful for preparing datasets for use with
packages like haven or for exporting to formats like SPSS or Stata.
add_metadata(data, metadata, ..., set_data_types = FALSE)add_metadata(data, metadata, ..., set_data_types = FALSE)
data |
A data frame containing the raw dataset. |
metadata |
A data frame that serves as a metadata dictionary. It must contain
at least the columns: |
... |
Additional arguments (currently unused). |
set_data_types |
Logical; if |
The function first checks the structure of the metadata using an internal helper.
Then, for each variable listed in metadata, it:
Adds a label using the label attribute
Converts values to labelled vectors using haven::labelled() if a valueset is provided
If value labels are present, the function tries to align data types between the data and the valueset (e.g., converting character codes to integers if necessary).
A tibble with the same data as data, but with added attributes:
Variable labels (via the label attribute)
Value labels (as a haven::labelled class, if applicable)
data <- data.frame( sex = c(1, 2, 1), age = c(23, 45, 34) ) metadata <- data.frame( variable_name = c("sex", "age"), label = c("Gender", "Age in years"), type = c("categorical", "numeric"), valueset = I(list( data.frame(value = c(1, 2), label = c("Male", "Female")), NULL )) ) labelled_data <- add_metadata(data, metadata) str(labelled_data)data <- data.frame( sex = c(1, 2, 1), age = c(23, 45, 34) ) metadata <- data.frame( variable_name = c("sex", "age"), label = c("Gender", "Age in years"), type = c("categorical", "numeric"), valueset = I(list( data.frame(value = c(1, 2), label = c("Male", "Female")), NULL )) ) labelled_data <- add_metadata(data, metadata) str(labelled_data)
rcdf classConverts an existing list or compatible object into an object of class rcdf.
as_rcdf(data)as_rcdf(data)
data |
A list or object to be converted to class |
The input object with class set to rcdf.
my_list <- list(a = 1, b = 2) rcdf_obj <- as_rcdf(my_list) class(rcdf_obj)my_list <- list(a = 1, b = 2) rcdf_obj <- as_rcdf(my_list) class(rcdf_obj)
Materialises a lazy rcdf_tbl_db DuckDB-backed table into a regular
R data frame, optionally applying the variable labels and value labels stored
in the table's metadata dictionary.
collect(data, ...) ## S3 method for class 'rcdf_tbl_db' collect(data, ...)collect(data, ...) ## S3 method for class 'rcdf_tbl_db' collect(data, ...)
data |
A lazy |
... |
Additional arguments passed to |
A tibble with all rows materialised. If the table carries a
metadata dictionary, variable labels and value labels are applied via
add_metadata before returning.
dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, "mtcars.rcdf") prv_key <- file.path(dir, "sample-private-key-pw.pem") ## Not run: result <- read_rcdf(path = rcdf_path, decryption_key = prv_key, password = "1234", lazy = TRUE) df <- collect(result$mtcars) class(df) # "tbl_df" ## End(Not run)dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, "mtcars.rcdf") prv_key <- file.path(dir, "sample-private-key-pw.pem") ## Not run: result <- read_rcdf(path = rcdf_path, decryption_key = prv_key, password = "1234", lazy = TRUE) df <- collect(result$mtcars) class(df) # "tbl_df" ## End(Not run)
Decrypt string using RSA
decrypt_string(x, prv_key, password = NULL)decrypt_string(x, prv_key, password = NULL)
x |
Encrypted base64-encoded string |
prv_key |
A private key object or .pem file |
password |
Passwor of the private key |
A decrpyted character of length 1
dir <- system.file("extdata", package = "rcdf") pub_key <- file.path(dir, 'sample-public-key.pem') prv_key <- file.path(dir, 'sample-private-key.pem') x <- encrypt_string('hello', pub_key) decrypt_string(x, prv_key = prv_key, password = '1234')dir <- system.file("extdata", package = "rcdf") pub_key <- file.path(dir, 'sample-public-key.pem') prv_key <- file.path(dir, 'sample-private-key.pem') x <- encrypt_string('hello', pub_key) decrypt_string(x, prv_key = prv_key, password = '1234')
Encrypt string using RSA
encrypt_string(x, pub_key)encrypt_string(x, pub_key)
x |
A character of length 1 |
pub_key |
A public key object or .pem file |
Encrypted base64-encoded string
dir <- system.file("extdata", package = "rcdf") pub_key <- file.path(dir, 'sample-public-key.pem') encrypt_string('hello', pub_key)dir <- system.file("extdata", package = "rcdf") pub_key <- file.path(dir, 'sample-public-key.pem') encrypt_string('hello', pub_key)
This function generates a random password of a specified length. It includes alphanumeric characters by default and can optionally include special characters.
generate_pw(length = 16, special_chr = TRUE)generate_pw(length = 16, special_chr = TRUE)
length |
Integer. The length of the password to generate. Default is |
special_chr |
Logical. Whether to include special characters
(e.g., |
A character string representing the generated password.
generate_pw() generate_pw(32) generate_pw(12, special_chr = FALSE)generate_pw() generate_pw(32) generate_pw(12, special_chr = FALSE)
This function generates an RSA key pair (public and private) and saves them to specified files.
generate_rsa_keys(path, ..., password = NULL, which = "public", prefix = NULL)generate_rsa_keys(path, ..., password = NULL, which = "public", prefix = NULL)
path |
A character string specifying the directory path where the key files in |
... |
Additional arguments passed to the |
password |
A character string specifying the password for the private key. If |
which |
A character string specifying which key to return. Can be either |
prefix |
A character string used as a prefix for the key file names. Defaults to |
A character string representing the file path of the generated key (either public or private, based on the which argument).
# Generate both public and private RSA keys and save them to the temp directory path_to <- tempdir() generate_rsa_keys(path = path_to, password = "securepassword")# Generate both public and private RSA keys and save them to the temp directory path_to <- tempdir() generate_rsa_keys(path = path_to, password = "securepassword")
Get metadata attribute from RCDF data
get_attr(rcdf, attr)get_attr(rcdf, attr)
rcdf |
RCDF data |
attr |
Valid metadata key. |
RCDF attribute/s or NULL
## Not run: # Assuming `df` is a valid RCDF object get_attr(df, "area_names") # To get nested attributes get_attr(df, "meta.source_note") ## End(Not run)## Not run: # Assuming `df` is a valid RCDF object get_attr(df, "area_names") # To get nested attributes get_attr(df, "meta.source_note") ## End(Not run)
Extracts and returns the complete metadata JSON stored inside an
.rcdf archive without decrypting or loading the underlying data.
Useful for inspecting provenance, checksums, encryption parameters, and
embedded data dictionaries without a decryption key.
get_attrs(path)get_attrs(path)
path |
Character string. Path to a valid |
A named list corresponding to the metadata.json stored
inside the archive. Common keys include log_id, created_at,
version, checksum, dictionary, and key.
get_rcdf_metadata for retrieving a single metadata key,
get_attr for reading metadata attributes attached to an
in-memory RCDF object.
dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, "mtcars.rcdf") meta <- get_attrs(rcdf_path) meta$version meta$created_atdir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, "mtcars.rcdf") meta <- get_attrs(rcdf_path) meta$version meta$created_at
Retrieves a specific metadata value from a .rcdf file.
get_rcdf_metadata(path, name = NULL, key)get_rcdf_metadata(path, name = NULL, key)
path |
Character string. The file path to the |
name |
Character string. The metadata key to extract from the file. |
key |
|
The value associated with the specified metadata key, or NULL if the key does not exist.
## Not run: # Assuming "example.rcdf" is a valid RCDF file in the working directory: get_rcdf_metadata("example.rcdf", "log_id") ## End(Not run)## Not run: # Assuming "example.rcdf" is a valid RCDF file in the working directory: get_rcdf_metadata("example.rcdf", "log_id") ## End(Not run)
Merge multiple RCDF files
merge_rcdf( rcdf_files, decryption_keys, passwords, merged_file_path, pub_key = NULL )merge_rcdf( rcdf_files, decryption_keys, passwords, merged_file_path, pub_key = NULL )
rcdf_files |
A character vector of RCDF file paths |
decryption_keys |
Decryption keys associated with each RCDF file. Must match the length of the vector passed in the |
passwords |
Password of the associated decryption keys. Must match the length of |
merged_file_path |
File path or name of the merged RCDF file. |
pub_key |
Public key to encrypt the merged file. If |
NULL (void)
## Not run: dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') pw <- '1234' temp_dir <- tempdir() merge_rcdf( rcdf_files = rcdf_path, decryption_keys = private_key, passwords = pw, merged_file_path = file.path(temp_dir, "merged.rcdf"), pub_key = file.path(dir, 'sample-public-key-pw.pem') ) unlink(file.path(temp_dir, "merged.rcdf"), force = TRUE) ## End(Not run)## Not run: dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') pw <- '1234' temp_dir <- tempdir() merge_rcdf( rcdf_files = rcdf_path, decryption_keys = private_key, passwords = pw, merged_file_path = file.path(temp_dir, "merged.rcdf"), pub_key = file.path(dir, 'sample-public-key-pw.pem') ) unlink(file.path(temp_dir, "merged.rcdf"), force = TRUE) ## End(Not run)
rcdf objectInitializes and returns an empty rcdf object. This is a convenient constructor
for creating a new rcdf-class list structure.
rcdf_list(...)rcdf_list(...)
... |
Optional elements to include in the list. These will be passed to
the internal list constructor and included in the resulting |
A list object of class rcdf.
rcdf <- rcdf_list() class(rcdf)rcdf <- rcdf_list() class(rcdf)
Based on https://github.com/gaborcsardi/dotenv
read_dot_env(path = ".env")read_dot_env(path = ".env")
path |
A string specifying the path to the |
Reads a .env file containing environment variables in the format KEY=VALUE, and returns them as a named list.
Lines starting with # are considered comments and ignored.
A named list of environment variables. Each element is a key-value pair extracted from the file. If no variables are found, NULL is returned.
## Not run: # Assuming an `.env` file with the following content: # DB_HOST=localhost # DB_USER=root # DB_PASS="secret" env_vars <- read_dot_env(".env") print(env_vars) # Should output something like: # $DB_HOST # [1] "localhost" # If no path is given, it defaults to `.env` in the current directory. env_vars <- read_dot_env() ## End(Not run)## Not run: # Assuming an `.env` file with the following content: # DB_HOST=localhost # DB_USER=root # DB_PASS="secret" env_vars <- read_dot_env(".env") print(env_vars) # Should output something like: # $DB_HOST # [1] "localhost" # If no path is given, it defaults to `.env` in the current directory. env_vars <- read_dot_env() ## End(Not run)
read_env(path = ".env")read_env(path = ".env")
path |
A string specifying the path to the |
Reads a .env file containing environment variables in the format KEY=VALUE, and returns them as a named list.
Lines starting with # are considered comments and ignored.
A named list of environment variables. Each element is a key-value pair extracted from the file. If no variables are found, NULL is returned.
## Not run: # Assuming an `.env` file with the following content: # DB_HOST=localhost # DB_USER=root # DB_PASS="secret" env_vars <- read_env(".env") print(env_vars) # Should output something like: # $DB_HOST # [1] "localhost" # If no path is given, it defaults to `.env` in the current directory. env_vars <- read_env() ## End(Not run)## Not run: # Assuming an `.env` file with the following content: # DB_HOST=localhost # DB_USER=root # DB_PASS="secret" env_vars <- read_env(".env") print(env_vars) # Should output something like: # $DB_HOST # [1] "localhost" # If no path is given, it defaults to `.env` in the current directory. env_vars <- read_env() ## End(Not run)
This function reads a Parquet file, optionally decrypting it using the provided decryption key. If no decryption key is provided, it reads the file normally without decryption. It supports reading Parquet files as Arrow tables or regular data frames, depending on the as_arrow_table argument.
read_parquet( path, ..., decryption_key = NULL, as_arrow_table = FALSE, metadata = NULL )read_parquet( path, ..., decryption_key = NULL, as_arrow_table = FALSE, metadata = NULL )
path |
The file path to the Parquet file. |
... |
Additional arguments passed to |
decryption_key |
A list containing |
as_arrow_table |
Logical. If |
metadata |
Optional metadata (e.g., a data dictionary) to be applied to the resulting data. |
An Arrow table or a data frame, depending on the value of as_arrow_table.
## Not run: # Using sample Parquet files from `mtcars` dataset dir <- system.file("extdata", package = "rcdf") # Not encrypted read_parquet(file.path(dir, "mtcars.parquet")) # Encrypted read_parquet( file.path(dir, "mtcars-encrypted.parquet"), decryption_key = 'rppqM5CuEqotys4wQq/g7xh6wpIjRozcAIbI9sagwKE=' ) ## End(Not run)## Not run: # Using sample Parquet files from `mtcars` dataset dir <- system.file("extdata", package = "rcdf") # Not encrypted read_parquet(file.path(dir, "mtcars.parquet")) # Encrypted read_parquet( file.path(dir, "mtcars-encrypted.parquet"), decryption_key = 'rppqM5CuEqotys4wQq/g7xh6wpIjRozcAIbI9sagwKE=' ) ## End(Not run)
This function reads a Parquet file, optionally decrypting it using the provided decryption key. If no decryption key is provided, it reads the file normally without decryption. It supports reading Parquet files as Arrow tables or regular data frames, depending on the as_arrow_table argument.
read_parquet_tbl(conn, file, decryption_key, table_name = NULL, columns = NULL)read_parquet_tbl(conn, file, decryption_key, table_name = NULL, columns = NULL)
conn |
A DuckDB connection. |
file |
The file path to the Parquet file. |
decryption_key |
A list containing |
table_name |
Database table name. If |
columns |
A character vector matching the column names available in the Parquet file. |
Lazy table from DuckDB connection
## Not run: # Using sample Parquet files from `mtcars` dataset dir <- system.file("extdata", package = "rcdf") # Encrypted read_parquet_tbl( file.path(dir, "mtcars-encrypted.parquet"), decryption_key = 'rppqM5CuEqotys4wQq/g7xh6wpIjRozcAIbI9sagwKE=' ) ## End(Not run)## Not run: # Using sample Parquet files from `mtcars` dataset dir <- system.file("extdata", package = "rcdf") # Encrypted read_parquet_tbl( file.path(dir, "mtcars-encrypted.parquet"), decryption_key = 'rppqM5CuEqotys4wQq/g7xh6wpIjRozcAIbI9sagwKE=' ) ## End(Not run)
This function reads an RCDF file, decrypts its contents using the specified decryption key, and loads it into R as an RCDF object.
read_rcdf( path, ..., decryption_key, password = NULL, metadata = list(), ignore_duplicates = TRUE, recursive = FALSE, return_meta = FALSE, lazy = FALSE, n_threads = NULL )read_rcdf( path, ..., decryption_key, password = NULL, metadata = list(), ignore_duplicates = TRUE, recursive = FALSE, return_meta = FALSE, lazy = FALSE, n_threads = NULL )
path |
A string specifying the path to the RCDF archive (zip file). If a directory is provided, all |
... |
Additional parameters passed to other functions, if needed (not yet implemented). |
decryption_key |
The key used to decrypt the RCDF. This can be an RSA or AES key, depending on how the RCDF was encrypted. |
password |
A password used for RSA decryption (optional). |
metadata |
An optional list of metadata object containing data dictionaries, value sets, and primary key constraints for data integrity measure (a |
ignore_duplicates |
A |
recursive |
Logical. If |
return_meta |
Logical. If |
lazy |
Logical. If |
n_threads |
Integer or |
An RCDF object, which is a list of Parquet files (one for each record) along with attached metadata.
dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') pw <- '1234' ## Not run: rcdf_data <- read_rcdf( path = rcdf_path, decryption_key = private_key, password = pw ) rcdf_data ## End(Not run)dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') pw <- '1234' ## Not run: rcdf_data <- read_rcdf( path = rcdf_path, decryption_key = private_key, password = pw ) rcdf_data ## End(Not run)
Writes a data frame to a Parquet file. When encryption_key is
supplied the file is encrypted with AES using DuckDB's native Parquet
encryption support. Without a key the file is written by
arrow::write_parquet() and supports any compression codec
understood by the Arrow library.
write_parquet( data, path, ..., encryption_key = NULL, conn = NULL, compression = "zstd" )write_parquet( data, path, ..., encryption_key = NULL, conn = NULL, compression = "zstd" )
data |
A data frame or tibble to write. |
path |
Character string. Destination file path for the Parquet file. |
... |
Additional arguments passed to |
encryption_key |
A raw AES key as a hex string (32, 48, or 64 hex
characters) or a base-64–encoded 256-bit key string. When |
conn |
An optional existing DuckDB |
compression |
Compression codec to use for unencrypted Parquet files.
Any codec supported by |
NULL invisibly. The Parquet file is written to path.
## Not run: library(rcdf) data <- mtcars key <- "rppqM5CuEqotys4wQq/g7xh6wpIjRozcAIbI9sagwKE=" temp_dir <- tempdir() # Encrypted write write_parquet( data = data, path = file.path(temp_dir, "mtcars.parquet"), encryption_key = key ) # Unencrypted write with gzip compression write_parquet( data = data, path = file.path(temp_dir, "mtcars-gz.parquet"), compression = "gzip" ) ## End(Not run)## Not run: library(rcdf) data <- mtcars key <- "rppqM5CuEqotys4wQq/g7xh6wpIjRozcAIbI9sagwKE=" temp_dir <- tempdir() # Encrypted write write_parquet( data = data, path = file.path(temp_dir, "mtcars.parquet"), encryption_key = key ) # Unencrypted write with gzip compression write_parquet( data = data, path = file.path(temp_dir, "mtcars-gz.parquet"), compression = "gzip" ) ## End(Not run)
This function writes data to an RCDF (Reusable Data Container Format) archive. It encrypts the data using AES, generates metadata, and then creates a zip archive containing both the encrypted Parquet files and metadata. The function supports the inclusion of metadata such as system information and encryption keys.
write_rcdf( data, path, pub_key, ..., metadata = list(), ignore_duplicates = TRUE )write_rcdf( data, path, pub_key, ..., metadata = list(), ignore_duplicates = TRUE )
data |
A list of data frames or tables to be written to RCDF format. Each element of the list represents a record. |
path |
The path where the RCDF file will be written. The file will be saved with a |
pub_key |
The public RSA key used to encrypt the AES encryption keys. |
... |
Additional arguments passed to helper functions if needed. |
metadata |
A list of metadata to be included in the RCDF file. |
ignore_duplicates |
A |
NULL. The function writes the data to a .rcdf file at the specified path.
## Not run: # Example usage of writing an RCDF file rcdf_data <- rcdf_list() rcdf_data$mtcars <- mtcars dir <- system.file("extdata", package = "rcdf") temp_dir <- tempdir() write_rcdf( data = rcdf_data, path = file.path(temp_dir, "mtcars.rcdf"), pub_key = file.path(dir, 'sample-public-key.pem') ) write_rcdf( data = rcdf_data, path = file.path(temp_dir, "mtcars-pw.rcdf"), pub_key = file.path(dir, 'sample-public-key-pw.pem') ) unlink(file.path(temp_dir, "mtcars.rcdf"), force = TRUE) unlink(file.path(temp_dir, "mtcars-pw.rcdf"), force = TRUE) ## End(Not run)## Not run: # Example usage of writing an RCDF file rcdf_data <- rcdf_list() rcdf_data$mtcars <- mtcars dir <- system.file("extdata", package = "rcdf") temp_dir <- tempdir() write_rcdf( data = rcdf_data, path = file.path(temp_dir, "mtcars.rcdf"), pub_key = file.path(dir, 'sample-public-key.pem') ) write_rcdf( data = rcdf_data, path = file.path(temp_dir, "mtcars-pw.rcdf"), pub_key = file.path(dir, 'sample-public-key-pw.pem') ) unlink(file.path(temp_dir, "mtcars.rcdf"), force = TRUE) unlink(file.path(temp_dir, "mtcars-pw.rcdf"), force = TRUE) ## End(Not run)
Exports RCDF-formatted data to one or more supported open data formats. The function automatically dispatches to the appropriate writer function based on the formats provided.
write_rcdf_as(data, path, formats, ...)write_rcdf_as(data, path, formats, ...)
data |
A named list or RCDF object. Each element should be a table or tibble-like object (typically a |
path |
The target directory where output files should be saved. |
formats |
A character vector of file formats to export to. Supported formats include: |
... |
Additional arguments passed to the respective writer functions. |
Invisibly returns NULL. Files are written to disk.
write_rcdf_csv write_rcdf_tsv write_rcdf_json write_rcdf_xlsx write_rcdf_dta write_rcdf_sav write_rcdf_sqlite
## Not run: dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_as(data = rcdf_data, path = temp_dir, formats = c("csv", "xlsx")) unlink(temp_dir, force = TRUE) ## End(Not run)## Not run: dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_as(data = rcdf_data, path = temp_dir, formats = c("csv", "xlsx")) unlink(temp_dir, force = TRUE) ## End(Not run)
Writes each table in the RCDF object as a separate .csv file.
write_rcdf_csv(data, path, ..., parent_dir = NULL)write_rcdf_csv(data, path, ..., parent_dir = NULL)
data |
A valid RCDF object. |
path |
The base output directory. |
... |
Additional arguments passed to |
parent_dir |
Optional subdirectory under |
Invisibly returns NULL. Files are written to disk.
dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_csv(data = rcdf_data, path = temp_dir) unlink(temp_dir, force = TRUE)dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_csv(data = rcdf_data, path = temp_dir) unlink(temp_dir, force = TRUE)
.dta filesWrites each table in the RCDF object to a .dta file for use in Stata.
write_rcdf_dta(data, path, ..., parent_dir = NULL)write_rcdf_dta(data, path, ..., parent_dir = NULL)
data |
A valid RCDF object. |
path |
Output directory for files. |
... |
Additional arguments passed to |
parent_dir |
Optional subdirectory under |
Invisibly returns NULL. Files are written to disk.
dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_dta(data = rcdf_data, path = temp_dir) unlink(temp_dir, force = TRUE)dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_dta(data = rcdf_data, path = temp_dir) unlink(temp_dir, force = TRUE)
Writes each table in the RCDF object as a separate .json file.
write_rcdf_json(data, path, ..., parent_dir = NULL)write_rcdf_json(data, path, ..., parent_dir = NULL)
data |
A valid RCDF object. |
path |
The output directory for files. |
... |
Additional arguments passed to |
parent_dir |
Optional subdirectory under |
Invisibly returns NULL. Files are written to disk.
dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_json(data = rcdf_data, path = temp_dir) unlink(temp_dir, force = TRUE)dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_json(data = rcdf_data, path = temp_dir) unlink(temp_dir, force = TRUE)
This function writes an RCDF object (a list of data frames) to multiple Parquet files. Each data frame in the list is written to its corresponding Parquet file in the specified path.
write_rcdf_parquet( data, path, ..., parent_dir = NULL, primary_key = NULL, ignore_duplicates = TRUE )write_rcdf_parquet( data, path, ..., parent_dir = NULL, primary_key = NULL, ignore_duplicates = TRUE )
data |
A list where each element is a data frame or tibble that will be written to a Parquet file. |
path |
The directory path where the Parquet files will be written. |
... |
Additional arguments passed to |
parent_dir |
An optional parent directory to be included in the path where the files will be written. |
primary_key |
A |
ignore_duplicates |
A |
A character vector of file paths to the written Parquet files.
dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_parquet(data = rcdf_data, path = temp_dir) unlink(temp_dir, force = TRUE)dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_parquet(data = rcdf_data, path = temp_dir) unlink(temp_dir, force = TRUE)
.sav filesWrites each table in the RCDF object to a .sav file using the haven package for compatibility with SPSS.
write_rcdf_sav(data, path, ..., parent_dir = NULL)write_rcdf_sav(data, path, ..., parent_dir = NULL)
data |
A valid RCDF object. |
path |
Output directory for files. |
... |
Additional arguments passed to |
parent_dir |
Optional subdirectory under |
Invisibly returns NULL. Files are written to disk.
dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_sav(data = rcdf_data, path = temp_dir) unlink(temp_dir, force = TRUE)dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_sav(data = rcdf_data, path = temp_dir) unlink(temp_dir, force = TRUE)
Writes all tables in the RCDF object to a single SQLite database file.
write_rcdf_sqlite(data, path, db_name = "cbms_data", ..., parent_dir = NULL)write_rcdf_sqlite(data, path, db_name = "cbms_data", ..., parent_dir = NULL)
data |
A valid RCDF object. |
path |
Output directory for the database file. |
db_name |
Name of the SQLite database file (without extension). |
... |
Additional arguments passed to |
parent_dir |
Optional subdirectory under |
Invisibly returns NULL. A .db file is written to disk.
dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_sqlite(data = rcdf_data, path = temp_dir) unlink(temp_dir, force = TRUE)dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_sqlite(data = rcdf_data, path = temp_dir) unlink(temp_dir, force = TRUE)
Writes each table in the RCDF object as a separate tab-separated .txt file.
write_rcdf_tsv(data, path, ..., parent_dir = NULL)write_rcdf_tsv(data, path, ..., parent_dir = NULL)
data |
A valid RCDF object. |
path |
The base output directory. |
... |
Additional arguments passed to |
parent_dir |
Optional subdirectory under |
Invisibly returns NULL. Files are written to disk.
dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_tsv(data = rcdf_data, path = temp_dir) unlink(temp_dir, force = TRUE)dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_tsv(data = rcdf_data, path = temp_dir) unlink(temp_dir, force = TRUE)
Writes each table in the RCDF object as a separate .xlsx file using the openxlsx package.
write_rcdf_xlsx( data, path, ..., parent_dir = NULL, as_single_file = FALSE, file_name = NULL )write_rcdf_xlsx( data, path, ..., parent_dir = NULL, as_single_file = FALSE, file_name = NULL )
data |
A valid RCDF object. |
path |
The output directory. |
... |
Additional arguments passed to |
parent_dir |
Optional subdirectory under |
as_single_file |
Whether to export all records (items in the RCDF list) in a single file where each item will be written per sheet in the workbook. |
file_name |
File name to assign when |
Invisibly returns NULL. Files are written to disk.
dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_xlsx(data = rcdf_data, path = temp_dir) unlink(temp_dir, force = TRUE)dir <- system.file("extdata", package = "rcdf") rcdf_path <- file.path(dir, 'mtcars.rcdf') private_key <- file.path(dir, 'sample-private-key-pw.pem') rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key, password = '1234') temp_dir <- tempdir() write_rcdf_xlsx(data = rcdf_data, path = temp_dir) unlink(temp_dir, force = TRUE)