pkgdown/extra.css Skip to contents

datasusr can cache DATASUS downloads in a local directory so that repeated calls do not hit the DATASUS FTP again. This is especially useful when developing analysis pipelines interactively.

How caching works

When you call datasus_download() with use_cache = TRUE (the default), files are stored in a structured subdirectory tree under the cache folder. On subsequent calls for the same files, the cached versions are reused without any network access.

library(datasusr)

downloads <- datasus_fetch(
  source    = "SIHSUS",
  file_type = c("RD", "SP"),
  year      = 2024,
  month     = 1,
  uf        = c("PE", "PB")
)

Configuring the cache directory

By default, downloads are placed in a session-scoped subdirectory of tempdir() (which R cleans up automatically when the session ends), so the package never writes outside the user-controlled tempdir unless you opt in.

The cache location is resolved in the following order:

  1. The cache_dir function argument
  2. The DATASUSR_CACHE_DIR environment variable
  3. The datasusr.cache_dir R option
  4. The session default (file.path(tempdir(), "datasusr-cache"))

To enable a persistent cache that survives across sessions, point one of the above to a directory of your choice — for example tools::R_user_dir("datasusr", "cache") — and the cache becomes truly persistent.

To set it globally, add a line to your .Renviron:

DATASUSR_CACHE_DIR=/path/to/my/cache

Or in R:

options(datasusr.cache_dir = "/path/to/my/cache")

Inspecting the cache

# Quick summary
datasus_cache_info(verbose = TRUE)

# Detailed listing of all cached files
datasus_cache_list()

Forcing a re-download

Pass refresh = TRUE to datasus_download() (or datasus_fetch()) to re-download files even when they exist in the cache:

datasus_download(files, refresh = TRUE)

Pruning and clearing the cache

Over time the cache can grow large. Two functions help manage its size:

# Remove files older than 90 days
datasus_cache_prune(older_than_days = 90)

# Keep the total cache under 5 GB
datasus_cache_prune(max_size_bytes = 5 * 1024^3)

# Remove everything
datasus_cache_clear()

When pruning by size, the least-recently-accessed files are removed first.