Skip to contents
# Use development build when interactive *and* explicitly enabled via env var.
dev_mode <- (Sys.getenv("DEV_VIGNETTES", "false") == "true")

if (dev_mode && requireNamespace("pkgload", quietly = TRUE)) {
  pkgload::load_all(
    export_all = FALSE,
    helpers = FALSE,
    attach_testthat = FALSE
  )
} else {
  # fall back to the installed package (the path CRAN, CI, and pkgdown take)
  library(stamp)
}

This article explains how stamp detects changes and records versions. We’ll cover:

The core idea: stamp computes stable, reproducible hashes for objects and (optionally) the user-supplied code. When you call st_save(), stamp compares the new hashes with stored metadata (sidecars or committed snapshots) and decides whether to write a new version. This allows cheap “skip-on-equal” behavior for expensive workflows.

st_opts_reset()
st_opts(
  versioning = "content", # skip write when content unchanged
  code_hash = TRUE, # store code hash when 'code=' is provided to st_save()
  store_file_hash = TRUE, # compute & store file hash after write
  verify_on_load = TRUE, # verify content on load (warn on mismatch)
  meta_format = "both" # write JSON + QS2 sidecars
)
##  stamp options updated
##   versioning = "content", code_hash = "TRUE", store_file_hash = "TRUE",
##   verify_on_load = "TRUE", meta_format = "both"

Save with hashes (and skip if content identical)

root <- tempdir()
st_init(root)
##  stamp initialized
##   root: /tmp/RtmpF4xZDt
##   state: /tmp/RtmpF4xZDt/.stamp
p <- fs::path(root, "demo.qs")
x <- data.frame(a = 1:3)

# First write: creates artifact + sidecars + catalog entry
st_save(x, p, code = function(z) z)
##  Saved [qs] → /tmp/RtmpF4xZDt/demo.qs @ version
## 17789658836fb198
# Second write, same content & same code: skipped (no new version)
st_save(x, p, code = function(z) z)
##  Skip save (reason: no_change_policy) for
## /tmp/RtmpF4xZDt/demo.qs
nrow(st_versions(p)) # should be 1
## [1] 1

In the snippet above stamp serializes x in a deterministic way and computes a content hash. If both the content hash and (when provided) the code hash match the stored metadata, no new version is created when versioning = "content".

Note: the first write will always create the artifact and its sidecar(s). If you see the first write skipped, check that the path you passed to st_save() exactly matches subsequent calls.

If you change content (or change the code), a new version is recorded:

x2 <- transform(x, a = a + 1L)
st_save(x2, p, code = function(z) z)
##  Saved [qs] → /tmp/RtmpF4xZDt/demo.qs @ version
## 1946b8e377d9c8d0
nrow(st_versions(p)) # now 2
## [1] 2
st_latest(p) # latest version id (string)
## [1] "1946b8e377d9c8d0"

Policy: By design, changing the code= you pass to st_save() creates a new version even if x is identical. This makes code provenance explicit.

A short practical pattern:

  • Pass your transformation code to st_save(..., code = <function or expression>) so stamp can record the code hash.
  • Use st_changed() or st_should_save() to cheaply decide whether to run expensive computations before calling st_save().

Inspect sidecars & metadata

meta <- st_read_sidecar(p)
meta[c(
  "format",
  "created_at",
  "size_bytes",
  "content_hash",
  "code_hash",
  "file_hash"
)]
## $format
## [1] "qs"
## 
## $created_at
## [1] "2025-12-22T11:00:57.492275Z"
## 
## $size_bytes
## [1] 137
## 
## $content_hash
## [1] "7e25cdd35cd37239"
## 
## $code_hash
## [1] "488e8fa49c740261"
## 
## $file_hash
## [1] "28d069cce24a86f3"

Explanation:

  • content_hash is the stable hash of the R object written (via st_hash_obj()).

  • code_hash is recorded when you provide code= to st_save() (via st_hash_code()).

  • file_hash is computed after the file write (if store_file_hash = TRUE) and can be used to detect external file tampering.

  • content_hash comes from st_hash_obj(x)

  • code_hash comes from st_hash_code(code) if you supplied code=

  • file_hash is optional (post-write) and is used to verify on load

Change detection (without writing)

Use these before doing expensive work, to decide whether to recompute.

x_same <- x2
x_new <- transform(x2, a = a + 10L)

st_changed(p, x = x_same, code = function(z) z)
## $changed
## [1] FALSE
## 
## $reason
## [1] "no_change"
## 
## $details
## $details$content_changed
## [1] FALSE
## 
## $details$code_changed
## [1] FALSE
## 
## $details$file_changed
## [1] FALSE
st_changed_reason(p, x = x_same, code = function(z) z) # "no_change"
## [1] "no_change"
st_changed(p, x = x_new, code = function(z) z)
## $changed
## [1] TRUE
## 
## $reason
## [1] "content"
## 
## $details
## $details$content_changed
## [1] TRUE
## 
## $details$code_changed
## [1] FALSE
## 
## $details$file_changed
## [1] FALSE
st_changed_reason(p, x = x_new, code = function(z) z) # "content"
## [1] "content"
st_should_save(p, x = x_same, code = function(z) z) # recommends skip
## $save
## [1] FALSE
## 
## $reason
## [1] "no_change_policy"
st_should_save(p, x = x_new, code = function(z) z) # recommends save
## $save
## [1] TRUE
## 
## $reason
## [1] "content"

When you call st_changed() or st_changed_reason() you avoid performing any file writes. These helpers are ideal as guards inside functions that compute expensive results only when necessary:

Example pattern inside your pipeline function:

if (st_should_save(p, x = out, code = my_transform)$save) {
  st_save(out, p, code = my_transform)
} else {
  message("Skipping write; content and code unchanged")
}

Loading specific versions

vids <- st_versions(p)
head(vids)
##          version_id      artifact_id     content_hash        code_hash
##              <char>           <char>           <char>           <char>
## 1: 1946b8e377d9c8d0 fed38ae263aeddbc 7e25cdd35cd37239 488e8fa49c740261
## 2: 17789658836fb198 fed38ae263aeddbc 1811ba4b2bd2a26a 488e8fa49c740261
##    size_bytes                  created_at sidecar_format
##         <num>                      <char>         <char>
## 1:        137 2025-12-22T11:00:57.492275Z           both
## 2:        137 2025-12-22T11:00:57.325872Z           both
vid_latest <- st_latest(p)
obj_latest <- st_load_version(p, vid_latest)
##  Loaded ← /tmp/RtmpF4xZDt/demo.qs @
## 1946b8e377d9c8d0 [qs]
# Load an older version by id
if (nrow(vids) > 1L) {
  vid_old <- vids$version_id[[nrow(vids)]]
  obj_old <- st_load_version(p, vid_old)
}
##  Loaded ← /tmp/RtmpF4xZDt/demo.qs @
## 17789658836fb198 [qs]

st_versions() returns a table of version metadata. Each row includes the version_id, created_at, and a snapshot of sidecar fields available at commit time. Use st_load_version() to restore the artifact as it was at that version.

Where are versions stored?

Snapshots live under .stamp/versions/<relative-path>/<version_id>/.

p <- fs::path(root, "demo.qs")
x <- data.frame(a = 1:5)

# Write once to create a version snapshot
st_save(x, p, code = function(z) z)
##  Saved [qs] → /tmp/RtmpF4xZDt/demo.qs @ version
## ff201adba0a15f08
# Now list the versions tree
vroot <- stamp:::.st_versions_root()
fs::dir_tree(vroot, recurse = TRUE, all = TRUE)
## /tmp/RtmpF4xZDt/.stamp/versions
## └── demo.qs
##     ├── 17789658836fb198
##     │   ├── artifact
##     │   ├── sidecar.json
##     │   └── sidecar.qs2
##     ├── 1946b8e377d9c8d0
##     │   ├── artifact
##     │   ├── sidecar.json
##     │   └── sidecar.qs2
##     └── ff201adba0a15f08
##         ├── artifact
##         ├── sidecar.json
##         └── sidecar.qs2

Each snapshot dir contains:

  • artifact — a copy of the saved file
  • sidecar.json and/or sidecar.qs2 — depending on meta_format

Additionally each snapshot may include a parents.json file capturing committed lineage between artifacts; this is created when stamp records explicit parents during a commit. Sidecar metadata (in stmeta/) is the primary local source used to decide whether to write, while snapshots are the long-term committed record.

Integrity checks on load (optional)

If verify_on_load = TRUE and a content_hash exists in the sidecar, st_load() recomputes the object’s hash and warns if it differs (indicating the file changed outside stamp).

invisible(st_load(p)) # triggers optional verify; warns on mismatch
## Warning: Loaded object hash mismatch for /tmp/RtmpF4xZDt/demo.qs (content hash differs
## from sidecar).
## Warning: No primary key recorded for /tmp/RtmpF4xZDt/demo.qs.
##  You can add one with `st_add_pk()`.
##  Loaded [qs] ← /tmp/RtmpF4xZDt/demo.qs

If verify_on_load = TRUE, st_load() recomputes st_hash_obj() and compares it to the content_hash recorded in the sidecar or snapshot. A mismatch usually means the file was modified outside of stamp and re-saving is recommended.

Troubleshooting

Q: The first st_save() was skipped and st_versions(p) is 0. A: The first write should never be skipped. Ensure you’re using the current st_should_save() which returns save = TRUE when the artifact is missing or the sidecar is missing.

Q: st_changed_reason() says "missing_meta". A: The artifact exists but the sidecar was removed or is unreadable. Call st_save(x, p, code = ...) once; it will re-materialize metadata and record a version.

Q: Changing only code= didn’t create a new version. A: By design, a code change does create a new version. Confirm st_opts("code_hash", .get = TRUE) is TRUE and you passed code= consistently (e.g., a function literal, not different object pointers to identical functions in rare cases).

Q: CSV round-trips aren’t byte-identical. A: data.table::fread/fwrite may coerce types (e.g., integers vs doubles). Compare with relaxed checks or coerce types before comparison.

Q: I see a warning on load about hash mismatch. A: With verify_on_load = TRUE, stamp recomputes the object hash and warns if it differs from the sidecar’s content_hash. This indicates the file was modified outside stamp or the sidecar is stale. Re-save to repair.

Q: qs2 isn’t installed. A: qs2 is preferred. If unavailable, stamp falls back to qs for read/write under the "qs2" handler. Install qs2 for best performance.

Q: Sidecars not appearing. A: Check st_opts("meta_format", .get = TRUE) — set to "json", "qs2", or "both". Sidecars are written to stmeta/ next to the artifact.

Q: Versions aren’t where I expect. A: Version snapshots live under .stamp/versions/<relative-path>/<version_id>/. Use the code snippet above to explore the tree.

Tips & conventions

  • Keep versioning = "content" for reproducible artifacts; use "timestamp" if you want a new version on every save; "off" to skip versioning entirely.
  • Use st_changed() / st_should_save() to gate expensive computation inside your own functions.
  • Sidecars: prefer meta_format = "json" for readability, "qs2" for compactness, or "both" for redundancy.

Further reading / next steps:

  • See the lineage-rebuilds vignette for how committed parents and sidecar parents interact during st_rebuild().
  • Consider recording code= for critical data transformations so provenance is preserved even when object content is identical.