Skip to contents

calculates and displays the hash of the data in memory for all the elements of the first level of x. This function is basically a wrapper around digest::digest(). It also stores the time of the estimation of the stamp.

Usage

stamp_get(
  x,
  algo = c(getOption("stamp.digest.algo"), "md5", "sha1", "crc32", "sha256", "sha512",
    "xxhash32", "xxhash64", "murmur32", "spookyhash", "blake3"),
  serialize = TRUE,
  file = FALSE,
  length = Inf,
  skip = "auto",
  ascii = FALSE,
  raw = FALSE,
  seed = 0,
  errormode = c("stop", "warn", "silent")
)

Arguments

x

An arbitrary R object which will then be passed to the base::serialize function

algo

character: default is value in option "stamp.digest.algo". This argument is the algorithms to be used; currently available choices are md5, which is also the default, sha1, crc32, sha256, sha512, xxhash32, xxhash64, murmur32, spookyhash and blake3

serialize

A logical variable indicating whether the object should be serialized using serialize (in ASCII form). Setting this to FALSE allows to compare the digest output of given character strings to known control output. It also allows the use of raw vectors such as the output of non-ASCII serialization.

file

A logical variable indicating whether the object is a file name or a file name if object is not specified.

length

Number of characters to process. By default, when length is set to Inf, the whole string or file is processed.

skip

Number of input bytes to skip before calculating the digest. Negative values are invalid and currently treated as zero. Special value "auto" will cause serialization header to be skipped if serialize is set to TRUE (the serialization header contains the R version number thus skipping it allows the comparison of hashes across platforms and some R versions).

ascii

This flag is passed to the serialize function if serialize is set to TRUE, determining whether the hash is computed on the ASCII or binary representation.

raw

A logical variable with a default value of FALSE, implying digest returns digest output as ASCII hex values. Set to TRUE to return digest output in raw (binary) form. Note that this option is supported by most but not all of the implemented hashing algorithms

seed

an integer to seed the random number generator. This is only used in the xxhash32, xxhash64 and murmur32 functions and can be used to generate additional hashes for the same input if desired.

errormode

A character value denoting a choice for the behaviour in the case of error: ‘stop’ aborts (and is the default value), ‘warn’ emits a warning and returns NULL and ‘silent’ suppresses the error and returns an empty string.

Value

The digest function returns a character string of a fixed length containing the requested digest of the supplied R object. This string is of length 32 for MD5; of length 40 for SHA-1; of length 8 for CRC32 a string; of length 8 for for xxhash32; of length 16 for xxhash64; and of length 8 for murmur32.

Details

Cryptographic hash functions are well researched and documented. The MD5 algorithm by Ron Rivest is specified in RFC 1321. The SHA-1 algorithm is specified in FIPS-180-1, SHA-2 is described in FIPS-180-2.

For md5, sha-1 and sha-256, this R implementation relies on standalone implementations in C by Christophe Devine. For crc32, code from the zlib library by Jean-loup Gailly and Mark Adler is used.

For sha-512, a standalone implementation from Aaron Gifford is used.

For xxhash32 and xxhash64, the reference implementation by Yann Collet is used.

For murmur32, the progressive implementation by Shane Day is used.

For spookyhash, the original source code by Bob Jenkins is used. The R implementation that integrates R's serialization directly with the algorithm allowing for memory-efficient incremental calculation of the hash is by Gabe Becker.

For blake3, the C implementation by Samuel Neves and Jack O'Connor is used.

Please note that this package is not meant to be used for cryptographic purposes for which more comprehensive (and widely tested) libraries such as OpenSSL should be used. Also, it is known that crc32 is not collision-proof. For sha-1, recent results indicate certain cryptographic weaknesses as well. For more details, see for example https://www.schneier.com/blog/archives/2005/02/cryptanalysis_o.html.

See also

Other stamp functions: stamp_confirm(), stamp_read(), stamp_save(), stamp_set(), stamp_time(), stamp_x_attr()

Examples

stamp_get("abc")
#> $stamps
#> $stamps[[1]]
#> [1] "42ae699e08957e40b19ab3976419a232"
#> 
#> 
#> $time
#> $time$tz
#> [1] "UTC"
#> 
#> $time$tformat
#> [1] "%Y%m%d%H%M%S"
#> 
#> $time$usetz
#> [1] FALSE
#> 
#> $time$st_time
#> [1] "20230202222548"
#> 
#> 
#> $algo
#> [1] "spookyhash"
#>