This function returns the fuzzy matches for a character vector using a pool of potential matches. The distance between each string and each candidate in the pool is calculated using a specified method, and the candidate with the shortest distance to each string is returned.
Usage
get_fuzzy_match(
old,
new,
method = c("osa", "lv", "dl", "lcs", "qgram", "cosine", "jaccard", "jw"),
nthread = parallel::detectCores() - 1
)Arguments
- old
(character) vector of strings to fuzzy-match
- new
(character) vector of strings to use as possible matches
- method
method used to calculate distances between strings (see Details)
- nthread
number of parallel threads (default all minus 1)
Details
This function uses stringdist::stringdistmatrix() to calculate distances
between each requested string in old and the candidates in new. The method
is one of the following: osa (default), lv, dl, lcs, qgram, cosine,
jaccard, or jw. See the corresponding stringdist::stringdistmatrix()
documentation.