Skip to content

Supported Algorithms

A number of algorithms are supported using ElasticSearch with the analysis-phonetic plugin and the OpenCR Service (alone).

Algorithm OpenCR Service ElasticSearch
Exact Yes Yes
Metaphone Yes Yes
Double-metaphone Yes Yes
Levenshtein Yes Yes
Damerau-Levenshtein Yes Yes
Jaro-Winkler Yes No
Soundex Yes Yes

For more advanced string similarity matching, the similarity-scoring plugin for ElasticSearch can provide more features, and is based on the https://github.com/tdebatty/java-string-similarity library. The library is open source.

For more information, see the similarity-scoring repository:

Matcher Parameter for Query Algorithm Type Normalized?
cosine-similarity Cosine similarity yes
dice-similarity Sorensen-Dice similarity yes
jaccard-similarity Jaccard similarity yes
jaro-winkler-similarity Jaro-Winkler similarity yes
normalized-lcs-similarity Normalized Longest Common Subsequence similarity yes
normalized-levenshtein-similarity Normalized Levenshtein similarity yes
cosine-distance Cosine distance yes
damerau-levenshtein Damerau-Levenshtein distance no
dice-distance Sorensen-Dice distance yes
jaccard-distance Jaccard distance yes
jaro-winkler-distance Jaro-Winkler distance yes
levenshtein Levenshtein distance no
longest-common-subsequence Longest Common Subsequence distance no
metric-lcs Metric Longest Common Subsequence distance yes
ngram N-Gram distance yes
normalized-lcs-distance Normalized Longest Common Subsequence distance yes
normalized-levenshtein-distance Normalized Levenshtein distance yes
optimal-string-alignment Optimal String Alignment distance no
qgram Q-Gram distance no

Last update: October 23, 2020