TheFuzz.Similarity.Tversky (TheFuzz v0.6.0)

View Source

This module contains functions to calculate the Tversky index between two given strings

Summary

Functions

Calculates the Tversky index between two given strings with a default ngram size of 1, alpha of 1 and beta of 1

Calculates the Tversky index between two given strings with the specified options passed as a map of key, value pairs.

Functions

compare(a, b)

Calculates the Tversky index between two given strings with a default ngram size of 1, alpha of 1 and beta of 1

This is equivalent of Tanimoto coefficient

Examples

iex> TheFuzz.Similarity.Tversky.compare("contact", "context")
0.5555555555555556
iex> TheFuzz.Similarity.Tversky.compare("ht", "hththt")
0.3333333333333333

compare(a, b, arg3)

Calculates the Tversky index between two given strings with the specified options passed as a map of key, value pairs.

Options

  • n_gram_size: positive integer greater than 0, to tokenize the strings
  • alpha: weight of the prototype sequence
  • beta: weight of the variant sequence

Note: If any of them is not specified as part of the options object they are set to the default value of 1

Examples

iex> TheFuzz.Similarity.Tversky.compare("contact", "context", %{n_gram_size: 4, alpha: 2, beta: 0.8})
0.10638297872340426
iex> TheFuzz.Similarity.Tversky.compare("contact", "context", %{n_gram_size: 2, alpha: 0.5, beta: 0.5})
0.5