TheFuzz v0.5.0 TheFuzz.Similarity.Tversky View Source

This module contains functions to calculate the Tversky index between two given strings

Link to this section Summary

Functions

Calculates the Tversky index between two given strings with a default ngram size of 1, alpha of 1 and beta of 1

Calculates the Tversky index between two given strings with the specified options passed as a map of key, value pairs

Link to this section Functions

Calculates the Tversky index between two given strings with a default ngram size of 1, alpha of 1 and beta of 1

This is equivalent of Tanimoto coefficient

Examples

iex> TheFuzz.Similarity.Tversky.compare("contact", "context")
0.5555555555555556
iex> TheFuzz.Similarity.Tversky.compare("ht", "hththt")
0.3333333333333333

Calculates the Tversky index between two given strings with the specified options passed as a map of key, value pairs.

Options

  • n_gram_size: positive integer greater than 0, to tokenize the strings
  • alpha: weight of the prototype sequence
  • beta: weight of the variant sequence

Note: If any of them is not specified as part of the options object they are set to the default value of 1

Examples

iex> TheFuzz.Similarity.Tversky.compare("contact", "context", %{n_gram_size: 4, alpha: 2, beta: 0.8})
0.10638297872340426
iex> TheFuzz.Similarity.Tversky.compare("contact", "context", %{n_gram_size: 2, alpha: 0.5, beta: 0.5})
0.5