Jump to content

Daitch–Mokotoff Soundex: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Lightbot (talk | contribs)
Date audit per mosnum/overlink/Other
KosherJava (talk | contribs)
m BM Soundex
Line 34: Line 34:
| Jackson-Jackson || J252 || 154664, 454664, 145466, 445466, 154646, 454646, 145464, 445464
| Jackson-Jackson || J252 || 154664, 454664, 145466, 445466, 154646, 454646, 145464, 445464
|}
|}

==Beider-Morse Phonetic Name Matching Algorithm==
To address the large number of false positive results generated by the D-M Soundex, [[Stephen Morse]] and [[Alexander Beider]] created the Beider-Morse Phonetic Name Matching algorithm<ref>http://stevemorse.org/phoneticinfo.htm</ref>. This new algorithm cuts down on false positives at the expense of some false negatives. A number of sites are offering the B-M soundex in addition to the D-M soundex<ref>[http://www.avotaynu.com/nu/V09N22.htm Nu? What's New? Volume 9, Number 22]</ref>.


==See also==
==See also==
* [[Where Once We Walked]]
* [[Where Once We Walked]]

==Notes==
{{reflist}}


==External links==
==External links==

Revision as of 19:30, 11 November 2008

Daitch-Mokotoff Soundex (D-M Soundex) is a phonetic algorithm invented in 1985 by genealogist Gary Mokotoff, and later improved by Randy Daitch, both of the Jewish Genealogical Society. It is a refinement of the Russell and American Soundex algorithms designed to allow matching of Slavic and Yiddish surnames with similar pronunciation but differences in spelling.

Daitch-Mokotoff Soundex is sometimes referred to as "Jewish Soundex" and "Eastern European Soundex", although the authors discourage use of these nicknames for the algorithm.

Improvements

Improvements over the older Soundex algorithms include:

  • Coded names are six digits long, resulting in greater search precision (traditional Soundex uses four characters)
  • Coded names can be stored as numeric values, which can save space in some applications (regular Soundex encodes values as alphanumeric text)
  • Several rules in the algorithm encode multiple character n-grams as single digits (American and Russell Soundex do not handle multi-character n-grams)
  • Multiple possible encodings can be returned for a single name (traditional Soundex returns only one encoding, even if the spelling of a name could potentially have multiple pronunciations)

Examples

Some examples:

Surname American Soundex D-M Soundex
Peters P362 739400, 734000
Peterson P362 739460, 734600
Moskowitz M232 645740
Moskovitz M213 645740
Auerbach A612 097500, 097400
Uhrbach U612 097500, 097400
Jackson J250 154600, 454600, 145460, 445460
Jackson-Jackson J252 154664, 454664, 145466, 445466, 154646, 454646, 145464, 445464

Beider-Morse Phonetic Name Matching Algorithm

To address the large number of false positive results generated by the D-M Soundex, Stephen Morse and Alexander Beider created the Beider-Morse Phonetic Name Matching algorithm[1]. This new algorithm cuts down on false positives at the expense of some false negatives. A number of sites are offering the B-M soundex in addition to the D-M soundex[2].

See also

Notes