delphi3000.com - the free delphi knowledge platform
delphi3000.com - the free delphi knowledge platform
500 Users Online NOW
Have a look at your member-status

connecting people's knowledge


  - Recent ArticlesRSS feed for Recent Articles on delphi3000.com
  - List of All Articles
  - Top Viewed Articles
  - Articles (+Attachem.)
  - Articles Of Interest
  - Categories
  - Top Uploader
  - Search
  - Index

  - My Home
  - Submit an Article
  - My Articles
  - My Personal Data
  - My Bookmarks
  - Activities
  - Login/Logout

  - Sign Up
  - Why Sign Up
  - Newsletter

  - Press
  - Advertise

  - Contact
  - Feedback





Community
Borland
ClubeDelphi
Dr. Bob
UK-BUG
Delphi Meetings
Planeta Delphi



Loremo - the 1.5 liter car coming in 2009




Startblatt.de






Share this article with friendsShare this article with friends
Rate this articleRate this article - to keep the quality of delphi3000.com !
Comment this article or read through previous comments (2)


Fuzzy Matching StringsFormat this article printer-friendly!Bookmark function is only available for registered users!
Product:
Delphi all versions
Category:
Algorithm
Skill Level:
Scoring:
Last Update:
09/30/2002
Search Keys:
delphi delphi3000 article borland vcl code-snippet Fuzzy Strings Match Index
Times Scored:
4
Visits:
2508
Uploader: duncan parsons
Company: DSP
Reference: N/A
 
Question/Problem/Abstract:
How to get an idea of how closely 2 strings match
Answer:



unit FuzzyMatch;

{This unit provides a basic 'fuzzy match' index on how alike two strings are
     The result is of type 'single': near 0 - poor match
                                     near 1 - close match
     The intention is that HowAlike(s1,s2)=HowAlike(s2,s1)
     The Function is not case sensitive}

interface

uses sysutils;

function HowAlike(s1,s2:string):single;

implementation

function instr(start:integer;ToSearch,ToFind:string):integer;
begin
     //This is a quick implementation of the VB InStr, since Pos just doesn't do what is needed!!
     //NB - case sensitive!!
     if start>1 then Delete(ToSearch,1,start-1);
     result:=pos(ToFind,ToSearch);
     if (result>0) and (start>1) then inc(result,start);
end;

function HowAlike(s1,s2:string):single;
var l1,l2,pass,position,size,foundpos,maxscore:integer;
    score,scored,string1pos,string2pos,bestmatchpos:single;
    swapstring,searchblock:string;
begin
     s1:=Uppercase(trim(s1));
     s2:=Uppercase(trim(s2));

     score:=0;
     maxscore:=0;
     scored:=0;

     //deal with zero length strings...
     if (s1='') and (s2='') then
        begin
             result:=1;
             exit;
        end
       else
        if (s1='') or (s2='') then
           begin
                result:=0;
                exit;
           end;

     //why perform any mathematics is the result is clear?
     if s1=s2 then
        begin
             result:=1;
             exit;
        end;

     //make two passes,
     //     with s1 and s2 each way round to ensure
     //     consistent results
     for pass:=1 to 2 do
         begin
              l1:=length(s1);
              l2:=length(s2);
              for size:=l1 downto 1 do
                  begin
                       for position:=1 to (l1-size+1) do
                           begin
                                //try to find implied block in the other string
                                //Big blocks score much better than small blocks
                                searchblock:=copy(s1,position,size);
                                foundpos:=pos(searchblock,s2);

                                if size=l1 then
                                   string1pos:=0.5
                                  else
                                   string1pos:=(position-1)/(l1-size);

                                if foundpos>0 then
                                   begin
                                        //the string is in somewhere in there
                                        //    - find the 'closest' one.
                                        bestmatchpos:=-100; //won't find anything that far away!

                                        repeat
                                              if size=l2 then
                                                 string2pos:=0.5
                                                else
                                                 string2pos:=(foundpos-1)/(l2-size);

                                              //If this closer than the previous best?
                                              if abs(string2pos-string1pos) < abs(bestmatchpos-string1pos) then
                                                 bestmatchpos:=string2pos;

                                              foundpos:=instr(foundpos+1,s2,searchblock);
                                        until foundpos=0; //loop while foundpos>0..

                                        //The closest position is now known: Score it!
                                        //Score as follows: (1-distance of best match)
                                        score:=score+(1-abs(string1pos-bestmatchpos));
                                   end;

                                //Keep track if the maximum possible score
                                //BE CAREFUL IF CHANGING THIS FUNCTION!!!

                                //maxscore:=maxscore+1;
                                inc(maxscore);
                           end; //for position..
                  end; //for size..

              if pass=1 then
                 begin
                      //swap the strings around
                      swapstring:=s1;
                      s1:=s2;
                      s2:=swapstring;
                 end;

              //Each pass is weighted equally

              scored:=scored+(0.5*(score/maxscore));
              score:=0;
              maxscore:=0;
         end; //for pass..

     //HowAlike=score/maxscore
     result:=scored;
end;

end.





Please rate this article!
Skill level:
BeginnerExpert

Useful:
No!Very!

Overall rating:
PoorExcellent



Comments to this article
Write a new comment
Transcription error?
    Mikael Horseman (Sep 29 2002 6:25AM)

Very interesting - I'm keen to incorporate a number of alternative pattern matching routines in my current program, and reach a "conclusion" about the similarity of two strings by means of a "voting" system (compare the results of, perhaps, 3 routines and reach a decision based on majority outcome).
However I'm having some problems with this unit at the lines:

"...

  //If this closer than the previous best?
if abs(string2pos - string1pos)                                                
bestmatchpos := string2pos;

..."

The comparison seems to have gone a bit wrong, and I'm not sure what the missing element should be. ( > ?).
Respond

RE: Transcription error?
duncan parsons (Sep 30 2002 12:26AM)

You're right!

And to be honest, I thought I had corrected it, it should be:

                                              //If this closer than the previous best?
                                              if abs(string2pos-string1pos)                                                 bestmatchpos:=string2pos;

I will alter it in the text too.

Hope your system goes well, sounds very interesting, do mail m a copy when you have finished!

Regards
Duncan Parsons
Respond














 
Sign up to consume product discounts for Bronze memberships !

read more


  Visit our Sponsor

 

  Community Ad of
M. Shkolnik
 
   














 







     
  Copyright © 2000 - 2007 delphi3000.com - All rights reserved. Terms of use. || Privacy
delphi3000.com is a service by bluestep.com IT-Services GmbH (Vienna)