BrT

Musings about technology and other neat stuff

BrT random header image

Eliminating duplicate contacts in Evolution (HowTo)

July 9th, 2006 · 27 Comments

This had been bothering me for a while. Due mostly to artifacts in synchronizing with my PalmOS device, a large number of duplicates had accumulated in my Evolution addressbook. Today I put together this short script to eliminate them. It’s crude, but it sort of works.

Standard disclaimer applies: use at your own risk, backup your data, floss your teeth, etc. etc.

Because wordpress tends to mess with quotes and such, instead of cutting-and-pasting the code below, you may want to download it from here:
evo_eliminate_duplicate_contacts.pl

#!/usr/bin/perl -w

use DB_File;

$addrdb=$ARGV[0]
    || "$ENV{HOME}/.evolution/addressbook/local/system/addressbook.db";
print "* Examining $addrdb\n";

tie %h, 'DB_File', $addrdb, O_RDWR, 0777, $DB_HASH
  or die "Error opening file: $!\n";

# Keep track of names we've seen
%names=();

for $k (keys %h) {
    $card=$h{$k};
    if ($card =~ /^FN:(.*)$/m) {
        $name=$1;
        $name=~s/\r//g;
        chomp $name;
        if (exists($names{$name})) {
            print "* Previously found $name, removing\n";
            delete $h{$k};
        }
        $names{$name}++;
    }
}

print "* Done. Duplicate statistics:\n";
print join("\n", map { "$_: $names{$_} times" }
           sort { $names{$b} <=> $names{$a} }
           grep { $names{$_} > 1 } keys %names
          )."\n";

Technorati Tags: , , , ,

Tags: · ·

27 responses so far ↓

  • 1  adam lark // Jul 29, 2006 at 1:54 pm

    sorry i don;t undertsnad how to use this. I saved the file as “evolution_clean_script” to my home directory & fired the following

    adam@adams-desktop:~$ sudo perl evolution_clean_script

    I get the following output
    * Examining /home/adam/.evolution/addressbook/local/system/addressbook.db
    Error opening file: No such file or directory

    but I can see the file?

    adam@adams-desktop:~$ ls /home/adam/.evolution/addressbook/local/system/ addressbook.db
    addressbook.db.summary
    beagle-cufGXc9lGkWLA5Ap1WyKpw.changes.db
    pilot-map-1000.xml
    pilot-sync-evolution-addressbook-1000.changes.db

  • 2  Hitchhiker // Jul 29, 2006 at 9:09 pm

    adam – I’m not sure what could be causing the problem. But you do not need to run it using sudo, since the file that it modifies belongs to you, so you have permissions on it already. Could you please try without sudo, and see if the problem persists?

  • 3  Larry // Sep 19, 2006 at 5:16 am

    I have the same problem, whether I run sudo perl evolution_clean_script or perl evolution_clean_script
    * Examining /home/drfox/.evolution/addressbook/local/system/addressbook.db
    Error opening file: No such file or directory

    Can you offer any other suggestions?

    Larry

  • 4  Hitchhiker // Sep 19, 2006 at 10:33 am

    Larry – your addressbook.db must be in a different location, you should find it and pass it as an argument to the script. Which version of evolution are you using? The use of $HOME/.evolution is only in recent versions, previously it used to be $HOME/evolution (without the dot).

  • 5  Hitchhiker // Sep 19, 2006 at 10:36 am

    adam, Larry – I just noticed I had a stupid mistake in the script, it was not using the $addrdb variable, no matter how it was defined. This was the cause of your problems – please try the new version.

  • 6  Larry // Sep 19, 2006 at 10:34 pm

    This worked great! Thanks, Hitchhiker!

  • 7  Andy // Dec 8, 2006 at 1:01 am

    Thanks a lot for this – found it through a search.
    Evolution had gone crazy and created 100 copies of several of my contacts!
    Anyway, this program cleared up all the mess – brilliant!

  • 8  jan // Dec 24, 2006 at 2:02 pm

    Thanks a lot!
    Never even knew Perl could read/write the evolution db, this is great!

  • 9  Alex // Apr 4, 2007 at 7:31 am

    I tried this script and got: “Unrecognized character xE2 at ./evo line 33.” . I don’t know what it is. I don’t know any programming language but I have a lot duplicates in my Evolution contacts.

  • 10  Hitchhiker // Apr 4, 2007 at 9:56 am

    Alex: the script has only 32 lines. Maybe you inserted some weird character when copying/pasting into the script?

  • 11  Alex // Apr 4, 2007 at 6:12 pm

    Sorry maybe I am stupid. I just copied and pasted.

    Line 33: ).ā€nā€;

    Unrecognized character xE2 at ./evo line 33.

  • 12  Alex // Apr 4, 2007 at 11:46 pm

    OK, it’s working now. The problem is your HTML code: ).”n”; for line 33. I copied it from the source and changed #8221; into regular quotes and it works now. Thank’s for the script.
    Alex.

  • 13  Hitchhiker // Apr 5, 2007 at 12:20 am

    Alex – I think Wordpress was messing with the quotes. I have added a link to the script for download – I’m glad you got it working.

  • 14  Alex // Apr 5, 2007 at 12:35 am

    Did you try to contact Evolution people? Maybe they will want to include your script into their next release? Your script is really great. It removed 578 duplicates from my contacts. I don’t know why duplicates appeared. Maybe Evolution is buggy?

  • 15  Bruno // Jun 25, 2007 at 5:20 pm

    Worked great for me. Removed whole load of duplicates after synchronizing Palm.

    Thanks.

    Bruno

  • 16  Matthew Stevens // Nov 8, 2007 at 3:26 am

    Any such script out there for calendar entires? TIA

  • 17  Hitchhiker // Nov 9, 2007 at 12:43 am

    Matthew: I just looked and the calendars are not stored in database files (which would have made adapting the script really easy) but as ical files (.ics), so a completely different technique would have to be used to remove duplicates. Unfortunately I am no longer using evolution myself, and don’t have the time to look into it.

  • 18  Ch0c0bn // Mar 5, 2008 at 8:46 am

    My adressbook had gone wild after using syncevolution with scheduleworld.

    Your script worked really fine for me. thank you.

  • 19  Ismael Olea // Mar 7, 2008 at 2:04 pm

    You’re script is very nice, but it removes all contacts not only duplicates but complementary contacts for same persons in different roles.

    Anyhow, it’ll be really useful a cleaning tool for evo.

  • 20  Addressed Out // Apr 16, 2008 at 8:25 pm

    Thanks a lot – your script worked well for me.

    I just had a question –
    your script just eliminates duplicates – but does it merge duplicate contacts? (e.g. if I’d added a work telephone No. to one entry and a Mob. No. to the other does this info get saved) – or does the script just delete the second version. Many thanks.

  • 21  Hitchhiker // Apr 16, 2008 at 9:06 pm

    Everyone: thanks for all the comments! I am happy that the script has been useful for some people.

    @Addressed Out & Ismael: it’s a very dumb script, it simply keeps the first record for each person, and it does the checking based on the person’s name. So it will not merge duplicate records, and it will blindly remove everything beyond the first one it finds for each person. If the first record it finds is empty, that’s the one it’ll keep. Furthermore, there’s no way to know which record it will find first, given the nature of the database file in which the records are stored.

    The script could be improved to do merging, or at least to check which record has more information before deleting them. Unfortunately I am not using Evolution any more, so that will have to be a task for someone else :-)

  • 22  Edmond Tong // Aug 21, 2008 at 5:22 pm

    Wow you’ve saved me hours with that script!! Thanks a bunch!

  • 23  Mike // Sep 28, 2008 at 3:31 am

    There is an easier way! Create a new address book. Name it whatever you want. Take the two known* duplicate entries and drag them there. It will give you the duplicate merge dialog. Merge them. Drag them back. Done!

    *this being the tricky part…

  • 24  Shane Rice // Feb 16, 2009 at 2:03 am

    Works great for me! I had a bunch of duplicates after syncing with Palm. This eliminated them. Thanks!!! 2000+ contacts down to 1000+ contacts!

  • 25  Rohan Kapoor // Aug 24, 2009 at 6:39 am

    Thanks a lot for this! You saved me around hours of work when synce messed up and I ended up with 8 of the same contact x 400 contacts!

  • 26  Brion Kidder // Aug 24, 2009 at 9:31 pm

    When I run the script from Terminal I get the error message “command not found”

    evo_eliminate_duplicate_contacts.pl

  • 27  Brion Kidder // Aug 24, 2009 at 9:35 pm

    OIC. You have to add the command perl in front of the script. Did not know that.

    brion@brion-laptop:~$ perl evo_eliminate_duplicate_contacts.pl

Leave a Comment

Comments for this post will be closed on 24 August 2010.