Last night, one of my clients sent me what scientists refer to as "a metric fuckton" of data containing thousands of accounts from his debt collection business. Apparently he'd outsourced his skip tracing to some shady-ass place that gave him a bunch of wrong numbers and now he had a ton of duplicates that needed cleaned up. No problem, I just took all the data into Excel, spent a few minutes formatting it so it would work and went to town...and I noticed something funny.
Removing duplicates for "Address 1", "Address 2" and "Amount Owed" returned around 1200 duplicate entries in the table of ~10,000 accounts. Even for a shady skip trace company, that seemed a little high. Well, wouldn't you know it - when I check the "Name" fields that number drops to ~500.
Of course this could imply that he's got a bunch of people listed at the wrong address - that is, something's screwed up in his database somewhere. But, I thought the other implications of this were pretty interesting. Are people using fake names? Does it just mean that there's a positive correlation between bad checks and having lots of roommates? I didn't know where else to put it so I thought it would go here but this suddenly feels like the wrong forum, can anyone confirm or deny this?
