A quick and dirty way to extract undeliverable email addresses from an maildir-style mail server

Anyone who writes any app that does anything with email inevitably has to do something to stem the tide of undeliverables. Sending mail continually to dead mailboxes is a sure way to reduce the deliverability of the legit email you do want to send.

Just funnel the undeliverables using Postbox or Mozilla Thunderbird. The screenshot below seems to be a relatively complete set of subject lines you can go by, since there doesn't seem to be one particular string people use. Create this rule on an email box that is your FROM address. At posterous, we use help@posterous.com as a catch-all -- yes, it's more work to look at it, but it makes it that much easier to talk to your customers and fix any problems that arise. High touch service goes a long way for a web startup.

Then run this and you'll get back a clean list of newline-delimited email addresses that you can feed to your application to invalidate email addresses. Change your mail_dir to the current directory on your mail server, assuming you use maildir-formatted mail storage on your mail server. Should run fine on basically any linux box. =)




#!/usr/bin/ruby

# set this to your mail directory on linux
mail_dir = '/home/vmail/help@posterous.com/.Undelivered/cur'

output = `egrep -i -h -C 1 '<[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4}>:' #{mail_dir}/*`
emails = []
puts output  # you can double check that everything's kosher here...
output.split(/\n/).each do |l|
  emails.push($1) if l.match(/\b([A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4})\b/i)
end

puts "------------------------------"
puts emails.compact.uniq.join("\n")

puts "------------------------------"
puts "#{emails.size} emails undeliverable"


views