Instructions
- Submit a single full path web page address (absolute URL). Don't forget to include its http or https scheme.
- If a server blocks a crawl, visit the target URL and copy/paste its front-end visible text or its back-end source code into our tool textarea. Then submit form. For instance, try this technique by visiting http://app.leg.wa.gov/memberemail/ or other legislature members contact page, and then copy/paste and submit its visible text. You can also extract emails in this way from offline files residing in your machine.
- To access the source code of an email document in Outlook, right-click it and navigate to
properties > Details > Message Source
. Once accessed, paste it in our tool textarea and submit it. Do the same with the visible text of the email document instead of its source code.
- The tool supports plain text and source code, up to the first 100,000 characters. These can be from different file formats like htm, html, asp, aspx, php, txt, js, css, .mso, etc.
- To export the data, click the results textarea. Copy/paste them as you usually would.
Who can use it?
- Data miners, teachers, students, or anyone interested in extracting email addresses from online or offline documents.
Suggested Exercises
- To start, visit a public mailing list archive like the one at the Robotstxt.org site which, among others, lists two email-rich URLs. Crawl the first one with our tool. You should be able to grab several hundreds of email addresses. Since the second one is a huge mailing list, it may freeze your browser. In that case, you may want to save it to your local host, break it into several files and crawl them one at-a-time.
- Another alternative is to visit public mailing list archives like
- your legislature members contact web page.
- a health department directory web page.
From there you can then narrow down your selections and crawl a desired URL with our tool.
- You can also submit the URL or source code of a web page that you know is more likely to list contact information; for example, a university or company staff directory page, an event conference or speakers/vendors page, or a search engine results page.
- Submit the query [ mailto: ] to a search engine. Then run the source code of the search results page through our tool. Do this with several search engines and compare results.
- Repeat previous exercise, this time with a query of the form [ @xyz.com ], where xyz is hotmail, gmail, yahoo, or msn. Repeat the analysis with a query of the form [ mailto:*@xyz.com ]. Notice the wildcard asterisc (*). Compare results.
Feedback
Contact us for any suggestion or question regarding this tool.