Announcing CanaryPW

Hello! If you’re reading this, then you’ve been made aware of the beta release of CanaryPW!

What is CanaryPW you might ask? Well, it is a search engine designed to look through text that has been publicly posted on services like Pastebin. A series of tools scan the text, pull out the interesting bits, and then are entered into a database. The interesting bits for now include e-mail addresses, phone numbers, IP addresses, and websites.

The idea behind this is to mirror the data posted on these sites so anyone can perform quick analysis. It also allows for individuals and organisations to determine if any proprietary information has been inadvertently made available and allow for mitigation.

Overview

This is the default CanaryPW screen. It’s a basic search with nothing super fancy in it.

And here are the results. Again, nothing fancy here either.

And then we get to the interesting stuff. Here’s the text and below is all of the data plus links to other texts that have been posted that have similar items. Basically you can search for something via CanaryPW and then have links to all results to see what’s up. Eventually you’ll be able to sort it by what has some interesting data and what is related to another entry.

This is an idea I borrowed from DuckDuckGo: ‘bangs’ have been incorporated into the search. It’s as simple as typing “!ip 127.0.0.1″ in the box and search for all texts containing that string. Other ‘bangs’ include !http, !host, !email, and !phone. More will be added as I start to further fine-tune the text analysis.

History

I’ve been playing around with an idea for a service for sometime and my friend, David had suggested to me that I build some sort of search engine for stuff posted on Pastebin, Pastie, and elsewhere. Sites like these have been used for all sorts of reasons including the posting of leaked credentials from services such as video game services, message boards, and news sites. The idea here is to allow quick and easy access to this information so to allow anyone affected to react quickly to mitigate the problem.

The project itself started in late March and a test version of the site was launched in early June. And here we are today with a formal launch of the service with much, much more to come.

Closing

I would like to thank those who have assisted so far with the project. I will be releasing more details as time progresses. There are some bugs that still need to be worked on namely in the database sorting side of things and rendering the site on Internet Explorer.

If you have any questions or comments, feel free to respond below or send an e-mail to support@canary.pw.

[edit]

One person said I didn’t link to the page! It’s on the right but it’s also available here:

https://canary.pw

About these ads
  1. This sounds like a great tool! Could you provide a link to it somewhere in this post?

    • Jeremy McRedacted
    • July 18th, 2013

    quick question, what sort of backend do you use for doc storage and indexing? Lucene? Solr? Lucy? also, have you thought about opening an api? web scraping is easy, but it’s harder on the server that’s being scraped than rest/json/etc.

    cool work, thanks!

    • I don’t want to elaborate on the specifics of the back-end at this time but there is a plan to provide a JSON-based API in the future.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: