A couple days ago malc0de released a PERL script to search PasteBin and update the user in near-realtime. The script wasn’t beautiful, but it got the job done, so being the Python fan I am, I ported it and threw it up on GitHub. At the time I didn’t blog about it because I wanted to clean the code up a bit, but now I think the current version is good enough to post about.
Pastebin and other related sites have gained quite the attention from those looking to dump results of their nafarious activies. Lenny put together a nice blog post mentioning all about this, so I will leave it to him to inform you. The basic idea behind pastycake is to be a lightweight solution to finding keywords within pastes and getting that data as quick as possible.
But Solutions Already Exist
This is correct and some are great solutions, but none of them contain exactly what I want to see nor are they written in a language I prefer. Ultimately, I would like to see a tool or mine have the ability to plug in new sites as they come online, support a number of backends, allow for retroactive searching and support a number of output methods.
Show Me the Code
If you want to participate on this project, fork it over at GitHub. I am not certain who coh is, but they have really spruced up my original code and made it modular. As pull reqeusts come in, I will comment and merge them. Until then I plan on working on the features in between my major projects.
Putting it to Work
Right now PastyCake supports the ability to track results in a text file and sqlite database. I will soon be adding MongoDB and MySQL into the mix as well. For now I prefer to use the sqlite database as I can easily move the project around from place-to-place.
Invoking PastyCake to run in Harvest mode and sending alerts to an email account:
python gather.py -k kwords -o urls.db -a “email@example.com” harvest
Successful matches will be stored in the database, emailed to your account and sent to stdout: