May 26th, 5:15pm 2 comments

This Document Requires a Password...

In the past few days I have come across a couple malware samples used in targeted attacks asking me for a password. This sort of tactic is generally done through a spearphishing message where the document is attached and the password is contained within the body. For those handling the incident with the full attack chain, this is no problem, but when you don't have it, what then?

Screen_shot_2012-05-26_at_4
Screen_shot_2012-05-26_at_4

There are a couple freeware PDF password crackers, but I have found them to be flakey at best. These attackers aren't exactly creating PDF documents that fit the standard specification. In the cases I have seen, there is data all over the place and the free crackers can't look past it. As for documents, most "freeware" I have seen is more likely to carry malicious code than it is get anything done. 

I am not one to push a product, but I have found Elcomsoft to be extremely valuable in the cases of password protected files. For a bundle of their password recovery products you are looking at about $1200 at least depending on the edition you choose, but compared to most products, I would call that cheap.

I took the documents that had me stumped and threw them over to their respective client for cracking. The Word document cracked in less than a minute with a simple 4 letter password, no digits and all lowercase. Even with such a weak password, I still would have never guessed it given I had no context. 

Screen_shot_2012-05-26_at_4

As for the PDF document, I was fortunately able to guess the password due to its simple nature, but even so, I put it into the recovery program and got an answer back in a few seconds. 

Screen_shot_2012-05-26_at_5

The tactic of putting a password on a document may seem dumb or trivial, but it ends up being a real pain when you are left with nothing more than a malicious file and you are expected to some how analyze it. Having not used the Elcomsoft tool for that long, I can't say it is perfect, but at the end of the day no solution is. The software got the job done and saved me a lot of time. I entered the passwords for each document and was happy to see a crash right before my machine was exploited. :)

Posted by Brandon Dixon
May 26th, 1:35pm 0 comments

Quick Update on ~I32SUN.EXE

After my initial excitement died down, I sat down and took a look at the ~I32SUN.exe file and was saddened to find it looked just like CMD.exe. Hoping for something modified or different, I threw both files into BinDiff, but was saddened to see a 100% match on all functions. The only difference I could find between the two files was the versions and even that was slight.

So what was the deal with it landing on the system? I have yet to figure out the decoding on the RAT itself, but my suspicions are that the attackers requested CMD.exe through their web interface. In doing so, this spawned their process to run on my system and it was reversed back over the HTTP tunnel. I am hoping it may be possible to see any commands executed or data exfiltrated in decoding the content. More to come. 

Filed under malware research updates
Posted by Brandon Dixon
May 22nd, 6:53pm 3 comments

Observing the Enemy : CVE-2012-0754 PDF Interactions

Earlier today I was tipped off that CVE-2012-0754 had made its way into a PDF document and got ahold of a sample to reverse. This sample was obtained from the public PDF X-RAY repository by searching for "MyComputer". Below I will quickly outline my analysis of the document and then jump over to some of the cooler aspects. 

The document itself consisted of two versions yet both appeared to be the same exploit code. What caught me right away was the metadata contained within the document:

Screen_shot_2012-05-22_at_6
If you track targeted documents, you will notice "MyComputer" as an interesting author as it has been used in other attacks. We will come back to this in a moment, but first the actual make-up of the malicious PDF file. 

Objects 29-35 appear to define the SWF action used for the staging in the exploit. Both the SWF file and MP4 file have the name of "nero". Object 35 itself contains the SWF file (3a901db9dbcc2c6abfc916be7880400e) to make everything work. 

The SWF file itself keeps with the nero trend and labels the class as NeroShow(). Buried in the SWF is a declaration of ShowMP4 where the nero.mp4 file is referenced. It is unclear how connections are established to the C2, but it appears this could be handled within flash.net::NetConnection. 

Screen_shot_2012-05-22_at_3
Object 36-39 defines the MP4 file information. The MP4 is located within object 39 and appears to be very small. I am not familiar with the actual corruption bug, but I suspect this is a bare-bones trigger for the exploit to work.

Screen_shot_2012-05-22_at_6

As I mentioned earlier, the metadata from this PDF file was something seen before in targeted attacks. I ran a search in PDF X-RAY and ended up with three hits on the documents not including the one I was analyzing:

The first two documents have been linked with their write-ups or links on contagio. Both of those documents though used an older exploit. Specifically, CVE-2010-3654, another flash exploit. The "5aea..." document however looked almost identical to the one I had researched. Searching my database records, I saw that the document was uploaded back on April 24th, 2012. I try and stay on top of the PDF submissions, but in this case I failed and it was a big fail. 

Figuring I had missed some good intelliegence, I thought it would still be worthwhile to run the document as I had not seen it elsewhere or heard of it. I prepped my generic VM image, updated to Adobe 9.4.0 and ran the document. 

Screen_shot_2012-05-22_at_6
The dropped document appeared to be some sort of write-up and not that interesting. It also didn't reveal a clear target. Spawned after the crash was an "explorer.exe" process. I have found that a lot of targeted attacks sometimes do a bit more after they run for a while, so I let this one run for a few minutes. 

Screen_shot_2012-05-22_at_6

During the run I managed to see a new connection and spawned child process from "explorer.exe". This time the process was "~ISUN32.EXE" and after a minute, "ipconfig.exe" ran from that process. I can't be certain, but it appears that I had a connection from our attacker friends and they were running commands on the system. I continued to let the processes run until I felt enough time had passed and our friends were gone. 

The explorer process first resolves to IP addresses at Microsoft (likely a connection check) and then makes a connection to report02.proxydns01.ddns.us (64.71.163.90). 

Screen_shot_2012-05-22_at_6

During the time of my run, a lengthy amount of information was exchanged between my infected host and the attackers systems. 

Screen_shot_2012-05-22_at_6

It is presently unclear what traffic and information were passed back and forth, but I am hoping the binary analysis will lead to a clear picture. Googling around for the malware brings up references to Symantec's "Barkiofork" family of malware. I managed to find a report here and here that looks close to the malware that was dropped from the PDF. 

What I found more interesting however was the interaction from the remote C2 server. It appears that this C2 didn't gain as much attention as some of the others and had remained active online. The quick interaction could have been scripted, but after many attempts on many different IP spaces, I was never able to reproduce the acitivity on the first run (refreshed VMs everytime). 

In the next few days I will write another blog on the more interesting child process, ~ISUN32.EXE describing its functionality and purpose. 

Posted by Brandon Dixon
April 29th, 10:07pm 8 comments

Data Mining + Malware = Improved Analysis

Over the past few weeks I have been talking with different analysts, programmers and RE folks about the future of malware analysis and how we combat changes in attacks. Ripping apart binaries and developing signatures based on TTPs doesn’t scale (it goes without saying that signatures do a great job, but this knowledge is gained from knowing something about an attack) as more and more new threats emerge, so we need to start thinking about something new.

In the next few postings I will take some time to focus on data mining. This field of study can be applied to any discipline, but you often come up short when Googling it coupled with malware. My background is not so much in the math world, so I will steer away from the inner workings of the algorithms, but will highlight the strengths and weaknesses identified. To avoid overkill, I will also leave it to you, the reader, to figure out the data mining process. 

Summary

Using thousands of malicious, known good, and targeted documents, a classifier was trained and tested using multiple algorithms with a high success rate. Each PDF from the dataset would be transformed from its native format to a flat vector of unweighted features that would then be fed to a learner. After training the learner, several tests were ran using known techniques to evaluate classifier success in order to identify how successful the project was. 

The top two algorithms (decision tree and k-nearest neighbor) were implemented within a new method (classify) that is part of the PDF X-RAY API. Users can now get classification results back for each of the respective algorithms by simply submitting their questionable file. 

Dataset Files

For any good classifier to work well, one must have a sizable selection of data that can be used to train it. For the PDF example, the following datasets were used:

  • Malicious - 15K
  • Non-malicious - 6K
  • Targeted - 320

Just based on the supplied sample-set, it is easy to see that targeted may have issues later on just because there is such an underwhelming amount of data. 

PDF to Vector

Having worked with the PDF format for quite some time now, I felt this would be a good example to start with. To train the classifier, one must first transform the data from the native PDF format to a vector of features. Features are typically extracted using various techniques, but in this case, I relied on my experience with the documents to pick the features myself. 

In total, I ended up picking 35 features to represent what a PDF “looked like”. My first test runs included 23 features, but I later found that by adding more, I was able to get more stable results. These 35 features include items like known named dictionaries used in malicious attacks, filters, structural attributes such as object counts, size, etc. 

Screen_shot_2012-04-29_at_7

Not all features defined need to be used, but having them all means the process doesn’t need to re-run. Testing of the feature relevance can actually be done using Python code. 

Screen_shot_2012-04-29_at_8

The above shows each feature ranked based on relevance in respect to the overall dataset. On the left are the original 35 features and to the right are the 24 features that meet our defined threshold. Features can be adjusted on the fly so that we never really need to edit the dataset used in learning. This ability also provides us with a way to compare feature groupings per algorithm. 

Algorithms for Learning

With the data formatted, the algorithms for the classifier must be chosen and compared against each other. For my classifier I picked bayesian, decision tree, KNN and SVM. Each of these have different uses and varying results, but because the data is represented in a standard vector, all of them can easily be tested and applied.

Once each learner has consumed the data, testing is done using a 10-fold cross-validation run. This method splits the data into equal K subsets with a single subset used as training data to then test the remaining subsets. Through each fold of this process, results can be shown for each classifier with the averaged results displayed at the end. 

Screen_shot_2012-04-29_at_9

For the PDF data it appears the most useful algorithm to apply would be the KNN (K nearest neighbor). Like any of the algorithms though, this choice may not prove to be the best until more testing is done. It should be noted that as more features were introduced, bayesian improved while SVM got worse. It is currently unclear why this occurred. 

Running Tests 

Once I trained the classifier I was able to run it against several downloaded and local samples to test its accuracy. I was pleasantly surprised with how good a job the tool did, but of course was more intrigued by the documents that were misclassified. 

Screen_shot_2012-04-29_at_8
In most cases the problem dataset appeared to be targeted vs. malicious/non-malicious documents. This was to be expected though given that a lot of targeted documents are nothing more than a good document with malicious code injected into them. It is possible that this problem could be solved with more fine-grained feature selection to account for the subtle differences between a targeted document and the others. It is also possible that weights could be introduced into certain features to better control the end decision. 

Working Implementation

Unlike most of my other tools, I am not quite ready to release all the code I have put together for this project, but wanted to provide the public with a way to take advantage of it. Earlier today I added a new API call to PDF X-RAY that lets users submit files to the server and get back a classification response. 

Screen_shot_2012-04-29_at_9

To submit to the API, users can use the same code present on the API page with a slight tweak to the API. 

Screen_shot_2012-04-29_at_8

Conclusions

More testing needs to be done, but so far introducing the concept of data mining into malware analysis appears to be a great direction. Now that the classifier has been built, this significantly reduces the amount of time spent looking at files to determine whether or not they are malicious. Furthermore, with a bit of tuning, this tool could soon aid in the detection of targeted attacks. In some of the future postings, other techniques like clustering will be demonstrated to begin grouping similar files of different types.

As a final thought and note - is this sort of solution perfect? No, but I have yet to see any tool in this field that's perfect. Anti-virus companies make millions off their products that only work sometimes whereas these concepts and methods are free and based off solid math foundations. I suspect this sort of technology will find its way into malware analysis more and more as time progresses. 

Posted by Brandon Dixon
April 23rd, 8:29pm 0 comments

Still Alive

Blog post coming shortly. Been busy with fun classes, programming and new opportunities. 

Posted by Brandon Dixon
March 27th, 12:56pm 0 comments

Building Chrome Extension Skimmers

Maybe I am trendy, but every time I am working on something new, along comes a company making a post about it. It spreads all over and leaves me wondering if my million dollar idea is now just a meaningless stream of ASCII. It’s times like these that I find it best just to dump the code and let others use it.

There has been talk about malicious chrome extensions lately as if they ever went away. A couple years ago I took the liberty of creating a Wachovia chrome extension that allowed you to view your bank account balance without ever leaving your page, except it didn’t do that at all.

Wachovia-extension

What was great about this extension was that the “evil” code was exposed with no obfuscation and yet it survived for many months with many installs and many reviews. Did I steal the credentials? Of course not, but I made it clear to the users that they could have been stolen. Yet despite all this, I received email asking for help, why it didn’t work or that their password combinations seemed to fail. 

I thought I would play in the new Google store again to see if anything changed. My goal here was to take some of the newer doomsday encoding techniques and work them into the extensions. I also wanted to test some of the basic workflow details of existing extensions. 

Encoding the Guts

The Facebook stealing extension that recently hit the press appeared to do its dirty work in the clear. In other words, anyone reviewing these extensions should have wondered what was up. Stepping over that detail for a moment, I find it better to reduce the likelihood of getting caught. 

Screen_shot_2012-03-27_at_12

The above image shows our payload which isn’t much help without decoding it. How do you decode it though? Well, the key to this JavaScript relies in the site you are visiting. If the site is the one we want to inject into, then it decodes itself and injects otherwise it just evals garbage. 

Magic Hooks

With the encoder built, we can feed it any JavaScript, but what to feed and how? If we are an extension that has permissions to trigger on every page then it is very to say we have control over all of the DOM. Keeping in mind the payload only fires when the site is correct, we really only have control of the target DOM. 

Screen_shot_2012-03-27_at_12

Not to bash on Wachovia (now Wells Fargo), but their site is simple HTML and outlines the login clear as day. Using the jQuery (for ease of use) above, it is easy to bind a hook to the login function, extract the user fields and fire off an AJAX request POST to a third-party site. 

Framing the Extension

Extensions are easy to make requiring a manifest, a couple icons and some JavaScript/HTML code. The core of any useful extension comes down to the extension type and URLs it can access. 

Extensions come in three flavors:

  • content scripts - run in the background and fire based on a URL matcher
  • popups - always there, but require a user click to function
  • both - does both

Google allows you to use wildcards when specifying your content script matcher, but it would not happily take “*” or any combination without a URL. I knew other extensions could live on every page, so I downloaded the Google Dictionary and discovered this little gem tucked away in the manifest - “\u003Call_urls\u003E”. Using this allows all URLs to match for the content script. 

What exactly is our content script you ask? Remember that encoding stuff above? Well, that needs to go into a file. I prefer to take the output and throw it somewhere in a minified version of the jQuery source, so you get something that looks like this.

Screen_shot_2012-03-27_at_12
Not only is it hard to see where our encoded code exists, but the use of jQuery can also be practical as you can use it within your pop-up component of the extension. Making use of both components makes the extension less suspicious. 

Developer X is Hacked

Assume for a moment the developer of AdBlocker is hacked. That extension has hundreds of thousands of users of which could easily be skimmed at any moment. What would happen if you modified the source, included a skimmer and re-uploaded? It doesn’t appear like anything would happen. In fact, I have seen extensions completely remove all functionality they originally had yet still have no issues staying inside the Chrome store. This makes me worry about how often these extensions are checked and removed if at all.

Conclusions

Chrome extensions haven’t changed and their security is no different than it was years ago. The recent hype that these things could be bad and think twice before installing has held true ever since the store was created. I would bet skimmers like the one documented above exist now with little disturbance and high success. 

I documented this process not to cause some mass skimmer craze, but to really bring awareness to the issue. Articles often miss the technical details of these topics and how easy it is to do it yourself. I will release the code used to generate the extension guts after my talk at InfoSec Southwest. There is a short script that takes in the code you wish to encode and the key and outputs the end blob.

Posted by Brandon Dixon
March 20th, 11:57pm 1 comment

Clean Tibet Quick Script and Action Call

NOTE - If flagged as abuse, the user flagged will be blocked on your Twitter account. This script is not perfect and could flag someone who isn't spamming.

Krebs put out in article today covering the buzz about tibet hashtags being flooded on Twitter that got me thinking. Spam accounts seem to be created within the past few days, often have a small number of followers and sometimes contain little or nothing within the tweet. Technically someone could build a quick web interface that sorts out the garbage, but that only solves half the issue as the spammers just keep plowing away.

I am not a designer and looked at this from another angle. I put together a quick python script (bare-bones) that uses the Twitter streaming API to parse through tweets, match against those characteristics and send an abuse message off to Twitter with the offending account. Simply replace the holders for username and password with your own and you are ready to go. Lastly, feel free to fork or clean up the script. I recognize this may flag real accounts, but chances are it shouldn't. 

Filed under api script tibet twitter
Posted by Brandon Dixon
March 6th, 11:33am 4 comments

Adobe's SWF Tools - CVE-2012-0754

Yesterday Mila posted a Doc file exploiting the recent flash bug (CVE-2012-0754). Having not looked at it yet, I thought this would be a good way to test the new SWF tools Adobe released last night. I downloaded the files from the Contagio site here and the new Adobe tool here.

First Impressions

I like the fact that the tool runs on Mac or Windows. When reversing SWFs I typically find myself on the command line trying to convert the file into some mangled ActionScript, so it's nice to have a GUI to navigate the file. Aside from SWFREtools, I know of no other GUI to reverse SWF files. What I like best about this tool is that the engineers who built it are located in the same area as those who created/support Flash.

Using the Tool

Once you have loaded your SWF file, using the tool is simple. You are presented with statistics on the file at first glance which in this case shows the tag "DoABC2" which could end up being worthwhile. 

Screen_shot_2012-03-06_at_11

From here I went to the SWF Disassembler to take a look at that function. Using the built-in finding, I searched for any HTTP strings and located the remote call out to "http://208.115.230.76/test.mp4". From here I was able to see that URL pushed on to the stack and the refernce to the jit_egg shortly thereafter. 

Screen_shot_2012-03-06_at_11

Lastly, I took a look at the strings tab and realized all this combing through ActionScript could have been avoided as all the valuable information was presented to me in a nice, exportable list. 

Screen_shot_2012-03-06_at_11

Conclusions

Adobe has put together a nice tool and I really enjoy it. I am not certain how it will be for really digging into the actual vulnerability being exploited, but for a high-level overview with some technical details, this is great. I also find it useful that you can run and emulate the SWF files all within one place. I plan to add this to my arsenal and continue to use it. 

Posted by Brandon Dixon
February 27th, 3:11pm 4 comments

Revisiting Targeted Attacks - Insights From the Past to Present

This post is filled with a lot of information pertaining to targeted attacks and introduces several branch-paths one could take to gather further information. It is worth detailing how this research came to be so that one can follow the paths taken and choices made.

Overview Summary

I find that there isn’t a large focus on the vehicle or dropper used in attacks. More often than not it is ignored or analyzed only after a write-up on the payload has been released (speaking in terms of public analysis). By separating out these details, one loses context and additional information that could further narrow the attacker’s motives, goals and behavior. This post will cover a mini research project based on small details shared between what later appeared to be a set of targeted attacks lasting several months across multiple years.

Here is a direct link to the data referenced and shared (note - some columns (k) are hidden - mainly the MD5 file listing from dynamic analysis for every file):

https://docs.google.com/spreadsheet/ccc?key=0AnF2ITyrL6JJdHRjMzdycmhZaGRPcl9LR1p6dHRnQlE

Screen_shot_2012-02-27_at_3

The Catalyst

Last year I analyzed numerous PDF attacks, some targeted, some not, but noted an interesting detail about a targeted attack used somtime around June 2011 using 2011-0611. This file contained a named dictionary “PDFWP” that potentially tied the file to the WordPress plug-in used to generate PDFs. I did some quick research, but ultimately shrugged this off until December 2011 when I noticed it again in another targeted attack using 2011-2462. 

Knowing that PDFWP could be a potential weaponizer, I did a search in the PDF X-RAY database and got back about 115 malicious PDF files. Unfortunately, I did not have too much context around these files other than I knew they were bad and in most cases, knew which objects were responsible for the exploitation. 

Testing PDFWP

Before kicking off the major portion of research, I decided to focus in on the plug-in that may have been used or modified to create the files. I installed a default instance of WordPress (older version to avoid build changes) and PDFWP. Generating the PDF was a manual procedure in which I selected the article and then hit generate.

Screen_shot_2012-02-27_at_9

Due to some faulty packages and formatting, I couldn’t get a PDF to generate with a lot of content. I did however get the “tex” file used to generate the PDF and modified it so that only the bare minimum was present. After modifying the TEX and rerunning the file, I got a PDF.

https://www.pdfxray.com/interact/34d91b8a3b3f804a7a26da1ecfc2c45b/

Searching the document revealed no trace of “PDFWP” which I found to be odd. I searched the code, but struggled to find anything that would produce the result I was seeing. Given this, I am not certain this plug-in would have been used to generate the documents I analyzed. Despite this, I still felt the corpus was small and wanted to continue on. 

Vehicle Details

I have yet to find an automated tool capable of accurately identifying and extracting the malicious components from a document, so I was left with the PDF X-RAY API and doing things by hand. As a first pass, I used PDF X-RAY to identify the suspicious objects and then output their contents to files that I could later compare. Doing so quickly revealed JavaScript and SWF objects that were shared among a number of different files.

Screen_shot_2012-02-27_at_2

Not only were the names of the JavaScript functions sometimes identical, but also the entire decoded stream. I was able to identify this by extracting the contents of the malicious object, hashing it and then comparing it to all the other samples in the set. I used the matching of function names and streams as my first connection when associating samples with each other. It is worth noting that while some streams did not match completely, they still retained similar structures. More work could be spent on this particular subject given the small sample set of JavaScript and SWF files used.

Post-exploitation Findings

Just because a file shares JavaScript portions, malicious SWF files or exploit methods doesn’t always mean they match in payload or command and control. In fact, one would hope that a smart adversary would change each one of these details every time an attack were done. Fortunately for us, they don’t.

I contemplated which sandbox to run these files in and ultimately just decided to use my own as I had multiple Adobe versions already snapshotted and wanted to ensure I caught details for files that needed to run longer. Most of the files produced a successful crash (some need more work due to not being able to identify the exploit) and at the very least tried to resolve some addresses or send data back out. 

To capture and record all of the dropped files I used CaptureBat and a couple scripts to take all the output, MD5 it and give me a nice listing. Having ran a lot of these files by hand, it was clear to see some of the same patterns emerging, but I wanted to prove this in code. Just because the files had the same name did not mean they were the same file though it was yet another connection point. To handle this I wrote a quick python script to loop through my results and spit back out the relationships.

Screen_shot_2012-02-27_at_2

Above is a small portion of the matches found. In some cases the PDFs dropped the same payload, in other cases they dropped the same clean PDF, but a lot of them seemed to be connected. Due to these samples being older, many of them no longer had a valid C2 that they could report to. While this was not favorable, it did reveal that most of the samples issued three SYN packets before giving up. This could later be used to help identify the end payload and the builder. 

Screen_shot_2012-02-27_at_2
Screen_shot_2012-02-27_at_3

C2s were extracted from the PCAPs using my PCAP Tools scripts and manual verification. It was only a few cases that I found what appeared to be a working C2 or an attempt to post data back out to a server. Those cases appeared to be similar and could likely be used to identify the payload on the system. Sharing between each file was represented by color in the spreadsheet so it could easily be seen when old attacks were being reused. 

Timeframes and Outside Open Intelligence

Now that I knew the files were connected in a few ways, I felt like it would be good to do some searching for information about the hash (when it was first seen, write-ups, etc.). For a large portion of my files, most had some presence back to Contagio. This provided me with original emails, dates in which these files were sent, who they were sent to and in some cases confirmed what I had seen dropped. 

I did not factor in any of the email content, but doing so could have identified even more trending. Upon sorting the files, it was clear to see a pattern emerge. For the entire year of 2010, there was almost solid coverage in terms of targeted attacks being sent out. Files with matching payloads, clean PDFs and C2s could easily be seen. In some cases the attackers changed nothing at all after months of waiting.

Thoughts and Observations

Exploits

Of the files identified the most popular exploits appeared to be 2009-4324, 2010-0188, 2010-1297 and 2011-0611. The files appeared to cluster together when viewing them from the JavaScript function name perspective. Given the known timeframes, it did not appear that the attackers attempted to keep up with the latest and greatest exploit. 

Shared Details

Many of the files shared some aspect with each other. Most notably was the sharing of JavaScript code to carry out the actual exploitation or heap spray. As stated before, in some cases the same exact code would be used even after several months had passed. When it came to sharing details such as payload EXEs and clean PDF documents, they appeared in clusters and across a much shorter time period.

Command and Control

Screen_shot_2012-02-27_at_3

In some cases the command and control addresses overlapped, but this was often not the case. However, several net blocks did appear more often than others and in a few cases, domains were re-used. Not much was done past observing the call backs given the lengthy period of time that has passed. Those with historical data may be able to make more sense of these details and identify more connections.

Overall Theme

Many of the targeted files sent had a policy/government theme to them. This was determined based on the clean files dropped and some of the email information collected. Many of the files detailed discussions over policy change, threats in regards to war readiness, human rights and general news content ripped from the web.

Conclusions

This post doesn’t really convey or provide as much detail necessary to really link these attacks to a single grouping or entity. They do however provide a high-level view of operations for some or many of the actors involved in infiltrating systems with the intent on remaining there for quite some time. 

I am releasing my spreadsheet of data in hopes that some others interested in this will look into it more. This data has not been shared elsewhere to avoid any conflicts, so it is now open to the public and free to use. I plan on spending some more time looking at the final payload dropped to see if any commonalities exist between them and will provide updates if anything interesting is found.

This project identified a lot of issues with how this sort of work is done in the public space. You are often left to your own devices and what others have preserved. In this case, the dates and times of these files were valuable to identifying clusters or payload reuse, but it was still limited. Unfortunately, there really isn’t a single source that one can go to and get this information. You often walk away with a small portion of the attack and thus can’t make a full determination. 

Having a large amount of potentially related data also poses the issue of how it is stored and queried upon. This project ideally should have entered some sort of database, but I didn’t feel the planning involved with that was worth the effort. It should be noted that developing a scheme to deal with all this data is very difficult to maintain. For that reason alone I sway towards choosing a NoSQL database yet again, but more thought needs to be put into this problem as a whole before attempting to solve it. 

Attribution is difficult, but finding commonalities and shared resources between attacks like this make grouping much easier. Trends are impossible to glean without data and it seems that the best analysis and connection making is done when a bountiful amount of data is present. Those interested in joining this fight and making a difference should begin thinking of open ways into storing all this data and easily identifying relationships. Questions, comments, feedback and emails are welcome. 

Posted by Brandon Dixon
February 14th, 3:41pm 2 comments

Cat Facts Payback

It’s that time of the year again where someone gets slick and gains the attention of the media because of some lame joke. What is it this time? Well, it is “cat facts” which has been heavily documented elsewhere, but basically operates as a little trick your friends may play by spamming your phone with messages you don’t want. 

I am blogging about this because my brother started getting spammed with “cat facts” Friday night by some pranksters. Fortunately for him, back at Defcon 17 I did a talk titled “SMS No Longer Your BFF” where I covered Internet to mobile spamming through the use of shortmail and XMPP. A lot has changed since my talk, but one thing still remains, default settings. 

XMPP is a bit of a pain to script when it comes to registering rogue accounts, so I went with shortmail. As a little recap, shortmail is essentially an email that can be sent to your phone via SMS. Using the destination number and carrier, you can send these messages at will and in bulk. Some carriers support truncating of the messages, so you could easily send one message that would expand out to several SMS. If you want more data on this, go revisit my talk or shoot me an email.

Back to the spammers. Because the number was unknown to us, we weren’t quite sure of the carrier, so we just decided to throw them all into the script hoping one would hit. 

As you can see here, this is a hack. We use the local sendmail SMTP server on the virtual machine to bounce emails from whoever we want to the number provided...forever. After a couple hundred texts being sent out, we stopped hearing from the cat fact folks and went back to the drinking of beers. 

The next day my brother had a text in his inbox that read something along the lines of “what did you do?!”. Needless to say, our spammers turned out to be his friends and the little hack of a script did its job of pushing out several hundred messages to the poor friends phone. Moral of the story, don’t cat facts tech people.

Posted by Brandon Dixon