In my last post I gave a simple example of an XMPP bot in action. The bot demonstrated how easy it was to plug a python XMPP bot into an existing XMPP setup. This post is going to go a little deeper and show command and control could be performed using your own XMPP server along with public XMPP interfaces.
For a working demo I have a local XMPP server with a functioning BOSH connection manager, a hosted google talk transport and 2 local accounts. One account is “tester” who would be administering the commands to the main bot “gatekeeper”. Gatekeeper has one command, dumper, to take in a base64 encoded string separated by a “:” that it then decodes and stores it in a backend database. Gatekeeper her been configured to bond directly with a gmail address that is exposed out on the internet for communications. Another gmail account has been setup that would act as the caller back to the main gatekeeper bot in which it would send the encoded data and then disconnect.
In the video (view in HD) below the following occurs:
- Openfire interface is shown with both users offline
- Tester logs into a HTTP interface that can chat with the gatekeeper bot
- The bot is started
- Both users are confirmed to be online through the openfire interface
- Tester sends a help command to the bot to see the commands available
- Gmail interface for gatekeeper is shown
- Message is sent from the “caller” address to the gatekeeper gmail
- The bot sends back the reponse of valid commands through gmail to the “caller” gmail
- Base64 encoded string is sent from the “caller” to the gatekeeper
- Gatekeeper acks the information
- Connection to the database shows the record with decoded data
Lets speak through the web!
XMPP works great through the standard 5222/5223 ports, but that makes our traffic stick out on the network and is likely to get inspected. What better way to look normal then to speak through HTTP(S). To facilitate this sort of feature, XMPP uses BOSH which is capable of passing data in both unsecured and secure communications. How does this factor into our setup? Well, Openfire has a BOSH connection manager built in that supports both unsecured and secure connections. Located under server settings is a couple configurations in which BOSH clients can form a connection and under what terms they need to follow. A simple enable is all that is needed to turn the service on and expose the connection interface over port 7070.
With BOSH enabled and exposed, we can make a connection from an HTTP(S) website to our server over web sockets. Using a modern browser, jQuery and the Strophe library, we can authenticate to our XMPP server and chat as if we were using a fat client. Google talk is a great example of this concept as you can install the desktop client, connecting to the 5222/5223 service or you can chat within the browser. Aside from looking like normal traffic, how else does this help? Assuming you use secure communications all over (HTTPS on the web and BOSH manager) then all your chat traffic remains encrypted making it difficult to understand what is actually being sent back and forth. By communicating over HTTP(S) you also have the ability to integrate administration of the bots and callback data into one centralized panel instead of multiple locations.
Of course some people might be weary of making a direct connection with Google or Yahoo, but no need to worry as there are plenty of public XMPP servers hosted online. These servers host a number of different transports using their own public instant messaging domains, a lot of which are in foreign countries and likely ignore the day-to-day usage of the actual services. Assuming you make a connection to one of these services, how would you speak back to your bot? Simply take your bot name and throw the proper domain suffix on it. In my example I took bonded the local “gatekeeper” bot with a gmail address and was then able to communicate directly to the bot through the gmail interface.
Here is a decent list of open transports that can be polled programmatically:
Can we scale as connections come in?
Depending on the XMPP server, scaling can be pretty easy. Servers like Openfire allow for clustering and detached BOSH connection managers with the ability to load balance over several systems. Other servers like ejabber allow for similar setups, but with the addition of hot swapping code because of the use of erlang. A well built server should be able to easily handle thousands of connections without any trouble. Given the setup though, connections are likely going to be made for short period of times with small amounts of traffic sent through.
Unlike other instant messaging services, XMPP also has the ability to queue up messages sent while a user was offline. More often then not a bot is not going to remain connected all the time though in theory they could. Instead, users could take advantage of the storing of commands, send the data to the bots while offline and when the bots login next time they could recieve instructions. This could further scale by using the built in peer-to-peer support within XMPP. Bots could elect a single “shot caller” who would fetch instructions and then distribute them locally across the peer group. If P2P wasn’t desierable then a single bot could still be used to update a local copy of the configuation on the HTTP(S) site of which all bots would go to read what they needed to do.