Wednesday, July 22, 2009

Sweeping Networks with Clojure

One of the great things about Clojure is the Java toolbox that it brings to the table. Socket programming has never been one of Lisp's strong points, and although it has always been possible to write network applications in Lisp, platform specific details and portability are generally a concern. Luckily, the JVM provides one of the most tested networking stacks on the planet, so we can cast any of these historical concerns to the wind.

Enter my situation this evening. I was recently given a new workstation at my place of employment. Our DHCP server delegates IPs by MAC address, so I was assigned a new IP, which I haven't committed to memory. I glanced at the output from ifconfig this afternoon before leaving and briefly noticed that my workstation fell somewhere in the dot one-hundred range. After getting home and noticing a bug in the application I've been working on, I needed to access my workstation to recompile.

Not knowing the IP, my only option was to hop on the VPN and sweep the network for machines running ssh. There are obviously tools available to do this rather easily (nmap for example), but what fun is that?

I'm generally a fan of keeping functions as pure as possible and avoiding side-effects at all costs. That being said, socket programming like all forms of IO is inherently side-effect laden, so in this case I've opted to employ certain facilities I might otherwise avoid.

Kicking things off with a bit of ceremony, we'll import the required objects.
(import '(java.io IOException)
        '(java.net Socket)
        '(java.net InetSocketAddress)
        '(java.net SocketTimeoutException)
        '(java.net UnknownHostException))
With that out of the way, we'll create a function to see if a given host / port combination is connectable. To avoid indefinite blocking, we'll make it so the connection can timeout (thanks to nikkomega from reddit for helping me improve this function).
(defn host-up? [hostname timeout port]
  (let [sock-addr (InetSocketAddress. hostname port)]
    (try
     (with-open [sock (Socket.)]
       (. sock connect sock-addr timeout)
       hostname)
     (catch IOException e false)
     (catch SocketTimeoutException e false)
     (catch UnknownHostException e false))))
As you can see, the use of with-open ensures that the connection is closed regardless of the outcome. Any exceptions that may occur result in a return value of false. We'll use this later to filter through the relevant results. To avoid ruining the flexibility of the host-up? function, we'll add a second function to test specifically for ssh servers running on port 22.
(defn ssh-host-up? [hostname]
      (host-up? hostname 5000 22))
The timeout is hardcoded at 5000 milliseconds, which is probably much longer than needed. Performance will suffer in a single-threaded application, but we'll address this later. With the hard work out of the way, we'll simply apply the functions to the desired data.
(def network "192.168.1.")
; scan 192.168.1.1 - 192.168.1.254
(def ip-list (for [x (range 1 255)] (str network x)))
(doseq [host (filter ssh-host-up? ip-list)]
       (println (str host " is up")))
After running this, I was able to retrieve the desired results and locate my machine; however, it took over twelve minutes to sweep the entire network. This is due to the long timeout and the fact that we're testing each host in a serial fashion. Seeing as we're using Clojure, a few small changes should improve the situation dramatically.

Before multi-threading the program:
real    12m19.390s
user    0m1.684s
sys     0m0.364s
There are varying ways to add concurrency to a Clojure app, but agents provide a send-off function specifically designed for blocking tasks. Given the fact that we're sitting around waiting for most of these hosts to timeout, agents are a logical choice in this case. Since the first part of our program was written in a generic fashion, all we need to change is the application of the functions.
(def network "192.168.1.")
; scan 192.168.1.1 - 192.168.1.254
(def ip-list (for [x (range 1 255)] (str network x)))
(def agents (for [ip ip-list] (agent ip)))

(doseq [agent agents]
  (send-off agent ssh-host-up?))

(apply await agents)

(doseq [host (filter deref agents)]
  (println (str @host " is up")))

(shutdown-agents)
Running the modified code reduces the runtime from twelve minutes to six seconds. Who can argue with that?
real 0m6.731s
user 0m1.996s
sys 0m0.268s

5 comments:

  1. Hi. Nice blog, Travis! I really enjoyed the post with the mad scientist and the monkey.

    I was at work earlier and couldn't post here (flash maybe?) but I posted on the reddit thread my attempt to convert your code into a more general ping sweep.

    Only problem is, I couldn't make all the agents work concurrently at once -- I had to spawn them 1.5 seconds from one another. I can post the code here for completeness if you'd like.

    Anyway, thanks for the head-start on a Clojure utility I've been wanting to make recently.

    ReplyDelete
  2. Post away, I'm no expert, but I'll be happy to take a look.

    ReplyDelete
  3. Very nice...
    I had a similar problem at my workplace where I had to fetch data from a webservice, I used pmap for that, but no matter how many thread i tell it to execute I could only see around 4 to 5 at a time.
    couldn't figure out the resaon for that...

    ReplyDelete
  4. Great example, very clear to follow. Funny that on my machine it takes 6.255secs. Please keep posting your findings with clojure, fun to read. Cheers Patrick

    ReplyDelete
  5. @rsdr, pmap uses a fixed-size thread pool proportionate to the number of available processors.

    ReplyDelete