<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="wordpress/2.0.4" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>

<channel>
	<title>Thought Palace</title>
	<link>http://mooseyard.com/Jens</link>
	<description>little boxes made of words</description>
	<pubDate>Sat, 17 May 2008 00:42:28 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.0.4</generator>
	<language>en</language>
			<item>
		<title>BLIP update</title>
		<link>http://mooseyard.com/Jens/2008/05/blip-update/</link>
		<comments>http://mooseyard.com/Jens/2008/05/blip-update/#comments</comments>
		<pubDate>Sat, 17 May 2008 00:42:26 +0000</pubDate>
		<dc:creator>Jens Alfke</dc:creator>
		
	<category>Computers</category>
		<guid isPermaLink="false">http://mooseyard.com/Jens/2008/05/blip-update/</guid>
		<description><![CDATA[	I&#8217;ve got my new BLIP protocol all implemented now. After my previous post on Monday:

	
		On Tuesday I implemented message metadata.
		On Wednesday I got SSL working (configuring the &#8220;server&#8221; side to verify the &#8220;client&#8217;s&#8221; cert was difficult.)
		On Thursday I put Cloudy up on blocks, pried out Vortex and my Obj-C wrapper library, and replaced them with [...]]]></description>
			<content:encoded><![CDATA[	<p>I&#8217;ve got my <a href="http://mooseyard.com/Jens/2008/05/the-fine-line-between-clever-and-stupid/" title="">new <span class="caps">BLIP</span> protocol</a> all implemented now. After my previous post on Monday:</p>

	<ul>
		<li>On Tuesday I implemented message metadata.</li>
		<li>On Wednesday I got <span class="caps">SSL</span> working (configuring the &#8220;server&#8221; side to verify the &#8220;client&#8217;s&#8221; cert was difficult.)</li>
		<li>On Thursday I put Cloudy up on blocks, pried out Vortex and my Obj-C wrapper library, and replaced them with <span class="caps">BLIP</span>.</li>
		<li>And on Friday (today) I debugged.</li>
	</ul>

	<p>Cloudy&#8217;s back up and running, and all its features work. So, that makes one week of effort to implement the networking layer from scratch (I started sketching and coding on Saturday). Really makes me regret spending several times that on the previous library&#8212;writing an Obj-C <span class="caps">API</span>, fixing bugs, adding features. Still, I&#8217;m sure all that experience helped me implement <span class="caps">BLIP</span> so quickly.</p>

	<p>(The code for <span class="caps">BLIP</span> is quite separable from Cloudy; but it does use some lower-level utilities and network classes I wrote, so it&#8217;s not quite standalone. And it needs documentation before it&#8217;ll make sense to anyone. How soon I release it depends on how much interest there is, so let me know if it&#8217;s something you&#8217;re interested in using.)</p>
 ]]></content:encoded>
			<wfw:commentRSS>http://mooseyard.com/Jens/2008/05/blip-update/feed/</wfw:commentRSS>
		</item>
		<item>
		<title>The Fine Line Between Clever And Stupid</title>
		<link>http://mooseyard.com/Jens/2008/05/the-fine-line-between-clever-and-stupid/</link>
		<comments>http://mooseyard.com/Jens/2008/05/the-fine-line-between-clever-and-stupid/#comments</comments>
		<pubDate>Tue, 13 May 2008 05:05:17 +0000</pubDate>
		<dc:creator>Jens Alfke</dc:creator>
		
	<category>Computers</category>
		<guid isPermaLink="false">http://mooseyard.com/Jens/2008/05/the-fine-line-between-clever-and-stupid/</guid>
		<description><![CDATA[It's that old "make vs. buy" trade-off, or "write vs. reuse" in this case: do you go with an existing library, even if it's problematic, or do you write your own implementation from scratch?
What am I talking about? The networking code in Cloudy.]]></description>
			<content:encoded><![CDATA[	<p>&#8230;and which side of that line am I on? Not in general; just in respect to my latest decision in Cloudy. It&#8217;s the old &#8220;make vs. buy&#8221; trade-off, or &#8220;write vs. reuse&#8221; in this case: do you go with an existing library, even if it&#8217;s problematic, or do you write your own implementation from scratch?</p>

	<p>What am I talking about? The networking code in Cloudy. From the very beginning I wanted to use <a href="http://beepcore.org" title=""><span class="caps">BEEP</span></a>, a generic and flexible protocol for sending request/response messages over a socket. It has good support for parallelism, nice abstractions like multiple channels and feature negotiation, and supports <span class="caps">SSL</span>.</p>

	<p>The <span class="caps">BEEP</span> implementation I&#8217;m using is <a href="http://vortex.aspl.es" title="">Vortex</a>. It has the benefits of existing (there aren&#8217;t a lot of <span class="caps">BEEP</span> implementations around), being written in native code, and being within my capabilities to get running on Mac <span class="caps">OS X</span>. Unfortunately it&#8217;s also got a very complex and unintuitive <span class="caps">C API</span>, spawns lots of threads and calls my code from them (so I have to deal with thread-safety), and isn&#8217;t quite finished yet. So over the months I&#8217;ve put a lot of work into writing an Objective-C <span class="caps">API</span>, figuring out how to get that working reliably, and diving into the Vortex code to fix bugs and add new features.</p>

	<p>At some point&#8212;I think it was last Thursday&#8212;I crossed a line and started to wonder whether putting more effort into Vortex was throwing good money after bad. I wanted to use some of <span class="caps">BEEP</span>&#8217;s <span class="caps">MIME</span> features, but found that Vortex only implements a tiny subset, and that even that implementation doesn&#8217;t work right. Going down the path of fixing and improving, I found that I would have to break compatibility with any earlier versions of Vortex (which all send wrongly-formatted data), and that every patch I added was exposing more bugs, whether existing ones or newly-added ones of my own. And all this meant hacking on complicated code in plain C, with <a href="https://dolphin.aspl.es/svn/publico/af-arch/trunk/libvortex/src/vortex_sequencer.c" title="">very long functions, and liberal usage of &#8216;goto&#8217; statements</a>.</p>

	<p>Unfortunately, there are not many <span class="caps">BEEP</span> implementations to choose from (it&#8217;s hard to compete against <span class="caps">HTTP</span>, the hammer everyone reaches for.) There&#8217;s a private <span class="caps">BEEP</span>.framework inside <span class="caps">OS X</span>, but I&#8217;d have to be a fool to use it (for legal reasons if not technical.) I found that SubEthaEdit has one, and I&#8217;ve asked the developers about it, but I&#8217;m sure there will be financial terms.</p>

	<h3>So yeah, I&#8217;m writing my own.</h3>

	<p>No, I am not writing my own <span class="caps">BEEP</span> implementation! Stupid sensationalistic headlines. I thought about it for a few hours on Friday, but <span class="caps">BEEP</span> is actually pretty complicated and I knew it would end up taking longer and being harder than I would like.</p>

	<p>But then I started to think about what <span class="caps">BEEP</span> features I need, and which ones I don&#8217;t, and realized I could make do with a subset. Things I need:</p>

	<ul>
		<li>Multiple messages and replies over a single socket</li>
		<li>Parallelism (don&#8217;t have to strictly alternate messages and replies)</li>
		<li>Message metadata (some type of <span class="caps">MIME</span>-like headers)</li>
		<li><span class="caps">SSL</span> support</li>
	</ul>

	<p>But I don&#8217;t need:</p>

	<ul>
		<li>Multiple channels (that&#8217;s a biggie! Quite complicated.)</li>
		<li>Well-defined schema (&#8220;profiles&#8221;) with negotiation over which to use</li>
		<li><span class="caps">FIFO</span> message delivery</li>
		<li>Negotiation of whether to use <span class="caps">SSL</span></li>
		<li><span class="caps">SASL</span> and other authentication schemes</li>
		<li>Interoperability with any other clients</li>
	</ul>

	<p>On Saturday morning, a bit of pen-and-paper doodling convinced me I could create a simpler protocol that used some of the ideas of <span class="caps">BEEP</span>, and would give me just the features I needed. And since I could build on top of the Foundation and CFNetwork classes, in Objective-C, I could do it with a whole lot less code than Vortex.</p>

	<p>So I started coding&#8230;</p>

	<h3><span class="caps">BLIP</span>.</h3>

	<p>It&#8217;s almost ready now, Monday night. I don&#8217;t have message metadata implemented yet, and flow-control needs work, but the rest is running pretty reliably. I&#8217;m calling it <span class="caps">BLIP</span>, for &#8220;BEEP-LIke Protocol&#8221; (or maybe &#8220;BEEP-Lite Imitation Protocol&#8221;, like some kind of suspect canned food product you might find in a dingy supermarket.)</p>

	<p>What does <span class="caps">BLIP</span> do? It lets you open a <span class="caps">TCP</span> socket and then use it to send messages back and forth. Each message is a data blob of arbitrary length, with an optional set of key/value properties [once I implement that]. Each message can have a reply associated with it, so you can send a message and then wait for the reply. Either peer can send messages, at any time (as opposed to most protocols that only let the &#8220;client&#8221; who opened the connection send messages, and only the &#8220;server&#8221; reply.)</p>

	<p>Multiple messages can be in flight at once, in either direction: sending a 10MB file doesn&#8217;t block the transfer of other data. Messages can request high priority to get more bandwidth, or they can request gzip compression.</p>

	<p>And it&#8217;s only 1500 lines of code, so far! Vortex is about 34,000. Even my Obj-C wrapper for Vortex is bigger than <span class="caps">BLIP</span> is. [All those figures ignore comments and blank lines.]</p>

	<p>I&#8217;m sure <span class="caps">BLIP</span> will end up doubling in size before it&#8217;s really ready; and I&#8217;m also sure it&#8217;ll take a lot more time than I&#8217;ve spent on it. But both of those figures are cheap compared to what I&#8217;ve invested in something that ultimately wasn&#8217;t working for me. So I think in this case re-inventing the wheel is justified.</p>

	<p>&#8230;Meaning that I&#8217;ve answered the implied question of this post&#8217;s title, coming down on the side of &#8220;clever&#8221;. Whew! (What else did you expect? it&#8217;s <em>my</em> blog, after all.)</p>

	<p>[PS: Yes, if <span class="caps">BLIP</span> does work out, I&#8217;ll certainly open-source it. I wouldn&#8217;t call it a replacement for <span class="caps">BEEP</span>, as it&#8217;s a lot more limited, but I think it&#8217;ll be useful.]</p>
 ]]></content:encoded>
			<wfw:commentRSS>http://mooseyard.com/Jens/2008/05/the-fine-line-between-clever-and-stupid/feed/</wfw:commentRSS>
		</item>
		<item>
		<title>Stickies makes its music-video debut!</title>
		<link>http://mooseyard.com/Jens/2008/05/stickies-makes-its-music-video-debut/</link>
		<comments>http://mooseyard.com/Jens/2008/05/stickies-makes-its-music-video-debut/#comments</comments>
		<pubDate>Sun, 11 May 2008 16:07:18 +0000</pubDate>
		<dc:creator>Jens Alfke</dc:creator>
		
	<category>Me</category>
	<category>Humor</category>
	<category>Computers</category>
		<guid isPermaLink="false">http://mooseyard.com/Jens/2008/05/stickies-makes-its-music-video-debut/</guid>
		<description><![CDATA[Stickies and I hadn't spoken in a while, but this morning I just heard it's made its acting debut in a music video! That was unexpected, to say the least, but it's an exciting career move, and I had to congratulate it; it does a great job.]]></description>
			<content:encoded><![CDATA[	<p>Stickies and I hadn&#8217;t spoken in a while, but it called me this morning to announce it&#8217;s made its acting debut in a music video! That was unexpected, to say the least, but it&#8217;s an exciting career move, and I had to congratulate it; it does a great job:</p>

	<p><object width="425" height="355"><param name="movie" value="http://www.youtube.com/v/6kxDxLAjkO8&#038;rel=0&#038;color1=0x234900&#038;color2=0x4e9e00&#038;hl=en"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/6kxDxLAjkO8&#038;rel=0&#038;color1=0x234900&#038;color2=0x4e9e00&#038;hl=en" type="application/x-shockwave-flash" wmode="transparent" width="425" height="355"></embed></object></p>

	<p>Stickies makes its entrance at 0:53, if you want to skip directly to it, but really the entire video (and song) are excellent. I just wish they&#8217;d used Stickies in the opening scenes instead of Word&#8212;face it, Word is over the hill, especially that old Office 2004 version. (Did you see the bags under the Office Assistant&#8217;s eyes? Stickies told me they dragged it straight out of the Betty Ford Center to shoot those scenes, and it couldn&#8217;t remember any of its lines even though they were right up on the screen next to it in giant print. It&#8217;s sad, really. At least it hasn&#8217;t OD&#8217;d yet like that pathetic paperclip.)</p>

	<p>This seems to be a fan-made video, by the way; but I think it&#8217;s better than <a href="http://www.youtube.com/watch?v=xDlEXQaMBpk" title="">the official one</a>. Now the question is: will Apple use this in a commercial? I think they should!</p>

	<p>[via <a href="http://www.37signals.com/svn/posts/1020-how-to-make-a-music-video" title="">37signals</a>]</p>
 ]]></content:encoded>
			<wfw:commentRSS>http://mooseyard.com/Jens/2008/05/stickies-makes-its-music-video-debut/feed/</wfw:commentRSS>
		</item>
		<item>
		<title>Coroutines, pt. 2</title>
		<link>http://mooseyard.com/Jens/2008/05/coroutines-pt-2/</link>
		<comments>http://mooseyard.com/Jens/2008/05/coroutines-pt-2/#comments</comments>
		<pubDate>Sat, 03 May 2008 16:21:31 +0000</pubDate>
		<dc:creator>Jens Alfke</dc:creator>
		
	<category>Computers</category>
		<guid isPermaLink="false">http://mooseyard.com/Jens/2008/05/coroutines-pt-2/</guid>
		<description><![CDATA[It strikes me that ucontext is basically no lighter-weight than a pthread, in terms of address-space usage and context switch speed. Is that true? Or is there additional overhead to pthreads besides the stack + registers?

If so, then it might be simpler just to use pthreads, since the API is already in place, and existing system facilities (like ObjC and C++ exceptions, and Cocoa autorelease pools) already know how to work with them. But the cooperative scheduling of coroutines is a bonus in some ways, as it makes the flow of control more deterministic and reduces the need for complex locking and synchronization.

So my second question is whether there's a clean way to implement cooperative scheduling of pthreads?]]></description>
			<content:encoded><![CDATA[	<p><em>[I just posted these questions to Apple&#8217;s darwin-userlevel mailing list, but they&#8217;re also worth cc:ing here as a follow-up to my last post.]</em></p>

	<p>I&#8217;ve been experimenting this week with coroutines. Typically these are implemented as a type of cooperative thread, since each coroutine needs a separate stack and register set. I adapted Steve Dekorte&#8217;s libCoroutine, which basically just uses ucontext, with malloc&#8217;ed stacks.</p>

	<p>It strikes me that ucontext is basically no lighter-weight than a pthread, in terms of address-space usage and context switch speed. Is that true? Or is there additional overhead to pthreads besides the stack + registers?</p>

	<p>If so, then it might be simpler just to use pthreads, since the <span class="caps">API</span> is already in place, and existing system facilities (like ObjC and C++ exceptions, and Cocoa autorelease pools) already know how to work with them. But the cooperative scheduling of coroutines is a bonus in some ways, as it makes the flow of control more deterministic and reduces the need for complex locking and synchronization.</p>

	<p>So my second question is whether there&#8217;s a clean way to implement cooperative scheduling of pthreads<sup>1</sup>, i.e. to have a set of threads that transfer control within themselves only via some sort of &#8220;yield&#8221; call, not by pre-emption? I&#8217;ve seen this implemented, for tutorial purposes, in Java using a shared lock. However, a coroutine transfers control to an explicitly-named other coroutine; it&#8217;s not at the whim of the thread scheduler. How would one implement that in pthreads? I&#8217;m guessing it would involve a lock per thread; but I&#8217;m not familiar with the details of the pthreads primitives or <span class="caps">API</span>, so advice would be appreciated.</p>

	<p>&#8212;Jens</p>

	<p><sup>1</sup> Yes, I&#8217;m aware that cooperative scheduling means you don&#8217;t get all the performance benefits of multicore CPUs. I&#8217;m not concerned about that because, frankly, my application code uses minimal <span class="caps">CPU</span> time; it&#8217;s always waiting for sockets, CoreAudio background threads, or user input. So the benefits of threading my code aren&#8217;t worth the complexity and error-prone-ness of thread-safety.</p>
 ]]></content:encoded>
			<wfw:commentRSS>http://mooseyard.com/Jens/2008/05/coroutines-pt-2/feed/</wfw:commentRSS>
		</item>
		<item>
		<title>Coroutines in Objective-C</title>
		<link>http://mooseyard.com/Jens/2008/04/coroutines-in-objective-c/</link>
		<comments>http://mooseyard.com/Jens/2008/04/coroutines-in-objective-c/#comments</comments>
		<pubDate>Wed, 30 Apr 2008 22:03:14 +0000</pubDate>
		<dc:creator>Jens Alfke</dc:creator>
		
	<category>Languages</category>
	<category>Computers</category>
		<guid isPermaLink="false">http://mooseyard.com/Jens/2008/04/coroutines-in-objective-c/</guid>
		<description><![CDATA[I've started using NSOperation in a few places in Cloudy, which means I'm backsliding into using threads and locking and so forth. It definitely makes writing network code easier than Cocoa's asynchronous API, but I really don't want to get into a morass of threads.

What I'd really like to use are "Actors":http://revactor.org/philosophy. In a nutshell, an Actor is an object that has its own [cooperative] thread and message queue. Actors interact by message-passing instead of shared state. The idea is to eliminate the need for standard synchronization primitives like semaphors and locks, and get rid of the race conditions and deadlocks that plague multi-threaded programs.]]></description>
			<content:encoded><![CDATA[	<p>I&#8217;ve started using NSOperation in a few places in Cloudy, which means I&#8217;m backsliding into using threads and locking and so forth. It definitely makes writing network code easier than Cocoa&#8217;s asynchronous <span class="caps">API</span>, but I really don&#8217;t want to get into a morass of threads.</p>

	<p>What I&#8217;d really like to use are <a href="http://revactor.org/philosophy" title=""><strong>Actors</strong></a>. In a nutshell, an Actor is an object that has its own [cooperative] thread and message queue. Actors interact by message-passing instead of shared state. The idea is to eliminate the need for standard synchronization primitives like semaphors and locks, and get rid of the race conditions and deadlocks that plague multi-threaded programs.</p>

	<p>The page I linked to above is from <a href="http://revactor.org/" title="">Revactor</a>, a new Actor implementation for Ruby 1.9; Actors are also built into languages like Erlang and Io.</p>

	<p>The only hard part about implementing Actors in Obj-C appears to be the underlying dependency on <a href="http://en.wikipedia.org/wiki/Coroutine" title=""><strong>coroutines</strong></a>. Steve Dekorte [author of Io] has a <a href="http://www.dekorte.com/projects/opensource/libCoroutine/" title="">C coroutine implementation</a> ; when I discovered that last year, I then found an <a href="http://www.dpompa.com/?p=83" title="">Objective-C wrapper for it</a>, but that relies on a HigherOrderMessaging library which is incompatible with <span class="caps">OS X 10</span>.5 and hasn&#8217;t yet been updated.</p>

	<h3>So I&#8217;ve gone <span class="caps">DIY</span>.</h3>

	<p>I started with Steve Dekorte&#8217;s coroutine library, but unfortunately it doesn&#8217;t build on 10.5 &#8230; and when I got it to build, I then ran into a bug in the system header <ucontext.h> &#8230; and by the time I&#8217;d worked around <em>that</em>, I knew the code well enough that I pretty much rewrote it. Doing this made it much shorter and clearer, since Steve&#8217;s code is cross-platform and supports three different implementations, while all I care about is <span class="caps">OS X 10</span>.5. I got it down to 150 lines of C.</p>

	<p>Then I wrote a pretty simple Objective-C wrapper around it, under 200 lines. (<a href="http://mooseyard.com/hg/hgwebdir.cgi/Actors/file/tip/Coroutines/MYCoroutine.h" title="">Here&#8217;s the header</a>.)</p>

	<h3>Come &#8216;n&#8217; get it&#8230;</h3>

	<p>The work in progress is <a href="http://mooseyard.com/hg/hgwebdir.cgi/Actors/" title="">in my Mercurial repository</a> if you want to check it out. It&#8217;s grandly titled &#8220;Actors&#8221;, but all that&#8217;s in it so far is the coroutine implementation.</p>

	<p>Needless to say, this is currently <strong>for amusement purposes only</strong>; do not use it in real code. In particular, I strongly suspect that there are going to be problems getting Objective-C exceptions and autorelease pools to work correctly in coroutines. I&#8217;ll try to explore that next.</p>
 ]]></content:encoded>
			<wfw:commentRSS>http://mooseyard.com/Jens/2008/04/coroutines-in-objective-c/feed/</wfw:commentRSS>
		</item>
		<item>
		<title>Cloudy Verification</title>
		<link>http://mooseyard.com/Jens/2008/04/cloudy-verification/</link>
		<comments>http://mooseyard.com/Jens/2008/04/cloudy-verification/#comments</comments>
		<pubDate>Sat, 26 Apr 2008 22:04:54 +0000</pubDate>
		<dc:creator>Jens Alfke</dc:creator>
		
	<category>Social Software</category>
	<category>Computers</category>
		<guid isPermaLink="false">http://mooseyard.com/Jens/2008/04/cloudy-verification/</guid>
		<description><![CDATA[The _first_ time you connect to someone, how do you establish that digital identifier you're communicating with is the human being you think it is? This is surprisingly difficult to do, because it's prone to what cryptographers call the "man-in-the-middle attack".

First, consider the most obvious attack: simple spoofing.

Let's suppose there's an instant-messaging UI, and while working at home you receive a message from someone with an unknown key, whose nickname is "AliceLidell", which happens to be the name of a co-worker.]]></description>
			<content:encoded><![CDATA[	<p><i>Continuing from <a href="http://mooseyard.com/Jens/2008/04/cloudy-networking/" title="">the previous Cloudy post</a> &#8230; </i></p>

	<p><img src="http://mooseyard.com/Jed/Cloudy/Mars%204.jpg" alt="" border="0" /></p>

	<p>The <em>first</em> time you connect to someone, how do you establish that digital identifier you&#8217;re communicating with is the human being you think it is? This is surprisingly difficult to do, because it&#8217;s prone to what cryptographers call the &#8220;man-in-the-middle attack&#8221;.</p>

	<p>(Those of you already wearing tinfoil hats can skip past the general explanation, down to &#8220;What Cloudy Does&#8221;.)</p>

	<h2>1. A Quick Overview Of Verification Attacks.</h2>

	<p>First, consider the most obvious attack: simple spoofing.</p>

	<h3>Spoofing.</h3>

	<p>Let&#8217;s suppose there&#8217;s an instant-messaging UI, and while working at home you receive a message from someone with an unknown key, whose nickname is &#8220;AliceLiddell&#8221;, which happens to be the name of a co-worker.</p>

	<blockquote>&#8220;AliceLiddell&#8221;: yo, this is alice<br />
You: hi alice, what&#8217;s up?<br />
<em>You add this identity to your friends-list.</em><br />
Alice: i need the admin password to the web server to fix a template<br />
You: oh ok, it&#8217;s wend4743kt<br />
Alice: kthxbye</blockquote>


	<p>Fifteen minutes later your company&#8217;s website is pwned by the hacker who posed as Alice. All he had to do was create a new identity with her name as the nickname, and pretend to be her.</p>

	<p>How do we get around this? You might think that asking questions before accepting someone&#8217;s claimed identity would help, and it does help with spoofing, but there are nastier attacks.</p>

	<h3>Man-In-The-Middle</h3>

	<blockquote>&#8220;AliceLiddell&#8221;: yo, this is alice<br />
You: You haven&#8217;t contacted me before &#8230; how do I know you?<br />
&#8220;AliceLiddell&#8221;: i&#8217;m down the hall next door to brad. i need to ask you a question but you&#8217;re not in the office today.<br />
You: yeah, i&#8217;m working from home. sorry to be paranoid, but what&#8217;s the poster on your wall say?<br />
&#8220;AliceLiddell&#8221;: it used to say &#8220;hang in there baby&#8221; but i took it down when lolcats started getting too popular =)<br />
<em>You add this identity to your friends-list.</em><br />
You: cool &#8230; hi alice, what&#8217;s up?<br />
&#8230;</blockquote>

	<p>Having established that this is really Alice, you go on to give her the password &#8230; and fifteen minutes later your company&#8217;s website gets pwned anyway. What went wrong? Well, it really <em>was</em> Alice you were talking with; but the hacker was able to listen in and read the password. Wasn&#8217;t the industrial-strength 2048-bit <span class="caps">RSA</span> encryption supposed to prevent this?</p>

	<p>The problem is that you and Alice were <em>talking</em> with each other; but you weren&#8217;t directly <em>connected</em> to each other. Instead each of you was connected to the hacker, who was relaying your messages back and forth. In this scenario what probably happened was that Alice tried to look you up by your name, found the hacker&#8217;s fake account instead, and the hacker&#8217;s computer then quickly created an identity with the same nickname as Alice, connected to the real you using that identity, and started forwarding your messages to each other while recording them itself.</p>

	<p>What&#8217;s even worse: That identity you added to your friend-list as Alice? It&#8217;s really the hacker&#8217;s identity. From now on the hacker can talk directly to you and you&#8217;ll probably assume it&#8217;s Alice.</p>

	<h3>How Do We Solve This?</h3>

	<p>The man-in-the-middle attack is resistant to nearly any kind of <em>in-band</em> verification. You can ask Alice any personal questions you want, but it won&#8217;t reveal that you&#8217;re not connected directly to Alice. You can ask Alice to type in her public key, but the hacker can edit her reply and substitute the key he&#8217;s connected to you by.</p>

	<p>About the only practical way to solve this, unfortunately, is to use an <em>out-of-band</em> channel. You need to talk with the real Alice and compare notes, before you can trust that her digital identity belongs to her. All you have to do, really, is get her <em>real</em> public key and compare it to the key you&#8217;re communicating with. (And she has to do likewise, of course.)</p>

	<p>The canonical way to do this is to meet Alice in person and swap public keys. (PGP users call this a &#8220;key-signing ceremony&#8221;.) Or you and Alice can read your keys to each other over the phone (or Skype, or an iChat video conference.) Sending the keys over IM is somewhat less reliable, but enough so for many purposes, since forging centralized IMs is a fairly involved task.</p>

	<p>Of course, we don&#8217;t want to read 512 hexadecimal digits to each other! One optimization is to compare secure hashes of the keys (as <span class="caps">PGP</span> does), but that&#8217;s still 40 digits. And those &#8220;B&#8221;s and &#8220;D&#8221;s are so easy to mix up over the phone.</p>

	<h2>2. What Cloudy Does.</h2>

	<p><div style="text-align:center;"><img src="http://mooseyard.com/Jens/wp-content/uploads/2008/04//CloudyAlert.png" alt="CloudyAlert.png" border="0" width="500" height="306" /></div></p>

	<p>Cloudy&#8217;s verification scheme is blatantly stolen from the one used in Bryan Ford <em>et al</em>&#8217;s <a href="http://pdos.csail.mit.edu/papers/uia:osdi06/#SECTION00021000000000000000" title="">Unmanaged Internet Architecture</a>. Instead of making you read a number as a string of digits, Cloudy converts it into a three-word phrase by mapping consecutive chunks of bits into words in an English dictionary, moreover a dictionary that&#8217;s been <a href="http://tothink.com/mnemonic/wordlist.html" title="">specially constructed</a> of words that are easy to recognize and hard to mix up.</p>

	<p>And instead of making you listen to the words and type them in, Cloudy (like <span class="caps">UIA</span>) presents a short list of phrases with radio buttons, for you to pick from. One of them is the one that will be correct if the connection is genuine, the others are chosen at random, and there&#8217;s a catch-all &#8220;None of the above&#8221; at the end. If the user didn&#8217;t select the expected phrase, something&#8217;s wrong.</p>

	<p><div style="text-align:center;"><img src="http://mooseyard.com/Jens/wp-content/uploads/2008/04//CloudyVerification.png" alt="CloudyVerification.png" border="0" width="495" height="364" /></div></p>

	<p>(An aside: the phrase only encodes 32 bits, which is far less than even the <span class="caps">SHA</span>-1 hash. Just hashing the key down to 32 bits would not be secure enough; instead Cloudy creates a one-time 32-bit key by combining the public key with a randomly-chosen integer that&#8217;s sent to the other peer at the time of verification.)</p>

	<p>Ford points out another benefit of this interface: &#8220;its multiple-choice design prevents users from just clicking &#8216;OK&#8217; without actually comparing the keys&#8221;, which defeats the user&#8217;s damnable tendency to just <a href="http://www.macworld.com/article/132910/2008/04/pubsubagent.html" title="">dismiss all security-related alerts</a>.</p>

	<p>Once this is done, and the user chose the right verification phrase, Cloudy adds the other person&#8217;s public key/identity to your &#8220;contact list&#8221; in its persistent storage. You can then decide to associate that key with an entry in your Address Book. Cloudy also mints a &#8220;relationship&#8221; certificate attesting that you have verified the other person&#8217;s identity; you can choose to annotate the relationship with <a href="http://gmpg.org/xfn/11" title=""><span class="caps">XFN</span></a> tags like &#8220;friend&#8221; or &#8220;co-worker&#8221;. These certs can be passed to other friends to transitively extend trust.</p>

	<p>How well does this user interface work? Cloudy hasn&#8217;t seen much real-world use yet, but I&#8217;ve gone through the initial setup with a half-dozen people, and the verification (once I debugged it!) is quite easy to follow and takes only ten seconds or so.</p>

	<h2>3. Is This Too Paranoid?</h2>

	<p>One of the unpleasant side effects of learning too much about computer security is that you start to become paranoid. You <a href="http://www.arrod.co.uk/essays/matrix.php" title="">swallow the red pill</a> of the Internet and discover how much we take for granted, how much trust we implicitly place in things that are not trustworthy: domain names, centralized databases, passwords, emails. In severe cases, you start to self-identify as a cypherpunk and refuse to connect to any server through fewer than three anonymizing proxies. It&#8217;s a bit like <a href="http://en.wikipedia.org/wiki/Medical_students_disease" title="">Medical Student Syndrome</a>.</p>

	<p>On the other hand, I think a lot of this paranoia is justified. I remember the old days, when &#8220;spam&#8221; was just a Monty Python sketch and you could trust the &#8220;From:&#8221; line of an email. Nowadays most of the emails we get have forged senders, and even a message that <em>sounds</em> like it came from a friend might have been sent by <a href="http://www.djchuang.com/2008/blubet-sent-you-a-special-gift-too/" title="">some shady social-networking site he foolishly uploaded his address book to</a>. Not too many people worry about domain names yet, but <span class="caps">DNS</span> is <a href="http://en.wikipedia.org/wiki/DNS_cache_poisoning" title="">not hard to mess with</a>, either by <a href="http://www.pcworld.com/article/id,140465-pg,1/article.html" title="">hackers</a> or by <a href="http://blog.wired.com/27bstroke6/2008/04/isps-error-page.html" title="">profit-motivated ISPs</a>.</p>

	<p>Eventually, anything that can be subverted, will. And since peer-to-peer software can&#8217;t use the standard brute-force obstacles (centralized authority, locked-down servers) to delay attacks, it has to rely on <em>actually being secure</em>. And that means public keys, encryption, webs of trust. As many have pointed out, if you make security an optional add-on to a product, hardly anyone will use it. (How many people you know sign or encrypt their email?) It needs to be built in by default. And the more our privacy is invaded by advertisers, ISPs, search engines, phishers, monopolistic content owners and the like, the more that <a href="http://www.shirky.com/writings/riaa_encryption.html" title="">drives the adoption of actually-secure software</a> by end-users.</p>

	<p>Having to go out-of-band and swap three-word verification codes with your buddies is an inconvenience. But you only have to do it once with any particular person; after that, Cloudy remembers their key. And I will probably, in the future, put in some form of transitive trust: if I haven&#8217;t verified you, but I verified Jean-Claude and he&#8217;s verified you [and signed a cert to that effect] then I&#8217;ll decide to trust you too.</p>

	<p><strong>Next: Cloudy Gossip.</strong></p>
 ]]></content:encoded>
			<wfw:commentRSS>http://mooseyard.com/Jens/2008/04/cloudy-verification/feed/</wfw:commentRSS>
		</item>
		<item>
		<title>Discussing the SDK-that-dare-not-speak-its-name</title>
		<link>http://mooseyard.com/Jens/2008/04/discussing-the-sdk-that-dare-not-speak-its-name/</link>
		<comments>http://mooseyard.com/Jens/2008/04/discussing-the-sdk-that-dare-not-speak-its-name/#comments</comments>
		<pubDate>Thu, 24 Apr 2008 19:12:38 +0000</pubDate>
		<dc:creator>Jens Alfke</dc:creator>
		
	<category>Uncategorized</category>
		<guid isPermaLink="false">http://mooseyard.com/Jens/2008/04/discussing-the-sdk-that-dare-not-speak-its-name/</guid>
		<description><![CDATA[	WHEREAS the iP&#8226;&#8226;&#8226;e SDK is technically under NDA (even though anyone in the world can sign up and download it); and

	WHEREAS most members of Apple developer mailing lists are aware that we are not supposed to discuss anything about the iP&#8226;&#8226;&#8226;e SDK on those lists; and

	WHEREAS I and almost everyone else have good-naturedly gone along [...]]]></description>
			<content:encoded><![CDATA[	<p><span class="caps">WHEREAS</span> the iP&#8226;&#8226;&#8226;e <span class="caps">SDK</span> is technically under <span class="caps">NDA </span>(even though anyone in the world can sign up and download it); and</p>

	<p><span class="caps">WHEREAS</span> most members of Apple developer mailing lists are aware that we are not supposed to discuss anything about the iP&#8226;&#8226;&#8226;e <span class="caps">SDK</span> on those lists; and</p>

	<p><span class="caps">WHEREAS I</span> and almost everyone else have good-naturedly gone along with this annoying ban; but</p>

	<p><span class="caps">WHEREAS</span> more than a month has gone by, and there is still no forum for such discussion, not even one lousy mailing list, indicating that the secrecy-obsessed individuals at Apple who decreed this policy have not seen fit to do anything to help solve the problems it created for others; and</p>

	<p><span class="caps">WHEREAS</span> it is pretty plainly in Apple&#8217;s best interests for iP&#8226;&#8226;&#8226;e developers to pool their expertise so as to improve their products; then</p>

	<p><span class="caps">THEREFORE</span> be it resolved that I, for one, no longer find it important to humor the wishes of said bureaucrats.</p>

	<p>*****</p>

	<p>I don&#8217;t know that I&#8217;m about to start any new threads asking questions, as I&#8217;ve put my development for said platform on hold until such time as I am granted the ability to actually install software on it; but I&#8217;m not going to hold my tongue if questions come up on mailing lists, which I happen to know the answer to.</p>

	<p>If worst comes to worst we can create a mailing list somewhere else and just not publicize it. (Of course, if that&#8217;s already been done, then can someone please email me the info?)</p>
 ]]></content:encoded>
			<wfw:commentRSS>http://mooseyard.com/Jens/2008/04/discussing-the-sdk-that-dare-not-speak-its-name/feed/</wfw:commentRSS>
		</item>
		<item>
		<title>Why They&#8217;re Doing This</title>
		<link>http://mooseyard.com/Jens/2008/04/why-theyre-doing-this/</link>
		<comments>http://mooseyard.com/Jens/2008/04/why-theyre-doing-this/#comments</comments>
		<pubDate>Sun, 20 Apr 2008 00:49:43 +0000</pubDate>
		<dc:creator>Jens Alfke</dc:creator>
		
	<category>Social Software</category>
	<category>Web</category>
		<guid isPermaLink="false">http://mooseyard.com/Jens/2008/04/why-theyre-doing-this/</guid>
		<description><![CDATA[I don't want to make a habit of replying on my blog to posts on other blogs, because (a) it's dorky in an autistic way, and (b) it only encourages the annoying practice of blogs that don't allow comments.

But I've seen a couple of references now to "Dean Allen's complaint about sites that offer multiple RSS feed formats":http://textism.com/2008/04/19/please.stop.doing.this, none offering comments, and since it directly relates to my past job monkeying with feeds I feel like I should answer.

There are two reasons why a web page would advertise multiple feeds.]]></description>
			<content:encoded><![CDATA[	<p>I don&#8217;t want to make a habit of replying on my blog to posts on other blogs, because (a) it&#8217;s dorky in an autistic way, and (b) it only encourages the annoying practice of blogs that stick their fingers in their ears.</p>

	<p>But I&#8217;ve seen a couple of references now to <a href="http://textism.com/2008/04/19/please.stop.doing.this" title="">Dean Allen&#8217;s complaint about sites that offer multiple <span class="caps">RSS</span> feed formats</a>, but no place to post a comment about it; and since it directly relates to my past job monkeying with feeds, I feel like I should answer.</p>

	<p>There are two reasons why a web page would link to multiple feeds.</p>

	<ol>
		<li>To support feed-readers that don&#8217;t understand every format. The <span class="caps">XML</span>-syndication-format field has a totally ludicrous history of incompatible versionitis, and the only format that&#8217;s actually sanely designed (Atom) is new enough that for a while some major clients, such as BlogLines, didn&#8217;t support it. So it&#8217;s reasonable during such a transition period to generate both formats.</li>
			<li>Because there might be feeds with different content. Some sites offer headlines-only feeds and full-content feeds. Some blogs offer a feed of all the comments on one post, as well as the usual feed of all the posts. Some wikis offer a feed of revisions of a single page.</li>
	</ol>

	<p>The problem for a client like Safari is that there isn&#8217;t a clear way to tell the difference between those two cases. When a site offers multiple feeds (by including multiple <span class="caps">LINK</span> tags in its <span class="caps">HTML</span>) do they have the same human-readable content or not? Answering that would require, at a minimum, downloading and parsing both feeds and trying to match the contents of the entries.</p>

	<p>The first version of Safari <span class="caps">RSS </span>(in Mac <span class="caps">OS X 10</span>.4) went for simplicity, by indicating in its UI only that there <em>was</em> a feed available, not how many. If the user pressed the &#8220;RSS&#8221; button, it would pick one of the feeds to switch to. The heuristic was to give priority both to order (picking the first feed listed) but also to format (preferring Atom to <span class="caps">RSS</span>, because the format is much better-defined and less prone to nasty ambiguities and parsing problems.)</p>

	<p>But some people complained that, on sites that offered multiple feeds with different content (like articles vs. comments) you couldn&#8217;t use the button to pick which one you wanted. So in Mac <span class="caps">OS X 10</span>.5 we decided to make the button into a pop-up if there were multiple feeds listed.</p>

	<p>The problem now is that you more commonly end up being offered a choice between two formats with identical content, which is what Dean and others complain about, because most blog engines are configured to offer both Atom and <span class="caps">RSS </span>(and sometimes multiple flavors of each.)</p>

	<p>Here&#8217;s where I <em>could</em> talk about the design problems of the feed auto-discovery mechanism, and describe better ways it could have been designed to support multiple feed formats [hint: <span class="caps">HTTP 1</span>.1 already supports content-type negotiation] or even future improvements that could be made to reduce this annoyance in the future. But you know what? I don&#8217;t care about Trendy <span class="caps">XML</span>-Based Syndication Formats anymore. OK, all right, I clearly care enough to write a multi-paragraph explanation and post it to my blog. But not any more than that.</p>
 ]]></content:encoded>
			<wfw:commentRSS>http://mooseyard.com/Jens/2008/04/why-theyre-doing-this/feed/</wfw:commentRSS>
		</item>
		<item>
		<title>Cloudy Networking</title>
		<link>http://mooseyard.com/Jens/2008/04/cloudy-networking/</link>
		<comments>http://mooseyard.com/Jens/2008/04/cloudy-networking/#comments</comments>
		<pubDate>Fri, 18 Apr 2008 00:37:15 +0000</pubDate>
		<dc:creator>Jens Alfke</dc:creator>
		
	<category>Social Software</category>
	<category>Computers</category>
		<guid isPermaLink="false">http://mooseyard.com/Jens/2008/04/cloudy-networking/</guid>
		<description><![CDATA[Next I need to talk about networking; having an identity and minting certificates isn't very interesting until you can connect to someone else.

When one Cloudy peer wants to communicate with another one, it opens a TCP socket to its IP address —

[Hang on, there are two issues I suddenly glossed over in that last phrase. First, how did this peer find out the others' IP address? These are just random computers, not servers, so they don't have their own domain names or even stable addresses.]]></description>
			<content:encoded><![CDATA[	<p>Next I need to talk about networking; having an identity and minting certificates isn&#8217;t very interesting until you can connect to someone else.</p>

	<p><img src="http://mooseyard.com/Jed/Cloudy/Too%20Much%20Electricity.jpg" width=600/></p>

	<h2>Point-to-Point Communications.</h2>

	<p>When one Cloudy peer wants to communicate with another one, it opens a <span class="caps">TCP</span> socket to its IP address &#8212;</p>

	<p>[Hang on, there are two issues I suddenly glossed over in that last phrase. First, how did this peer find out the others&#8217; IP address? These are just random computers, not servers, so they don&#8217;t have their own domain names or even stable addresses. This is indeed a problem with any unstructured peer-to-peer network, but the solution involves things I won&#8217;t get to until the next installment, in an unfortunately but necessary violation of layering.]</p>

	<p>[Oh, and issue #2 is that most home computers are now behind <a href="http://en.wikipedia.org/wiki/Network_address_translation" title="">Network Address Translators</a> (usually some kind of WiFi base station or broadband router), which means they don&#8217;t have their own real IP addresses and can&#8217;t receive incoming connections. Fortunately, most NATs now <a href="http://en.wikipedia.org/wiki/UPNP" title="">support</a> <a href="http://en.wikipedia.org/wiki/NAT_Port_Mapping_Protocol" title="">protocols</a> that allow clients to open listening ports to the outside world, and doubly fortunately, Mac <span class="caps">OS X 10</span>.5 includes <a href="http://developer.apple.com/documentation/Networking/Reference/DNSServiceDiscovery_CRef/dns_sd/CompositePage.html#//apple_ref/c/func/DNSServiceNATPortMappingCreate" title="">an <span class="caps">API</span> for making such connections</a>. Cloudy opens such a port whenever it finds itself behind a <span class="caps">NAT</span>.]</p>

	<p>&#8212; and runs a protocol called <span class="caps">BEEP</span> over the socket.</p>

	<h3><span class="caps">BEEP</span>.</h3>

	<p><a href="http://beepcore.org" title=""><span class="caps">BEEP</span></a> is a sort of generic application protocol that multiplexes a <span class="caps">TCP</span> socket into multiple virtual channels, each of which can send and receive binary messages. It&#8217;s very handy for designing your own protocols, since it lets you focus on the high-level tasks of defining how your messages are encoded and when to send them and what to send in response.</p>

	<p>I&#8217;m using an open-source (LGPL) implementation of <span class="caps">BEEP</span> called <a href="http://vortex.aspl.es/" title="">Vortex</a>. Its <span class="caps">API</span> is in C, but I&#8217;ve written a Foundation-level Objective-C wrapper around it. (I&#8217;ll probably open-source that code sometime.)</p>

	<p>One nice feature of <span class="caps">BEEP</span> and Vortex is that they handle <span class="caps">SSL</span> for you. The <span class="caps">BEEP</span> protocol lets the two peers negotiate what type of <span class="caps">SSL</span> they support, before switching over to it, almost transparently to the application code. Since the first thing that happens in <span class="caps">SSL</span> setup is exchanging certificates, each instance of Cloudy immediately learns the identity of the peer it&#8217;s connecting to. (In normal <span class="caps">HTTP</span>-over-SSL, only the server has a certificate and the browser remains anonymous; but <span class="caps">SSL</span> supports bidirectional authentication and Cloudy uses it.) Unlike most client-server protocols, Cloudy has no need for a login: each peer has seen the others&#8217; public key, and the ability to use that public key proves that the other peer owns the private key, and hence that identity.</p>

	<p>So now the two peers are connected, they&#8217;ve identified and authenticated each other, and their communication channel is encrypted. They can now open <span class="caps">BEEP</span> channels and send each other messages across them. The primary types of messages Cloudy sends are signed objects (certificates); I&#8217;ll get into those later.</p>

	<h2>Local Area Discovery (Bonjour).</h2>

	<p>As you&#8217;d expect, Cloudy also uses <a href="http://en.wikipedia.org/wiki/Bonjour_%28software%29" title="">Bonjour</a>. This is somewhat orthogonal to <span class="caps">BEEP </span>&#8212;Bonjour&#8217;s a <em>discovery</em> protocol, so its main purpose is to let peers on the same <span class="caps">LAN</span> find out each others&#8217; names and addresses. But Bonjour does support a thing called a <a href="http://files.dns-sd.org/draft-cheshire-dnsext-dns-sd.txt" title=""><span class="caps">TXT </span>Record</a>, which is a small chunk of arbitrary metadata that a service can associate with itself. For example, iChat stores your availability and status message in its Bonjour <span class="caps">TXT</span> record, which is how its Bonjour buddy list can show that information for everyone on your network.</p>

	<p>Remember the &#8220;CallingCard&#8221; I used as an example of a signed object in the last post? Well, that&#8217;s what Cloudy puts in its <span class="caps">TXT</span> record. The CallingCard contains your availability and status, but what&#8217;s really important is that it contains your public key, which is your identity.</p>

	<p>So Bonjour solves, at least on a <span class="caps">LAN</span>, the discoverability problem I pointed out at the start of this post. At this point, if the peer you want to send messages to is on the same network, Cloudy can easily find it via Bonjour, open a <span class="caps">BEEP</span> socket, and authenticate over <span class="caps">SSL</span>.</p>

	<p>What&#8217;s more, Cloudy&#8217;s view of who&#8217;s on the network is actually <em>trustworthy</em>. The CallingCard is <em>signed</em> with your public key, proving that you created it. iChat&#8217;s Bonjour IM has always been insecure in that there&#8217;s no way to tell whether anyone else is who they say they are: all you know about someone is their name, which they can easily change to anything they want by editing their address book. In Cloudy, on the other hand, once you&#8217;ve communicated with someone once, your app remembers their public key, and it can identify in the future whether a peer appearing on the network is that person or not. (To make this clear in the UI, the name of anyone you haven&#8217;t previously vouched for is shown &#8220;in quotes&#8221;.)</p>

	<p>&#8212; Oops, I just skipped over a tricky problem again. The <em>first</em> time you connect to someone, how do you establish that the digital identifier you&#8217;re communicating with corresponds to the human being you think it is? This is surprisingly difficult to do, because it&#8217;s vulnerable to what cryptographers call the <a href="http://en.wikipedia.org/wiki/Man_in_the_middle_attack" title="">man-in-the-middle attack</a>. It&#8217;s worth a post by itself&#8230;</p>

	<p><b>Next: <a href="http://mooseyard.com/Jens/2008/04/cloudy-verification/.</b" title="">Verifying Identities</a>></p>
 ]]></content:encoded>
			<wfw:commentRSS>http://mooseyard.com/Jens/2008/04/cloudy-networking/feed/</wfw:commentRSS>
		</item>
		<item>
		<title>Cloudy Identity</title>
		<link>http://mooseyard.com/Jens/2008/04/cloudy-identity/</link>
		<comments>http://mooseyard.com/Jens/2008/04/cloudy-identity/#comments</comments>
		<pubDate>Wed, 16 Apr 2008 06:04:21 +0000</pubDate>
		<dc:creator>Jens Alfke</dc:creator>
		
	<category>Ideas</category>
	<category>Social Software</category>
	<category>Computers</category>
		<guid isPermaLink="false">http://mooseyard.com/Jens/2008/04/cloudy-identity/</guid>
		<description><![CDATA[At the root of Cloudy is the means for creating and establishing identity. A lot of peer-to-peer systems treat the peers mostly as interchangeable anonymous nodes, often deliberately so, but Cloudy is a _social_ system. Your Cloudy identity is simply a public key, currently 2048-bit RSA, generated the first time you launch the program. (The matching private key is stored securely in the Mac OS Keychain.) From then on, that public key uniquely identifies you.]]></description>
			<content:encoded><![CDATA[	<p><i>Continuing from <a href="http://mooseyard.com/Jens/2008/04/cloudy-as-buzzwords/" title="">the previous Cloudy post</a> &#8230; </i></p>

	<p><img src="http://mooseyard.com/Jed/Cloudy/WhatAStrangePlace.jpg" alt="" border="0" /></p>

	<p>At the root of Cloudy is the means for creating and establishing identity. A lot of peer-to-peer systems treat the peers mostly as interchangeable anonymous nodes, often deliberately so, but Cloudy is a <em>social</em> system.</p>

	<h2>Quick Crypto Recap.</h2>

	<p>The identity and security layers of Cloudy are tightly intertwined, because identity <em>without</em> security is useless. And security is accomplished entirely through cryptography, because the centralized alternatives like locking all of your servers up in a closet don&#8217;t apply. Cloudy doesn&#8217;t do anything new cryptographically (wisely so), but for the benefit of those who aren&#8217;t familiar with it, here&#8217;s a superficial overview of the off-the-shelf tools I&#8217;m using:</p>

	<h3>Cryptographic Hashes, or, Digests.</h3>

	<p>Like any hash algorithm, a cryptographic hash converts a block of data of arbitrary length into a short fixed-length output; the same input always produces the same output; and even the slightest change to the input should produce an entirely different output. Unlike a regular hash, two different inputs should <em>never</em> result in the same hash output. (That&#8217;s &#8220;never&#8221; in the practical sense: collisions are mathematically inevitable, but it should impractically long, ideally millions of years, to find one.) And it should be infeasible to identify anything about the original data given only the hash.</p>

	<p>Cryptographic hashes are rather weird and neat. I&#8217;ve previously called them &#8220;the Dewey Decimal numbers for the Universal Library&#8221;. They also remind me of the scene in the old TV cartoon of &#8220;The Cat In The Hat&#8221;, where the Cat and kids are running around labeling every object in the house with cryptic identifiers like &#8220;QW-X12&#8221;. Digests are a bit longer than that (SHA-1 outputs 160 bits, i.e. 20 bytes) but it&#8217;s still very handy to have a compact label to identify any conceivable chunk of data.</p>

	<h3>Public/Private Key Pairs, or, Asymmetric Keys</h3>

	<p>A regular cipher uses what&#8217;s called a <em>symmetric key</em>. The sender and receiver choose a single key that they both have to know, but keep secret. The sender inserts the key into the encryption algorithm and feeds the message in, and out comes the encrypted form. The receiver then inserts the same key into the decryption algorithm, feeds the encrypted data into it, and out comes the original message. The point is that they use a single key, and the one who generates the key has to somehow convey it secretly to the other party before they can use it, leading to an obvious chicken-and-egg problem.</p>

	<p>Asymmetric encryption algorithms, the best known of which is <span class="caps">RSA</span>, use <em>two</em> keys. The keys are generated together in a matched pair. When one key is used by the sender to encrypt data, it takes the <em>other</em> key of the pair to decrypt it.</p>

	<p>The genius of this is that you only have to keep one of the keys secret. The other one can be given away freely and is called the &#8220;public key&#8221;. If someone else has a copy of your public key, s/he can use your public key to encrypt a message to you. Remember, it doesn&#8217;t matter how people get your key; you can read it to them over the phone, email it, print it on a billboard. The encrypted message still can&#8217;t be read by anyone else, because only you have the matching <em>private</em> key to decrypt it. In other words, it becomes possible to send secure messages without having to share a secret key in advance.</p>

	<h3>Digital Signatures.</h3>

	<p>There&#8217;s another use for key-pairs. You can use your private key to generate a <em>signature</em> of a message, a small block of data that can be attached to it. Anyone who has your public key can then use it to <em>verify</em> the signature, i.e. prove that you generated the signature of that message with your private key. No one else could have generated the signature. In other words, as the name &#8220;signature&#8221; implies, this is a way to unforgeably mark a document to show your authorship or approval.</p>

	<p>(A digital signature is really just a cryptographic hash of the document, which is then encrypted with the private key. To verify someone&#8217;s signature you just decrypt it using their public key, then compute your own hash of the document and compare the two results.)</p>

	<h3>Digital Certificates.</h3>

	<p>A <em>certificate</em> is, in general, just a signed document that attests to something. Usually it&#8217;s vouching for the identity of the owner of another key; something like &#8220;the owner of the public key 3FD8B640 is Joe-Bob Briggs, of Dallas TX, joebob@example.com. Signed, Verisign.com.&#8221; This is the standard way that trust gets spread around in a distributed system.</p>

	<h2>Back To Cloudy: Generating An Identity.</h2>

	<p>Your Cloudy identity is simply a public key, currently 2048-bit <span class="caps">RSA</span>, generated the first time you launch the program. (The matching private key is stored securely in the Mac <span class="caps">OS </span>Keychain.) From then on, that public key uniquely identifies you. It&#8217;s <em>unique</em> because it&#8217;s randomly picked from a space so large that the possibility of collisions is for all practical purposes zero. It <em>identifies</em> you because it can be used by others to verify anything you signed with the matching private key.</p>

	<p>[2048 bits (256 bytes) is somewhat bulky to use as an identifier; so a public key can be run through an <span class="caps">SHA</span>-1 hash and converted into a 160-bit (20-byte) digest form that&#8217;s, for all intents and purposes, equally unique. (2<sup>160</sup> is about 10<sup>48</sup>, nearly the number of atoms in the Earth.) The digest, however, has no cryptographic value to a recipient who doesn&#8217;t already have the public key, so it&#8217;s not a <em>secure</em> identifier by itself.]</p>

	<p>The first thing the new key is used for is to mint an <span class="caps">SSL</span> certificate, which will be used for identification when you communicate with other peers over <span class="caps">SSL</span> sockets. It&#8217;s a &#8220;self-signed&#8221; certificate because it doesn&#8217;t contain a signature from any trusted higher authority (there aren&#8217;t any). But that&#8217;s OK: when Cloudy peers connect, they only need to make sure of the identities they&#8217;re contacting, which are literally just the public keys in the certificates.</p>

	<h2>Cloudy Certificates.</h2>

	<p><span class="caps">SSL</span> unfortunately uses an awkward and complex certificate format called X.509, which is one of two evolutionary relics left over from a long-dead overly-ambitious network architecture called X.500. (The other one is the <span class="caps">LDAP</span> directory protocol; the fact that the L stands for &#8220;Lightweight&#8221; gives you an idea of how comparatively elephantine X.500 must have been.) Most cryptographic experts seem to hate X.509: Ferguson and Schneier&#8217;s <a href="http://www.amazon.com/Practical-Cryptography-Niels-Ferguson/dp/0471223573/ref=pd_bbs_sr_1?ie=UTF8&#038;s=books&#038;qid=1208239300&#038;sr=8-1" title="">Practical Cryptography</a> flatly recommends avoiding it if possible, and Peter Gutmann&#8217;s <a href="http://www.cs.auckland.ac.nz/~pgut001/pubs/pkitutorial.pdf" title="">overview</a> is a masterful takedown, with sarcasm worthy of Doug Piranha.</p>

	<p>After spending a week painfully figuring out how to generate a goddamn trivial self-signed cert, even with the help of state-of-the-art system APIs, I could understand what the experts meant. I didn&#8217;t want to use X.509 anymore. And it wasn&#8217;t flexible enough anyway, since it was designed around the idea of hierarchical authorities. Unfortunately I didn&#8217;t have a choice for <span class="caps">SSL</span>, but I went with an alternate approach for all the other certs Cloudy peer use when talking to each other.</p>

	<p>I really liked the approach taken by <a href="http://people.csail.mit.edu/rivest/sdsi11.html" title=""><span class="caps">SDSI</span></a>, a distributed identity system from about 10 years ago that never took off. It defined a simple textual syntax for certificates. <span class="caps">SDSI</span> used <span class="caps">LISP</span>-like S-expressions as the syntax, but the details aren&#8217;t important&#8212;I took the abstract concepts and went with something I found more readable. I tried <a href="http://json.org" title=""><span class="caps">JSON</span></a> first, but found it too limiting, so I ended up using <a href="http://yaml.org" title=""><span class="caps">YAML</span></a>.</p>

	<p>[YAML is a data serialization syntax; it&#8217;s language-agnostic, but most popular in the Ruby community. Its main advantages over <span class="caps">JSON </span>(or <span class="caps">OS X</span> property lists) are a richer set of data types, custom typing for collections (i.e. you can say &#8220;this array is a Rectangle&#8221; or &#8220;this dictionary is a Person&#8221;), and the ability to represent arbitrary object graphs, not just trees. You can think of it as being like a pretty syntax for Cocoa object archives or Java object serialization.]</p>

	<p>All I had to do (aside from writing a good Cocoa wrapper <span class="caps">API</span> for <span class="caps">YAML</span>) was define a schema for representing things like keys and signatures in <span class="caps">YAML</span>. Then I could use those to define my own signed certificate objects.</p>

	<h3><span class="caps">A YAML </span>Certificate Example.</h3>

<code><pre>
.  --- !cloudy/CallingCard
.  host: 76.191.199.123
.  prof: 229474364
.  port: 60507
.  stat: 4
.  signature: !cloudy/Signature
.    signed: !binary |-
.      oVCuVVlXPEdRPR+gy1k/UNOXtwvcN7LNpK6xTcA/hmlKh6uIT56E19LxWzA7POxm
.      nhc351NVdoKC9XaUVsaZYDOnp2wWEWLUtdYYA8I++NZZIVlCHOjHCHr7mcfNcceD
.      v+15RE9vguQ/PO1yaOU4DlviYt75y7xKMRs5REbZss6E/mr+0r1KE+f73dpHCVoD
.      SW0azTD43pug2Pyh2Kar0GHXQcS4Iq/Y2nRFv7wyLUUmyVA7XI665a8QjMCiec2w
.      0PqQ32FwGBYkH/iR/cfmaKjuwjAbW/qo7NoTH6WSFQy2ua/PVQs9B+dyjnZ5Z30E
.      rnl9UTCVwjUmCc8J4hoaTQ==
.    digest: !cloudy/Digest pFCzUK7yuO0dWtm0oATB7ag6vj0=
.    date: 2008-04-15 21:55:46.830 -07:00
.    expires: 21600
.    signer: !cloudy/Person
.      nickname: snej
.      publicKey: !cloudy/PublicKey
.        algorithm: 42
.        format: 1
.        bits: 2048
.        data: !binary |-
.          MIIBCgKCAQEApP6/D5aZm7nYfGwSMD3xQCCWw+XeU1NmZE7N/7eHvQlCUHMS8Aac
.          Wh+s/PlPd1o7k+YePhoHnc1vR9uAfWm8iowiUU0RluUNxY0dRkTauRqeYM6//s+5
.          ZXuh27pDDq2BgQYPL6EOp2UtWSQ/ojQjqX2/sGMkZ3k+uYiu1ZGQS2s0xTHPkgtu
.          VI+Kg2TBY/28zAG4H/seUHNAP+frlpX+fizSC2oYNdREpEcVcVacHMQGwrj3mAr7
.          g/LpJTnWgZhiJYvp7c4MkAYfHOIbKIXeXrF8oOz0EwgwSp0ZWkezuIYa4BMAns52
.          WYK3LooQ+GttPIdVhSzzhLlY3psLeOf6nQIDAQAB
</pre></code>

	<p>This represents a &#8220;CallingCard&#8221; object, which a peer broadcasts in order to tell other peers that it&#8217;s online, and where it is on the network and what it&#8217;s current state is. (One of the places a CallingCard appears is in the <span class="caps">TXT</span> record of Cloudy&#8217;s Bonjour service.)</p>

	<p>The syntax is more complex than <span class="caps">JSON</span>, but still pretty easy to read:</p>

	<ul>
		<li>The first line says this is a dictionary structure whose higher-level type is &#8220;cloudy/CallingCard&#8221; (this gets mapped to the CallingCard class in my code.)</li>
		<li>The next four lines describe four key/value pairs in the dictionary. In a CallingCard these represent the IP address, port number, timestamp of the user&#8217;s current &#8220;profile&#8221;, and online status (4 = &#8220;Available&#8221;). The keys are four letters long just to save some room and because I get nostalgic for OSTypes sometimes.</li>
		<li>Line 6 assigns the &#8220;signature&#8221; key to a nested dictionary of type &#8220;cloudy/Signature&#8221;. This dictionary is the digital signature of the enclosing object.</li>
		<li>The &#8220;signed&#8221; attribute of the signature is the raw <span class="caps">RSA</span> signature data.</li>
		<li>The &#8220;digest&#8221; attribute is the <span class="caps">SHA</span>-1 digest of the object being signed, in this case the enclosing CallingCard, <em>ignoring the &#8220;signature&#8221; attribute</em>.</li>
		<li>The &#8220;date&#8221; attribute is the timestamp of the moment the signature was generated.</li>
		<li>The &#8220;expires&#8221; attribute is the lifetime of the signature, in seconds, starting from the &#8220;date&#8221;. After this interval the signature expires, and the signed object loses its validity and will generally be deleted, or at least not passed on to other peers anymore.</li>
		<li>The &#8220;signer&#8221; attribute is a cloudy/Person object, the identity who generated the signature.</li>
		<li>&#8220;nickname&#8221; is a brief human-readable name for this identity. It doesn&#8217;t really mean anything; it&#8217;s just useful as a default name to display (like an <span class="caps">AIM</span> handle in a buddy list) if the local user hasn&#8217;t set up a customized name.</li>
		<li>&#8220;publicKey&#8221; is the identity&#8217;s public key, the actual unique identifier.</li>
		<li>&#8220;algorithm&#8221; identifies the type of key (RSA), &#8220;format&#8221; identifies the format of the key data (PEM, I think), &#8220;bits&#8221; is the number of bits in the key, and &#8220;data&#8221; is the key data itself.</li>
	</ul>

	<p>This does look a bit verbose when written out, but of course usually you&#8217;d never see this unless you were debugging something. (And it compresses well, by about 50% using gzip.) One space-saving feature that doesn&#8217;t show up here is that, if the same object appears more than once, it&#8217;s only written out once; after that it appears as a short reference back to the definition. So if <span class="caps">I YAML</span>-encode an array of signed objects (which is very common), my cloudy/Person data only appears once.</p>

	<h3>A Nasty Detail: Canonical Form</h3>

	<p>I glossed over an important detail: to sign an object you have to compute a digest of it, and to compute a digest you have to be able to express the object as raw data. Clearly the raw data in this case is the <span class="caps">YAML</span> encoding of the object, right?</p>

	<p>The catch is that in <span class="caps">YAML</span>, as with any human-readable syntax, there are many different ways to write out the same object. I can change the indentation, I can change the line breaks, I can list dictionary attributes in a different order. Any of those changes causes the resulting digest to look completely different.</p>

	<p>This is a real problem because, if I read a signed object from <span class="caps">YAML</span> into a native object and then write it back out to <span class="caps">YAML</span>, it&#8217;s likely to come out slightly differently. For example, if I write it as part of an array, then as a nested element its lines will be indented. Also, the ways dictionaries are stored in hashtables mean that their keys come out in unpredictable orders when iterated. But if that happens, the digest changes, which invalidates the signature.</p>

	<p>So any certificate syntax has to define a single standard (&#8220;canonical&#8221;) encoding of an object into binary data. In my <span class="caps">YAML</span> code I had to enable a &#8220;canonical mode&#8221; that, when turned on, causes a specific set of spacing rules to be used, dictionary and set entries to be written in alphabetical order, et cetera. This mode isn&#8217;t normally used, but it has to be turned on when computing the digest of an object, in order to sign it or to verify a signature.</p>

	<p>[Incidentally, one of the reasons that digital signatures aren&#8217;t being used much in the various trendy <span class="caps">XML</span>-based data formats, like <span class="caps">RSS</span> and Atom, is that <span class="caps">XML</span> is much more difficult to canonicalize. I don&#8217;t understand all of the details, but they looked nasty enough that I was glad enough to rule out using <span class="caps">XML</span>.]</p>

	<h3>Verifying A Signed Object.</h3>

	<p>When you verify the signature of a block of <span class="caps">YAML</span> like the above, you have to do this:</p>

	<ol>
		<li>Parse the <span class="caps">YAML</span> into a graph of native objects.</li>
			<li>Take the root Signed object, remove the &#8220;signature&#8221; attribute, and write it back into <span class="caps">YAML</span> in &#8220;canonical mode&#8221;.</li>
			<li>Compute the digest of that canonical <span class="caps">YAML</span>.</li>
			<li>Compare the digest with the &#8220;digest&#8221; attribute of the Signature. If it doesn&#8217;t match, the object&#8217;s been tampered with (or damaged) and should be ignored.</li>
			<li>Otherwise, write the &#8220;Signature&#8221; object back into canonical <span class="caps">YAML</span> and compute the digest.</li>
			<li>Encrypt that digest using the public key in the &#8220;signer.publicKey&#8221; attribute.</li>
			<li>Compare the result with the &#8220;signed&#8221; attribute. If it doesn&#8217;t match, the signature was forged (or damaged.)</li>
			<li>Otherwise, the signature is valid and the outer Signed object can be treated as being definitively created by the Person listed in the Signature.</li>
	</ol>

	<h2>Whew.</h2>

	<p>OK, so we can create secure identities, encrypt stuff, and sign arbitrary objects. Now what do we do with them? The CallingCard example above should give you some ideas, but I&#8217;ll go into more detail in the next &#8216;thrilling&#8217; installment.</p>

	<p><strong>Next: <a href="http://mooseyard.com/Jens/2008/04/cloudy-networking/" title="">Networking</a>.</strong></p>
 ]]></content:encoded>
			<wfw:commentRSS>http://mooseyard.com/Jens/2008/04/cloudy-identity/feed/</wfw:commentRSS>
		</item>
	</channel>
</rss>
