<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Ikai Lan says</title>
	<atom:link href="http://ikaisays.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://ikaisays.com</link>
	<description>I say things!</description>
	<lastBuildDate>Mon, 29 Apr 2013 17:56:21 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='ikaisays.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/4820fec29ccabd5dd64221099916101b?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>Ikai Lan says</title>
		<link>http://ikaisays.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://ikaisays.com/osd.xml" title="Ikai Lan says" />
	<atom:link rel='hub' href='http://ikaisays.com/?pushpress=hub'/>
		<item>
		<title>Why &#8220;The Real Reason Silicon Valley Coders Write Bad Software&#8221; is wrong</title>
		<link>http://ikaisays.com/2012/10/09/why-the-real-reason-silicon-valley-coders-write-bad-software-is-wron/</link>
		<comments>http://ikaisays.com/2012/10/09/why-the-real-reason-silicon-valley-coders-write-bad-software-is-wron/#comments</comments>
		<pubDate>Wed, 10 Oct 2012 07:14:24 +0000</pubDate>
		<dc:creator>Ikai Lan</dc:creator>
				<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://ikaisays.com/?p=292</guid>
		<description><![CDATA[There was an article in The Atlantic this morning titled, “The Real Reason Silicon Valley Coders Write Bad Software” with the tagline, “If someone had taught all those engineers how to string together a proper sentence, Windows Vista would be a lot less buggy.” The author, Bernard Meisler, seems to think that the cause of “bad [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=292&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>There was an article in <a href="http://www.theatlantic.com/">The Atlantic</a> this morning titled, <a href="http://www.theatlantic.com/national/archive/2012/10/the-real-reason-silicon-valley-coders-write-bad-software/263377/">“The Real Reason Silicon Valley Coders Write Bad Software”</a> with the tagline, “If someone had taught all those engineers how to string together a proper sentence, Windows Vista would be a lot less buggy.” The author, Bernard Meisler, seems to think that the cause of “bad software” is, and I quote:</p>
<blockquote><p>“But the downfall of many programmers, not just self-taught ones, is their lack of ability to sustain complex thought and their inability to communicate such thoughts. That leads to suboptimal code, foisting upon the world mediocre (at best) software like Windows Vista, Adobe Flash, or Microsoft Word.”</p></blockquote>
<p>Not only are the conclusions of the article inaccurate, it paints a negative portrayal of software engineers that isn’t grounded in reality. For starters, there is a distinction between “bad” software and “buggy” software. Software that is bad tends to be a result of poor usability design. Buggy software, on the other hand, is a consequence of a variety of factors stemming from the complexity of modern software. The largest factor in reducing the number of bugs isn’t going to come from improving the skills of individual programmers, but rather, from instituting quality control processes throughout the software engineering lifecycle.</p>
<h2><strong>Bad Software</strong></h2>
<p>Bad software is software that, for whatever reason, does not meet the expectation of users. Software is supposed to make our lives simpler by crunching data faster, or automating repetitive tasks. Great software is beautiful, simple, and, given some input from a user, produces correct output using a reasonable amount of resources. When we say software is bad, we mean any combination of things: it’s not easy to use. It gives us the wrong output. It uses resources poorly. It doesn’t always run. <strong>It doesn’t do the right thing.</strong> Bugs contribute to a poor user experience, but are not the sole culprit for the negative experiences that users have with software. Let’s take one of the examples Meisler has cited: Windows Vista. A quick search for “why does windows vista suck” in Google turns up these pages:</p>
<p><a href="http://www.thebuzzmedia.com/why-vista-sucks/">http://www.thebuzzmedia.com/why-vista-sucks/</a><br />
<a href="http://www.intelliadmin.com/index.php/2007/01/the-5-sins-of-vista/">http://www.intelliadmin.com/index.php/2007/01/the-5-sins-of-vista/</a><br />
<a href="http://techgage.com/article/top_8_vista_annoyances">http://techgage.com/article/top_8_vista_annoyances</a></p>
<p>Oh, bugs are on there, and they’re pretty serious. We’ll get to that. But what else makes Windows Vista suck, according to those sites? Overly aggressive security prompts. Overly complex menus (we’ll visit the idea of complexity again later, I promise). None of the menus make sense. Changed network interface. Widgets that are too small to be usable shipping with the system. Rearranged menus. Search that only works on 3 folders. Those last few things aren’t bugs, they’re usability problems. The software underneath is, for the most part, what we software engineers call <em>working as intended</em>. Some specification somewhere designed those features to work that way, and the job of the software engineers is, in many companies, to build the software to that specification, as ridiculous as the specification is. One of my coworkers points out that <a href="https://www.twitter.com/MrAlanCooper">Alan Cooper</a>, creator of Visual Basic, wrote a great book about this subject titled, <a href="http://www.amazon.com/Inmates-Are-Running-Asylum-Products/dp/0672326140/ref=la_B001IGLP7M_1_2?ie=UTF8&amp;qid=1349838009&amp;sr=1-2]">“The Inmates are Running The Asylum”</a>. Interestingly enough, his argument is that overly technical people are harmful when they <em>design</em> user interactions, and this results in panels like the Windows Vista search box with fifty different options. But, to be fair, even when user interactions are designed well, just making the software do what the user expects is hard. A simple button might be hiding lots of complex interactions underneath the hood to make the software easy to use. The hilarious and often insightful Steve Yegge, talks about just that in a <a href="http://steve-yegge.blogspot.com/2009/04/have-you-ever-legalized-marijuana.html">tangentially related post about complexity in software requirements</a>. Software is generally called “bad” when it does not do what the <em>user</em> expects, and this is something that is really hard to get right.</p>
<h2>Buggy Software</h2>
<p>Buggy software, on the other hand, is software that does not behave as the <em>software engineer</em> expects or the <em>specification</em> dictates. This is in stark contrast to bad software, which is software that does not behave as the way a <em>user</em> expects. There’s often overlap. A trivial example: let’s suppose an engineer writes a tip calculator for mobile phones that allows a user to enter a dollar amount, and press a “calculate” button, which then causes the application to output 15% of the original bill amount on the screen. Let’s say a user uses the application, enters $100, and presses calculate. The amount that comes out is $1500. That’s not 15%! Or &#8211; the user presses calculate, and the application crashes. The user expects $15, but gets $1500. Or no result, because the application ends before it presents output.</p>
<p>Software is buggy partially because of bad documentation, as Meisler asserts, but not primarily because of it. Software isn’t even buggy because programmers can’t express “complex thoughts”, another of Meisler’s gems; <em>all</em> of programming is the ability to “combine simple ideas into compound ideas”. Software is buggy because of problems stemming out of <em>complexity</em>.</p>
<p>All software is built on top of <em>abstractions</em>. That is, someone else is responsible for abstracting away the details such that a programmer does not need to fully understand another system to be able to use it. As a programmer, I do not need to understand how my operating system communicates with the hard drive to save a file, or <a href="http://computer.howstuffworks.com/hard-disk.htm">how my hard disk manages its read/write heads</a>. I don’t need to write code that says, “move the write head” &#8211; I write code that says, “write this data to a file to this directory”.  Or maybe I just say, “just save this data” and never worry about files or directories. Abstractions in software are kind of like the organizational structure of a large company. The CEO of a large car manufacturer talks to the executive board about the general direction of the company. The executive staff then break these tasks down into more specific focus area goals for their directors, who then break these tasks into divisional goals for the managers, who then break these tasks into team goals and tasks for the individual employees that actually build, design, test, market, and sell the damn things. To make this more convoluted, it’s not all from top to bottom communication, either. There are plenty of cross team interactions, and interactions between layers of management that cross the chain of command.</p>
<p>To say that poor documentation is the primary source of bugs is laughable. Documentation exists to try to make sense of the complexity, but there is no way documentation can be comprehensible in any reasonably complex software with layers of abstraction, because, as Joel Spolsky, founder of Fog Creek Software says, <a href="http://www.joelonsoftware.com/articles/LeakyAbstractions.html">abstractions leak</a>. Programmers cannot know all the different ways abstractions they are depending on will fail, and thus, they cannot possibly report or handle all the different ways the abstraction <em>they</em> are working on will fail. More importantly: programmers cannot know how every possible <em>combination</em> of abstractions they are depending on will produce subtly incorrect results that result in more and more warped results up the abstraction stack. It’s like the <a href="http://en.wikipedia.org/wiki/Butterfly_effect">butterfly effect</a>.  By the time a bug surfaces, a programmer needs to chase it all the way down the rabbit hole, often into code he does not understand.  Documentation helps, but no programmer reads documentation all the way down to the bottom of the stack before he writes code. It’s not commercially feasible for programmers to do this and retain all the information <em>a priori</em>. Non-trivial software is <em>complex as hell</em> underneath the hood, and it doesn’t help that even seemingly simple software often has to turn water into wine just to try to do what a user expects.</p>
<h2>Software engineers and critical thinking</h2>
<p>I don’t deny the importance of writing or critical thinking skills. They are crucial. I wouldn’t be surprised if the same ability to reason through complex thoughts allows people to write well as well as program well. But to assert that writing skills lead to reasoning skills? This is a case of placing the cart before the horse. Meisler is dismissive of the intellectual dexterity needed to write programs:</p>
<blockquote><p>“Most programmers are self-taught and meet the minimum requirement for writing code &#8212; the ability to count to eight”</p></blockquote>
<p>It’s not true. Programming often involves visualizing very abstract data structures, multivariate inputs/outputs, dealing with non-deterministic behavior, and simulating the concurrent interactions between several moving parts in your mind. When I am programming, holding in my mental buffer the state of several objects and context switching several times a second to try to understand how a small change I make in one place will ripple outwards. I do this hundreds of times a session. It’s a near trance-like state that takes me some time to get into before I am working at full speed, and why programming is so damned hard. It’s why I can’t be interrupted and need a contiguous block of time to be fully effective on what Paul Graham calls the <a href="http://www.paulgraham.com/makersschedule.html">maker’s schedule</a>. I&#8217;m not the only one who feels this way &#8211; many other programmers report experiencing <a href="http://psygrammer.com/2011/02/10/the-flow-programming-in-ecstasy/">the mental state that psychologists refer to as &#8220;flow&#8221;</a> when they are performing at their best. <a href="blank"><br />
</a></p>
<h2>How to reduce the incidences of bugs</h2>
<p>Phillipe Beaudoin, a coworker, <a href="https://plus.google.com/107011265359512082824/posts">writes</a>:</p>
<blockquote><p>I like to express the inherent complexity of deep software stacks with an analogy, saying that software today is more like biology than mathematics. Debugging a piece of software is more like an episode of House than a clip from A Beautiful Mind. Building great software is about having both good bug prevention processes (code reviews, tests, documentation, etc.) as well as good bug correction processes (monitoring practices, debugging tools).</p>
<p>Trying to find a single underlying cause to buggy software is as preposterous as saying there is a single medical practice that would solve all of earth&#8217;s health problems.</p></blockquote>
<p>Well said.</p>
<p>I’m disappointed in the linkbait title, oversimplification, and broad sweeping generalizations of Bernard Meisler’s article. I’m disappointed that this is how software engineering is being represented to a mainstream, non-techie audience. It’s ironic that the article totes writing skills, but is poorly structured in arguing a point. It seems to conclude that writing skills are the reason code is buggy. No wait &#8211; critical thinking. Ah! Nope, surprise, writing skills, and a Steve Jobs quote that is used in a <a href="http://folklore.org/StoryView.py?project=Macintosh&amp;story=Saving_Lives.txt">misleading way and taken out of context</a> mixed in for good measure. He argues for the logic of language, but as many of us who also write for fun and profit know, human language is fraught with ambiguity and there’s a lot less similarity between prose and computer programming languages than Meisler would have the mainstream audience believe. I’m sorry, Herr Meisler, but if your article were a computer program, it simply wouldn’t compile.</p>
<p>&#8211; Ikai</p>
<h5><em>Written with special thanks to <a href="https://plus.google.com/107988469357342173268" target="_top">Philippe Beaudoin</a>, <a href="http://www.linkedin.com/in/marvingouw">Marvin Gouw</a>, <a href="http://www.linkedin.com/in/alejandrocrosa">Alejandro Crosa</a>, and <a href="http://www.linkedin.com/pub/tansy-woan/2a/48b/a88">Tansy Woan</a>.</em></h5>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=292&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ikaisays.com/2012/10/09/why-the-real-reason-silicon-valley-coders-write-bad-software-is-wron/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/70564585574a96fa34fbab2a1a6981b0?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">Ikai Lan</media:title>
		</media:content>
	</item>
		<item>
		<title>Clearing up some things about LinkedIn mobile&#8217;s move from Rails to node.js</title>
		<link>http://ikaisays.com/2012/10/04/clearing-up-some-things-about-linkedin-mobiles-move-from-rails-to-node-js/</link>
		<comments>http://ikaisays.com/2012/10/04/clearing-up-some-things-about-linkedin-mobiles-move-from-rails-to-node-js/#comments</comments>
		<pubDate>Fri, 05 Oct 2012 02:34:00 +0000</pubDate>
		<dc:creator>Ikai Lan</dc:creator>
				<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://ikaisays.com/?p=232</guid>
		<description><![CDATA[There&#8217;s an article on highscalability that&#8217;s talking about the move from Rails to node.js (for completeness: its sister discussion on Hacker News). It&#8217;s not the first time this information has been posted. I&#8217;ve kind of ignored it for now (because I didn&#8217;t want to be this guy), but it&#8217;s come up enough times and no one [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=232&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>There&#8217;s an article on highscalability that&#8217;s talking about the <a href="http://highscalability.com/blog/2012/10/4/linkedin-moved-from-rails-to-node-27-servers-cut-and-up-to-2.html">move from Rails to node.js</a> (for completeness: its sister <a href="http://news.ycombinator.com/item?id=4613870">discussion on Hacker News</a>). It&#8217;s not the first time this information has been posted. I&#8217;ve kind of ignored it for now (because I <a href="http://xkcd.com/386/">didn&#8217;t want to be this guy</a>), but it&#8217;s come up enough times and no one has spoken up, so I suppose it&#8217;s up to me to clear a few things up.</p>
<p>I was on the team at LinkedIn that was responsible for the mobile server, and while I wasn&#8217;t the primary contributor to that stack, I built and contributed several things, such as the unfortunate <a href="https://developer.palm.com/webChannel/index.php?packageid=com.linkedin.mobile">LinkedIn WebOS app</a> which made use of the mobile server (and a few features) and much of the <a href="http://blog.linkedin.com/2008/08/19/jdbc-connection-pooling-for-rails-on-glassfish/">initial research behind productionizing JRuby</a> for web applications (I did much more stuff that wasn&#8217;t published). I left LinkedIn in 2009, so I apologize if any new information has surfaced. My hunch is that even if I&#8217;m off, I&#8217;m not off by that much.</p>
<p>Basically: the article is leaving out several facts. We can all learn something from the mobile server and software engineering if we know the full story behind the whole thing.</p>
<p>In 2008, I joined a software engineering team that LinkedIn that was focused on building things outside the standard Java stack. You see, back then, to develop code for linkedin.com, you needed a Mac Pro with 6gigs of RAM just to run your code. And those requirements kept growing. If my calculations are correct, the standard setup for engineers now is a machine with 20 or more gigabytes of RAM just to RUN the software. In addition, each team could only release once every 6 weeks (this has been fixed in the last few years). It was deemed that we needed to build out a platform off the then-fledgling API and start creating teams to get get off the 6 week release cycle so we could iterate quickly on new features. The team I was on, LED, was created for this purpose.</p>
<p>Our first projects was a <a href="http://newin.linkedin.com/">rotating globe that showed off new members joining LinkedIn</a>. It used to run Poly9, but when they got shut down, it looks like someone migrated it to use Google Earth. The second major project was m.linkedin.com, a mobile web client for LinkedIn that would be one of the major clients of our fledgling API server, codenamed PAL. Given that we were building out an API for third parties, we figured that we could eat our own dogfood and build out LinkedIn for mobile phones with browsers. This is 2008, mind you. The <em>iPhone</em> just came out, and it was a very Blackberry world.</p>
<p>The stack we chose was Ruby on Rails 1.2, and the deployment technology was <a href="http://blog.codahale.com/2006/06/19/time-for-a-grown-up-server-rails-mongrel-apache-capistrano-and-you/">Mongrel</a>. Remember, this is 2008. Mongrel was cutting edge Ruby technology. <a href="http://www.modrails.com/">Phusion Passenger</a> wasn&#8217;t released yet (more on this later), and Mongrel was light-years ahead of FastCGI. The problem with Mongrel? It&#8217;s single-threaded. It was deemed that the cost of shipping fast was more important than CPU efficiency, a choice I agreed with. We were one of the first products at LinkedIn to do i18n (well, we only did translations) via <a href="https://github.com/mutoh/gettext">gettext</a>. We deployed using <a href="https://github.com/capistrano/capistrano">Capistrano</a>, and were the first ones to use <a href="http://wiki.nginx.org/Main">nginx</a>. We did a lot of other cool stuff, like experiment with <a href="http://redis.io/">Redis</a>, learn a lot about <a href="http://memcached.org/">memcached</a> in <a href="https://github.com/ikai/jruby-memcache-client">production</a> (nowadays this is a given, but there was a lot of memcached vs EHCache talk back then). Etc, etc. But I&#8217;m not trying to talk about how much fun I had on that team. Well, not primarily.</p>
<p>I&#8217;m here to clear up facts about the post about moving to node.js. And to do that, I&#8217;m going to back to my story.</p>
<p>The iPhone SDK had shipped around that time. We didn&#8217;t have an app ready for launch, but we wanted to build one, so our team did, and we inadvertantly became the mobile team. So suddenly, we decided that this array of Rails server that made API calls to PAL (which was, back then, using a pre-OAuth token exchange technology that was strikingly similar) would also be the primary API server for the iPhone client and any other rich mobile client we&#8217;d end up building, this thing that was basically using Rails rhtml templates. We upgraded to Rails 2.x+ so we could have the respond_to directive for different outputs. Why didn&#8217;t we connect the iPhone client directly to PAL? I don&#8217;t remember. Oh, and we also decided to use OAuth for authenticating the iPhone client. Three legged OAuth, so we also turned those Rails servers into OAuth providers. Why did we use 3-legged OAuth? Simple: we had no idea what we were doing. I&#8217;LL ADMIT IT.</p>
<p>Did I mention that we hosted outside the main data centers? This is what <a href="http://www.youtube.com/watch?v=G9bbfL3WjfA">Joyent talks about when they say they supplied LinkedIn with hosting</a>. They never hosted linkedin.com proper on Joyent, but we had a long provisioning process for getting servers in the primary data center, and there were these insane rules about no scripting languages in production, so we decided it was easier to adopt an outside provider when we needed more capacity.</p>
<p>Here&#8217;s what you were accessing if you were using the LinkedIn iPhone client:</p>
<p>iPhone -&gt; m.linkedin.com (running on Rails) -&gt; LinkedIn&#8217;s API (which, for all intents and purposes, only had one client, us)</p>
<p><strong>That&#8217;s a cross data center request, guys</strong>. Running on single-threaded Rails servers (every request blocked the <span style="text-decoration:underline;">entire</span> process), running Mongrel, leaking memory like a sieve (this was mostly the fault of gettext). The Rails server did some stuff, like translations, and transformation of XML to JSON, and we tested out some new mobile-only features on it, but beyond that it didn&#8217;t do a lot. It was a little more than a proxy. A proxy with a maximum concurrency factor dependent on how many single-threaded Mongrel servers we were running. The Mongrel(s), we affectionately referred to them, often bloated up to 300mb of RAM each, so we couldn&#8217;t run many of them.</p>
<p>At this time, I was busy productionizing JRuby. JRuby, you see was taking full advantage of Rails&#8217; ability to serve concurrent requests using JVM concurrency. In addition, JRuby outperformed MRI in almost every real benchmark I threw at it &#8211; there were maybe 1 or 2 specific benchmarks when it didn&#8217;t. I knew that if we ported the mobile server to JRuby, we could have gotten more performance and gotten way more concurrency. We would have kept the same ability to deploy fast with the option to in-line into many of the Java libraries LinkedIn was using.</p>
<p>But we didn&#8217;t. Instead, the engineering manager at the time ruled in favor of Phusion Passenger, which, to be fair, was an easier port than JRuby. We had come to depend on various native extensions, gettext being the key one, and we didn&#8217;t have time to port the translations to something that was JRuby friendly. I was furious, of course, because I had been evangelizing JRuby as the best Ruby production environment and no one was listening, but that&#8217;s a different story for a different time. Well, maybe some people listened; those <a href="http://blog.jruby.org/2012/01/jruby-at-square/">Square guys</a> come to mind.</p>
<p>This was about the time I left LinkedIn. As far as I know, they didn&#8217;t build a ton more features. Someone told me that one of my old teammates suddenly became fascinated with node.js, and pretty much singlehandedly decided to rewrite the mobile server using node. Node was definitely a better fit for what we were doing, since we were constantly blocking on a cross data center call, and non blocking server for IO has been shown to be highly advantageous from a performance perspective. Not to mention: <strong>we never intended for the original Ruby on Rails server to be used as a proxy for several years.</strong></p>
<p>So, knowing all the facts, what are all the takeaways?</p>
<ul>
<li>Is v8 faster than MRI? MRI is generally slower than YARV (Ruby 1.9), and, at least in <a href="http://shootout.alioth.debian.org/u32/benchmark.php?test=all&amp;lang=v8&amp;lang2=yarv">these benchmarks, I don&#8217;t think there is any question that v8 is freakin&#8217; fast</a>. If node.js blocked on I/O, however, this fact would have been almost completely irrelevant.</li>
<li>The rewrite factor. How many of us have been on a software engineering project where the end result looking nothing like what we planned to build in the first place? And, knowing fully the requirements, we know that, if given time and the opportunity to rebuild it from scratch, it would have been way better? Not to mention: I grew a lot at LinkedIn as a software engineer, so the same me several years later would have done a far better job than the same me in 2008. Experience does matter.</li>
<li>I see that one of the advantages of the mobile server being in node.js is people could &#8220;leverage&#8221; (<a href="https://www.google.com/search?q=site:press.linkedin.com+leverage">LinkedIn loves that word</a>) their Javascript skills. Well, LinkedIn <a href="http://www.linkedin.com/search/fpsearch?type=people&amp;keywords=java+linkedin&amp;pplSearchOrigin=GLHD&amp;pageKey=member-home#facets=keywords%3Djava%2520linkedin%26search%3DSearch%2520Search%26companyId%3D%26facetsOrder%3DCC%252CN%252CG%252CI%252CPC%252CED%252CL%252CFG%252CTE%252CFA%252CSE%252CP%252CCS%252CF%252CDR%26inNetworkSearch%3Dfalse%26pplSearchOrigin%3DFCTD%26keepFacets%3Dtrue%26facet_CC%3D1337%26facet_G%3Dus%253A84%26openFacets%3DCC%252CN%252CG">had/has hundreds of Java engineers</a>! If that was a concern, we would have spent more time exploring <a href="https://netty.io/">Netty</a>. Lies, damn lies, and benchmarks, I always say, but I think it&#8217;s safe for us to say that Netty (this is vertx, which sits on top of Netty) is <a href="http://vertxproject.wordpress.com/2012/05/09/vert-x-vs-node-js-simple-http-benchmarks/">at least as fast as node.js</a> for web serving.</li>
<li>Firefighting? That was probably a combination of several things: the fact that we were running MRI and leaked memory, or the fact that the ops team was 30% of a single guy.</li>
</ul>
<p>What I&#8217;m saying here is use your brain. Don&#8217;t read the High Scalability post and assume that you must build your next technology using node.js. It was definitely a better fit than Ruby on Rails for what the mobile server ended up doing, but it is not a performance panacea. You&#8217;re comparing a lower level server to a full stack web framework.</p>
<p>That&#8217;s all for tonight, folks, and thank you internet for finally goading me out of hiding again.</p>
<p>- Ikai</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=232&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ikaisays.com/2012/10/04/clearing-up-some-things-about-linkedin-mobiles-move-from-rails-to-node-js/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/70564585574a96fa34fbab2a1a6981b0?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">Ikai Lan</media:title>
		</media:content>
	</item>
		<item>
		<title>Apps Script quick tips: building a stock price spreadsheet</title>
		<link>http://ikaisays.com/2012/07/27/apps-script-quick-tips-building-a-stock-price-spreadsheet-2/</link>
		<comments>http://ikaisays.com/2012/07/27/apps-script-quick-tips-building-a-stock-price-spreadsheet-2/#comments</comments>
		<pubDate>Fri, 27 Jul 2012 22:05:41 +0000</pubDate>
		<dc:creator>Ikai Lan</dc:creator>
				<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://ikaisays.com/?p=219</guid>
		<description><![CDATA[I’ve been using iGoogle less and less over the past few years. A few weeks ago, the team announced that iGoogle would be shutting down in November 2013. It’s not a huge loss to me, though I do check iGoogle several times a day. Why? Stock prices! I’ve been using the Stock Market gadget for [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=219&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>I’ve been using iGoogle less and less over the past few years. A few weeks ago, the <a href="http://googleblog.blogspot.com/2012/07/spring-cleaning-in-summer.html">team announced that iGoogle would be shutting down in November 2013</a>. It’s not a huge loss to me, though I do check iGoogle several times a day. Why? Stock prices! I’ve been using the Stock Market gadget for years.</p>
<p>As it turns out, the functionality I want is very easy to replicate using Google Spreadsheets and <a href="https://script.google.com">Google Apps Script</a>. I’m thoroughly convinced that the fastest way to wire up different Google services for custom functionality is this product. Google Apps Script provides <a href="https://developers.google.com/apps-script/service_finance">services to access Google Finance APIs</a>.</p>
<p>Knowing this, it’s incredibly easy to wire up a spreadsheet that has access to live finance data. The spreadsheet I use looks something like this:</p>
<p><img class=" wp-image" src="http://ikailansays.files.wordpress.com/2012/07/screen-shot-2012-07-27-at-5-49-25-pm.png?w=635&#038;h=98" alt="Image" width="635" height="98" /></p>
<p>We can pull this off in a few very easy steps.</p>
<p>Step 1: Create a spreadsheet.</p>
<p>I made a spreadsheet with the following columns names:<br />
<strong><br />
Symbol Price Change Change % Details</strong></p>
<p><img class=" wp-image" src="http://ikailansays.files.wordpress.com/2012/07/screen-shot-2012-07-27-at-5-50-11-pm.png?w=592&#038;h=14" alt="Image" width="592" height="14" /></p>
<p>My intended use is to populate the Symbol column and have the rest of the data in the other columns auto populated. The nice thing about writing scripts that integrate with spreadsheets is that we have a built in UI for making edits, sorting, filtering and searching. By using spreadsheets as our data entry and manipulation UI, our functionality is already more advanced than the functionality provided in the Stock Market gadget as well as many other online portfolio-at-a-glance services.</p>
<p><strong>Step 2: Create the script</strong></p>
<p>What we’re going to do is write a few functions in the Script Editor. Spreadsheet cells can accept both the standard set of built-in functions that do simple things like SUM, AVG, and so forth, but they can also accept custom functions that retrieve data from other Google services.</p>
<p>Click Tools -&gt; Script Editor.</p>
<p><img class="size-full wp-image" src="http://ikailansays.files.wordpress.com/2012/07/screen-shot-2012-07-27-at-5-50-37-pm.png?w=272" alt="Image" width="272" height="326" /></p>
<p>This will open up a new tab in your browser where you can write code. The default name of this file is Code.gs. Replace whatever is in the buffer with this:</p>
<pre class="brush: jscript; title: ; notranslate">
function getStockPrice(symbol) {
  return FinanceApp.getStockInfo(symbol)[&quot;price&quot;];
}
function getStockPriceChangePct(symbol) {
  return FinanceApp.getStockInfo(symbol)[&quot;changepct&quot;];
}

function getStockPriceChange(symbol) {
  return FinanceApp.getStockInfo(symbol)[&quot;change&quot;];
}

function getGoogleFinanceLink(symbol) {
  return &quot;http://www.google.com/finance?q=&quot; + symbol;
}
</pre>
<p><strong><br />
</strong>Your Script Editor should look like this:</p>
<p><img class="size-full wp-image" src="http://ikailansays.files.wordpress.com/2012/07/screen-shot-2012-07-27-at-5-51-31-pm.png?w=627" alt="Image" width="627" height="436" /></p>
<p><a href="https://developers.google.com/apps-script/class_financeapp#getStockInfo">FinanceApp.getStockInfo()</a> returns a <a href="http://developers.google.com/apps-script/class_financeresult">FinanceResult</a> instance with a LOT of data. I only care about the basics: price, price change, and price change percentage. The functions I’ve defined reflect this.</p>
<p><strong>Step 3: Add the functions into the cells</strong></p>
<p>Now let’s go back to the spreadsheet tab. I’ve populated a few basic symbols under the Symbol column: GOOG (Google) and AAPL (Apple), two of my favorite companies. In column B2, enter this value:</p>
<pre>=getStockPrice(A2)</pre>
<p>Hit enter. If everything is working correctly, this will now populate with the latest price of whatever stock symbol is in A2. Let’s add the rest of the functions. In C2, enter:</p>
<pre>=getStockPriceChange(A2)</pre>
<p>D2</p>
<pre>=getStockPriceChangePct(A2)</pre>
<p>I like to have a link back to Google Finance if I ever want to do more research on a company, so in E2, add:</p>
<pre>=getGoogleFinanceLink(A2)</pre>
<p>This next part is hard to explain but shouldn’t be difficult for anyone who has used a spreadsheet program before. Highlight rows B2-E2. You can hold down shift and select these rows. Now hover your mouse over the bottom right corner of E2 and drag down a few rows. What this does is it copies the functions for the subsequent rows, but it substitutes A2 for A3, A4, A5, &#8230; depending on what row you happen to be in. You can test this out by adding additional stock symbols. The live stock data will appear.</p>
<p><strong>Step 4: Color Coding</strong></p>
<p>I like to see color coding depending on whether a stock price has risen or fallen. Hold down shift and click on C, then D at the top of the rows:</p>
<p><img class="size-full wp-image" src="http://ikailansays.files.wordpress.com/2012/07/screen-shot-2012-07-27-at-5-52-20-pm.png?w=492" alt="Image" width="492" height="481" /></p>
<p>Click the arrow to the right. This should drop down a menu. Click on “Conditional Formatting”:</p>
<p><img class="size-full wp-image" src="http://ikailansays.files.wordpress.com/2012/07/screen-shot-2012-07-27-at-5-52-48-pm.png?w=575" alt="Image" width="575" height="305" /></p>
<p>You’ll want to add two rules: a greater than rule and a less than rule. When the Change and Change % columns are greater than 0, change the background to green. When they are less than 0, change the backgrounds to red. Click “Save Rules”</p>
<p>You’re done!<br />
<strong><br />
Summary</strong></p>
<p>I’ve only scratched the surface of what can be done with Apps Script. We haven’t even gotten into a lot of the other cool things we can do. Using <a href="http://https://developers.google.com/apps-script/understanding_events">Clock Events</a>, we can check every few minutes for changes and email ourselves using the <a href="https://developers.google.com/apps-script/class_gmailapp">GmailApp library</a> if a stock price change is greater than some threshhold. We can <a href="https://developers.google.com/apps-script/service_charts">generate charts</a> based on historic data. And so on, and so forth. For more examples of things that can be done with Google Apps Script, check out the <a href="https://developers.google.com/apps-script/articles">tutorials section</a> for more ideas.</p>
<p>Have a great weekend!</p>
<p>- Ikai</p>
<p><strong> </strong></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=219&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ikaisays.com/2012/07/27/apps-script-quick-tips-building-a-stock-price-spreadsheet-2/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/70564585574a96fa34fbab2a1a6981b0?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">Ikai Lan</media:title>
		</media:content>

		<media:content url="http://ikailansays.files.wordpress.com/2012/07/screen-shot-2012-07-27-at-5-49-25-pm.png?w=907" medium="image">
			<media:title type="html">Image</media:title>
		</media:content>

		<media:content url="http://ikailansays.files.wordpress.com/2012/07/screen-shot-2012-07-27-at-5-50-11-pm.png?w=846" medium="image">
			<media:title type="html">Image</media:title>
		</media:content>

		<media:content url="http://ikailansays.files.wordpress.com/2012/07/screen-shot-2012-07-27-at-5-50-37-pm.png?w=272" medium="image">
			<media:title type="html">Image</media:title>
		</media:content>

		<media:content url="http://ikailansays.files.wordpress.com/2012/07/screen-shot-2012-07-27-at-5-51-31-pm.png?w=627" medium="image">
			<media:title type="html">Image</media:title>
		</media:content>

		<media:content url="http://ikailansays.files.wordpress.com/2012/07/screen-shot-2012-07-27-at-5-52-20-pm.png?w=492" medium="image">
			<media:title type="html">Image</media:title>
		</media:content>

		<media:content url="http://ikailansays.files.wordpress.com/2012/07/screen-shot-2012-07-27-at-5-52-48-pm.png?w=575" medium="image">
			<media:title type="html">Image</media:title>
		</media:content>
	</item>
		<item>
		<title>Getting started with jOOQ: A Tutorial</title>
		<link>http://ikaisays.com/2011/11/01/getting-started-with-jooq-a-tutorial/</link>
		<comments>http://ikaisays.com/2011/11/01/getting-started-with-jooq-a-tutorial/#comments</comments>
		<pubDate>Wed, 02 Nov 2011 02:54:50 +0000</pubDate>
		<dc:creator>Ikai Lan</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://ikaisays.com/?p=205</guid>
		<description><![CDATA[Introduction I accidentally stumbled onto jOOQ a few days ago while doing a lot of research on Hibernate. Funny how things work, isn’t it? For those of you that aren’t familiar with it, jOOQ is a different approach to the over-ORMing of Java persistence. Rather than try to map database tables to Java classes and [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=205&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<h1>Introduction</h1>
<p>I accidentally stumbled onto jOOQ a few days ago while doing a lot of research on Hibernate. Funny how things work, isn’t it? For those of you that aren’t familiar with it, jOOQ is a different approach to the over-ORMing of Java persistence. Rather than try to map database tables to Java classes and abstract away the SQL underneath, jOOQ assumes you <em>want</em> low level control over the SQL queries you execute, and provides a mostly typesafe interface for executing queries. I don&#8217;t have anything against simple ORMs, but it&#8217;s good to have the right tool for the right job. From the <a href="http://www.jooq.org">jOOQ homepage</a>:&lt;/p?</p>
<p>
Instead of this SQL query:
</p>
<pre class="brush: sql; title: ; notranslate">
SELECT * FROM BOOK
   WHERE PUBLISHED_IN = 2011
ORDER BY TITLE
</pre>
<p>
You would execute this Java code:
</p>
<pre class="brush: java; title: ; notranslate">
create.selectFrom(BOOK)
      .where(PUBLISHED_IN.equal(2011))
      .orderBy(TITLE)
</pre>
<p>
Why a Java interface? Type safety, for one. Programmatically using jOOQ’s DSL has some advantages over writing SQL queries by hand, such as IDE support and compile time checking of <em>some</em> things.
</p>
<p>
The idea interested me and I dug in. Unfortunately, the jOOQ site’s documentation, while fairly comprehensive, DO NOT PROVIDE AN END TO END “GETTING STARTED” PAGE!!! This means that if you want to learn jOOQ, you’ll have to jump to the chapter about Meta model code generation, then jump to the DSL, then jump to jOOQ classes section. It’s a bit of a mess for new users. Google search also didn’t turn up many useful results, so I figured I’d whip up a quick “Getting started” guide. We’re going to go over the following steps:</p>
<p>
Preparation: Download jOOQ and your SQL driver<br />
Step 1: Create a SQL database and a table<br />
Step 2: Generate classes<br />
Step 3. Write a main class and establish MySQL connection<br />
Step 4: Write a query using jOOQ’s DSL<br />
Step 5: Iterate over results<br />
Step 6: Profit!
</p>
<p>
Ready? Let’s get started.
</p>
<h1>Getting our hands dirty</h1>
<h2>Preparation: Download jOOQ and your SQL driver</h2>
<p>
If you haven’t already downloaded them, download jOOQ:
</p>
<p><a href="http://sourceforge.net/projects/jooq/files">http://sourceforge.net/projects/jooq/files/</a></p>
<p>
For this example, we’ll be using MySQL. If you haven’t already downloaded MySQL Connector/J, download it here:</p>
<p><a href="http://dev.mysql.com/downloads/connector/j/">http://dev.mysql.com/downloads/connector/j/</a></p>
<p>
Stash these somewhere where you can get to them later.
</p>
<h2>Step 1: Create a SQL database and a table</h2>
<p>
We’re going to create a database called “guestbook” and a corresponding “posts” table. Connect to MySQL via your command line client and type the following:</p>
<pre class="brush: sql; title: ; notranslate">
create database guestbook;

CREATE TABLE `posts` (
  `id` bigint(20) NOT NULL,
  `body` varchar(255) DEFAULT NULL,
  `timestamp` datetime DEFAULT NULL,
  `title` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`)
);
</pre>
<p>(I copied and pasted the create table statement from a “show create table” command)</p>
<h2>Step 2: Generate classes</h2>
<p>
In this step, we’re going to use jOOQ’s command line tools to generate classes that map to the Posts table we just created. The official <a href="http://www.jooq.org/manual/META/Configuration/">docs are here</a>.
</p>
<p>I’m going to augment the command line steps a bit. The easiest way to generate a schema is to copy the jOOQ jar files (there should be 3) and the MySQL Connector jar file to a temporary directory. Create a properties file. I’ve created a file called guestbook.properties that looks like this:</p>
<pre class="brush: plain; title: ; notranslate">
#Configure the database connection here
jdbc.Driver=com.mysql.jdbc.Driver
jdbc.URL=jdbc:mysql://localhost:3306/guestbook
jdbc.Schema=guestbook
jdbc.User=ikai
jdbc.Password=

#The default code generator. You can override this one, to generate your own code style
#Defaults to org.jooq.util.DefaultGenerator
generator=org.jooq.util.DefaultGenerator

#The database type. The format here is:
#generator.database=org.util.[database].[database]Database
generator.database=org.jooq.util.mysql.MySQLDatabase

#All elements that are generated from your schema (several Java regular expressions, separated by comma)
#Watch out for case-sensitivity. Depending on your database, this might be important!
generator.database.includes=.*

#All elements that are excluded from your schema (several Java regular expressions, separated by comma). Excludes match before includes
generator.database.excludes=

#Primary key / foreign key relations should be generated and used. 
#This will be a prerequisite for various advanced features
#Defaults to false
generator.generate.relations=true

#Generate deprecated code for backwards compatibility 
#Defaults to true
generator.generate.deprecated=false

#The destination package of your generated classes (within the destination directory)
generator.target.package=test.generated

#The destination directory of your generated classes
generator.target.directory=/Users/ikai/workspace/MySQLTest/src
</pre>
<p>
One thing that wasn’t clear from jOOQ’s docs is the value of <code>jdbc.Schema</code>: it should be your database name. Since our database name is “guestbook”, that’s what we put. Replace the username with whatever user has the appropriate privileges: in my local dev database, my user has what is effectively root access to everything without a password. You’ll want to look at the other values and replace as necessary. Here are the two interesting properties:</p>
<p>
<code>generator.target.package</code> &#8211; set this to the parent package you want to create for the generated classes. My setting of test.generated will cause the <code>test.generated.Posts</code> and <code>test.generated.PostsRecord</code> to be created</p>
<p>
<code>generator.target.directory</code> &#8211; the directory to output to. Worst case scenario you can just copy the files to the package.
</p>
<p>
Once you have the JAR files and guestbook.properties in your temp directory, type this:
</p>
<pre class="brush: bash; title: ; notranslate">
java -classpath jooq-1.6.8.jar:jooq-meta-1.6.8.jar:jooq-codegen-1.6.8.jar:mysql-connector-java-5.1.18-bin.jar:. org.jooq.util.GenerationTool /jooq.properties
</pre>
<p>
<strong>Note the prefix slash before jooq.properies.</strong> Even though it’s in our working directory, we need to prepend a slash.</p>
<p>
Replace the filenames with your filenames. In this example, I’m using jOOQ 1.6.8. If everything has worked, you should see this in your console output:</p>
<pre class="brush: plain; title: ; notranslate">
Nov 1, 2011 7:25:06 PM org.jooq.impl.JooqLogger info
INFO: Initialising properties  : /jooq.properties
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: Database parameters      
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: ----------------------------------------------------------
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO:   dialect                : MYSQL
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO:   schema                 : guestbook
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO:   target dir             : /Users/ikai/Documents/workspace/MySQLTest/src
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO:   target package         : test.generated
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: ----------------------------------------------------------
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: Emptying                 : /Users/ikai/workspace/MySQLTest/src/test/generated
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: Generating classes in    : /Users/ikai/workspace/MySQLTest/src/test/generated
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: Generating schema        : Guestbook.java
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: Generating factory       : GuestbookFactory.java
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: Schema generated         : Total: 122.18ms
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: Sequences fetched        : 0 (0 included, 0 excluded)
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: Masterdata tables fetched: 0 (0 included, 0 excluded)
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: Tables fetched           : 5 (5 included, 0 excluded)
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: Generating tables        : /Users/ikai/workspace/MySQLTest/src/test/generated/tables
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: ARRAYs fetched           : 0 (0 included, 0 excluded)
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: Enums fetched            : 0 (0 included, 0 excluded)
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: UDTs fetched             : 0 (0 included, 0 excluded)
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: Generating table         : Posts.java
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: Tables generated         : Total: 680.464ms, +558.284ms
Nov 1, 2011 7:25:07 PM org.jooq.impl.JooqLogger info
INFO: Generating Keys          : /Users/ikai/workspace/MySQLTest/src/test/generated/tables
Nov 1, 2011 7:25:08 PM org.jooq.impl.JooqLogger info
INFO: Keys generated           : Total: 718.621ms, +38.157ms
Nov 1, 2011 7:25:08 PM org.jooq.impl.JooqLogger info
INFO: Generating records       : /Users/ikai/workspace/MySQLTest/src/test/generated/tables/records
Nov 1, 2011 7:25:08 PM org.jooq.impl.JooqLogger info
INFO: Generating record        : PostsRecord.java
Nov 1, 2011 7:25:08 PM org.jooq.impl.JooqLogger info
INFO: Table records generated  : Total: 782.545ms, +63.924ms
Nov 1, 2011 7:25:08 PM org.jooq.impl.JooqLogger info
INFO: Routines fetched         : 0 (0 included, 0 excluded)
Nov 1, 2011 7:25:08 PM org.jooq.impl.JooqLogger info
INFO: Packages fetched         : 0 (0 included, 0 excluded)
Nov 1, 2011 7:25:08 PM org.jooq.impl.JooqLogger info
INFO: GENERATION FINISHED!     : Total: 791.688ms, +9.143ms
</pre>
<h2>Step 3. Write a main class and establish MySQL connection</h2>
<p>
Let’s just write a vanilla main class in the project containing the generated classes:</p>
<pre class="brush: java; title: ; notranslate">
public class Main {

	public static void main(String[] args) {
		Connection conn = null;
		String userName = &quot;ikai&quot;;
		String password = &quot;&quot;;
		String url = &quot;jdbc:mysql://localhost:3306/guestbook&quot;;
		try {
			Class.forName(&quot;com.mysql.jdbc.Driver&quot;).newInstance();
			conn = DriverManager.getConnection(url, userName, password);
			conn.close();
		} catch (Exception e) {
			// You'll probably want to handle the exceptions in a real app
			// Don't ever do this silence catch(Exception e) thing. I've seen this in
			// live code and it is horrendous.
			e.printStackTrace();
		} 

	}
}
</pre>
<p>
This is pretty standard code for establishing a MySQL connection.
</p>
<h2>Step 4: Write a query using jOOQ’s DSL</h2>
<p>
Let’s add a simple query:
</p>
<pre class="brush: java; title: ; notranslate">
			GuestbookFactory create = new GuestbookFactory(conn);
			Result result = create.select().from(Posts.POSTS).fetch();
</pre>
<p>
We need to first get an instance of <code>GuestbookFactory</code> so we can write a simple <code>SELECT</code> query. We pass an instance of the MySQL connection to <code>GuestbookFactory</code>. Note that factory doesn’t close the connection. We’ll have to do that ourselves.</p>
<p>
We then use jOOQ’s DSL to return an instance of <code>Result</code>. We’ll be using this result in the next step.</p>
<h2>Step 5: Iterate over results</h2>
<p>After the line where we retrieve the results, let’s iterate over the results and print out the data:</p>
<pre class="brush: java; title: ; notranslate">
			for (Record r : result) {
				Long id = r.getValueAsLong(Posts.ID);
				String title = r.getValueAsString(Posts.TITLE);
				String description = r.getValueAsString(Posts.BODY);
				
				System.out.println(&quot;ID: &quot; + id + &quot; title: &quot; + title + &quot; desciption: &quot; + description);
			}
</pre>
<p>
The full program should now look like this:
</p>
<pre class="brush: java; title: ; notranslate">
package test;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;

import org.jooq.Record;
import org.jooq.Result;

import test.generated.GuestbookFactory;
import test.generated.tables.Posts;

public class Main {

	/**
	 * @param args
	 */
	public static void main(String[] args) {
		Connection conn = null;
		String userName = &quot;ikai&quot;;
		String password = &quot;&quot;;
		String url = &quot;jdbc:mysql://localhost:3306/guestbook&quot;;
		try {
			Class.forName(&quot;com.mysql.jdbc.Driver&quot;).newInstance();
			conn = DriverManager.getConnection(url, userName, password);

			GuestbookFactory create = new GuestbookFactory(conn);
		
			Result result = create.select().from(Posts.POSTS).fetch();
		
			for (Record r : result) {
				Long id = r.getValueAsLong(Posts.ID);
				String title = r.getValueAsString(Posts.TITLE);
				String description = r.getValueAsString(Posts.BODY);
			
				System.out.println(&quot;ID: &quot; + id + &quot; title: &quot; + title + &quot; desciption: &quot; + description);
			}
			conn.close();
		} catch (Exception e) {
			// You'll probably want to handle the exceptions in a real app
			// Don't ever do this silence catch(Exception e) thing. I've seen this in 
			// live code and it is horrendous.
			e.printStackTrace();
		} 
	
	}
}
</pre>
<h2>Step 6: Profit!</h2>
<p>Get a job and go to work like the rest of us.</p>
<h2>Conclusion</h2>
<p>
I haven’t explored the more advanced bits of jOOQ, but, at least judging from the docs, it looks like there’s a lot of meat there. I’m hoping this guide makes it easier for new users to dive in.</p>
<p>
- ikai<br />
<em>Currently listening: Sweat &#8211; Snoop Dogg vs David Guetta</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=205&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ikaisays.com/2011/11/01/getting-started-with-jooq-a-tutorial/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/70564585574a96fa34fbab2a1a6981b0?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">Ikai Lan</media:title>
		</media:content>
	</item>
		<item>
		<title>On Hackathons, Process, Email and the Tragedy of the Commons</title>
		<link>http://ikaisays.com/2011/07/16/on-hackathons-process-email-and-the-tragedy-of-the-commons/</link>
		<comments>http://ikaisays.com/2011/07/16/on-hackathons-process-email-and-the-tragedy-of-the-commons/#comments</comments>
		<pubDate>Sat, 16 Jul 2011 22:24:08 +0000</pubDate>
		<dc:creator>Ikai Lan</dc:creator>
				<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://ikaisays.com/?p=194</guid>
		<description><![CDATA[Hackathons I love hackathons. I love going to them, and I love running them. Most recently, I participated in a 48-hour hackathon in Kuala Lumpur, Malaysia. It’s one of the best parts of my job. I get to run (and sometimes participate) in both external hackathons, as well as hackathons that are internal to Google. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=194&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<h2>Hackathons</h2>
<p>I love hackathons. I love going to them, and I love running them. Most recently, I participated in a <a href="http://blog.gtugkl.org/2011/06/start-your-engines-google-app-engine.html">48-hour hackathon in Kuala Lumpur, Malaysia</a>. It’s one of the best parts of my job. I get to run (and sometimes participate) in both external hackathons, as well as hackathons that are internal to Google.</p>
<p><a href="https://plus.google.com/photos/115174157490147606284/albums/5622564218351918977"><img class="alignnone" src="https://lh5.googleusercontent.com/-kukyTPU08O4/ThB5959tjVI/AAAAAAAAEG8/-ZJFH4s43wg/s800/IMG_6499.JPG" alt="" width="480" height="320" /></a></p>
<p>In early June I held an internal Hackathon at Google to teach employees how to best use the product I work on: Google App Engine. I consider the event a success: we had hundreds of RSVPs and a completely booked room. It was so successful, in fact, that I’m planning on holding at least one of these events a quarter. The breakdown was primarily newer employees, which didn’t surprise me giving the <a href="http://googleblog.blogspot.com/2011/01/help-wanted-google-hiring-in-2011.html">amount of hiring we’ve been doing</a>.</p>
<p>A primary driver for the sheer volume of RSVPs was the fact that we advertised the event on a mailing list that went out to pretty much <em>all</em> of engineering. All. Of. It. An engineering company with headcount in the <a href="http://investor.google.com/financial/tables.html">tens of thousands</a>, hundreds of RSVPs was not only <em>likely</em>, it was pretty much a mathematical certainty. Looking back, we would probably not have received the response if we didn’t sent out such a wide blast.</p>
<p>As a result of what I consider to be a fairly successful event (and I don’t mean to take all the credit here, at about the same time as my event, there was another very successful internal hackathon), various teams have suggested hackathons for their product APIs. There are events on the calendar.</p>
<p>Therein, of course, lies our problem. The problem of noise.</p>
<p>What should we do? Email all of engineering for every event? Create a new list/site/page announcing new events? Let’s break down the tradeoffs for each choice:</p>
<h3>1. email all of engineering</h3>
<p><strong>Pros:</strong> goes to everyone</p>
<p><strong>Cons (and this is the bigger point of this post):</strong> the majority of events will be irrelevant, causing the signal-to-noise ratio on the list to significantly drop, causing people to filter out these annoucements</p>
<h3>2. Create a new distribution channel for events</h3>
<p><strong>Pros:</strong> Opt in</p>
<p><strong>Cons:</strong> You don’t get the distribution you’d get with #1, since only a minority of people will opt in. Also &#8211; has the same SNR problems.</p>
<p>Now, a hybrid solution would be to do both. High profile, important events go to all of engineering, and smaller events go to the special distribution channel. The issue here is that <em>everyone’s</em> event is high profile. So again, we don’t have a great solution. Not to mention: people can only attend so many hackathons and still be able to do all the stuff they’re supposed to be doing. See, that’s one of the great things about Google engineering. If you’re consistently delivering, there isn’t a manager in the company that will tell you not to attend a hackathon or internal event where you can only get better at what you do. The issue, of course, is that the more hackathons take place, you are likely taking something a resource away from another team for a non-trivial amount of time. From a hackathon organizer’s perspective, a hackathon is almost always beneficial as long as some non-zero number of participants show up: they learn about your API, provide feedback and you learn a bit about how to improve the documentation or SDK. You almost can’t afford <em>not</em> to throw a hackathon.</p>
<p>This is the classic example of the <a href="http://en.wikipedia.org/wiki/Tragedy_of_the_commons">tragedy of the commons</a>. By running an event, you consume space. You consume employee time. You generate noise on all the distribution channels. And when everyone does it, suddenly, as a whole, everyone is worse off, though you yourself may individually gain.</p>
<h2>Email</h2>
<p>Another key example of the tragedy of the commons is a company’s email marketing. I worked at a consumer internet company that broke teams by product. To drive usage metrics for an individual product, the product managers would run email campaigns to the site’s millions of users. The result was that the individual product would receive for usage, and everyone would give themselves a pat on the back. What was actually happening was that it was causing users to become extremely irritated at the company (myself included) for the voluminous amounts of email being sent all the time. Sure, you could go to the site settings and disable email, but new products would automatically opt you in to receiving notifications, and you would have to log back into the site to find the settings and disable those notifications as well. Some users, like myself, have created Gmail filters to completely send all emails from this company’s domain to a “Stupid Mail” label. I can understand the individual product managers’ reasoning. You don’t want to be the one team that doesn’t deliver metrics, so you email spam. And when everyone email spams, it’s to the detriment of the company overall. An employee posted to an internal group asking if it was an example of the tragedy of the commons &#8211; I don’t know if his advice was ever heeded, but based on the complaints I see on Twitter about email, my guess is no.</p>
<h2>Process</h2>
<p>I view team processes the same way, and this sometimes leads to some very heated discussions with people I work with. It’s not that I don’t believe making your 1 step process a 5 step process doesn’t make your life easier or the company better organized; it’s that <em>everybody</em> wants to turn their processes from one step, lightweight, free form processes into full on, form driven, strict-requirements-based, signed-in-triplicate steps for doing things. I fight heavy processes when I can because I don’t believe enough people do so. Why? The tragedy of the commons. An extra 20 minutes here, and extra 20 minutes there, and suddenly, I am spending most of my day tangled in process instead of getting things done.</p>
<p>There are no easy solutions to this, of course. Some process is necessary, though from the onset, it isn’t always obvious which ones. How do you know, for instance, if a process is unnecessary? A good example is a managerial approval step in a process. Let’s say I need approval to do something. How do I evaluate if managerial approval is working?</p>
<ul>
<li>What is the cost of doing it wrong? What was the bad outcome?</li>
<li>What was the number of incidences in which, prior to the institution of the process, that approval would have prevented a bad outcome?</li>
<li>Is the manager rubber stamping requests?</li>
</ul>
<p>What absolutely needs to be done are constant evaluations of process. Don’t create a process and sit on it. Make it better. What can you take away, and still have it work? Think about your last trip to the DMV. How many steps could have been eliminated?</p>
<h2>Awareness of the bigger picture</h2>
<blockquote><p>Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.</p></blockquote>
<p><em>Antoine de Saint-Exupery</em><br />
French writer (1900 &#8211; 1944)</p>
<p>I suppose that’s the solution to fighting the tragedy of the commons. A constant awareness of the bigger picture and a real desire to make things better. An understanding that many things in this world <em>are</em> a zero sum game. I’ll issue caution, of course: you can probably only champion a few things. Championing fixing everything, and people stop listening to you, you lose focus, and you will end up fixing very little. What do we call this effect? No, I won’t bother. Hopefully you actually read this and already know.</p>
<p>- Ikai</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=194&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ikaisays.com/2011/07/16/on-hackathons-process-email-and-the-tragedy-of-the-commons/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/70564585574a96fa34fbab2a1a6981b0?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">Ikai Lan</media:title>
		</media:content>

		<media:content url="https://lh5.googleusercontent.com/-kukyTPU08O4/ThB5959tjVI/AAAAAAAAEG8/-ZJFH4s43wg/s800/IMG_6499.JPG" medium="image" />
	</item>
		<item>
		<title>Setting up an OAuth provider on Google App Engine</title>
		<link>http://ikaisays.com/2011/05/26/setting-up-an-oauth-provider-on-google-app-engine/</link>
		<comments>http://ikaisays.com/2011/05/26/setting-up-an-oauth-provider-on-google-app-engine/#comments</comments>
		<pubDate>Fri, 27 May 2011 01:23:24 +0000</pubDate>
		<dc:creator>Ikai Lan</dc:creator>
				<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://ikaisays.com/?p=185</guid>
		<description><![CDATA[App Engine provides an API for easily creating an OAuth provider. In this blog post, I’ll describe the following steps: Create and deploy an App Engine application the implements the OAuth API Add a new domain to your Google Account. Verify this domain. Connecting an OAuth client to make requests against your application I’ll avoid [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=185&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>App Engine provides an API for easily creating an OAuth provider. In this blog post, I’ll describe the following steps:</p>
<ol>
<li>Create and deploy an App Engine application the implements the OAuth API</li>
<li>Add a new domain to your Google Account. Verify this domain.</li>
<li>Connecting an OAuth client to make requests against your application</li>
</ol>
<p>
I’ll avoid a deep explanation of OAuth for now. We can find everything you need to know about OAuth in the <a href="http://hueniverse.com/oauth/">Beginner’s guide to OAuth</a>.</p>
<h2>Get the code</h2>
<p>The code that goes along with this blog post is available here:</p>
<p><a href="https://github.com/ikai/appengine-oauth-java-server-python-client-sample">https://github.com/ikai/appengine-oauth-java-server-python-client-sample</a></p>
<p>
The two most important files are:
</p>
<ul>
<li>python/oauth_client.py</li>
<li>src/com/ikai/oauthprovider/ProtectedServlet.java</li>
</ul>
<h2>Step 1: Create and deploy an App Engine application that uses the OAuth API</h2>
<p>
Create a new App Engine Java application. I’ve created a servlet called <code>ProtectedServlet</code>:
</p>
<pre class="brush: java; title: ; notranslate">
package com.ikai.oauthprovider;

import com.google.appengine.api.oauth.OAuthRequestException;
import com.google.appengine.api.oauth.OAuthService;
import com.google.appengine.api.oauth.OAuthServiceFactory;
import com.google.appengine.api.users.User;

import java.io.IOException;

import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

@SuppressWarnings(&quot;serial&quot;)
public class ProtectedServlet extends HttpServlet {
    
    @Override
    public void doGet(HttpServletRequest req, HttpServletResponse resp)
	    throws IOException {
	User user = null;
	try {
	    OAuthService oauth = OAuthServiceFactory.getOAuthService();
	    user = oauth.getCurrentUser();
	    resp.getWriter().println(&quot;Authenticated: &quot; + user.getEmail());
	} catch (OAuthRequestException e) {
	    resp.getWriter().println(&quot;Not authenticated: &quot; + e.getMessage());
	}
    }
    
}
</pre>
<p>
This servlet is incredibly simple. We retrieve an instance of <code>OAuthService</code> via <code>OAuthServiceFactory</code> and attempt to fetch the current user. Note that the <code>User</code> instance is the same kind of instance as a <code>User</code> returned by <code>UserService</code>. That’s because a User is still expected to sign in via a Google Account.</p>
<p>
The method <code>getCurrentUser()</code> takes care of all of the OAuth signature verification. If something goes wrong &#8211; say, the request is not signed, or the signature is invalid, or the client’s timestamp is outside of the acceptable skew, or the nonce is repeated &#8211; <code>OAuthService</code> throws <code>OAuthRequestException</code>.</p>
<p>
We can run this code locally, but it won’t work. When run locally, <code>oauth.getCurrentUser()</code> always returns a test user. Wel need to deploy it to App Engine before it’ll do verification. After deploy, we can test the servlet. I have the servlet mapped to /resource. When we browse to this URL, we see:
</p>
<blockquote><p>Not authenticated: Unknown</p></blockquote>
<p>
That’s okay. We expect to see this because we’re sending a vanilla GET to this API.
</p>
<h2>2. Add a new domain to your Google Account. Verify this domain</h2>
<p>
OAuth clients require a <em>consumer key</em> and <em>consumer token</em>. We need to generate these. Browse to the “Manage Domains” page:
</p>
<p><a href="https://www.google.com/accounts/ManageDomains">https://www.google.com/accounts/ManageDomains</a></p>
<p>
It should look like this:<br />
<img src="https://lh6.googleusercontent.com/Pz3EWvydHPLEbCiNnQkzxTHgAZ7gO8nwUuDEVAFPY--yqme8pIIAEtX85MUsSTDtsfP4BspJmn1sjvTxPnrrz90OIDajGqdcdp-salXGps2xusAyW0E" alt="" width="556px;" height="405px;" /><br />
Add the base URL of our App Engine app into the text box in the “Add a New Domain” section and click “Add domain”. For instance, I entered: <a href="http://ikai-oauth.appspot.com" rel="nofollow">http://ikai-oauth.appspot.com</a>.
</p>
<p>
We’ll be taken to a new page where we need to verify ownership of the application:<br />
<img src="https://lh6.googleusercontent.com/Kd3HNmK84BNkeIS-tyoPr_Z0VJ6Q6mJLTniJJbb9l0OhLJ6Sj1Cuhye6ettg5Ntfz1D1uQh2X9OZTs-FYj88PME2BruH3_OadHOqcAf0d4zwDzT_CUk" alt="" width="612px;" height="362px;" /><br />
Download the HTML verification file and place it into our war directory. Deploy this new version of the application to App Engine. Once we have confirmed that the page is serving, click “Verify” to complete the verification process.
</p>
<p>
When we have verified our domain, we will be asked to accept the Terms of Service and enter a few settings. Only the authsub setting is required; we can enter anything we want here because we will not be using <a href="http://code.google.com/apis/accounts/docs/AuthSub.html">authsub</a>. We will then be presented with an OAuth consumer key and OAuth consumer secret. The OAuth consumer key is simply the domain, whereas the consumer secret is an autogenerated shared secret that clients will be using.
</p>
<p>
Now we have these values, we can move on to step 3.
</p>
<h2>3. Connecting an OAuth client to make requests against your application</h2>
<p>
As of the time of this writing, App Engine only supports OAuth 1.0.
</p>
<p>
Below is a basic script that will do the 3-legged OAuth dance, cache access tokens locally and make API calls. To run this script, you will need to install the <a href="https://github.com/simplegeo/python-oauth2">python-oauth2 library</a>. If we have git installed, the commands to install the library on a *Nix like system are:
</p>
<pre class="brush: bash; title: ; notranslate">
git clone https://github.com/simplegeo/python-oauth2.git
cd python-oauth2
sudo python setup.py install
</pre>
<p>
This installs the <code>oauth2</code> library into your Python install so you can import it when we need it.
</p>
<p>
Now we can run the script to make authenticated calls against our app. Note that we’ll want to substitute the <code>consumer_secret</code> and <code>app_id</code> values with values that map to your application ID and consumer secret:
</p>
<pre class="brush: python; title: ; notranslate">
import oauth2 as oauth
import urlparse
import os
import pickle

app_id = &quot;your_app_id_here&quot;
url = &quot;http://%s.appspot.com/resource&quot; % app_id

consumer_key = '%s.appspot.com' % app_id
consumer_secret = 'your_consumer_secret_here'

access_token_file = &quot;token.dat&quot;

request_token_url   = &quot;https://%s.appspot.com/_ah/OAuthGetRequestToken&quot; % app_id
authorize_url       = &quot;https://%s.appspot.com/_ah/OAuthAuthorizeToken&quot; % app_id
access_token_url    = &quot;https://%s.appspot.com/_ah/OAuthGetAccessToken&quot; % app_id

consumer = oauth.Consumer(consumer_key, consumer_secret)

if not os.path.exists(access_token_file):

    client = oauth.Client(consumer)

    # Step 1: Get a request token. This is a temporary token that is used for 
    # having the user authorize an access token and to sign the request to obtain 
    # said access token.

    resp, content = client.request(request_token_url, &quot;GET&quot;)
    if resp['status'] != '200':
        raise Exception(&quot;Invalid response %s.&quot; % resp['status'])

    request_token = dict(urlparse.parse_qsl(content))

    print &quot;Request Token:&quot;
    print &quot;    - oauth_token        = %s&quot; % request_token['oauth_token']
    print &quot;    - oauth_token_secret = %s&quot; % request_token['oauth_token_secret']
    print 


    print &quot;Go to the following link in your browser:&quot;
    print &quot;%s?oauth_token=%s&quot; % (authorize_url, request_token['oauth_token'])
    print 

    # After the user has granted access to you, the consumer, the provider will
    # redirect you to whatever URL you have told them to redirect to. You can 
    # usually define this in the oauth_callback argument as well.
    accepted = 'n'
    while accepted.lower() == 'n':
            accepted = raw_input('Have you authorized me? (y/n) ')


    # Step 3: Once the consumer has redirected the user back to the oauth_callback
    # URL you can request the access token the user has approved. You use the 
    # request token to sign this request. After this is done you throw away the
    # request token and use the access token returned. You should store this 
    # access token somewhere safe, like a database, for future use.
    token = oauth.Token(request_token['oauth_token'],
                request_token['oauth_token_secret'])
    client = oauth.Client(consumer, token)

    resp, content = client.request(access_token_url, &quot;POST&quot;)
    access_token = dict(urlparse.parse_qsl(content))

    print &quot;Access Token:&quot;
    print &quot;    - oauth_token        = %s&quot; % access_token['oauth_token']
    print &quot;    - oauth_token_secret = %s&quot; % access_token['oauth_token_secret']
    print
    print &quot;You may now access protected resources using the access tokens above.&quot; 
    print

    token = oauth.Token(access_token['oauth_token'],
                access_token['oauth_token_secret'])

    with open(access_token_file, &quot;w&quot;) as f:
        pickle.dump(token, f)

else:
    with open(access_token_file, &quot;r&quot;) as f:
        token = pickle.load(f)


client = oauth.Client(consumer, token)
resp, content = client.request(url, &quot;GET&quot;)
print &quot;Response Status Code: %s&quot; % resp['status']
print &quot;Response body: %s&quot; % content
</pre>
<p>
(The basis for this script was shamelessly stolen from Joe Stump’s <a href="https://github.com/simplegeo/python-oauth2">sample oauth2 code for his Python library on Github</a>.)
</p>
<p>
Once we run the script using:
</p>
<pre class="brush: bash; title: ; notranslate">
python oauth_client.py
</pre>
<p>
we should see:
</p>
<pre class="brush: bash; title: ; notranslate">
Request Token:
- oauth_token        = SOME_OAUTH_REQUEST_TOKEN_VALUE
- oauth_token_secret = SOME_OAUTH_REQUEST_SECRET_VALUE

Go to the following link in your browser:

https://YOUR-APP-ID.appspot.com/_ah/OAuthAuthorizeToken?oauth_token=SOME_OAUTH_REQUEST_TOKEN_VALUE

Have you authorized me? (y/n)
</pre>
<p>
The OAuth token and token secret values are generated by the script using a combination of random values and the consumer key/secret pair. With these values, known as request tokens, you generate an authorization URL for an end user to bless our client so it can make OAuth requests on the behalf of the user that grants authorization.
</p>
<p>
At this point, the script pauses for input. As part of the OAuth dance, we need to browse to the URL provide and authorize the script. Copy/paste this URL into your browser window and click “Grant Access”:
</p>
<p><img src="https://lh3.googleusercontent.com/dPrfUcf3GOa1wCspjJO529vjuSlDLIckXNs_avFRt2tf95ZL0RML3_RcoQdHH61gQ8ILHVLTcUNoGg5mdy8ms8l33t7YnyD_OWI5oiCjGazGH_2CMwI" alt="" width="487px;" height="239px;" /></p>
<p>
Once we see a page that says:
</p>
<blockquote><p>You have successfully granted ikai-oauth.appspot.com access to your Google Account. You can revoke access at any time under &#8216;My Account&#8217;.
</p></blockquote>
<p>
We can switch back to your terminal window and hit “y”. The client now exchanges our request tokens for access tokens. Access tokens are what you need to make API calls. The script outputs this:
</p>
<pre class="brush: plain; title: ; notranslate">
Access Token:
- oauth_token        = SOME_OAUTH_ACCESS_TOKEN
- oauth_token_secret = SOME_OAUTH_ACCESS_TOKEN_SECRET

You may now access protected resources using the access tokens above.

Response Status Code: 200
Response body: Authenticated: the-account-you-logged-in-with@gmail.com
</pre>
<p>
The Python script caches the access token in a file called token.dat, so the next time we run oauth_client.py, we skip the authorization dance and can directly make API calls:
</p>
<pre class="brush: plain; title: ; notranslate">
$ python oauth_client.py
Response Status Code: 200
Response body: Authenticated:the-account-you-logged-in-with@gmail.com
</pre>
<p>
That’s all there is to it!
</p>
<h2>Final notes and general tips</h2>
<p>
Setting up an OAuth provider using App Engine’s API is incredibly simple once we know all the steps. Setting up the provider is just a matter of a few lines of code, and the steps to set up the client are pretty straightforward. The most difficult part is setting up the consumer key and secret, but even that isn’t so bad once we know where the management interface is.</p>
<p>
When possible, use OAuth instead of <a href="http://code.google.com/apis/accounts/docs/AuthForInstalledApps.html">ClientLogin</a>. This goes for web applications, mobile applications, desktop apps, and even command line scripts. OAuth allows users to revoke your access token and trains users not to arbitrarily give out their Google Account password to any interface that asks for it. For building clients, it also gives you a way to do client authentication without having to cache credentials &#8211; using ClientLogin too often results in <a href="http://code.google.com/apis/gdata/javadoc/com/google/gdata/client/GoogleService.CaptchaRequiredException.html">CaptchaRequiredException</a> being thrown, anyway.
</p>
<p>
- Ikai
</p>
<h2>References</h2>
<p>
Github sample code:<br />
<a href="https://github.com/ikai/appengine-oauth-java-server-python-client-sample">https://github.com/ikai/appengine-oauth-java-server-python-client-sample</a>
</p>
<p>
App Engine/Java OAuth docs: <a href="http://code.google.com/appengine/docs/java/oauth/overview.html">http://code.google.com/appengine/docs/java/oauth/overview.html</a>
</p>
<p>
Domain management &#8211; get your consumer key/secret here: <a href="https://www.google.com/accounts/ManageDomains">https://www.google.com/accounts/ManageDomains</a>
</p>
<p>
Python OAuth client code: <a href="https://github.com/simplegeo/python-oauth2">https://github.com/simplegeo/python-oauth2</a></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=185&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ikaisays.com/2011/05/26/setting-up-an-oauth-provider-on-google-app-engine/feed/</wfw:commentRss>
		<slash:comments>25</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/70564585574a96fa34fbab2a1a6981b0?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">Ikai Lan</media:title>
		</media:content>

		<media:content url="https://lh6.googleusercontent.com/Pz3EWvydHPLEbCiNnQkzxTHgAZ7gO8nwUuDEVAFPY--yqme8pIIAEtX85MUsSTDtsfP4BspJmn1sjvTxPnrrz90OIDajGqdcdp-salXGps2xusAyW0E" medium="image" />

		<media:content url="https://lh6.googleusercontent.com/Kd3HNmK84BNkeIS-tyoPr_Z0VJ6Q6mJLTniJJbb9l0OhLJ6Sj1Cuhye6ettg5Ntfz1D1uQh2X9OZTs-FYj88PME2BruH3_OadHOqcAf0d4zwDzT_CUk" medium="image" />

		<media:content url="https://lh3.googleusercontent.com/dPrfUcf3GOa1wCspjJO529vjuSlDLIckXNs_avFRt2tf95ZL0RML3_RcoQdHH61gQ8ILHVLTcUNoGg5mdy8ms8l33t7YnyD_OWI5oiCjGazGH_2CMwI" medium="image" />
	</item>
		<item>
		<title>Unit Testing in Tipfy, an App Engine framework in Python</title>
		<link>http://ikaisays.com/2011/02/19/unit-testing-in-tipfy-an-app-engine-framework-in-python/</link>
		<comments>http://ikaisays.com/2011/02/19/unit-testing-in-tipfy-an-app-engine-framework-in-python/#comments</comments>
		<pubDate>Sat, 19 Feb 2011 10:12:33 +0000</pubDate>
		<dc:creator>Ikai Lan</dc:creator>
				<category><![CDATA[App Engine]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://ikaisays.com/?p=171</guid>
		<description><![CDATA[I&#8217;ve been playing around with the Tipfy framework for App Engine. Tipfy is a framework built on top of App Engine&#8217;s APIs that provides many features on top of what is currently possible. I won&#8217;t go too much into their virtues here. One thing that&#8217;s bothered me is the dearth of a testing guide. More [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=171&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>I&#8217;ve been playing around with the <a href="http://www.tipfy.org/">Tipfy</a> framework for <a href="http://code.google.com/appengine/">App Engine</a>. Tipfy is a framework built on top of App Engine&#8217;s APIs that provides many features on top of what is currently possible. I won&#8217;t go too much into their virtues here.</p>
<p>One thing that&#8217;s bothered me is the dearth of a testing guide. More disturbing still is that one of the top search results for unit testing is a groups post of a <a href="http://groups.google.com/group/tipfy/msg/b3ed5a4c69d9c537">developer bragging that he doesn&#8217;t write tests</a> (let&#8217;s hope no one ever has to work with you). Digging around, it&#8217;s clear that Rodrigo Moraes, the creator of Tipfy, emphasizes testing in his own app, as can be evidence by the <a href="https://bitbucket.org/moraes/tipfy/src/5c71c3927850/tests/">testing package in the Tipfy source repository</a>. I&#8217;ve decided to write this quick guide to help other developers to try to save some time having to do the detective work I&#8217;ve had to do to get unit tests running.</p>
<h2>Shortcut</h2>
<p>So &#8211; if you don&#8217;t want to read, you can just skip ahead and read this <a href="http://pastie.org/1580946">code sample which shows an example of how to write tests</a> for the demo &#8220;Hello, World&#8221; application that comes as part of the Tipfy download.</p>
<h2>Getting Started</h2>
<p>We&#8217;re going to need a few different tools to run tests. Note that we don&#8217;t <em>need</em> need them, I just find that using these tools will make our life a lot easier:</p>
<ul>
<li>Nose &#8211; Nose is a popular Python test discovery and execution tool. Nose will dig through your source directory and run your tests</li>
<li>Nose GAE plugin &#8211; this is the plugin that makes nose play nice with the local App Engine SDK</li>
</ul>
<p>If you don&#8217;t already have these tools installed, go ahead and install them with easy_install:</p>
<pre class="brush: bash; title: ; notranslate">
sudo easy_install nose
sudo easy_install nosegae</pre>
<p><span style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13px;line-height:19px;white-space:normal;">We&#8217;ll also need to make sure tipfy is on our PYTHONPATH. Look for tipfy under YOUR_TIPFY_INSTALL/app/distlib. Here&#8217;s what I see as of the writing of this post:</span></p>
<pre class="brush: bash; title: ; notranslate">
distlib ikai$ ls
README.txt	babel		jinja2		tipfy		werkzeug</pre>
<p>Add this to your PYTHONPATH by adding a line to ~/.bash_profile (or equivalent on your system):</p>
<pre class="brush: bash; title: ; notranslate">export PYTHONPATH=&quot;/path/to/root/of/tipfy/libraries</pre>
<p>If needed, run:</p>
<pre class="brush: bash; title: ; notranslate">source ~/.bash_profile</pre>
<p>Alright, you&#8217;re ready to roll. Run a test from the root of your application directory. It&#8217;s probably easiest to do this from the directory app.yaml resides in:</p>
<pre class="brush: bash; title: ; notranslate">nosetests -d --with-gae --without-sandbox -v</pre>
<p>Note that this assumes your App Engine SDK lives at /usr/local/google_appengine. If it doesn&#8217;t, either symlink it or pass the &#8211;gae-lib-root flag.</p>
<p>You only really need &#8211;with-gae and &#8211;without-sandbox flags, but I like the other flags. Type nosetests &#8211;help for a full description of the commands available.</p>
<p>Now let&#8217;s write some tests.</p>
<h2>Writing tests</h2>
<p>Now let&#8217;s create a new file for tests. Tipfy has a concept of apps within a project (think Django apps), so for this example, I&#8217;ll create a file called tests.py in each app directory for each organization (we&#8217;ll have to remember to create a setting in app.yaml to not upload this file, but this isn&#8217;t crucial). The responsibility of the tests in this file will be to run the tests for the app it&#8217;s colocated with. It&#8217;d be equally valid to create a test directory.</p>
<p>Here&#8217;s our tests.py:</p>
<pre class="brush: bash; title: ; notranslate">
import unittest

from tipfy import RequestHandler, Tipfy
import urls

class TestHandler(unittest.TestCase):
    
    def setUp(self):
        self.app = Tipfy(rules=urls.get_rules(None))        
        self.client = self.app.get_test_client()

    
    def test_hello_world_handler(self):        
        response = self.client.get('/', follow_redirects=True)
        self.assertEquals(response.data, &quot;Hello BLAH&quot;)
            
    def test_pretty_hello_world_handler(self):                
        response = self.client.get('/pretty')
        self.assertTrue(&quot;Hello, World!&quot; in response.data)
</pre>
<p>Let&#8217;s talk through what we&#8217;re doing here step by step:</p>
<pre class="brush: python; title: ; notranslate">
    def setUp(self):
        self.app = Tipfy(rules=urls.get_rules(None))
        self.client = self.app.get_test_client()
</pre>
<p>If you&#8217;re using to Python testing, this shouldn&#8217;t look too surprising to you. The setUp function is run before each test. We&#8217;re doing two things here:</p>
<ol>
<li>Initialize an instance of the app. We&#8217;ve imported the urls module from this app, so we can call get_rules() on it to get our URL mappings. We&#8217;re passing None to this because it expects an app, but as luck would have it, the &#8220;Hello World&#8221; demo doesn&#8217;t actually use this paramter.</li>
<li>We&#8217;re initializing an instance of the test client. This is what we&#8217;ll be using to make requests</li>
</ol>
<p>Now let&#8217;s talk about the tests</p>
<pre class="brush: bash; title: ; notranslate">
    def test_hello_world_handler(self):
        response = self.client.get('/', follow_redirects=True)
        self.assertEquals(response.data, &quot;Hello BLAH&quot;)

    def test_pretty_hello_world_handler(self):
        response = self.client.get('/pretty')
        self.assertTrue(&quot;Hello, World!&quot; in response.data)
</pre>
<p>In test_hello_world_handler(), we use self.client.get() to make a call to the&#8221;/&#8221; URL. Note that we&#8217;ve passed a follow_redirects argument; we don&#8217;t actually need this. This is just something I copied over from <a href="http://www.tipfy.org/paste/view/685812">Rodrigo&#8217;s original testing example</a>. We test to ensure that the response equals the output.</p>
<p>In our second test, we test the &#8220;pretty&#8221; version of this handler. We look for a String inside, but really it&#8217;s up to us how we want to do this. In general, we don&#8217;t want to look for an exact match of the output, since this makes our test extremely brittle and we&#8217;ll end up either not maintaining or deleting this test.</p>
<p>Advanced users will likely have all the handlers extend a BaseHandler RequestHandler class and call self.render(). We can point the render method to a Mock method, then try to capture the context parameters that were passed. (this is a bit out of scope for this post, but I may follow up this post with some quick samples of how to do Mocking &#8211; I like <a href="http://twitter.com/voidspace">Michael Foord&#8217;s</a> <a href="http://www.voidspace.org.uk/python/mock/">Mock</a> library.</p>
<h2>Writing tests with the datastore</h2>
<p>Let&#8217;s do something a bit more interesting. Let&#8217;s run some tests with the datastore. We&#8217;ll also demonstrate some other ways of testing Tipfy. Let&#8217;s consider the following, updated code snippet:</p>
<pre class="brush: python; title: ; notranslate">

# Install nose and nosegae:
#   sudo easy_install nose
#   sudo easy_install nosegae
#
# run via:
#  nosetests --with-gae --without-sandbox -v

import unittest

from tipfy import RequestHandler, Rule, Tipfy
# Need this import for testing
from google.appengine.api import apiproxy_stub_map, datastore_file_stub
from google.appengine.ext import db
import urls

class Comment(db.Model):
    body = db.StringProperty()

class TestHandler(unittest.TestCase):

    def setUp(self):
        &quot;&quot;&quot;
            We use this to clear the datastore. Thanks to Gaetestbed for
            his example here:

https://github.com/jgeewax/gaetestbed/blob/master/gaetestbed/datastore.py

        &quot;&quot;&quot;
        datastore_stub = apiproxy_stub_map.apiproxy._APIProxyStubMap__stub_map['datastore_v3']
        datastore_stub.Clear()

        # We're importing rules from the sample app
        # The sample app doesn't require an app
        self.app = Tipfy(rules=urls.get_rules(None))
        self.client = self.app.get_test_client()

    def test_hello_world_handler(self):
        response = self.client.get('/', follow_redirects=True)
        self.assertEquals(response.data, &quot;Hello BLAH&quot;)

    def test_pretty_hello_world_handler(self):
        response = self.client.get('/pretty')
        self.assertTrue(&quot;Hello, World!&quot; in response.data)

    def test_save_comment(self):
        class DatastorePostHandler(RequestHandler):
            def post(self):
                body = self.request.form.get(&quot;body&quot;)
                comment = Comment()
                comment.body = body
                comment.save()
                return &quot;OK&quot;

        rules = [
            Rule('/ds', endpoint='ds', handler=DatastorePostHandler),
        ]

        app = Tipfy(rules=rules)
        client = app.get_test_client()
        response = client.post('/ds')
        self.assertEquals(response.data, &quot;OK&quot;)
        comments = Comment.all().fetch(100)
        self.assertEquals(1, len(comments))
</pre>
<p>Revisiting the setUp() method, we see that we have a new line of code:</p>
<pre class="brush: python; title: ; notranslate">
datastore_stub = apiproxy_stub_map.apiproxy._APIProxyStubMap__stub_map['datastore_v3']
datastore_stub.Clear()
</pre>
<p>Between test invocations, the datastore stub is NOT cleared. This lets us do it, since the last thing we want is to have state persist between tests. That&#8217;s a very bad practice I occasionally see in &#8220;clever&#8221; attempts to save lines of code. Don&#8217;t do it. It causes flaky tests and will give you hours of pain. Reset your state and rebuild it each time.</p>
<p>test_save_comment() defines a handler and a set of rules for our Tipfy instance. We probably won&#8217;t be doing this for non-trivial applications, since the whole point is to test some handler code we wrote, but it serves our purpose for this example. We want to test for a side effect &#8211; in this case, that a comment was saved. In a more complete test, we would not only test for the number of comments, but we&#8217;d also test that the body was saved. Notice the difference in our call to client.post() &#8211; this invokes an HTTP POST instead of an HTTP GET.</p>
<p>When we run nosetests with the command above, we get:</p>
<pre class="brush: python; title: ; notranslate">

$ nosetests -d --with-gae --without-sandbox -v
test_hello_world_handler (apps.hello_world.tests.TestHandler) ... ok
test_pretty_hello_world_handler (apps.hello_world.tests.TestHandler) ... ok
test_save_comment (apps.hello_world.tests.TestHandler) ... ok

----------------------------------------------------------------------
Ran 3 tests in 0.206s

OK
</pre>
<p>And life is good again.</p>
<h2>Final notes on testing</h2>
<p>I&#8217;m not one of these people that believe that 100% test coverage, or even 80% test coverage is needed for a project to be well covered. The payoff for that much coverage often involves lots and lots of code is relatively minor, especially for trivial code paths.</p>
<p>I also see a lot of developers completely isolate each layer of the stack. In the datastore example above, these developers would have completely mocked out the datastore layer. I don&#8217;t find this to be a useful practice by default, as you end up testing your mocks and not the code. There are cases where this practice is useful, but in most cases, you will have more confidence in your code if you take the time to define a correct set of fixtures. Where you&#8217;ll 100% want mocks are places where you have complex or external services that can be flaky, or when you need to replicate failure conditions that are difficult to programmatically cause in your code.</p>
<p>Don&#8217;t think of testing as a replacement for QA because it&#8217;s not. In web testing, think of it as a replacement for opening a browser and clicking. When you discover a bug, you write a test for it and try to fix it, because in most cases setting up the error state will be much easier programmatically than manually. You&#8217;re always going to have to do browser testing at some point, but it&#8217;s time consuming, especially if you need your data in a specific state. You could go the <a href="http://seleniumhq.org/">Selenium</a> route for full coverage, but in my experience (people are going to disagree with me on this &#8211; get ready for comment/Twitter trolling), Selenium tests, while providing a high level of confidence, also are extremely brittle and are a maintenance nightmare if you have too many of them. You&#8217;ll want to write as many tests as you can outside the browser environment and save Selenium for the minority of your user flows that are critical &#8211; write Javascript unit tests instead of Selenium tests for client side functionality. I&#8217;ve used <a href="http://www.jsunit.net/">JsUnit</a> before and heard good things about <a href="http://pivotal.github.com/jasmine/">Jasmine</a> but never had experience with it myself.</p>
<p>And my last tip? Do what works for your team. But do write tests, because it&#8217;s one of those practices that will pay off over time if you write AND maintain them well.</p>
<p>- Ikai</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=171&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ikaisays.com/2011/02/19/unit-testing-in-tipfy-an-app-engine-framework-in-python/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/70564585574a96fa34fbab2a1a6981b0?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">Ikai Lan</media:title>
		</media:content>
	</item>
		<item>
		<title>App Engine datastore tip: monotonically increasing values are bad</title>
		<link>http://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/</link>
		<comments>http://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/#comments</comments>
		<pubDate>Wed, 26 Jan 2011 02:26:24 +0000</pubDate>
		<dc:creator>Ikai Lan</dc:creator>
				<category><![CDATA[App Engine]]></category>
		<category><![CDATA[Tips and Tricks]]></category>

		<guid isPermaLink="false">http://ikaisays.com/?p=159</guid>
		<description><![CDATA[When saving entities to App Engine&#8217;s datastore at a high write rate, avoid monotonically increasing values such as timestamps. Generally speaking, you don&#8217;t have to worry about this sort of thing until your application hits 100s of queries per second. Once you&#8217;re in that ballpark, you may want to examine potential hotspots in your application [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=159&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>When saving entities to App Engine&#8217;s datastore at a high write rate, avoid monotonically increasing values such as timestamps. Generally speaking, you don&#8217;t have to worry about this sort of thing until your application hits 100s of queries per second. Once you&#8217;re in that ballpark, you may want to examine potential hotspots in your application that can increase datastore latency.</p>
<p>To explain why this is, let&#8217;s examine what happens to the underlying Bigtable of an application with a high write rate. When a Bigtable <em>tablet</em>, a contiguous unit of storage, experiences a high write rate, the tablet will have to &#8220;split&#8221; into more than one tablet. This &#8220;split&#8221; allows new writes to shard. Here&#8217;s a visual approximation of what happens:</p>
<p><a href="http://ikailansays.files.wordpress.com/2011/01/hd-tablet-splitting.jpg"><img class="aligncenter size-full wp-image-161" title="hd - tablet splitting" src="http://ikailansays.files.wordpress.com/2011/01/hd-tablet-splitting.jpg?w=700&#038;h=906" alt="" width="700" height="906" /></a></p>
<p>There&#8217;s a moment of pain &#8211; this is one of the causes of datastore timeouts in high write applications, as discussed in <a href="http://blog.notdot.net">Nick Johnson</a>&#8216;s article, &#8220;<a href="http://code.google.com/appengine/articles/handling_datastore_errors.html">Handling Datastore Errors</a>&#8220;.</p>
<p>Remember that for indexed values, we must write corresponding index rows. When values are randomly or even semi-randomly distributed, like, say, user email addresses, tablet splits function well. This is because the work to write multiple values is distributed amongst several Bigtable tablets:</p>
<p><a href="http://ikailansays.files.wordpress.com/2011/01/tablet-splitting-randomly-distributed-keys.jpg"><img class="aligncenter size-full wp-image-164" title="tablet splitting - randomly distributed keys" src="http://ikailansays.files.wordpress.com/2011/01/tablet-splitting-randomly-distributed-keys.jpg?w=700&#038;h=540" alt="" width="700" height="540" /></a></p>
<p>The problems appear when we start saving monotonically increasing values like timestamps, or insert dictionary words in alphabetical order:</p>
<p><a href="http://ikailansays.files.wordpress.com/2011/01/tablet-splitting-monotonically-increasing-keys.jpg"><img class="aligncenter size-full wp-image-165" title="tablet splitting - monotonically increasing keys" src="http://ikailansays.files.wordpress.com/2011/01/tablet-splitting-monotonically-increasing-keys.jpg?w=700&#038;h=540" alt="" width="700" height="540" /></a></p>
<p>The new writes aren&#8217;t evenly distributed, and whichever tablet they end up going to end up becoming a new hot tablet in need of a split.</p>
<p>As a developer, what can you do to avoid this situation?</p>
<ul>
<li>Avoid indexes unless you need to query against the values. No index = no hot tablet on increasing value</li>
<li>Lower your write rate, or figure out how to better distribute values. A pure random distribution is best, but even a distribution that isn&#8217;t random will be better than a predictable, monotonically increasing value</li>
<li>Prefix a shard identifier to your value. This is problematic if you plan on doing queries, as you will need to prefix and unprefix the values, then join the results in memory &#8211; but it will reduce the error rate of your writes</li>
</ul>
<p>The tips are applicable whether you are on Master-Slave or <a href="http://googleappengine.blogspot.com/2011/01/announcing-high-replication-datastore.html">High Replication datastore</a>. And one more tip: don&#8217;t prematurely optimize for this case, since chances are, you won&#8217;t run into it. You can be spending that time working on features.</p>
<p>- Ikai</p>
<p>P.S. Yes, I drew those doodles. No, I do not have any formal art training (how could you tell?!)</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=159&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/70564585574a96fa34fbab2a1a6981b0?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">Ikai Lan</media:title>
		</media:content>

		<media:content url="http://ikailansays.files.wordpress.com/2011/01/hd-tablet-splitting.jpg" medium="image">
			<media:title type="html">hd - tablet splitting</media:title>
		</media:content>

		<media:content url="http://ikailansays.files.wordpress.com/2011/01/tablet-splitting-randomly-distributed-keys.jpg" medium="image">
			<media:title type="html">tablet splitting - randomly distributed keys</media:title>
		</media:content>

		<media:content url="http://ikailansays.files.wordpress.com/2011/01/tablet-splitting-monotonically-increasing-keys.jpg" medium="image">
			<media:title type="html">tablet splitting - monotonically increasing keys</media:title>
		</media:content>
	</item>
		<item>
		<title>GWT, Blobstore, the new high performance image serving API, and cute dogs on office chairs</title>
		<link>http://ikaisays.com/2010/09/08/gwt-blobstore-the-new-high-performance-image-serving-api-and-cute-dogs-on-office-chairs/</link>
		<comments>http://ikaisays.com/2010/09/08/gwt-blobstore-the-new-high-performance-image-serving-api-and-cute-dogs-on-office-chairs/#comments</comments>
		<pubDate>Thu, 09 Sep 2010 01:00:02 +0000</pubDate>
		<dc:creator>Ikai Lan</dc:creator>
				<category><![CDATA[App Engine]]></category>
		<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://ikaisays.com/?p=142</guid>
		<description><![CDATA[I’ve been working on an image sharing application using GWT and App Engine to familiarize myself with the newer aspects of GWT. The project and code are here: http://ikai-photoshare.appspot.com http://github.com/ikai/gwt-gae-image-gallery (Please excuse spaghetti code in client side GWT code, much of it was me feeling my way around GWT. I’ve come to appreciate GWT quite [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=142&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>I’ve been working on an image sharing application using GWT and App Engine to familiarize myself with the newer aspects of GWT. The project and code are here:</p>
<p><a href="http://ikai-photoshare.appspot.com">http://ikai-photoshare.appspot.com</a><br />
<a href="http://github.com/ikai/gwt-gae-image-gallery">http://github.com/ikai/gwt-gae-image-gallery</a></p>
<p>(Please excuse spaghetti code in client side GWT code, much of it was me feeling my way around GWT. I’ve come to appreciate GWT quite a bit in spite of the fact that I’m pretty familiar with client side development; I’ll write about this in a future post).</p>
<p>The <a href="http://googleappengine.blogspot.com/2010/08/multi-tenancy-support-high-performance_17.html">1.3.6 release</a> of the App Engine SDK shipped with a high performance image serving API. What this means is that a developer can take a blob key pointing to image data stored in the <a href="http://code.google.com/appengine/docs/java/blobstore/">blobstore</a> and call getServingUrl() to create a special URL for serving the image. What are the benefits to using this API?</p>
<ul>
<li>You don’t have to write your own <a href="http://code.google.com/appengine/articles/java/serving_dynamic_images.html">handler for uploaded images</a></li>
<li>You don’t have to consume storage quota for saving resized or cropped images, as you can perform transforms on the image simply by appending URL parameters. You only need to store the final URL that is generated by getServingUrl().</li>
<li>You aren’t charged for datastore CPU for fetching the image (you will still be billed for bandwidth)</li>
<li>Images are, in general, served from edge server locations which can be geographically located closer to the user</li>
</ul>
<p>There are a few drawbacks, however, to using the API:</p>
<ul>
<li>There aren’t any great schemes for access control of the images, and if someone has the URL for a thumbnail, they can easily remove the parameters to see a larger image</li>
<li>Billing must be enabled &#8211; you will only be charged for usage, however, so you don’t have to spend a cent to use the API. You just have to have billing active.</li>
<li>Deleting an image blob doesn’t delete the image being served from the URL right away &#8211; that image will still be available for some time</li>
<li>Images must be uploaded to the blobstore, not the datastore as a blob, so it’s important to understand how the blobstore API works</li>
<li>The URLs of the created images are really, really ugly. If you need pretty URLs, it’s probably a better pattern to create a URL mapping to an HTML page that just displays the image in an IMG tag</li>
</ul>
<h2>Blobstore crash course</h2>
<p>It’ll be best if we gave a quick refresher course on the blobstore before we begin. Here’s the standard flow for a blobstore upload:</p>
<ol>
<li> Create a new blobstore session and generate an upload URL for a form to POST to. This is done using the <a href="http://code.google.com/appengine/docs/java/javadoc/com/google/appengine/api/blobstore/BlobstoreService.html#createUploadUrl(java.lang.String)">createUploadUrl() method of BlobstoreService</a>. Pass a callback URL to this method. This URL is where the user will be forwarded after the upload has completed.</li>
<li>Present an upload form to the user. The action is the URL generated in step 1. Each URL must be unique: you cannot use the same URL for multiple sessions, as this will cause an error.</li>
<li>After the URL has uploaded the file, the user is forwarded to the callback URL in your App Engine application specified in step 1. The key of the uploaded blob, a String blob key, is passed as an URL parameter. Save this URL and pass the user to their final destination</li>
</ol>
<p>Got it? Now we can talk about image serving.</p>
<h2>Using the image serving URL</h2>
<p>Once we have a blob key (step 3 of a Blobstore upload), we can do interesting things with it. First, we’ll need to create an instance of the ImagesService:</p>
<pre class="brush: java; title: ; notranslate">
ImagesService imagesService = ImagesServiceFactory.getImagesService();
</pre>
<p>Once we have an instance, we pass the blob key to getServingUrl and get back a URL:</p>
<pre class="brush: java; title: ; notranslate">
String imageUrl = imagesService.getServingUrl(blobKey);
</pre>
<p>This can sometimes take several hundred milliseconds to a few seconds to generate, so it’s almost always a good idea to run this on write as opposed to first read. Subsequent calls should be faster, but they may not be as fast as reading this value from a datastore entity property or memcache. Since this value doesn’t change, it’s a good idea to store it. On the local dev server, this URL looks something like this:</p>
<p><pre>/_ah/img/eq871HJL_bYxhWQbTeYYoA</pre>
</p>
<p>In production, however, this will return a URL that looks like this:</p>
<p><pre>http://lh5.ggpht.com/2PQk0vDo8Bn8oiPba2gtGlDfd1ciD0H0MLrixcT12FCDQEm2oyMW9ErJX_-ZzOHBWbYBKzevK0BY6cxdZ3cxf_37</pre>
<p>(Cute dogs below)<br />
<img src="http://lh5.ggpht.com/2PQk0vDo8Bn8oiPba2gtGlDfd1ciD0H0MLrixcT12FCDQEm2oyMW9ErJX_-ZzOHBWbYBKzevK0BY6cxdZ3cxf_37=s500" />
</p>
<p>You’ve already saved yourself the trouble of writing a handler. What’s really nice about this URL is that you can perform operations on it just by appending parameters. Let’s say we wanted to crop our image to be no larger than 200&#215;200, yet retain scale. We’d simply append “=s200” to the end of the image:</p>
<p><pre>http://lh5.ggpht.com/2PQk0vDo8Bn8oiPba2gtGlDfd1ciD0H0MLrixcT12FCDQEm2oyMW9ErJX_-ZzOHBWbYBKzevK0BY6cxdZ3cxf_37=s144</pre>
<p>(Looks like this)<br />
<img src="http://lh5.ggpht.com/2PQk0vDo8Bn8oiPba2gtGlDfd1ciD0H0MLrixcT12FCDQEm2oyMW9ErJX_-ZzOHBWbYBKzevK0BY6cxdZ3cxf_37=s144" alt="" /></p>
<p>We can also crop the image by appending a “-c” to the size parameter:</p>
<pre>http://lh5.ggpht.com/2PQk0vDo8Bn8oiPba2gtGlDfd1ciD0H0MLrixcT12FCDQEm2oyMW9ErJX_-ZzOHBWbYBKzevK0BY6cxdZ3cxf_37=s144-c</pre>
<p>(Looks like this &#8211; compare with above)</p>
<p><img src="http://lh5.ggpht.com/2PQk0vDo8Bn8oiPba2gtGlDfd1ciD0H0MLrixcT12FCDQEm2oyMW9ErJX_-ZzOHBWbYBKzevK0BY6cxdZ3cxf_37=s144-c" alt="" /></p>
<p>Note that we can also generate these URLs programmatically using the <a href="http://code.google.com/appengine/docs/java/javadoc/com/google/appengine/api/images/ImagesService.html#getServingUrl(com.google.appengine.api.blobstore.BlobKey, int, boolean)">overloaded version of getServingUrl</a> that also accepts a size and crop parameter.</p>
<h2>Adding GWT</h2>
<p>So now that we’ve got all that done, let’s get it working with GWT. It’s important that we understand how it all works, because GWT’s single-page, Javascript-generated content model must be taken into account. Let’s draw our upload widget. We’ll be using UiBinder:</p>
<p>We’ll create our Composite class as follows:</p>
<pre class="brush: java; title: ; notranslate">
public class UploadPhoto extends Composite {

    private static UploadPhotoUiBinder uiBinder = GWT.create(UploadPhotoUiBinder.class);

    UserImageServiceAsync userImageService = GWT.create(UserImageService.class);

    interface UploadPhotoUiBinder extends UiBinder {}

    @UiField
    Button uploadButton;

    @UiField
    FormPanel uploadForm;

    @UiField
    FileUpload uploadField;

    public UploadPhoto(final LoginInfo loginInfo) {
        initWidget(uiBinder.createAndBindUi(this));
    }

}
</pre>
<p>Here&#8217;s the corresponding XML file:</p>
<pre class="brush: xml; title: ; notranslate">
&lt;!DOCTYPE ui:UiBinder SYSTEM &quot;http://dl.google.com/gwt/DTD/xhtml.ent&quot;&gt;
&lt;ui:UiBinder xmlns:ui=&quot;urn:ui:com.google.gwt.uibinder&quot;
	xmlns:g=&quot;urn:import:com.google.gwt.user.client.ui&quot;&gt;
	&lt;g:FormPanel ui:field=&quot;uploadForm&quot;&gt;
		&lt;g:HorizontalPanel&gt;
			&lt;g:FileUpload ui:field=&quot;uploadField&quot;&gt;&lt;/g:FileUpload&gt;
			&lt;g:Button ui:field=&quot;uploadButton&quot;&gt;&lt;/g:Button&gt;
		&lt;/g:HorizontalPanel&gt;
	&lt;/g:FormPanel&gt;
&lt;/ui:UiBinder&gt; 
</pre>
<p>(We’ll add more to this later)</p>
<p>When we discussed the Blobstore, we mentioned that each upload form has a different POST location corresponding to the upload session. We’ll have to add a GWT-RPC component to generate and return a URL. Let’s do that now:</p>
<pre class="brush: java; title: ; notranslate">
// UserImageService.java
@RemoteServiceRelativePath(&quot;images&quot;)
public interface UserImageService extends RemoteService  {
    public String getBlobstoreUploadUrl();
}
</pre>
<p>Our IDE will nag us to generate the corresponding Async interface if we have a GWT plugin:</p>
<pre class="brush: java; title: ; notranslate">
// UserImageServiceAsync.java
public interface UserImageServiceAsync {
    public void getBlobstoreUploadUrl(AsyncCallback callback);
}
</pre>
<p>We’ll need to write the code on the server side:</p>
<pre class="brush: java; title: ; notranslate">
// UserImageServiceImpl.java
@SuppressWarnings(&quot;serial&quot;)
public class UserImageServiceImpl extends RemoteServiceServlet implements UserImageService {

    @Override
    public String getBlobstoreUploadUrl() {
        BlobstoreService blobstoreService = BlobstoreServiceFactory.getBlobstoreService();
        return blobstoreService.createUploadUrl(&quot;/upload&quot;);
    }

}
</pre>
<p>This is pretty straightforward. We’ll want to invoke this service on the client side when we build the form. Let’s add this to UploadPhoto:</p>
<pre class="brush: java; title: ; notranslate">
public class UploadPhoto extends Composite {

private static UploadPhotoUiBinder uiBinder = GWT.create(UploadPhotoUiBinder.class);
UserImageServiceAsync userImageService = GWT.create(UserImageService.class);

interface UploadPhotoUiBinder extends UiBinder {}

    @UiField
    Button uploadButton;

    @UiField
    FormPanel uploadForm;

    @UiField
    FileUpload uploadField;

    public UploadPhoto() {
        initWidget(uiBinder.createAndBindUi(this));

        // Disable the button until we get the URL to POST to
        uploadButton.setText(&quot;Loading...&quot;);
        uploadForm.setEncoding(FormPanel.ENCODING_MULTIPART);
        uploadForm.setMethod(FormPanel.METHOD_POST);
        uploadButton.setEnabled(false);
        uploadField.setName(&quot;image&quot;);

        // Now we use out GWT-RPC service and get an URL
        startNewBlobstoreSession();

        // Once we've hit submit and it's complete, let's set the form to a new session.
        // We could also have probably done this on the onClick handler
        uploadForm.addSubmitCompleteHandler(new FormPanel.SubmitCompleteHandler() {

            @Override
            public void onSubmitComplete(SubmitCompleteEvent event) {
                uploadForm.reset();
               startNewBlobstoreSession();
            }
        });
    }

    private void startNewBlobstoreSession() {
        userImageService.getBlobstoreUploadUrl(new AsyncCallback() {

            @Override
            public void onSuccess(String result) {
                uploadForm.setAction(result);
                uploadButton.setText(&quot;Upload&quot;);
                uploadButton.setEnabled(true);
            }

            @Override
            public void onFailure(Throwable caught) {
                // We probably want to do something here
            }
        });
    }

    @UiHandler(&quot;uploadButton&quot;)
    void onSubmit(ClickEvent e) {
        uploadForm.submit();
    }

}
</pre>
<p>This is fairly standard <a>GWT RPC</a>.</p>
<p>So that concludes the GWT part of it. We mentioned an upload callback. Let’s implement that now:</p>
<pre class="brush: java; title: ; notranslate">
/**
 * @author Ikai Lan
 * 
 *         This is the servlet that handles the callback after the blobstore
 *         upload has completed. After the blobstore handler completes, it POSTs
 *         to the callback URL, which must return a redirect. We redirect to the
 *         GET portion of this servlet which sends back a key. GWT needs this
 *         Key to make another request to get the image serving URL. This adds
 *         an extra request, but the reason we do this is so that GWT has a Key
 *         to work with to manage the Image object. Note the content-type. We
 *         *need* to set this to get this to work. On the GWT side, we'll take
 *         this and show the image that was uploaded.
 * 
 */
@SuppressWarnings(&quot;serial&quot;)
public class UploadServlet extends HttpServlet {
	private static final Logger log = Logger.getLogger(UploadServlet.class
			.getName());

	private BlobstoreService blobstoreService = BlobstoreServiceFactory
			.getBlobstoreService();

	public void doPost(HttpServletRequest req, HttpServletResponse res)
			throws ServletException, IOException {

		Map blobs = blobstoreService.getUploadedBlobs(req);
		BlobKey blobKey = blobs.get(&quot;image&quot;);

		if (blobKey == null) {
			// Uh ... something went really wrong here
		} else {

			ImagesService imagesService = ImagesServiceFactory
					.getImagesService();

			// Get the image serving URL
			String imageUrl = imagesService.getServingUrl(blobKey);

			// For the sake of clarity, we'll use low-level entities
			Entity uploadedImage = new Entity(&quot;UploadedImage&quot;);
			uploadedImage.setProperty(&quot;blobKey&quot;, blobKey);
			uploadedImage.setProperty(UploadedImage.CREATED_AT, new Date());

			// Highly unlikely we'll ever filter on this property
			uploadedImage.setUnindexedProperty(UploadedImage.SERVING_URL,
					imageUrl);

			DatastoreService datastore = DatastoreServiceFactory
					.getDatastoreService();
			datastore.put(uploadedImage);

			res.sendRedirect(&quot;/upload?imageUrl=&quot; + imageUrl);
		}
	}

	@Override
	protected void doGet(HttpServletRequest req, HttpServletResponse resp)
			throws ServletException, IOException {

		String imageUrl = req.getParameter(&quot;imageUrl&quot;);
		resp.setHeader(&quot;Content-Type&quot;, &quot;text/html&quot;);

		// This is a bit hacky, but it'll work. We'll use this key in an Async
		// service to
		// fetch the image and image information
		resp.getWriter().println(imageUrl);

	}
}
</pre>
<p>We’ll probably want to display the image we just uploaded in the client. Let’s add a line line of code to register a SubmitCompleteHandler to do this:</p>
<pre class="brush: java; title: ; notranslate">
	public void onSubmitComplete(SubmitCompleteEvent event) {
		uploadForm.reset();
		startNewBlobstoreSession();

		// This is what gets the result back - the content-type *must* be
		// text-html
		String imageUrl = event.getResults();
		Image image = new Image();
		image.setUrl(imageUrl);

		final PopupPanel imagePopup = new PopupPanel(true);
		imagePopup.setWidget(image);

		// Add some effects
		imagePopup.setAnimationEnabled(true); // animate opening the image
		imagePopup.setGlassEnabled(true); // darken everything under the image
		imagePopup.setAutoHideEnabled(true); // close image when the user clicks
												// outside it
		imagePopup.center(); // center the image

	}
</pre>
<p>And we’re done!</p>
<h2>Get the code</h2>
<p>I’ve got the code for this project here:</p>
<p><a href="http://github.com/ikai/gwt-gae-image-gallery">http://github.com/ikai/gwt-gae-image-gallery</a></p>
<p>Just a warning, this is a bit different from the sample code above. I wrote this post after I wrote the code, extrapolating the bare minimum to make this work. The sample code above has experimental tagging, delete and catches logins. I’m adding features to it simply to see what else can be done, so expect changes. I’m aware of a few of the bugs with the code, and I’ll get around to fixing them, but again, it’s a demo project, so keep realistic expectations. As far as I can tell, however, the code above should be runnable locally and deployable (once you have enabled billing for blobstore).</p>
<p>Happy developing!</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=142&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ikaisays.com/2010/09/08/gwt-blobstore-the-new-high-performance-image-serving-api-and-cute-dogs-on-office-chairs/feed/</wfw:commentRss>
		<slash:comments>22</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/70564585574a96fa34fbab2a1a6981b0?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">Ikai Lan</media:title>
		</media:content>

		<media:content url="http://lh5.ggpht.com/2PQk0vDo8Bn8oiPba2gtGlDfd1ciD0H0MLrixcT12FCDQEm2oyMW9ErJX_-ZzOHBWbYBKzevK0BY6cxdZ3cxf_37=s500" medium="image" />

		<media:content url="http://lh5.ggpht.com/2PQk0vDo8Bn8oiPba2gtGlDfd1ciD0H0MLrixcT12FCDQEm2oyMW9ErJX_-ZzOHBWbYBKzevK0BY6cxdZ3cxf_37=s144" medium="image" />

		<media:content url="http://lh5.ggpht.com/2PQk0vDo8Bn8oiPba2gtGlDfd1ciD0H0MLrixcT12FCDQEm2oyMW9ErJX_-ZzOHBWbYBKzevK0BY6cxdZ3cxf_37=s144-c" medium="image" />
	</item>
		<item>
		<title>Using the App Engine Mapper for bulk data import</title>
		<link>http://ikaisays.com/2010/08/11/using-the-app-engine-mapper-for-bulk-data-import/</link>
		<comments>http://ikaisays.com/2010/08/11/using-the-app-engine-mapper-for-bulk-data-import/#comments</comments>
		<pubDate>Wed, 11 Aug 2010 18:33:00 +0000</pubDate>
		<dc:creator>Ikai Lan</dc:creator>
				<category><![CDATA[App Engine]]></category>
		<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://ikaisays.com/?p=132</guid>
		<description><![CDATA[Since my last post describing App Engine mapreduce, a new InputReader has been added to the Java project for reading from the Blobstore. Nick Johnson wrote a great demo where indexing was done via reading code uploaded to the blobstore. This was demo’d at Google I/O. Now that the library is officially part of the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=132&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Since <a href="http://ikaisays.com/2010/07/09/using-the-java-mapper-framework-for-app-engine/">my last post</a> describing <a href="http://code.google.com/p/appengine-mapreduce/">App Engine mapreduce</a>, a new InputReader has been added to the Java project for reading from the Blobstore. Nick Johnson <a href="http://blog.notdot.net/2010/05/Exploring-the-new-mapper-API">wrote a great demo</a> where indexing was done via reading code uploaded to the blobstore. This was <a href="http://code.google.com/events/io/2010/sessions/batch-data-processing-app-engine.html">demo’d at Google I/O</a>. Now that the library is officially part of the project, it’s become much easier for developers to build Mappers that map across some large, contiguous piece of data as opposed to Entities in the datastore.The most obvious use case is data import. A developer looking to import large amounts of data would take the following steps:</p>
<ol>
<li>Create a CSV file containing the data you want to import. The assumption here is that each line of data corresponds to a datastore entity you want to create</li>
<li>Upload the CSV file to the blobstore. You’ll need billing to be enabled for this to work.</li>
<li>Create your Mapper, push it live and run your job importing your data.</li>
</ol>
<p>This isn’t meant to be a replacement for the <a href="http://code.google.com/appengine/docs/python/tools/uploadingdata.html">bulk uploader tool</a>; merely an alternative. This method requires a good amount more programmatic changes for custom data transforms. The advantage of this method is that the work is done on the server side, whereas the bulk uploader makes use of the <a href="http://code.google.com/appengine/articles/remote_api.html">remote API</a> to get work done. Let’s get started on each of the steps.</p>
<h2>Step 1: Create a CSV file with the data you want to upload</h2>
<p>We’re going to go through an example of uploading City and State information. MaxMind.com provides a <a href="http://www.maxmind.com/app/geolitecity">free GeoIP CSV file</a>. The free version isn’t as full featured as the paid version, but it’ll do fine for our demo. <strong>Be sure that if you use this file in any kind of production application that you </strong><a href="http://geolite.maxmind.com/download/geoip/database/LICENSE.txt"><strong>read and understand the license</strong></a><strong> first!</strong> For simplicity, we’re going to parse out only cities in the United States using grep. The file should now contain lines that look like this:</p>
<pre>
605,"US","NY","Valhalla","10595",41.0877,-73.7768,501,914
606,"US","PA","Pittsburgh","15222",40.4495,-79.9880,508,412
607,"US","MO","Bridgeton","63044",38.7667,-90.4201,609,314
608,"US","CA","San Francisco","94124",37.7312,-122.3826,807,415
609,"US","NY","New York","10017",40.7528,-73.9725,501,212
610,"US","PA","Bear Lake","16402",41.9491,-79.4448,516,814
611,"US","NJ","Piscataway","08854",40.5516,-74.4637,501,732
612,"US","NY","Keuka Park","14478",42.5669,-77.1325,555,315
613,"US","VT","Brattleboro","05302",42.8496,-72.6645,506,802
</pre>
<h2>2. Create an upload handler for your CSV file and upload the CSV file</h2>
<p>We’re going to create a basic handler for uploading a CSV file and displaying the key. We’ll need to pass this key to our mapper later. There isn’t too much magic here; it’s very similar to the sample code available for the <a href="http://code.google.com/appengine/docs/java/blobstore/overview.html">basic blobstore example</a>.</p>
<p>We’ll do a quick overview of the code we need here, but for the purposes of this post, it’s out of scope. We’ll need these files:</p>
<h3>upload.jsp</h3>
<pre class="brush: java; title: ; notranslate">
&lt;%@ page language=&quot;java&quot; contentType=&quot;text/html; charset=ISO-8859-1&quot;
    pageEncoding=&quot;ISO-8859-1&quot;%&gt;
&lt;!DOCTYPE html PUBLIC &quot;-//W3C//DTD HTML 4.01 Transitional//EN&quot; &quot;http://www.w3.org/TR/html4/loose.dtd&quot;&gt;

&lt;%@page import=&quot;com.google.appengine.api.blobstore.BlobstoreService&quot;%&gt;
&lt;%@page import=&quot;com.google.appengine.api.blobstore.BlobstoreServiceFactory&quot;%&gt;
&lt;html&gt;
&lt;head&gt;
&lt;meta http-equiv=&quot;Content-Type&quot; content=&quot;text/html; charset=ISO-8859-1&quot;&gt;
&lt;title&gt;Upload your CSV file here&lt;/title&gt;
&lt;/head&gt;
&lt;body&gt;
    &lt;% BlobstoreService blobstoreService = BlobstoreServiceFactory.getBlobstoreService(); %&gt;
    &lt;form action=&quot;&lt;%= blobstoreService.createUploadUrl(&quot;/upload&quot;) %&gt;&quot; method=&quot;post&quot; enctype=&quot;multipart/form-data&quot;&gt;
        &lt;input type=&quot;file&quot; name=&quot;data&quot;&gt;
        &lt;input type=&quot;submit&quot; value=&quot;Submit&quot;&gt;
    &lt;/form&gt;
&lt;/body&gt;
&lt;/html&gt;
</pre>
<h3>UploadBlobServlet.java</h3>
<pre class="brush: java; title: ; notranslate">
package com.ikai.mapperdemo.servlets;

import java.io.IOException;
import java.util.Map;

import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.google.appengine.api.blobstore.BlobKey;
import com.google.appengine.api.blobstore.BlobstoreService;
import com.google.appengine.api.blobstore.BlobstoreServiceFactory;

@SuppressWarnings(&quot;serial&quot;)
public class UploadBlobServlet extends HttpServlet {
	public void doPost(HttpServletRequest req, HttpServletResponse resp)
			throws IOException {

		BlobstoreService blobstoreService = BlobstoreServiceFactory.getBlobstoreService();
		Map&lt;String, BlobKey&gt; blobs = blobstoreService.getUploadedBlobs(req);
		BlobKey blobKey = blobs.get(&quot;data&quot;);

		if (blobKey == null) {
			resp.sendRedirect(&quot;/&quot;);
		} else {
			resp.sendRedirect(&quot;/upload-success?blob-key=&quot; + blobKey.getKeyString());
		}
	}

}
</pre>
<h3>SuccessfulUploadServlet.java</h3>
<pre class="brush: java; title: ; notranslate">
package com.ikai.mapperdemo.servlets;

import java.io.IOException;

import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

@SuppressWarnings(&quot;serial&quot;)
public class SuccessfulUploadServlet extends HttpServlet {
	public void doGet(HttpServletRequest req, HttpServletResponse resp)
			throws IOException {

		String blobKey = req.getParameter(&quot;blob-key&quot;);

		resp.setContentType(&quot;text/html&quot;);
		resp.getWriter().println(&quot;Successfully uploaded. Download file: &lt;br/&gt;&quot;);
		resp.getWriter().println(
				&quot;&lt;a href='/serve?blob-key=&quot; + blobKey
						+ &quot;'&gt;Click to download&lt;/a&gt;&quot;);
	}

}
</pre>
<p>Source code for this and other helper functions should be available in the <a href="http://github.com/ikai/App-Engine-Java-Mapper-API-demos">Github repository</a>.</p>
<h2>Step 3: Create your Mapper</h2>
<p>Now we get to the fun part. We need to create our Mapper. A prerequisite for understanding what’s coming next is reading the <a href="http://ikaisays.com/2010/07/09/using-the-java-mapper-framework-for-app-engine/">last post about Mapper I wrote</a>, so check that out before proceeding if you aren’t familiar with Mapper basics. Our Mapper class looks like this:</p>
<h3>ImportFromBlobstoreMapper.java</h3>
<pre class="brush: java; title: ; notranslate">
package com.ikai.mapperdemo.mappers;

import java.util.logging.Logger;

import org.apache.hadoop.io.NullWritable;

import com.google.appengine.api.datastore.Entity;
import com.google.appengine.tools.mapreduce.AppEngineMapper;
import com.google.appengine.tools.mapreduce.BlobstoreRecordKey;
import com.google.appengine.tools.mapreduce.DatastoreMutationPool;

/**
 * 
 * This Mapper imports from a CSV file in the Blobstore. The CSV
 * assumes it's in the MaxMind format for cities, states, zipcodes
 * and lat/long.
 * 
 * 
 * @author Ikai Lan
 *
 */
public class ImportFromBlobstoreMapper extends
		AppEngineMapper&lt;BlobstoreRecordKey, byte[], NullWritable, NullWritable&gt; {
	private static final Logger log = Logger.getLogger(ImportFromBlobstoreMapper.class
			.getName());

	@Override
	public void map(BlobstoreRecordKey key, byte[] segment, Context context) {
		
		String line = new String(segment);
		
		log.info(&quot;At offset: &quot; + key.getOffset());
		log.info(&quot;Got value: &quot; + line);
		
		// Line format looks like this:
		// 10644,&quot;US&quot;,&quot;VA&quot;,&quot;Tazewell&quot;,&quot;24651&quot;,37.0595,-81.5220,559,276
		// We're also assuming no errant commas in this simple example
		
		String[] values = line.split(&quot;,&quot;);
		String state = values[2];
		String cityName = values[3];		
		String zipcode = values[4];
		Double latitude = Double.parseDouble(values[5]);
		Double longitude = Double.parseDouble(values[6]);		
		
		state = state.replaceAll(&quot;\&quot;&quot;, &quot;&quot;);
		cityName = cityName.replaceAll(&quot;\&quot;&quot;, &quot;&quot;);
		zipcode = zipcode.replaceAll(&quot;\&quot;&quot;, &quot;&quot;);
		
		if(!zipcode.isEmpty()) {
			Entity zip = new Entity(&quot;Zip&quot;, zipcode);
			zip.setProperty(&quot;state&quot;, state);
			zip.setProperty(&quot;city&quot;, cityName);
			zip.setProperty(&quot;latitude&quot;, latitude);
			zip.setProperty(&quot;longitute&quot;, longitude);
			
			Entity city = new Entity(&quot;City&quot;, cityName);
			city.setProperty(&quot;state&quot;, state);
			city.setUnindexedProperty(&quot;zip&quot;, zipcode);
			
			DatastoreMutationPool mutationPool = this.getAppEngineContext(context)
					.getMutationPool();
			mutationPool.put(zip);
			mutationPool.put(city);
		}

	}
}
</pre>
<p>Let’s explain the things in this Mapper that are new:</p>
<pre class="brush: java; title: ; notranslate">
public class ImportFromBlobstoreMapper extends
AppEngineMapper&amp;lt;BlobstoreRecordKey, byte[], NullWritable, NullWritable&amp;gt;
</pre>
<p>Note this line. It’s different from our previous Mappers in that the type arguments are no longer Key and Entity, but BlobstoreRecordKey and byte[]. The <a href="http://code.google.com/p/appengine-mapreduce/source/browse/trunk/java/src/com/google/appengine/tools/mapreduce/BlobstoreRecordKey.java">source for BlobstoreRecordKey is here</a>. Remember that map-reduce is about some large body of data and breaking it into smaller pieces to operate on. BlobstoreRecordKey represents a pointer to range of data in our Blobstore. byte[] is a byte[] array actually containing that data.</p>
<pre class="brush: java; title: ; notranslate">
public void map(BlobstoreRecordKey key, byte[] segment, Context context)
</pre>
<p>Again, notice the new types. By default, we are splitting on a newline, so segment represents a single line. We can change what we split on by specifying a terminator in mapreduce.xml.</p>
<pre class="brush: java; title: ; notranslate">
		String line = new String(segment);
		
		// Line format looks like this:
		// 10644,&quot;US&quot;,&quot;VA&quot;,&quot;Tazewell&quot;,&quot;24651&quot;,37.0595,-81.5220,559,276
		// We're also assuming no errant commas in this simple example
		
		String[] values = line.split(&quot;,&quot;);
		String state = values[2];
		String cityName = values[3];		
		String zipcode = values[4];
		Double latitude = Double.parseDouble(values[5]);
		Double longitude = Double.parseDouble(values[6]);		
		
		state = state.replaceAll(&quot;\&quot;&quot;, &quot;&quot;);
		cityName = cityName.replaceAll(&quot;\&quot;&quot;, &quot;&quot;);
		zipcode = zipcode.replaceAll(&quot;\&quot;&quot;, &quot;&quot;);

</pre>
<p>This is very naive String parsing. Nothing fancy here.</p>
<pre class="brush: java; title: ; notranslate">
		if(!zipcode.isEmpty()) {
			Entity zip = new Entity(&quot;Zip&quot;, zipcode);
			zip.setProperty(&quot;state&quot;, state);
			zip.setProperty(&quot;city&quot;, cityName);
			zip.setProperty(&quot;latitude&quot;, latitude);
			zip.setProperty(&quot;longitute&quot;, longitude);
			
			Entity city = new Entity(&quot;City&quot;, cityName);
			city.setProperty(&quot;state&quot;, state);
			city.setUnindexedProperty(&quot;zip&quot;, zipcode);
			
			DatastoreMutationPool mutationPool = this.getAppEngineContext(context)
					.getMutationPool();
			mutationPool.put(zip);
			mutationPool.put(city);
		}
</pre>
<p>Again, very straightforward if you’ve seen this before. Some zipcodes in our CSV file subset are empty, so we’ll check for that and just not create an Entity. We’re adding 2 entities to the mutation pool here &#8211; a City and a Zipcode. This ensures that we can search by key when we do a datastore get. Remember that fetches by key are always faster than fetches with a query, since a query requires an index scan followed by a batch get, whereas the datastore can perform a get in a single operation.</p>
<p>That’s it for our Mapper. Let’s add a configuration:</p>
<pre class="brush: xml; title: ; notranslate">
  &lt;configuration name=&quot;Import all data from the Blobstore&quot;&gt;
    &lt;property&gt;
      &lt;name&gt;mapreduce.map.class&lt;/name&gt;
      
      &lt;!--  Set this to be your Mapper class  --&gt;
      &lt;value&gt;com.ikai.mapperdemo.mappers.ImportFromBlobstoreMapper&lt;/value&gt;
    &lt;/property&gt;
        
    &lt;!--  This is a default tool that lets us iterate over blobstore data --&gt;
    &lt;property&gt;
      &lt;name&gt;mapreduce.inputformat.class&lt;/name&gt;
      &lt;value&gt;com.google.appengine.tools.mapreduce.BlobstoreInputFormat&lt;/value&gt;
    &lt;/property&gt;
    
    &lt;property&gt;
      &lt;name human=&quot;Blob Keys to Map Over&quot;&gt;mapreduce.mapper.inputformat.blobstoreinputformat.blobkeys&lt;/name&gt;
      &lt;value template=&quot;optional&quot;&gt;blobkeyhere&lt;/value&gt;      
    &lt;/property&gt;        
    
    &lt;property&gt;
      &lt;name human=&quot;Number of shards to use&quot;&gt;mapreduce.mapper.shardcount&lt;/name&gt;
      &lt;value template=&quot;optional&quot;&gt;10&lt;/value&gt;      
    &lt;/property&gt;        
    
  &lt;/configuration&gt;  
</pre>
<p>We’ve changed 2 properties here: the input format class as well as a property for the blobstore key pointing to the data to iterate over.</p>
<h2>Step 4: Deploy!</h2>
<p>We can now package our application up and deploy it! Make sure that you built a new JAR file with the new classes in appengine-mapreduce! If you have the old JAR file, it won’t include the BlobstoreInputFormat class that we need to do our work.</p>
<h2>Step 5: Using the Mapper</h2>
<p>
Let’s browse to our upload hander at /upload.jsp. The page should be pretty bare.
</p>
<p><img src="https://lh5.googleusercontent.com/WAuj1OJMMX30BKgamjZ1l8Aa8qMX1g-CMNV_9Ts8ucNZvS0b-oE-XzzIHAgvRTpXDQZU0QzFmBK1dEXNGTqOpAFJ7RHJLzJ2b0_n21jeKY5jwVOsRw" alt="" width="332px;" height="36px;" /></p>
<p>Once the upload has finished, we’ll be on a page that looks like this:</p>
<p><img src="https://lh4.googleusercontent.com/QvQ8gxWJeRG6Vjz7ZQN_5SNEmhn8tvB3kEgiIW7hXc6VUJTpaaqW2EjzHcscsRudFowpjxmQpl1A1viYghBt87lZNL5D3wEpGl80ONTgdU5Z3aiedg" alt="" width="525px;" height="208px;" /></p>
<p>Let’s copy the blob-key in the URL. It’s not the most streamlined approach but it works. We’ll use it in the next screen when we browser to our mapper:</p>
<p><img src="https://lh5.googleusercontent.com/G25lQcAUXzY_3unnovi35BW4GNdJScOCb88iwB0yUGELOIghHc7ETjYkKXi17A6zqFPEK8hSr8gaH78HuF8vWi0-7OZa8sFXTaWSHqC0AbOPowL6xQ" alt="" width="321px;" height="162px;" /></p>
<p>We’ll copy-paste the key to replace “blobkeyhere” and hit “Run”. And now we play the waiting game &#8211; we’ll be able to check on the status of our Mapper in the UI, or check on Tasks, or look in the datastore to see if the data has been imported correctly:</p>
<p><img src="https://lh4.googleusercontent.com/a1PUz1wFvIPzC7_Zp1TW5cFuBSgpfN4QEvDex51St_lxL83XKHXLrg_u96aMmiYKqvegz_tz1TN6fAkHwX_j1fsB7SchfA79rpXblPJ1aTCsLvul6w" alt="" width="661px;" height="540px;" /></p>
<h2>Get the code</h2>
<p>The code is here on Github:</p>
<p><a href="http://github.com/ikai/App-Engine-Java-Mapper-API-demos">http://github.com/ikai/App-Engine-Java-Mapper-API-demos</a></p>
<p>It’s been updated with the new examples.</p>
<h2>Summary</h2>
<p>So there you have it: another way of importing data into the datastore. This isn’t a replacement for the bulk uploader, just another option. Here are some useful links for additional information:</p>
<p>App Engine Mapreduce issues tracker &#8211; <a href="http://code.google.com/p/appengine-mapreduce/issues/list">report issues here</a></p>
<p>Nick Johnson’s <a href="http://blog.notdot.net/2010/05/Exploring-the-new-mapper-API">post explaining how he built the code search example</a></p>
<p>One last tip: the best place for questions or discussion is probably the <a href="http://groups.google.com/group/google-appengine">App Engine Discussion Groups</a>, not the comments.</p>
<p>Happy hacking.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ikaisays.com&#038;blog=5969280&#038;post=132&#038;subd=ikailansays&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ikaisays.com/2010/08/11/using-the-app-engine-mapper-for-bulk-data-import/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/70564585574a96fa34fbab2a1a6981b0?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">Ikai Lan</media:title>
		</media:content>

		<media:content url="https://lh5.googleusercontent.com/WAuj1OJMMX30BKgamjZ1l8Aa8qMX1g-CMNV_9Ts8ucNZvS0b-oE-XzzIHAgvRTpXDQZU0QzFmBK1dEXNGTqOpAFJ7RHJLzJ2b0_n21jeKY5jwVOsRw" medium="image" />

		<media:content url="https://lh4.googleusercontent.com/QvQ8gxWJeRG6Vjz7ZQN_5SNEmhn8tvB3kEgiIW7hXc6VUJTpaaqW2EjzHcscsRudFowpjxmQpl1A1viYghBt87lZNL5D3wEpGl80ONTgdU5Z3aiedg" medium="image" />

		<media:content url="https://lh5.googleusercontent.com/G25lQcAUXzY_3unnovi35BW4GNdJScOCb88iwB0yUGELOIghHc7ETjYkKXi17A6zqFPEK8hSr8gaH78HuF8vWi0-7OZa8sFXTaWSHqC0AbOPowL6xQ" medium="image" />

		<media:content url="https://lh4.googleusercontent.com/a1PUz1wFvIPzC7_Zp1TW5cFuBSgpfN4QEvDex51St_lxL83XKHXLrg_u96aMmiYKqvegz_tz1TN6fAkHwX_j1fsB7SchfA79rpXblPJ1aTCsLvul6w" medium="image" />
	</item>
	</channel>
</rss>
