<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Code Commit &#187; Database</title>
	<atom:link href="http://www.codecommit.com/blog/category/database/feed" rel="self" type="application/rss+xml" />
	<link>http://www.codecommit.com/blog</link>
	<description>(permanently in beta)</description>
	<lastBuildDate>Mon, 07 Jun 2010 07:00:00 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<atom:link rel='hub' href='http://www.codecommit.com/blog/?pushpress=hub'/>
		<item>
		<title>Should ORMs Insulate Developers from SQL?</title>
		<link>http://www.codecommit.com/blog/database/should-orms-insulate-developers-from-sql</link>
		<comments>http://www.codecommit.com/blog/database/should-orms-insulate-developers-from-sql#comments</comments>
		<pubDate>Mon, 25 Feb 2008 08:00:03 +0000</pubDate>
		<dc:creator>Daniel Spiewak</dc:creator>
				<category><![CDATA[Database]]></category>

		<guid isPermaLink="false">http://www.codecommit.com/blog/database/should-orms-insulate-developers-from-sql</guid>
		<description><![CDATA[This is a question which is fundamental to any ORM design.&#160; And really from a philosophical standpoint, how should ORMs deal with SQL?&#160; Isn&#8217;t the whole point of the ORM to sit between the developer and the database as an all-encompassing, object oriented layer?
A long time ago in an office far, far away, a very [...]]]></description>
			<content:encoded><![CDATA[<p>This is a question which is fundamental to any ORM design.&nbsp; And really from a philosophical standpoint, how should ORMs deal with SQL?&nbsp; Isn&#8217;t the whole point of the ORM to sit between the developer and the database as an all-encompassing, object oriented layer?</p>
<p>A long time ago in an office far, far away, a very smart cookie named Gavin King got to work on what would become the seminal reference implementation for object relational mapping frameworks the world over (or so Java developers would like to think).&nbsp; This project was to be bundled with JBoss, possibly the most popular enterprise application server, and would support dozens of databases out of the box.&nbsp; It was to offer heady benefits such as totally object-oriented database access, transparent multi-tier caching and a flexible transaction model.&nbsp; At its core though, Hibernate was design to resolve a single problem: application developers hate SQL.</p>
<p>No really, it&#8217;s true!&nbsp; Bread-and-butter application developers really dislike accessing data with SQL.&nbsp; This has led to endless conflict (and bad jokes) between application developers and database administrators.&nbsp; Often times the developer team would write a set of boilerplate lines in Java and then copy/paste these arbitrarily throughout their code, swapping in the relevant query as supplied by the DBA.&nbsp; For obvious reasons, this would become very hard to maintain and just intensified the bad blood between developer and database.</p>
<p>If you think about it though, it&#8217;s a bit odd that this intense dislike would mutate from just hating the insanity of JDBC to hating JDBC, SQL and RDBMS in general.&nbsp; SQL is a very nice, almost mathematical language which allows phenomenally powerful queries to be expressed simply and elegantly.&nbsp; It abstracts developers from the headache of database-specific hashing APIs and algorithms which are almost filesystems in complexity.&nbsp; The language was designed to make it as easy as possible to get data out of a relational database.&nbsp; The fact that this effort backfired so utterly is a source of endless confusion to me.</p>
<p>But irregardless, we were talking about ORMs.&nbsp; When it was first introduced, Hibernate held out the promise that developers would never again have to wade knee deep through a sea of half-set SQL.&nbsp; Instead, developers would pass around POJOs (Plain Old Java Object(s)), modifying their values like any other Java bean and then handing these objects off to the data mapper, which would handle the details of persistence.&nbsp; Furthermore, Hibernate promised that developers would never again have to worry about which databases support which non-standard SQL extensions.&nbsp; Since developers would never have to work with SQL, anything database-specific could be handled within the persistence manager deep in the bowels of Hibernate itself.</p>
<p>This all seems lovely and wonderful, but there&#8217;s a catch: it doesn&#8217;t work so well in practice.&nbsp; Now before you stone me, I&#8217;m not talking about Hibernate specifically now, but ORMs in general.&nbsp; It turns out to be completely impossible to interact with a <em>relational</em> database solely through an object-oriented filter.&nbsp; This is easily seen with a simple example:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"></td><td class="code"><pre class="sql"><span style="color: #140dcc;">SELECT</span> * <span style="color: #140dcc;">FROM</span> people <span style="color: #140dcc;">WHERE</span> age &gt; <span style="color: #cb0710;">21</span> <span style="color: #140dcc;">GROUP</span> <span style="color: #140dcc;">BY</span> lastName</pre></td></tr></table></div>

<p>How in the world are you going to represent that in an object model?&nbsp; Sure, maybe you can provide a little abstraction for the query details, but it starts to get complex if you try to handle things like grouping non-declaratively.&nbsp; The developers working on Hibernate quickly realized this problem and came up with an innovative solution: write their own query language!&nbsp; After all, SQL is too confusing, so why not invent an entirely new query language with the &#8220;feel&#8221; of SQL (to keep the DBAs happy) but without all of the database-specific wrinkles?</p>
<p>This query language is now called &#8220;HQL&#8221;, and as the name implies, it&#8217;s really SQL, but not quite.&nbsp; Here&#8217;s how the aforementioned example would look in HQL (<b>disclaimer:</b> I&#8217;m not a Hibernate expert, so I may have gotten the syntax wrong):</p>
<pre>FROM Person WHERE :age &gt; 21 GROUP BY :lastName</pre>
<p>Remarkably similar, that.&nbsp; Executing this query in a Hibernate persistence manager yields an ordered list of <em>Person</em> entities pre-populated with data from the query.&nbsp; It seems to make a lot of sense, but there are a number of problems with this approach.&nbsp; First, it requires Hibernate to literally have its own compiler to translate HQL queries into database-specific SQL.&nbsp; Second, it hasn&#8217;t really solved the core problem that many developers have with SQL: it&#8217;s a declarative query language.&nbsp; As you can see, HQL is really just SQL in disguise, so it really doesn&#8217;t eliminate SQL from your database access, just dresses it in a funny hat.</p>
<p>Other ORMs have appeared over the years, taking alternative approaches to the problem of object-relational mapping, but none of them quite eliminating the query language.&nbsp; Even DSL-based ORMs like ActiveRecord fail to remove SQL entirely:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"></td><td class="code"><pre class="ruby"><span style="color: #140dcc;">class</span> Person &lt; AR::Base; <span style="color: #140dcc;">end</span>
&nbsp;
Person.<span class="me1">find</span><span class="br0">&#40;</span><span style="color: #ca9925;">:all</span>, <span style="color: #ca9925;">:conditions</span> =&gt; <span style="color: #cb0710;">'age &gt; 21'</span>, <span style="color: #ca9925;">:group</span> =&gt; <span style="color: #cb0710;">'lastName'</span><span class="br0">&#41;</span></pre></td></tr></table></div>

<p>It&#8217;s <em>sort of</em> SQL-free, but you can still see bits and pieces of a query language around the edges.&nbsp; In fact, what ActiveRecord is actually doing here is building a proper SQL query around the SQL fragments which are passed as parameters.&nbsp; It&#8217;s a system which is ripe for SQL injection, but surprisingly leads to very few problems in real-world applications.&nbsp; This is the approach which is also taken by <a href="https://activeobjects.dev.java.net">ActiveObjects</a> for its database query API.</p>
<p>So ORMs in and of themselves seem to have failed to entirely eliminate SQL from the picture, but what about other frameworks?&nbsp; There are a few quite recent efforts which seem to have nearly succeeded in eliminating the direct use of SQL completely from application code.&nbsp; <a href="http://ambition.rubyforge.org/">Ambition</a> is perhaps the best (and most clever) example of this, though others like <a href="http://code.google.com/p/scala-rel/">scala-rel</a> are catching up fast.&nbsp; Ambition is designed from the ground up to interact naturally with ActiveRecord, so the two combined perhaps represent the first &#8220;true&#8221; ORM: one which does not require the developer at <em>any</em> point to deal with any SQL whatsoever.</p>
<p>But was it really worthwhile?&nbsp; As clever as things like Ambition are, is it really that much easier than just writing queries in SQL?&nbsp; As Nathan Hamblen so <a href="http://technically.us/code/archive/2007/11/#item-4710">eloquently said</a> (when referring to a totally different topic): </p>
<blockquote>
<p>&#8230;is the end of the ORM rainbow.&nbsp; You get there, throw yourself a party and realize that important things are broken.</p>
</blockquote>
<p>A quote taken out of context perhaps, but I think it applies to the &#8220;cult of SQL genocide&#8221; with as much validity.&nbsp; In the end, by denying yourself access to the powerful and well-understood mechanism that is SQL, you&#8217;re just crippling your own application and forcing yourself to write <em>more</em> code instead of less.</p>
<p>So what&#8217;s the &#8220;right&#8221; approach?&nbsp; Is there a happy medium between ActiveRecord+Ambition and full-blown <a href="http://www2.sqlonrails.org/">SQL on Rails</a>?&nbsp; I think so, and that is the approach I have been <em>trying </em>to implement with ActiveObjects.&nbsp; As I&#8217;m sure you know, ActiveObjects takes a lot of its inspiration from ActiveRecord, so the syntax for querying the database is very similar:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"></td><td class="code"><pre class="java5">EntityManager em = ...
<span class="me1">em</span>.<span style="color: #2e7c0f;">find</span><span class="br0">&#40;</span>Person.<span style="color: #857d1f;">class</span>, Query.<span style="color: #2e7c0f;">select</span><span class="br0">&#40;</span><span class="br0">&#41;</span>.<span style="color: #2e7c0f;">where</span><span class="br0">&#40;</span><span style="color: #cb0710;">&quot;age &gt; 21&quot;</span><span class="br0">&#41;</span>.<span style="color: #2e7c0f;">group</span><span class="br0">&#40;</span><span style="color: #cb0710;">&quot;lastName&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>;
&nbsp;
<span style="color: #445fba;">// ...or</span>
em.<span style="color: #2e7c0f;">find</span><span class="br0">&#40;</span>Person.<span style="color: #857d1f;">class</span>, <span style="color: #cb0710;">&quot;name &gt; 21&quot;</span><span class="br0">&#41;</span>;   <span style="color: #445fba;">// no grouping</span></pre></td></tr></table></div>

<p>You still have the full power of SQL available to you.&nbsp; You can still write complex, nested boolean conditionals and funky subqueries, but there&#8217;s no longer any need to be burdened with the <em>whole</em> of SQL&#8217;s verbosity.&nbsp; As with vanilla ActiveRecord, this code intends to be a bit of a hand-holder, shielding innocent application developers from the fierce world of RDBMS.</p>
<p>Is this the right way to go?&nbsp; I&#8217;m honestly not sure.&nbsp; I&#8217;ve met a lot of developers that would give their left eye to never have to look at another SQL statement again (for developers already missing a right eye, this isn&#8217;t much of a stretch).&nbsp; On the other hand, there are purists like myself who revel in the freedom afforded by a powerful, declarative language.&nbsp; It&#8217;s hard to say which path is better, but at the end of the day, it&#8217;s really the question itself that matters.&nbsp; Giving application developers the <em>choice</em> to select whichever approach they feel is most appropriate, <em>that</em> is the solution.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.codecommit.com/blog/database/should-orms-insulate-developers-from-sql/feed</wfw:commentRss>
		<slash:comments>41</slash:comments>
		</item>
		<item>
		<title>Are GUIDs Really the Way to Go?</title>
		<link>http://www.codecommit.com/blog/database/are-guids-really-the-way-to-go</link>
		<comments>http://www.codecommit.com/blog/database/are-guids-really-the-way-to-go#comments</comments>
		<pubDate>Wed, 07 Nov 2007 07:21:32 +0000</pubDate>
		<dc:creator>Daniel Spiewak</dc:creator>
				<category><![CDATA[Database]]></category>

		<guid isPermaLink="false">http://www.codecommit.com/blog/database/are-guids-really-the-way-to-go</guid>
		<description><![CDATA[I recently read a (slightly inflammatory) posting entitled &#8220;The Gospel of the GUID&#8221;.&#160; In it, the author attempts to put forward several arguments in favor of using GUIDs for all database primary keys (as opposed to the more pedestrian sequence-generated INTEGER).&#160; I&#8217;ve heard similar arguments in the past, and I keep coming back to the [...]]]></description>
			<content:encoded><![CDATA[<p>I recently read a (slightly inflammatory) posting entitled <a href="http://weblogs.asp.net/wwright/archive/2007/11/04/the-gospel-of-the-guid-and-why-it-matters.aspx">&#8220;The Gospel of the GUID&#8221;</a>.&nbsp; In it, the author attempts to put forward several arguments in favor of using <a href="http://en.wikipedia.org/wiki/GUID">GUIDs</a> for all database primary keys (as opposed to the more pedestrian sequence-generated INTEGER).&nbsp; I&#8217;ve heard similar arguments in the past, and I keep coming back to the same conclusion: it&#8217;s all bunk.</p>
<p>To get right to the heart of my opinion, I really don&#8217;t see a compelling reason to use GUIDs for 90% of database use-cases.&nbsp; Consider for a moment some of the <em>really </em>large databases in this world (Wikipedia, Slashdot&#8217;s comments table, Digg, etc).&nbsp; I can&#8217;t think of a single web application in that league which uses GUIDs.&nbsp; If you don&#8217;t believe me, look at the code for MediaWiki, the URL structure for Digg and the idiocy involving INTEGER vs BIGINT on Slashdot.&nbsp; While <em>de facto</em> practice may not dictate optimal design, it does point to a trend that&#8217;s worthy of notice.&nbsp; More importantly, it proves that INTEGER (or rather, BIGINT) primary key fields are practicable under real-world stress.&nbsp; But what about the theoretical use-case?</p>
<p>Well to be honest, the theoretical use-case doesn&#8217;t interest me all that much.&nbsp; I mean, if you tell me that your schema can support theoretically 29 trillion rows in its users table, I&#8217;ll geek out right along with you.&nbsp; But if you try to feed that to a client, you&#8217;ll get either blank stares, or the equally common: &#8220;but there are only 5 billion people to chose from.&#8221;&nbsp; (unless you&#8217;re talking to someone in upper management)&nbsp; For an even better party, try telling your client that you can merge their data with one of their competitor&#8217;s in 45 minutes flat.&nbsp; Somehow I doubt that argument will fly.</p>
<p>Now let&#8217;s look at the cons involved.&nbsp; GUIDs are really, <em>really</em> hard to remember and type by hand.&nbsp; When you&#8217;re in a hurry, pulling a little data-mine-while-you-wait for your boss, you&#8217;re not going to want to dig around and find that string you copy/pasted into Notepad.&nbsp; Oh, and heaven forbid you actually type the GUID wrong!&nbsp; Finding a single-bit deviation in a 64-bit alpha-numeric is not a trivial task let me tell you.&nbsp; It&#8217;s at about this point that you start to think that maybe INTEGER keys would have been a better way to go.&nbsp; At least then you can throw the userID into good ol&#8217; short-term memory and bang it out ten seconds later in your SQL query window.&nbsp; Oh, in case you&#8217;ve never seen a GUID before, here&#8217;s a couple quick examples:</p>
<ul>
<li><strong>{3F2504E0-4F89-11D3-9A0C-0305E82C3301}</strong></li>
<li><strong>{52335ABA-2D8B-4892-A8B7-86B817AAC607}</strong></li>
<li><strong>{79E9F560-FD70-4807-BEED-50A87AA911B1}</strong></li>
</ul>
<p>I&#8217;m not even sure how to <em>read</em> that, much less remember it.</p>
<p>Another thing worth considering is that GUIDs are VARCHARs within the database.&nbsp; This means that not all ORMs will handle them appropriately.&nbsp; Hibernate will, as will iBatis and <a href="https://activeobjects.dev.java.net">ActiveObjects</a>.&nbsp; But if you&#8217;re using something like Django or ActiveRecord, you&#8217;re out of luck (disclaimer: Django might actually support VARCHAR primary keys, I&#8217;m not sure).&nbsp; <em>On top of that</em>, some database clustering techniques don&#8217;t seem to work nicely when applied to GUID-based key schemes.&nbsp; I haven&#8217;t run into this myself, but my good friend <a href="http://www.lowellheddings.com">Lowell Heddings</a> has told me tales of strange mystics and evil tides when playing with distributed indexes and GUIDs in MS SQL Server.&nbsp; Technically speaking, while any sane database may <em>work</em> with a VARCHAR for the table&#8217;s primary key, the architects really had INTEGERs in mind when they designed the algorithms.&nbsp; This means that by using GUIDs, you&#8217;re flirting with a less-well-tested use-case.&nbsp; It&#8217;s certainly not unsupported, but you should consider yourself in the minority of users if you go that route.</p>
<p>Oh, and one little myth I&#8217;d like to debug: there really isn&#8217;t that much more overhead in grabbing the generated value of an INTEGER primary key than in sending an in-code generated GUID value.&nbsp; Granted, some databases make this harder than others, but that&#8217;s their fault isn&#8217;t it?&nbsp; As long as you&#8217;re using a well designed ORM, you really won&#8217;t see much of a performance increase from removing that miniscule callback.&nbsp; In fact, depending on how your GUID algorithm works, you may see a significant <em>degradation</em> in overall performance due to extra overhead in your application.</p>
<h3>Conclusion</h3>
<p>So, the obligatory thirty second overview for our attention-deprived era:</p>
<table cellspacing="0" cellpadding="2" width="661" border="1">
<tbody>
<tr>
<td valign="top" width="145"><strong>Benefits of GUIDs</strong></td>
<td valign="top" width="514">perfectly unique across all your data and tables; way, way more head-room in terms of values</td>
</tr>
<tr>
<td valign="top" width="147"><strong>Problems with GUIDs</strong></td>
<td valign="top" width="514">really, really un-developer friendly; weird side-effects in databases and ORMs; unconventional approach to a &#8220;solved&#8221; problem; easy way to upset your DBA</td>
</tr>
</tbody>
</table>
<p>So it really seems the only upshot for GUIDs is their massive range.&nbsp; In trade-off, you lose usability from a raw SQL standpoint, your result sets are that much more cluttered and you risk odd bugs in your persistence backend.&nbsp; Hmm, I wonder which one I&#8217;m going to use in my next app?</p>
<p>Before you make the choice, ask yourself: &#8220;how much uniqueness do I really require?&#8221;&nbsp; Over-designing a solution is just as much a problem as under-designing one.&nbsp; Try to find the mid-point that works best for your use-case.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.codecommit.com/blog/database/are-guids-really-the-way-to-go/feed</wfw:commentRss>
		<slash:comments>31</slash:comments>
		</item>
	</channel>
</rss>
