<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>
<channel>
	<title>Comments on: Database Systems Pioneer Starts Database Company.</title>
	<atom:link href="http://kevinclosson.wordpress.com/2007/02/15/database-systems-pioneer-starts-database-company/feed/" rel="self" type="application/rss+xml" />
	<link>http://kevinclosson.wordpress.com/2007/02/15/database-systems-pioneer-starts-database-company/</link>
	<description>Oracle-related Platform, Storage and Clustering Topics (with the occasional rant)</description>
	<pubDate>Mon, 13 Oct 2008 15:36:08 +0000</pubDate>
	<generator>http://wordpress.org/?v=MU</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Phil Bowermaster</title>
		<link>http://kevinclosson.wordpress.com/2007/02/15/database-systems-pioneer-starts-database-company/#comment-9283</link>
		<dc:creator>Phil Bowermaster</dc:creator>
		<pubDate>Wed, 02 May 2007 03:45:11 +0000</pubDate>
		<guid isPermaLink="false">http://kevinclosson.wordpress.com/2007/02/15/database-systems-pioneer-starts-database-company/#comment-9283</guid>
		<description>There's no question that any new venture with Dr. Stonebraker behind it is worth keeping an eye on.  And, yes, bringing a new database to market will clearly represent significant challenges. 

It is well established that the technology Vertica is proposing -- a grid-enabled, column-oriented relational database -- can provide a huge performance boost for data analytics.  My company, Sybase, makes one such product, the &lt;a href="http://www.sybase.com/products/datawarehousing/sybaseiq" rel="nofollow"&gt;Sybase IQ analytics server&lt;/a&gt;.  It's already available, with nearly 1,000 customers experiencing tremedous performance acceleration and significant ROI.

As an example: Nielsen Media Research implemented Sybase IQ for their audience data warehouse. Sybase IQ has provided 10-100x increase in response time for even the most complex of queries, along with a 70% compression ratio (which is allowing them to save quite a bit on hard storage.)

&lt;a href="http://www.sybase.com/detail?id=1035802" rel="nofollow"&gt;http://www.sybase.com/detail?id=1035802&lt;/a&gt;

As a whole, analytics servers -– both emerging products like SAND and enterprise-class products like Sybase IQ (which includes advanced features like encryption) -- are experiencing very high growth. Dr. Stonebraker’s company could ride this wave of success, so it may not matter that their technology isn’t really new.</description>
		<content:encoded><![CDATA[<p>There&#8217;s no question that any new venture with Dr. Stonebraker behind it is worth keeping an eye on.  And, yes, bringing a new database to market will clearly represent significant challenges. </p>
<p>It is well established that the technology Vertica is proposing &#8212; a grid-enabled, column-oriented relational database &#8212; can provide a huge performance boost for data analytics.  My company, Sybase, makes one such product, the <a href="http://www.sybase.com/products/datawarehousing/sybaseiq" rel="nofollow">Sybase IQ analytics server</a>.  It&#8217;s already available, with nearly 1,000 customers experiencing tremedous performance acceleration and significant ROI.</p>
<p>As an example: Nielsen Media Research implemented Sybase IQ for their audience data warehouse. Sybase IQ has provided 10-100x increase in response time for even the most complex of queries, along with a 70% compression ratio (which is allowing them to save quite a bit on hard storage.)</p>
<p><a href="http://www.sybase.com/detail?id=1035802" rel="nofollow">http://www.sybase.com/detail?id=1035802</a></p>
<p>As a whole, analytics servers -– both emerging products like SAND and enterprise-class products like Sybase IQ (which includes advanced features like encryption) &#8212; are experiencing very high growth. Dr. Stonebraker’s company could ride this wave of success, so it may not matter that their technology isn’t really new.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Henry</title>
		<link>http://kevinclosson.wordpress.com/2007/02/15/database-systems-pioneer-starts-database-company/#comment-4370</link>
		<dc:creator>Henry</dc:creator>
		<pubDate>Wed, 28 Feb 2007 05:31:53 +0000</pubDate>
		<guid isPermaLink="false">http://kevinclosson.wordpress.com/2007/02/15/database-systems-pioneer-starts-database-company/#comment-4370</guid>
		<description>tried to post this yesterday but for some reason it didn't work...

Ah, Vertica. We actually had Michael Stonebraker in about a year and a half ago to talk about (convince us to use) Vertica. I was in on that meeting and was less than impressed with the presentation. The product seems sort of interesting, the presentation just struck me as less than honest. That meeting was a while ago so I don't remember a lot of the details, just what I put down in some sketchy notes. Also, I have not been following Vertica closely so I don't know what changes have been made in the last year and a half.

There are a number of people here want to try it out (I would really like to do some testing wrt Oracle, time permitting. Hah. If anyone has some interesting tests in mind let me know, I’ll see what I have time for). We supposedly even have a copy to play with, though I don’t think much is happening in that direction as of yet. Of course I bet a lot of the attraction for all the MIT AI lab hackers here is having a product in beta, that isn’t big bad Oracle. 

Vertica came out of c-store (http://db.csail.mit.edu/projects/cstore/vldb.pdf)
This product seems to be designed for DW and query intensive databases.  Storage is by column, not tuple. This allows retrieval solely of the attributes you want instead of all other attributes also stored in the block with tuple based storage. The column values are also sorted and compressed. Multiple sorts can be stored to help querying speed. 

There is a lot of redundancy in this design. It is a shared nothing architecture, with redundancy (at least as far as the data, not the sorting) across multiple nodes. It seems availability and recoverability are combined. 

There is also a weird way of mixing in OLTP transactions. They are done in a separate work area and a tuple mover then puts them into the read store. There appears to be a time delay between committing data and having it visible from a query. Strange. 

A few additional comments from my notes:
--It seems as if the High Availability (HA) and recoverability mechanisms are intertwined. Recoverability happens by accessing a replicated site. This implies a bunch of synchronous data transfer.
--If the above is true, I would really like to see query vs. write performance. At times it sounded as if the product was designed for improving DW queries, but M. Stonebraker also claimed OLTP writes were fine. He really pushed both. Even with multiple synchronous copies? And redundant data tables?
--We are reentering the shared nothing vs shared disk database cluster debates. 
--A bit disingenuous about some of the references to "the elephants". A lot of references were either 5 years old or referring to default behaviour. Granted, given the default setup,a standard table, and simple SQL Oracle will usually store in order entered (DELETES notwithstanding), but there is no requirement in the model to always do so. Of course this doesn't mean Vertica can't do some things better, just that we need to be educated consumers. (the disingenuousness, especially from an academic, made me slightly squirmy) 
--I would still like to know more about backups. 

Hey, just check my email, and M. Stonebraker will be giving a talk at my work Tuesday afternoon. Anybody have any questions for him?

Well that talk was today. If there is any interest I'll post some notes/comments later. Too tired right now.</description>
		<content:encoded><![CDATA[<p>tried to post this yesterday but for some reason it didn&#8217;t work&#8230;</p>
<p>Ah, Vertica. We actually had Michael Stonebraker in about a year and a half ago to talk about (convince us to use) Vertica. I was in on that meeting and was less than impressed with the presentation. The product seems sort of interesting, the presentation just struck me as less than honest. That meeting was a while ago so I don&#8217;t remember a lot of the details, just what I put down in some sketchy notes. Also, I have not been following Vertica closely so I don&#8217;t know what changes have been made in the last year and a half.</p>
<p>There are a number of people here want to try it out (I would really like to do some testing wrt Oracle, time permitting. Hah. If anyone has some interesting tests in mind let me know, I’ll see what I have time for). We supposedly even have a copy to play with, though I don’t think much is happening in that direction as of yet. Of course I bet a lot of the attraction for all the MIT AI lab hackers here is having a product in beta, that isn’t big bad Oracle. </p>
<p>Vertica came out of c-store (http://db.csail.mit.edu/projects/cstore/vldb.pdf)<br />
This product seems to be designed for DW and query intensive databases.  Storage is by column, not tuple. This allows retrieval solely of the attributes you want instead of all other attributes also stored in the block with tuple based storage. The column values are also sorted and compressed. Multiple sorts can be stored to help querying speed. </p>
<p>There is a lot of redundancy in this design. It is a shared nothing architecture, with redundancy (at least as far as the data, not the sorting) across multiple nodes. It seems availability and recoverability are combined. </p>
<p>There is also a weird way of mixing in OLTP transactions. They are done in a separate work area and a tuple mover then puts them into the read store. There appears to be a time delay between committing data and having it visible from a query. Strange. </p>
<p>A few additional comments from my notes:<br />
&#8211;It seems as if the High Availability (HA) and recoverability mechanisms are intertwined. Recoverability happens by accessing a replicated site. This implies a bunch of synchronous data transfer.<br />
&#8211;If the above is true, I would really like to see query vs. write performance. At times it sounded as if the product was designed for improving DW queries, but M. Stonebraker also claimed OLTP writes were fine. He really pushed both. Even with multiple synchronous copies? And redundant data tables?<br />
&#8211;We are reentering the shared nothing vs shared disk database cluster debates.<br />
&#8211;A bit disingenuous about some of the references to &#8220;the elephants&#8221;. A lot of references were either 5 years old or referring to default behaviour. Granted, given the default setup,a standard table, and simple SQL Oracle will usually store in order entered (DELETES notwithstanding), but there is no requirement in the model to always do so. Of course this doesn&#8217;t mean Vertica can&#8217;t do some things better, just that we need to be educated consumers. (the disingenuousness, especially from an academic, made me slightly squirmy)<br />
&#8211;I would still like to know more about backups. </p>
<p>Hey, just check my email, and M. Stonebraker will be giving a talk at my work Tuesday afternoon. Anybody have any questions for him?</p>
<p>Well that talk was today. If there is any interest I&#8217;ll post some notes/comments later. Too tired right now.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
