Wednesday, July 22, 2009

From Backups to Scalability in the Cloud

Where do you backup your databases today? Are you using tape drives and shipping some of the backup archives offsite for disaster recovery preparedness? How much are you paying for your database backup infrastructure?

Different companies will likely have different answers to the above questions and chances are strategies will vary even within the same company depending on criticalness and security requirements of different data sets. But hopefully your company's backup and disaster recovery policy is not like that of the bookmarking site ma.gnolia.com that folded unexpectedly and lost data for thousands of users due to a data corruption issue.

The backup approach that caught my attention - backup to cloud - is being used by IDUG.org (the website of the International DB2 Users Group). IDUG recently revamped their web presence (and did a pretty good job transforming the site into a social site almost overnight). The updated site utilizes DB2 Express-C, the free version of DB2, for storing data. No matter how good the database software, there is no substitute for a good data backup strategy. Rob Williams, who helped setup the updated IDUG.org site, thinks using Amazon's Simple Storage Service (S3) is a pretty good option for storing database backups. And I agree with Rob - at 15 cents a GB per month the economics certainly make sense (unless u have terabytes of data), and certainly saves the hassle of shipping backup tapes to an offsite location. In Rob's blog on IDUG.org, he outlines how he setup DB2 backups and log files to be archived on S3 and shares his user exit script.

Ironically the now defunct ma.gnolia.com used cloud infrastructure for their operations, so it may be wise to not buy into all the cloud hype. Remember failures in the cloud can happen just as easily as on-premise, so reliance on any single infrastructure may not be wise. If you are going to be storing database backups on the cloud, you may want to do so in addition to keeping backups on-premise or another location rather than relying solely on a single cloud. Some cloud providers like Amazon allow you to create copies of your data and store them in different availability zones or regions so you can be insulated from outages in a single data center. Of course, depending on your needs (plus paranoa level and cost bearing capacity) you can utilize multiple cloud providers for your backups.

If you want to take database backups in the cloud to the next level, you could setup a database server that mirrors the data on an on-premise server. That is, a duplicate database server in the cloud ... think of the possibilities something like this could accomplish if the database server in the cloud could automatically keep in sync with changes on your on premise server. Yep, a parallel server that could also service live workloads. So you would have built-in backup capabilities, continuous high availability/failover standby, and disaster recovery option. And now imagine the possibilities if u could have more than one such database server in the cloud mirroring the same database... a neat scalability solution that could be used to distribute users/queries among multiple servers containing the same data.

If you think all this is just imaginary craziness, well think again. Or better yet - watch the free webinar: Scalability in the Cloud: Fact or Fiction - to find out how easily it can be done.

Tuesday, July 14, 2009

Free Big Database on Free Blue Cloud

Did you know IBM has made available a Cloud service with stuff like database software and development tools, and its FREE to use?

You're probably thinking - Free and IBM in the same sentence ... is that an oxymoron? Yeah, its hard to picture IBM when you talk of free anything let alone a free cloud service and free database software. Many folk would associate IBM with a company that only sells mega expensive hardware and software to mega large enterprises for mega bucks. We'll that may be the case, but yes, the part about free cloud and free database software is also true, and I'll let you in on the secret for free ;-)

The free database software part is not new (but still bit of a secret to many) ... IBM released DB2 Express-C as a no-charge product for the community a couple of years ago. Express-C is a leaner, easy to use Linux / Windows / Mac version of DB2, the database software that is used by mega enterprises for mission critical systems and large data warehouses. DB2 Express-C on the other hand is for developers (including those using PHP, Ruby on Rails, Python, Django, etc.) ISVs, and SMB users, who want a fully functional database albeit with a free price tag (yes its really free to use for as long as you want without any time-based restrictions or database size limitations, although IBM secretly hopes that some point you will be making enough money to pay for subscription and support or upgrade to other DB2 editions with more advanced features).

The free Cloud service is called IBM Smart Business Development and Test on the IBM Cloud. Quite a mouthful, but probably not enough to detract you from asking, so what's the catch? There has to be a catch if its free, right? Okay, okay, its not free forever and its only available to those with a US address. This IBM cloud service is currently in the technology preview stage, and I imagine once the preview or beta phase is over IBM will start charging for use of this cloud.

The tech preview of the cloud service (lets call it the IBM Developer Cloud for short) comes with several luanch-ready pre-installed images containing development tools from Rational and other IBM middleware like DB2 Express-C running on an x86 based enterprise-class Linux distribution. Just like on Amazon EC2, you can dynamically provision instances (virtual servers) based on the pre-built images. Unlike Amazon EC2, IBM will not charge you 10 cents an hour for running these instances during the preview phase. And it only takes a few minutes to be up and running as you can see in the video below.



Useful links to get started:

Remember, the free IBM Dev Cloud tech preview is for a limited number of users only, so lets keep it a secret. Sshhhhh ...

Friday, July 10, 2009

Welcome to JuiceDB

Watch this space for musings about data management and cloud computing topics ...

The forecast for dataville is partly cloudy.

The forecast for dataville is partly cloudy.