|
|
September 2010 | |||||
|---|---|---|---|---|---|---|
| Sun | Mon | Tue | Wed | Thu | Fri | Sat |
| 1 | 2 | 3 | 4 | |||
| 5 | 6 | 7 | 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 | 16 | 17 | 18 |
| 19 | 20 | 21 | 22 | 23 | 24 | 25 |
| 26 | 27 | 28 | 29 | 30 | ||
Cloud computing terminology has been rising in popularity lately, and rising rather fast:
Here are some of my thoughts:
- I think Amazon first used the word "cloud" in their Elastic Cloud (EC) offering (I may be wrong here about first). When I first heard this it sounded silly but catchy, like an mturk they have. Technically, EC2 is a Xen infrastructure for rent with on-demand provisioning capabilities. Nothing more, nothing less.
- I have no idea what cloud computing actually is. I can guess – but I don't care. I still prefer good old Grid Computing as an industry accepted terminology.
- Grid computing arena seems to be going through a superficial name change every year or two: HPC->Grid Computing->Utility or On-Demand Computing-> Virtualization->Cloud Computing. Not only these changes sprung in blogosphere – but sometimes companies do an overnight change in their entire messaging. Case in point is DataSynapse that has changed its message from grid computing to virtualization literally overnight (entire website, PRs, white-papers, etc.) about 2 years ago and now seems to be going through the same change again. Comical...
- As expected, certain companies claim they invented cloud computing way before we knew about cloud computing. In separate news Al Gore claims he invented Internet before... it was invented.
- The motives are always clear for companies that are trying to chase the latest FUD in naming: they are attempting to differentiate on naming and positioning while struggling to make a difference in technology or features of the product.
I think the "Cloud" FUD will subside and something else will grab the attention. We already had a "swarm computing" so may it will be a "mist computing" or a "puff computing" or even a "haze infrastructure"
Quite often I am being asked about what is the most important feature or the feature that I like the most in GridGain. My answer usually comes around one feature or characteristic of GridGain that is unique and often overlooked in grid computing - developer's productivity.
The whole idea behind the GridGain came about from a frustration of working in Java with something like Globus or Sun Grid Engine. These things were so out of touch with modern Java development that using them seemed almost contrived. All the things we've come to expect and appreciate like lightweight containers, IoC, AOP, conventions over configurations, light deployment process, meta-programming with annotations, simple and powerful APIs - all of them were "missing in action".
Lack of focus on developer's productivity is what largely created this painful stigma that grid computing is a complex and very expensive proposition. In the minds of many mid-level managers grid computing is still firmly associated with hiring IBM Global Services to help with this "monumental" undertaking - installing and configuring Globus grid infrastructure...
So, when we started GridGain our goal #1 was to develop grid computing infrastructure that will be as productive to use as Spring (or Seam or Grails if you fast forward to our times). Here's what GridGain 2.0 provides when it comes to developer's productivity - your productivity as software engineer:
- Zero deployment with p2p class loadingThis is the key feature that boosts developer's productivity more than anything else. In a nutshell, when you are writing grid application with GridGain you don't copy files, you don't run special Ant scripts, you don't FTP anything, you don't start or re-start or re-provision anything, you don't go to a GUI console to do something - you just click "Run" and the code with your latest changes just works across the grid.
- Many grid nodes per computer or even per JVMLet me ask you a question: what other product allows you run multiple independent grid nodes in parallel on the same computer? How about in the same VM?? Can you run 10 nodes in the same VM and debug your complex distributed algorithms without leaving your favorite Java IDE? Yes - with GridGain you can.
- Conventions over configurationGridGain works out-of-the-box without a single line of configuration that needs to be changed. And this is not a special toy setup - this is the setup that would work for many production environments and for the most of development setups too. Moreover, we took a great care of thinking what default configuration means for each and every of our SPI - ensuring that you, the developer, spending time coding and not fiddling with configuration unless it is absolutely necessary.
- XML and non-XML configurationWith GridGain we use Spring XML beans as configuration medium. If you know Spring XML beans - you already know how to configure GridGain. In the same time - you don't have to know Spring at all. GridGain can be equally configured via Java code as our entire configuration is concentrated in a single interface:
- You can inject it via Spring which is default IoC framework
- You can inject it via any other IoC-container/framework
- You can configure it in Java without any XML or IoC
- Strongly typed Java 5 interfacesWe take strong typing seriously and GridGain is using parameterized types in many places. That has great benefit for the developers as you don't need to guess and rely on runtime or junits to catch something that any IDE will highlight as you are typing...
- Best Javadoc you have even seenWe take pride in our Javadoc. We know most of the developers that are using GridGain looking at it every time they need some answers on API and we wanted to make our Javadoc simple and effective API documentation. It's well organized, has useful UML diagrams for all classes, cross-linked with our wiki, and has generous coding and configuration examples.
We've been receiving request for public SVN access for quite some time. Recently we've redesigned how all our online resources are available on the website and put them all in one page. This is the list of all online resources for GridGain that are available to everyone:
- Wiki Documentation

That's where you can find all developer's documentation for GridGain: development guide, getting started, installation instructions, etc. - Javadoc

Standard Javadoc documentation for the latest release. We take a great care and pride in producing our Javadoc as most of the developers use Javadoc as their main source of documentation - JIRA

Issue and bug tracking system that is open to public and is used internally. Great tool to see what bugs and issues we have, what has been closed and what is being worked on right now. - SVN Access

Standard access to our SVN repository in read-only mode. You can see all latest changes. - WebSVN browser

Default web-viewer to our SVN repository. You don't have to checkout the entire project to see a certain file, history of changes or commit log. - Online Forums

Our online Forum is the best place where you can usually find answers on all your GridGain questions. If not – just post there and we or other members will respond promptly.
Enjoy grid computing with GridGain!
One of the fundamental differences between GridGain's implementation of MapReduce and the ones in the existing or legacy systems like Sun GridEngine, GigaSpaces, Hadoop and Globus is the cardinality or the type of the mapping operation.
In MapReduce pattern the mapping is a process of splitting the initial task into sub-tasks and assigning them to the grid nodes. Mapping generally involves the splitting logic itself, mapping sub-tasks to the nodes including load balancing, and potential failover and collision resolution. In conventional approach the worker nodes pull the sub-tasks for execution. In GridGain, sub-tasks are pushed to the worker nodes and this process is initially controlled by the task. The later has fundamental advantage that was largely missing in grid computing frameworks before GridGain:
|
GridGain approach of giving task the control of sub-task distribution enables early and late load balancingalgorithms. This effectively helps to adapt task execution to non-deterministic nature of execution on the grid. Not having this capability significantly narrows deployment options where optimal performance and scalability can be achieved. |
This unique property of GridGain's MapReduce implementation has profound effect on ability to develop grid applications with the advanced load balancing, failover and collision resolution logic. Let me describe early and late balancing in details by simply walking through the grid task execution sequence in GridGain where it will become apparent:
- Someone calls Grid.execute(...) passing grid task and its argument to initiate grid task execution in the system.
- Method map(...) will be called on the task to perform the initial mapping. This method is responsible for taking a task, splitting it into number of sub-tasks and mapping every sub-task with one or more grid nodes. This method returns set of {sub-task, node} pairs. This is what we call an early load balancing as it is done right during initial mapping operation and with only information available at the execution initiation time.
- Once mapping is done the sub-tasks will travel to respective remote nodes for execution.
- When sub-task arrives to the destination grid node it will be subject for collision (scheduling) resolution via collision SPI. This SPI is called every time when new sub-task arrived, existing sub-task finished its execution or metrics update received (with every heartbeat). Collision SPI looks into the queue of its sub-tasks (including a newly received one, if any) and can either cancel sub-task, leave it waiting in the queue, transfer it to another node for execution, or start its execution locally. This is what we call late load balancing. This load balancing happens later in the process of execution and it happens on destination node right where sub-task is about to get executed. The important characteristic of the late load balancing is that there can be a significant time difference between mapping (early load balancing) and actual time when execution of the sub-task commences on the remote node – and late load balancing allows to account for this non-deterministic aspect of grid execution and potentially re-balance the sub-task on the grid.
For example, our job stealing collision SPI does exactly that. It monitors number of queued sub-tasks on each node and preemptively moves waiting sub-tasks from "busy" node to the "idle" node for execution.
Load balancing capabilities in GridGain are more of the advanced features and not everyone would need them. For example, in homogeneous grid with homogeneous tasks load balancing achieved naturally. However, in many other cases when conditions are more real-life – sophisticated load balancing capabilities are about the only way to get the most out of your grid.
Enjoy grid computing!
How do you scale your application on the grid? In most cases the answer should be Data Partitioning + Affinity Map/Reduce.
I've blogged
about it last week</a> in details explaining why this pair of techniques is so important to scalability and performance (which are not the same). Combination of data partitioning and affinity map/reduce is becoming an essence of grid computing and how grid computing is applied to scalability and performance problems.
I'm predicting that during 2008 we are going to see more and more of crystallization of this idea in products as well as in marketing activities (as we've seen, for example, on QCon 2007 panel and other conferences). It is also interesting to notice that Java ecosystem is once again pushing envelop on technology advancements (just can't resist to compare this to certain other languages and technologies that are still debating whether to support threads...).
One idea that I'm harboring for quite some time by now is to see some effort toward standardization of this concept. We already have JSR-107 "JCache" dealing with basic data grid functionality and I have blogged about it here
. Having some effort providing uniform APIs around data partitioning and affinity map/reduce can go a long way towards wider adoption of grid computing in general.

There was an interesting question on our forums in regards to how many nodes we support and what are the limitation http://www.gridgainsystems.com/jiveforums/thread.jspa?threadID=168
. Here's a good answer that was provided with some of my own addition...
First of all, we need do differentiate between Communication and Discovery functionality in GridGain.
Number of nodes really matters for Discovery and I don't think there is a limit - it's a matter of proper configuration. If you use the default GridMulticastDiscoverySpi, then it's really light weight and configuration tweaking would involve mostly setting appropriate heartbeat interval.
However, I think users should be careful when choosing the appropriate split size (most likely you are not going to be splitting your grid task into 100s of thousands of jobs - unless it is a mobile grid computing). To choose appropriate split size you should take into consideration that every job will be sent to remote node and there is communication overhead. So if your job execution time becomes comparable to communication overhead, you probably should not split any further.
For example, let's say you have 1000 nodes in your cluster but your ideal task split size is 50. You would execute your task, splitting it into 50 jobs, but assigning them to different nodes every time (using random load balancer shipped with GridGain 2.0). This would provide the fastest performance (optimal split size) and the best scalability (using all nodes in the grid allows for best load distribution and failover possibilities).
![]()
Almost everyone who comments on the idea of mobile grid computing puts forward the skepticism about its possibility due to... battery performance in today's mobile devices.
So, the usual thought goes that if I let my cell phone crunch some numbers for extended period of time, let's say an hour, it will drain battery dry in the same hour. So, who needs it, right?
No. What's everyone is missing that mobile grid computing will require a slightly more sophisticated Map/Reduce implementation, specifically sparse load-balancing. Sparse load-balancing algorithm will perform splitting in such a way that any individual mobile device will only participate in the grid for short periods of time significantly reducing battery drainage. This logic is permitted by a simple fact that you may have tens or hundreds of thousands consumer mobiles devices in the grid at any given point of time - and you can use sparse load-balancing on such grid.
Sparse load-balancing is one of the areas in which mobile grid computing is fundamentally different from traditional server-based grid computing where utilization is the key. In mobile grid computing we are actually trying to reduce the utilization of an individual device but rely on a sparse distribution.
Another area where mobile grid computing is special is the use of redundant split, i.e. moving the same job to more than one node for execution - again relying on the fact that there are plenty of available grid nodes but each node has limited "reliability" for finishing its job.
Combination of sparse load-balancing and redundant split on mobile grid provides guaranteed execution with minimal impact on an individual mobile device.
With LEGO-like customization of every aspect of grid infrastructure GridGain is a perfect technology for mobile grid computing. Topology management and discovery, load balancing and dynamic map/reduce are all fully pluggable and customizable in GridGain.

I'm going to be presenting at JBoss World 2008
in Orlando on February 14th on topic of integration between GridGain and JBoss. If you are around - come see my presentation!
With upcoming GridGain 2.0 release we are going to round up our integration plans with JBoss by providing what I consider truly native blend-in integration. In fact, when you use GridGain with JBoss, GridGain basically becomes a part of JBoss container that looks as if it just acquired new functionality. Here's the list of integration points that we implement in totality with GridGain 2.0 (most of them has been available since GridGain 1.6):
- Loader for JBoss AS
- Support for and integration with JBoss AOP
- Integration with JBoss JMX
- Integration with JBoss Logging
- Integration with JBoss HA
- Integration with JBoss Cache (including support for dynamic data partitioning, load balancing and affinity map/reduce)
- Integration with JGroups
The reason I'm really excited about this integration is that it provides the first fully open source full-stack grid computing platform for Java, combining state of the art compute and data grid functionality. And that's why I think open source Java grid computing has never been more exciting!
There is an interesting BusinessWeek
article by Olga Kharif
mentioning our efforts with Android. Although our work is in the very early stages and we have yet to see how successful Android will be with major operators and handset providers - article gives good overview of the current state of the art and provides some thoughtful prognosis. Good read overall.
I do believe that mobile grid computing would be one of the next breakthrough enabling technologies for mobile market. Stay tuned!

I had a blog about a month ago talking about Android
and how it is making mobile grid computing reality. The major points that I talked about were improved networking and CPU performance of the modern mobile devices combined with unified programming model delivered by Android - jointly bringing mobile grid computing into realm of reality.
Since then I had several discussion about it and I have several interesting follow up observations.
First of all, the devices in mobile grid are profoundly different from those found in traditional grid computing environment. While rack mount blades are completely faceless and void of any human interaction - mobile devices like phones and PDAs are almost an extension of a human being - people interact with phones constantly, pressing button, snapping pictures, listening to music, making calls, watching movies, etc. What I find fascinating is that if you extend this idea of mobile device being the extension of human's capabilities you can think of mobile grid consisting of human nodes... Indeed, every person with the mobile device is a node in the grid and his or her device is just a connection to the grid.
In this sense, mobile grids open up completely different types of applications to be run on the grid. Some of the obvious choices are human-based image recognition, short and massively parallel tasks, new types of security (requiring simple human interaction from a group of people), etc.
Second of all, mobile grids will require new business models or approaches. I can clearly see, for example, large operators like AT&T or Verizon offering discounted or free services in exchange for having a mobile device participate in the grid - while reselling this processing capacity to other businesses. Let's say you get free 1000 any-minutes if you let your phone to crunch some small tasks while you are not talking... not a bad deal!
Overall, I predict that mobile grid computing will become a new trend in distributed processing in the coming years.

I found this interesting blog
by Jorge Lugo about how GridGain was deployed used in EC2. Very nice read with source code and diagrams. Short and to the point. We are actually planning for much deeper integration with EC2 infrastructure but even at this point EC2 deployment looks pretty cool.
Enjoy!
I'm continuing previewing some of the features that GridGain team is working for upcoming release. One of them is support for RESTful applications. Basically, that's how you could, for example, invoke a grid task from your RESTful web application by pointing to this URL:
http://www.myhost.com:7080/gridgain/taskid=mytaskid&arg=myarg&timeout=5000![]()
This will return XML document back with execution results which can be easily parsed by JavaScript in a usual Ajax way. That's it - you basically pass task ID, and optional argument and timeout and that's all it takes to execute a grid task. You can develop, test and debug your grid task as usual in any preferred Java IDE with all the productivity features that GridGain provides such as transparent deployment, our simple APIs or peer-to-peer class loading.
Note that this is exactly the same way as you would use, for example, Google Charts. Now, I don't know of any other grid computing framework that can get your Web 2.0 grid enabled easier than that!
Stay tuned and meanwhile download the current release of GridGain at http://www.gridgain.org/downloads.html
Last week I posted a blog on the similar topic of why Ruby and RoR are not ready to be used in enterprise systems. I've got quite a discussion in response (over 30 comments) which was roughly divided in half - those who support my point of view and those who don't.
I don't use Ruby in development. I live and work in Silicon Valley and have yet to see the first person who's using it or the first project that is using it (granted I don't mingle much among those who develop website for living). But I stumbled upon an article about concurrency support (the lack of thereof to be precise) in Ruby and RoR and that's what provoked by previous blog. But...
...the funny thing is that this is really a small potato comparing to... dynamic typing in Ruby. See, I fully agree with those saying that Ruby/RoR are just in their first version and glaring wholes like lack of normal threading will be eventually patched - just look at JDK 1.0 and compare it to today's Java. No doubt Ruby, as a language and system, will mature over time. What will not mature is its dynamic typing foundation...
I can pretty much say that I belong to a silent majority that view dynamic typing as a clear regression as far as general purpose languages go. All of us used or using languages with some form of dynamic typing from time to time be it JavaScript, Tcl/Tk, Perl or Shell. Even back in 1995 we were using these languages and when Java was introduced it was a sigh of relieve, at least for a huge silent majority, with its static typing and one of the best strong typing foundation - checked exceptions leading the way in innovation here.
I remember that I felt that we can finally move away from the silliness and inefficiency of writing 1000s line of backends in Perl or C++ and move to something much more stable, robust and adequate for large projects - Java. Even in its first version Java was miles ahead of anything else for enterprise development - something its designers didn't really anticipated in the beginning. Java was and still is the best language and system for developing large and complex enterprise systems. Strong and static typing, checked exceptions, rich libraries, huge open-source eco-system, concise and familiar C-based syntax, and sane design for the most parts (thanks to the same engineering organization that produced Solaris) - all of these characteristics contribute to Java's success initially and over time.
I've read this short article on Dynamic Typing (http://www.dzone.com/links/dynamic_typing_more_superior.html
) and I think it sums it up very politely and succinctly. Please, read it and especially read the comments (some of the them are truly delusional - where do these people come from?). Yes, when you write small scripts in Ruby or PHP or Perl you can get away with pretty much anything: dirty "duck typing", bizarre syntax border-lining on fetishism, no intelligent refactoring (no wonder most of Ruby-ist are working in Vi - and I love Vi, by the way), dozens of ways to do the same thing (welcome back to Perl), embedding types into variable names, spending hours resolving what should have been a compile time error (welcome back to C++).
But when you develop something more than a script on you web page or shell script in your environment using languages with dynamic typing is practically dumb. And that's the common wisdom of the silent majority. And when you read the comments on Soon Hui's
articles you can really feel for McMurphy in the movie "One Flew Over the Cuckoo's Nest"...
I don't have much beef with Ruby and RoR as I don't encounter it beyond simple web applications. In enterprise systems, dominated by Java and .NET, it is statistically non-existent. But nonetheless I was really startled when I read about concurrency "support" in both Ruby as a language and RoR as a framework. Well, basically, there are almost none in both at least at this point.![]()
Now, in the year 2008, with multi-core CPU, "green threads" being taught in history CS classes, and with grid computing and overall drive for parallelization it is really preposterous and arrogant to think that language that has only rudimentary (if any at all) support for concurrency and a framework that is simply single-threaded (you need to start a separate processor - basically fork) can gain much attention from enterprise community. Let alone considering its scripting "baggage" and dynamic typing... It's like asking people to go back to CGI-type of web programming.
But somehow Ruby is getting popular for simple websites that have no apparent plans for growth. I guess for some having an outsider, "look, ma, no hands!" rebellion status often means much more than sound engineering and quality results...
I've got some questions on my previous blog on the same topic asking basically for more specific explanation why it is critical to expose the grid topology. I guess when you work in this field for a long time you are getting blind to certain things - hence I will try to provide more detailed explanations below...
Example #1: Your grid is heterogeneous, i.e. nodes are different in terms of CPU, memory, network bandwidth, IO, etc. In this situation split should always be "weighted" by node's characteristics meaning that original task cannot be split into equal jobs but rather each job should be of appropriate "size". Failure to do so will result in unbalanced grid and drastically reduced performance.
Example #2: Your task distribution is heterogeneous, i.e. not all grid nodes can or should execute the jobs from this task. In other words, when you split you pick only a specific (often dynamically changing) sub-set of grid node for your task distribution. This scenario happens practically in any situation: you have existing silos of blades that handle specific type of tasks, you have sub-grid dedicated to a new application, specific segment of your network is dedicated to on-demand computing only, etc. Failure to provide this topology awareness really makes these cases hard or impossible to implement.
Now, examples I provided are really kind of basic examples and each particular real-life application(s) would likely have its quirks and requirements. Here's one real-life example from GridGain usage: our user wanted to do a dynamic CPU scavenging but only for desktops in a specific subnet; all other desktops in the office should be automatically made available to the grid from 9pm to 6am (basically, overnight) and not available during the rest of the day. With GridGain this is the matter of 5-10 lines of code - and it's beautifully elegant.
