
There was an interesting question on our forums in regards to how many nodes we support and what are the limitation http://www.gridgainsystems.com/jiveforums/thread.jspa?threadID=168
. Here's a good answer that was provided with some of my own addition...
First of all, we need do differentiate between Communication and Discovery functionality in GridGain.
Number of nodes really matters for Discovery and I don't think there is a limit - it's a matter of proper configuration. If you use the default GridMulticastDiscoverySpi, then it's really light weight and configuration tweaking would involve mostly setting appropriate heartbeat interval.
However, I think users should be careful when choosing the appropriate split size (most likely you are not going to be splitting your grid task into 100s of thousands of jobs - unless it is a mobile grid computing). To choose appropriate split size you should take into consideration that every job will be sent to remote node and there is communication overhead. So if your job execution time becomes comparable to communication overhead, you probably should not split any further.
For example, let's say you have 1000 nodes in your cluster but your ideal task split size is 50. You would execute your task, splitting it into 50 jobs, but assigning them to different nodes every time (using random load balancer shipped with GridGain 2.0). This would provide the fastest performance (optimal split size) and the best scalability (using all nodes in the grid allows for best load distribution and failover possibilities).