Click on Javadoc link to open Javadoc documentation.
Package
org.gridgain.grid.spi.loadbalancing.affinity Javadoc![]()
Available starting with GridGain 
Description
GridAffinityLoadBalancingSpi ![]()
Architecture and Deployment
The diagram below illustrates the difference between using data grids without and with GridGain. The left side shows execution flow without GridGain, in which a remote data server is queried for data, the data is then delivered to caller (master) node, which is faster than DB access, but results into unnecessary network traffic.
On the right hand side, you can see the value that GridGain brings to the picture. The whole computation logic together with data access logic is brought to data server for local execution. Assuming that serialization of computation logic is much lighter than serializing data, the network traffic in this case is minimal. Also, your computation may access data from both, Node 2 and Node 3. In this case, GridGain will split your computation into logical jobs and route appropriate logical jobs to the corresponding data servers to ensure that all computations still remain local. Now, if one of the data server nodes crashes, your jobs will be automatically failed-over to other nodes, which allows you to fail-over logic together with data (not just data fail-over provided by data grids or distributed caches).

Coding Example
To use load balancers for your job routing, in your GridTask.map(List, Ojbect) ![]()
![]()
![]()
![]()
![]()
![]()
![]()
Here is an example of a grid task that uses affinity load balancing. Note how load balancing jobs is absolutely transparent to the user and is simply a matter of proper grid configuration.
public class MyFooBarAffinityTask extends GridTaskSplitAdapter<List<Integer>,Object> { // For this example we receive a list of cache keys and for every key // create a job that accesses it. @Override protected Collection<? extends GridJob> split(int gridSize, List<Integer> cacheKeys) throws GridException { List<MyGridAffinityJob> jobs = new ArrayList<MyGridAffinityJob>(gridSize); for (Inteter cacheKey : cacheKeys) { jobs.add(new MyGridAffinityJob(cacheKey)); } // Node assignment via load balancer // happens automatically. return jobs; } ... }
Here is the example of grid jobs created by the task above:
public class MyGridAffinityJob extends GridAffinityJobAdapter<Integer, Serializable> { public MyGridAffinityJob(Integer cacheKey) { // Pass cache key as a job argument. super(cacheKey); } public Serializable execute() throws GridException { ... // Access data by the same key returned // in 'getAffinityKey()' method. mycache.get(getAffinityKey()); ... } }
Also note that there may be cases where your underlying cache product supports multiple caches and you need to cache data with identical keys in those caches. Although, it still may be OK to return the same key from GridAffinityJob.getAffinityKey() ![]()
public class MyFooBarAffintyJob extends GridAffinityJobAdapter<String, Integer> { public MyFooBarAffinity(String cacheName, Integer cacheKey) { // Construct affinity key by concatenating cache name // and affinity key. Note that we also pass cacheKey as // argument to access from execute method. super(cacheName + '.' + cacheKey, cacheKey); } @Override pubic Serializable execute() { ... // Access data from your cache by the cache key. // The main point to note here is that the same // affinity key always corresponds to the same // cache key. Integer cacheKey = getArgument(); Object data = someCache.get(cacheKey); ... // Do computations. } }
Configuration
The following configuration parameters can be used to configure GridAffinityLoadBalancingSpi ![]()
| Setter Method | Description | Optional | Default |
|---|---|---|---|
| setVirtualNodeCount(int) |
Configuration parameter for the number of virtual nodes for Consistent Hashing algorithm. The larger the virtual node count, the more even the data affinity distribution is across nodes. The value should usually be larger than 500. The default value of 1000 is good enough for most grid deployments. Consistent Hashing generally yields between 2% and 4% standard deviation for equal data affinity distribution. If you set the value too large (larger than several thousands), it may cause performance degradation | Yes | Default value is 1000 which is good enough for most grid deployments. |
| setAffinityNodeAttributes(Map<String, ? extends Serializable>) |
All nodes that want to participate in affinity load balancing should have these attributes. This is useful when grid is segmented, for example, into clients and servers, and client nodes will never cache any data. | Yes | Default value is null which means that all nodes will be included. |
Please pay specific attention to the number of virtual nodes for Consistent Hashing algorithm (see setVirtualNodeCount(int) ![]()
You can use virtual node count to distribute load in uneven grid. Since the larger the virtual node count is, the more data will be stored on that node (which leads to more jobs sent to that node), nodes that have higher Memory or CPU capacity should have larger virtual node count value.
When configuring virtual node count, it is common to assign a certain number of virtual nodes to a single unit of capacity characteristic. For example, if you have 3 nodes in the grid: N1, N2, and N3, and nodes N1 and N2 have 2GB of memory and node N3 has 3GB of memory, then, to ease up calculations, for every 1GB of memory on a node you could assign 500 virtual nodes. As a result, nodes N1 and N2 should be assigned 1000 virtual nodes and node N3 should be assigned 1500 virtual nodes.
Examples
For an example of data affinity with JBoss Cache refer to Affinity MapReduce with JBoss Cache documentation on Wiki or under your GridGain installation.
As any GridGain SPI, GridAffinityLoadBalancingSpi ![]()
GridAffinityLoadBalancingSpi spi = new GridAffnityLoadBalancingSpi(); // Change number of virtual nodes. spi.setVirtualNodeCount(1500); GridConfigurationAdapter cfg = new GridConfigurationAdapter(); // Override default load balancing SPI. cfg.setLoadBalancingSpi(spi); // Start grid. GridFactory.start(cfg);
or from Spring configuration file
<bean id="grid.custom.cfg" class="org.gridgain.grid.GridConfigurationAdapter" singleton="true"> ... <property name="loadBalancingSpi"> <bean class="org.gridgain.grid.spi.loadbalancing.affinity.GridAffinityLoadBalancingSpi"> <property name="virtualNodeCount" value="1500"/> <!-- If your grid is segmented via node attributes, then provide all attributes a node should have in order to be considered by affinity load balancer. --> <property name="affinityNodeAttributes"> <map> <entry key="node.segment" value="foobar"/> </map> </property> </bean> </property> ... </bean>

For more information about using Spring framework for configuration click here.
