Dashboard > GridGain User Guide > Table Of Contents > Developers Guide > Running on Amazon EC2
Running on Amazon EC2
Added by architect, last edited by dozer on Dec 10, 2009  (view change)
Labels: 
(None)


GridGain 2.1 With Open MQ 4.1 or JGroups 2.6

GridGain provides Amazon EC2 image with installed GridGain 2.1.0 and configured to use with OpenMQ 4.1 JMS broker or JGroups 2.6 for discovery and communication. That image based on Fedora 8 i386 Amazon image. It works on m1.small and c1.medium instance types. GridGain also provides EC2 image with only OpenMQ 4.1 installed to work as a JMS hub.

GridGain image receives configuration file path/URL and JVM system properties (-D) as EC2 instance user data (-d parameter for the ec2-run-instances command). User data parameter should be like this:

config/jms/sunmq/spring-sunmq.xml | -DimqAddressList=[your OpenMQ instance internal DNS name] | -DcustomParam=value

Grid configuration file path or URL must be the first parameter. All other items that starts from "-D" will be passed to JVM. Do not forget that GridGain start script pass some default JVM options. JVM options that passed as EC2 user data will be appended to default options.

Start GridGain Images With OpenMQ JMS server

Use EC2 command line tools or Elasticfox plugin or similar tools to manage your EC2 instances.
Please set up you Amazon account for EC2 before. Look into the Getting Started Guide on Amazon web site.

AMI IDs utilized:

  • ami-428c682b - OpenMQ JMS server
  • ami-47a6412e - GridGain 2.1.0
  • ami-00678569 - GridGain 2.1.1

Do the following steps to run GridGain nodes (images) with OpenMQ server:

  1. Start OpenMQ 4.1 image:
    ec2-run-instances ami-428c682b -k gsg-keypair --instance-type m1.large
  2. Get OpenMQ image instance DNS names: Run the ec2-describe-instances command and find DNS names for the started instance.
  3. Run GridGain node instances:
    ec2-run-instances ami-00678569 -k gsg-keypair 
    -d "config/jms/sunmq/spring-sunmq.xml | -DimqAddressList=[your OpenMQ instance internal DNS name]" -n [instance count]
  4. When GridGain node instance are started you can open the following URL in your browser to look into the GridGain node log:
    http://[instance public address]/log/gridgain.log
    Use the ec2-describe-instances command to get instance addresses.
  5. Now you can start local GridGain node (or remote on custom EC2 image) that can connect to JMS server and execute your tasks.
  6. Use the ec2-terminate-instances command to stop EC2 instances.

Start GridGain Images With JGroups

AMI IDs utilized:

  • ami-47a6412e - GridGain 2.1.0
  • ami-00678569 - GridGain 2.1.1

Do the following steps to run GridGain nodes (images) with JGroups Communication and Discovery SPIs:

  1. Start one instance that should be "JGroups initial host" for all other:
    ec2-run-instances ami-00678569 -k gsg-keypair -d "config/jgroups/tcp/spring-jgroups.xml | -Djgroups.tcpping.initial_hosts="
  2. Get that instance internal IP address.
  3. Start all other GridGain instance (pass initial host for JGoups):
    ec2-run-instances ami-00678569 -k gsg-keypair 
    -d "config/jgroups/tcp/spring-jgroups.xml | -Djgroups.tcpping.initial_hosts=instance_internal_ip[7801]"
  4. When GridGain node instance are started you can open the following URL in your browser to look into the GridGain node log:
    http://[instance public address]/log/gridgain.log
    Use the ec2-describe-instances command to get instance addresses.
  5. Use the ec2-terminate-instances command to stop EC2 instances.

Known Problems

Due to virtual nature of EC2 it is not possible to predict how many virtual images will be run at one physical computer. We found that sometimes thread with heavy load can cause the situation when other threads will not receive CPU for some time. As result you can have troubles with discovery, because discovery has a timeout after which node will be considered as failed if it doesn't receive configured number of heartbeats. And if heartbeats sending thread will be blocked for some time this timeout can be expired.

Discovery timeout is calculated as

HeartbeatFrequency * MaximumMissedHeartbeats

Default value for frequency is 5 seconds and maximum missed heartbeats is 3 seconds for JMS discovery SPI. So if your node will not receive any heartbeat in 15 seconds it will consider other nodes as failed, while in fact those nodes are available - but heavy loaded.

To solve this you should increase number of MaximumMissedHeartbeats. We prepared two JMS discovery configurations that already have MaximumMissedHeartbeats increased. One for 10 missed heart beats, another for 20. You can use them if you have problems with discovery. So you can start your GridGain image as

ec2-run-instances ami-47a6412e -k gsg-keypair 
-d "config/jms/sunmq/spring-sunmq-10-missed-hb.xml | -DimqAddressList=[your OpenMQ instance internal DNS name]" -n [instance count]

or

ec2-run-instances ami-47a6412e -k gsg-keypair 
-d "config/jms/sunmq/spring-sunmq-20-missed-hb.xml | -DimqAddressList=[your OpenMQ instance internal DNS name]" -n [instance count]

Also you can make your own EC2 image based at our image where you can set any other value for MaximumMissedHeartbeats parameter.

Powered by Atlassian Confluence, the Enterprise Wiki. (Version: 2.2.10 Build:#528 Nov 29, 2006) - Bug/feature request - Contact Administrators