Clustering¶
Cluster mode is the solution for high performance system. It offers Load Balancing and High Availability features.
A Platform cluster is a set of nodes that communicate via JGroups - UDP or TCP - in the back-end, and a front-end Load Balancer like Apache that distributes HTTP requests to the nodes. The High Availability is achieved in the data layer natively by the RDBMS or Shared File Systems, such as SAN and NAS.
The following diagram illustrates a cluster field with two nodes (each node uses its local JCR index storage, but you can enable shared JCR indexing, as described in the chapter).
In this chapter:
- Setting up eXo Platform cluster How to set up eXo Platform cluster.
- JCR index in cluster mode Configuration and explanation of JCR index strategies (local and shared).
- Activating TCP default configuration files How to use TCP default configuration files.
- Configuring JGroups via exo.properties A list of default values and variable names that you can configure via
exo.properties
.- Using customized JGroups xml files In case you have a configuration that is not externalized, or you want to migrate your JGroups xml files from previous versions, read this section to activate your xml files.
- Setting up a HTTP proxy How to set up load balancing using a HTTP proxy
- FAQs of clustering Common questions and answers that are useful for administrators when doing a clustering on eXo Platform.
Setting up eXo Platform cluster¶
Install eXo Platform package by following Installation and Startup.
If you are using eXo Chat addon, you should install it in all the cluster nodes.
Create a copy of the package for each cluster node. Assume that you have two nodes: node1.your-domain.com and node2.your-domain.com.
Note
For testing or troubleshooting context, in case you are using Tomcat as application server and if you will run the cluster nodes in the same environment (same Operating System), you should configure different Tomcat ports.
- Configure the RDBMS datasources in each cluster node (follow this documentation) to use one of the supported database systems: Postgres, MySQL, MSSQL, Oracle, MariaDB.
Note
- It is not possible to use the default configured hsql embedded database as noted in Configuring eXo Platform with database.
- The different cluster nodes must use the same RDBMS datasources.
- eXo Platform comes with Elasticsearch embedded. For clustering, you MUST use a seperate Elasticsearch process. Please follow the steps described here.
eXo Platform uses databases and a disk folders to store its data:
Datasources:
- IDM: datasource to store user/group/membership entities.
- JCR: datasource to store JCR Data.
- JPA: datasource to store entities mapped by Hibernate. Quartz tables are stored in this datasource by default.
Disk:
File storage data: Stored by default under a file system folder and could be configured to store files in JPA datasource instead. More details here.
If the file system storage implementation is configured, the folder must be shared between all cluster nodes.
The folder location can be configured by using this property
exo.files.storage.dir=/exo-shared-folder-example/files/
. It is possible to modify it through exo.properties file.JCR Binary Value Storage: Stored by default under a file system folder and could be configured to store files in JCR datasource instead. More details here.
If the file system storage implementation is configured, the folder must be shared between all cluster nodes.
The folder location can be configured by using this property
exo.jcr.storage.data.dir=/exo-shared-folder-example/jcrvalues/
. It is possible to modify it through exo.properties file.
Tip
Choosing file system or RDBMS storage depends on your needs and your system environment.(See more details in Comparing file system and RDBMS storage.
JCR indexes: Stored under a local file system folder in each cluster node. More details here.
eXo Platform uses by default local JCR indexes and this is the recommended mode for clustering. In fact read and write operations take less time in local mode than in shared mode.
Other systems: Such as MongoDB if eXo Chat addon is installed.
Configure
exo.cluster.node.name
property. Use a different name for each node:In JBoss, edit this property in the
standalone-exo-cluster.xml
file:<system-properties> <property name="exo.cluster.node.name" value="node1"/> </system-properties>
In Tomcat, add the property in
setenv-customize.sh
(.bat for windows environments):For windows:
SET "CATALINA_OPTS=%CATALINA_OPTS% -Dexo.cluster.node.name=node1"
For Linux:
CATALINA_OPTS="${CATALINA_OPTS} -Dexo.cluster.node.name=node1"
eXo Platform uses UDP protocol by default for JGroups. This protocol is not recommended for production environements, you need to configure TCP as transport protocol instead. For that purpose, please follow this documentation.
Configure CometD Oort URL. Replace localhost in the following examples with the IP or host name of the node.
In JBoss, edit
standalone-exo-cluster.xml
:<property name="exo.cometd.oort.url" value="http://localhost:8080/cometd/cometd"/>
In Tomcat, edit
exo.properties
:exo.cometd.oort.url=http://localhost:8080/cometd/cometd
CometD is used to perform messaging over the web, and Oort is a CometD extension that supports clustering. The configuration is necessary to make the On-site Notification work properly.
Configure CometD group port. This step is optional.
CometD Oort nodes will automatically join others in the same network and the same group, so to prevent stranger nodes from joining your group, you might specify your group with a port that is different from the default port (
5577
). The situation is likely to happen in a testing environment.In JBoss, edit
standalone-exo-cluster.xml
file:<!-- Configure the same port for all nodes in your cluster --> <property name="exo.cometd.oort.multicast.groupPort" value="5579"/>
In Tomcat, edit
exo.properties
file:# Configure the same port for all nodes in your cluster exo.cometd.oort.multicast.groupPort=5579
The above last step is applicable when multicast is available on the system where CometD is deployed. Otherwise, the static discovery mechanism should be used by adding the following properties in exo.properties file:
exo.cometd.oort.configType=static exo.cometd.oort.cloud=http://host2:port2/cometd/cometd,http://host3:port3/cometd/cometd
- The default value for
exo.cometd.oort.configType
is “multicast”, and only the two values “multicast” and “static” are available. - The parameter
exo.cometd.oort.cloud
must contain a comma-separated list of the Cometd endpoint of all the other nodes of the cluster. So in the example above, we assume that the node of thisexo.properties
is host1:port1, and that the cluster is composed of three nodes : host1, host2 and host3.
- The default value for
Only in Tomcat, configure the following:
In
setenv-customize.sh (.bat for Windows)
:EXO_PROFILES="all,cluster"
In
exo.properties
:gatein.jcr.config.type=cluster gatein.jcr.index.changefilterclass=org.exoplatform.services.jcr.impl.core.query.ispn.LocalIndexChangesFilter # Default JCR indexing is local so you need to use a different folder for each node. # With the value below, you do not have to create the folder. exo.jcr.index.data.dir=gatein/data/jcr/index
Start the servers. You must wait until node1 is fully started, then start node2.
In JBoss, you need to indicate the configuration file with -c option:
./bin/standalone.sh -b 0.0.0.0 -c standalone-exo-cluster.xml
(.bat for Windows).Only in JBoss, some other options that you can use in the start command:
-Dexo.cluster.node.name=a-node-name overrides the node name in the configuration file.
-Djboss.socket.binding.port-offset=101
This is useful in case you set up nodes in the same machine for testing. You will not need to configure the port for every node. Just use a different port-offset in each start command.
Note
If you run two nodes in the same machine for testing, change the default ports of node2 to avoid port conflict.
In Tomcat, ports are configured in conf/server.xml
.
In JBoss, use -Djboss.socket.binding.port-offset
option mentioned above.
To configure a front-end for your nodes, follow Setting up Apache front-end.
To configure load balancing, follow Setting up a load balancer.
Note
eXo Platform only supports sticky session mode for clustering (no session replication). This must be configured in the load balancer configuration.
JCR index in cluster mode¶
Note
eXo Platform uses local JCR index by default. You can switch between local index and shared index by configuration.
The local indexing is defaulted for simplifying configuration. Each strategy has its pros and cons. Here is brief of their characteristics, but it is strongly recommended you read the given links for better understanding:
Local indexing: Each node manages its own local index storage. The “documents” (to be indexed) are replicated within nodes.
“Documents” are Lucene term that means a block of data ready for indexing. The same “documents” are replicated between nodes and each node locally indexes it, so the local indexes are updated for the running nodes.
There are additional mechanisms for a new node that starts for the first time to initiate its local index, and for a node joining the cluster after downtime to update its local index.
Read this link for details.
Shared indexing: Every node has read access to a shared index and has its own in-memory index. A single “coordinator” node is responsible for pulling in-memory indexes and updating the shared index.
It allows searching for newly added content immediately. However, there are rare cases that search result is different between nodes for a while.
Read this link for details.
For LOCAL INDEXING, the index directory should be a local path for each node. In JBoss it is set already by default:
<property name="exo.jcr.index.data.dir" value="${exo.jcr.data.dir}/index"/>
But for Tomcat, you need to set it yourself, in exo.properties
file:
exo.jcr.index.data.dir=gatein/data/jcr/index
If you want to use a SHARED INDEX for every node:
Enable the profile cluster-index-shared.
In JBoss, edit
$PLATFORM_JBOSS_HOME/standalone/configuration/standalone-exo-cluster.xml
:<property name="exo.profiles" value="all,cluster,cluster-index-shared"/>
In Tomcat, edit
setenv-customize.sh
(.bat for Windows, see Customizing environment variables):EXO_PROFILES="all,cluster,cluster-index-shared"
Set the index directory (exo.jcr.index.data.dir
) to a network
sharing path.
In JBoss, edit
$PLATFORM_JBOSS_HOME/standalone/configuration/standalone-exo-cluster.xml
:<property name="exo.jcr.index.data.dir" value="${exo.shared.dir}/jcr/index"/>
In Tomcat, if you do not configure it,
exo.jcr.index.data.dir
is already set to a sub-folder of the shared directoryEXO_DATA_DIR
. It is done insetenv.*
:CATALINA_OPTS="$CATALINA_OPTS -Dexo.jcr.index.data.dir=\"${EXO_DATA_DIR}/jcr/index\""
You can override it in
exo.properties
:exo.jcr.index.data.dir=/path/of/a/shared/folder/for/all/nodes
Activating TCP default configuration files¶
The default protocol for JGroups is UDP. However, TCP is still
pre-configured in
platform-extension-config.jar!/conf/platform/jgroups
and you can
simply activate it.
The files contain externalized variable names and default values for
TCP. In case you want to use TCP instead of UDP, it is recommended that
you activate those files and, if you need to, change the default
settings via exo.properties
. See Configuration overview
for the exo.properties
file.
To activate TCP default configuration files, enable the profile
cluster-jgroups-tcp
:
In JBoss, edit
standalone-exo-cluster.xml
:<system-properties> ... <property name="exo.profiles" value="all,cluster,cluster-jgroups-tcp"/> ... </system-properties>
In Tomcat, edit
setenv-customize.sh
(.bat for Windows, see Customizing environment variables):EXO_PROFILES="all,cluster,cluster-jgroups-tcp"
When switching to use TCP instead of UDP, you need to add some
properties in exo.properties
:
# Assume node1 is 192.168.1.100 and node2 is 192.168.1.101. Here is configuration for node1:
exo.jcr.cluster.jgroups.tcp.bind_addr=192.168.1.100
exo.jcr.cluster.jgroups.tcpping.initial_hosts=192.168.1.100[7800],192.168.1.101[7800]
exo.service.cluster.jgroups.tcp.bind_addr=192.168.1.100
exo.service.cluster.jgroups.tcpping.initial_hosts=192.168.1.100[7900],192.168.1.101[7900]
Configuring JGroups via exo.properties¶
JGroups configuration can be externalized for both JCR and Service. In this
section you find a list of default values and externalized variables
that you can configure via exo.properties
. See Configuration overview
for the exo.properties
file.
It is recommended you configure JGroups via exo.properties
. Only
when the variables are not enough, or when migrating from previous
versions you want to re-use your JGroups xml files, you will customize
JGroups xml files as described in next section.
UDP configuration for JCR¶
JGroups name | Default value | eXo variable |
---|---|---|
UDP | ||
singleton_name | exo-transpor t-udp | exo.jcr.cluster.jgroups.udp.singleton _name |
bind_addr | 127.0.0.1 | exo.jcr.cluster.jgroups.udp.bind_add r |
bind_port | 16600 | exo.jcr.cluster.jgroups.udp.bind_por t |
mcast_addr | 228.10.10.10 | exo.jcr.cluster.jgroups.udp.mcast_ad dr |
mcast_port | 17600 | exo.jcr.cluster.jgroups.udp.mcast_po rt |
tos | 8 | exo.jcr.cluster.jgroups.udp.tos |
ucast_recv_buf_siz e | 20M | exo.jcr.cluster.jgroups.udp.ucast_re cv_buf_size |
ucast_send_buf_siz e | 640K | exo.jcr.cluster.jgroups.udp.ucast_se nd_buf_size |
mcast_recv_buf_siz e | 25M | exo.jcr.cluster.jgroups.udp.mcast_re cv_buf_size |
mcast_send_buf_siz e | 640K | exo.jcr.cluster.jgroups.udp.mcast_se nd_buf_size |
max_bundle_size | 64000 | exo.jcr.cluster.jgroups.udp.max_bund le_size |
max_bundle_timeout | 30 | exo.jcr.cluster.jgroups.udp.max_bund le_timeout |
ip_ttl | 2 | exo.jcr.cluster.jgroups.udp.ip_ttl |
enable_diagnostics | true | exo.jcr.cluster.jgroups.udp.enable_d iagnostics |
diagnostics_addr | 224.0.75.75 | exo.jcr.cluster.jgroups.udp.diagnosti cs_addr |
diagnostics_port | 7500 | exo.jcr.cluster.jgroups.udp.diagnosti cs_port |
thread_naming_patte rn | cl | exo.jcr.cluster.jgroups.udp.thread_n aming_pattern |
use_concurrent_stac k | true | exo.jcr.cluster.jgroups.udp.use_conc urrent_stack |
thread_pool.enabled | true | exo.jcr.cluster.jgroups.udp.thread_p ool.enabled |
thread_pool.min_thr eads | 10 | exo.jcr.cluster.jgroups.udp.thread_p ool.min_threads |
thread_pool.max_thr eads | 1000 | exo.jcr.cluster.jgroups.udp.thread_p ool.max_threads |
thread_pool.keep_al ive_time | 5000 | exo.jcr.cluster.jgroups.udp.thread_p ool.keep_alive_time |
thread_pool.queue_e nabled | true | exo.jcr.cluster.jgroups.udp.thread_p ool.queue_enabled |
thread_pool.queue_m ax_size | 1000 | exo.jcr.cluster.jgroups.udp.thread_p ool.queue_max_size |
thread_pool.rejectio n_policy | discard | exo.jcr.cluster.jgroups.udp.thread_p ool.rejection_policy |
oob_thread_pool.ena bled | true | exo.jcr.cluster.jgroups.udp.oob_thre ad_pool.enabled |
oob_thread_pool.min _threads | 5 | exo.jcr.cluster.jgroups.udp.oob_thre ad_pool.min_threads |
oob_thread_pool.max _threads | 1000 | exo.jcr.cluster.jgroups.udp.oob_thre ad_pool.max_threads |
oob_thread_pool.kee p_alive_time | 5000 | exo.jcr.cluster.jgroups.udp.oob_thre ad_pool.keep_alive_time |
oob_thread_pool.que ue_enabled | false | exo.jcr.cluster.jgroups.udp.oob_thre ad_pool.queue_enabled |
oob_thread_pool.que ue_max_size | 1000 | exo.jcr.cluster.jgroups.udp.oob_thre ad_pool.queue_max_size |
oob_thread_pool.rej ection_policy | Run | exo.jcr.cluster.jgroups.udp.oob_thre ad_pool.rejection_policy |
PING | ||
timeout | 2000 | exo.jcr.cluster.jgroups.ping.timeout |
num_initial_members | 1 | exo.jcr.cluster.jgroups.ping.num_ini tial_members |
MERGE2 | ||
max_interval | 30000 | exo.jcr.cluster.jgroups.merge2.max_i nterval |
min_interval | 10000 | exo.jcr.cluster.jgroups.merge2.min_i nterval |
FD | ||
timeout | 10000 | exo.jcr.cluster.jgroups.fd.timeout |
max_tries | 5 | exo.jcr.cluster.jgroups.fd.max_tries |
shun | true | exo.jcr.cluster.jgroups.fd.shun |
VERIFY_SUSPECT | ||
timeout | 1500 | exo.jcr.cluster.jgroups.verify_suspe ct.timeout |
pbcast.NAKACK | ||
use_stats_for_retr ansmission | false | exo.jcr.cluster.jgroups.pbcast.nakack .use_stats_for_retransmission |
exponential_backoff | 150 | exo.jcr.cluster.jgroups.pbcast.nakack .exponential_backoff |
use_mcast_xmit | true | exo.jcr.cluster.jgroups.pbcast.nakack .use_mcast_xmit |
gc_lag | 0 | exo.jcr.cluster.jgroups.pbcast.nakack .gc_lag |
retransmit_timeout | 50,300,600,1 200 | exo.jcr.cluster.jgroups.pbcast.nakack .retransmit_timeout |
discard_delivered_m sgs | true | exo.jcr.cluster.jgroups.pbcast.nakack .discard_delivered_msgs |
UNICAST | ||
timeout | 300,600,1200 | exo.jcr.cluster.jgroups.unicast.timeo ut |
pbcast.STABLE | ||
stability_delay | 1000 | exo.jcr.cluster.jgroups.pbcast.stable .stability_delay |
desired_avg_gossip | 50000 | exo.jcr.cluster.jgroups.pbcast.stable .desired_avg_gossip |
max_bytes | 1000000 | exo.jcr.cluster.jgroups.pbcast.stable .max_bytes |
VIEW_SYNC | ||
avg_send_interval | 60000 | exo.jcr.cluster.jgroups.view_sync.av g_send_interval |
pbcast.GMS | ||
print_local_addr | true | exo.jcr.cluster.jgroups.pbcast.gms.pr int_local_addr |
join_timeout | 3000 | exo.jcr.cluster.jgroups.pbcast.gms.jo in_timeout |
shun | false | exo.jcr.cluster.jgroups.pbcast.gms.sh un |
view_bundling | true | exo.jcr.cluster.jgroups.pbcast.gms.vi ew_bundling |
FC | ||
max_credits | 500000 | exo.jcr.cluster.jgroups.fc.max_credi ts |
min_threshold | 0.20 | exo.jcr.cluster.jgroups.fc.min_thres hold |
FRAG2 | ||
frag_size | 60000 | exo.jcr.cluster.jgroups.frag2.frag_s ize |
TCP configuration for JCR¶
See how to activate TCP default configuration in Activating TCP default configuration files.
JGroups name | Default value | eXo variable |
---|---|---|
TCP | ||
singleton_name | exo-transpor t-tcp | exo.jcr.cluster.jgroups.tcp.singleton _name |
bind_addr | 127.0.0.1 | exo.jcr.cluster.jgroups.tcp.bind_add r |
start_port | 7800 | exo.jcr.cluster.jgroups.tcp.start_po rt |
loopback | true | exo.jcr.cluster.jgroups.tcp.loopback |
recv_buf_size | 20000000 | exo.jcr.cluster.jgroups.tcp.recv_buf _size |
send_buf_size | 640000 | exo.jcr.cluster.jgroups.tcp.send_buf _size |
discard_incompatible _packets | true | exo.jcr.cluster.jgroups.tcp.discard_ incompatible_packets |
max_bundle_size | 64000 | exo.jcr.cluster.jgroups.tcp.max_bund le_size |
max_bundle_timeout | 30 | exo.jcr.cluster.jgroups.tcp.max_bund le_timeout |
use_incoming_packet _handler | true | exo.jcr.cluster.jgroups.tcp.use_inco ming_packet_handler |
enable_bundling | true | exo.jcr.cluster.jgroups.tcp.enable_b undling |
use_send_queues | true | exo.jcr.cluster.jgroups.tcp.use_send _queues |
sock_conn_timeout | 300 | exo.jcr.cluster.jgroups.tcp.sock_con n_timeout |
skip_suspected_memb ers | true | exo.jcr.cluster.jgroups.tcp.skip_sus pected_members |
use_concurrent_stac k | true | exo.jcr.cluster.jgroups.tcp.use_conc urrent_stack |
thread_pool.enabled | true | exo.jcr.cluster.jgroups.tcp.thread_p ool.enabled |
thread_pool.min_thr eads | 10 | exo.jcr.cluster.jgroups.tcp.thread_p ool.min_threads |
thread_pool.max_thr eads | 100 | exo.jcr.cluster.jgroups.tcp.thread_p ool.max_threads |
thread_pool.keep_al ive_time | 60000 | exo.jcr.cluster.jgroups.tcp.thread_p ool.keep_alive_time |
thread_pool.queue_e nabled | true | exo.jcr.cluster.jgroups.tcp.thread_p ool.queue_enabled |
thread_pool.queue_m ax_size | 1000 | exo.jcr.cluster.jgroups.tcp.thread_p ool.queue_max_size |
thread_pool.rejectio n_policy | Discard | exo.jcr.cluster.jgroups.tcp.thread_p ool.rejection_policy |
oob_thread_pool.ena bled | true | exo.jcr.cluster.jgroups.tcp.oob_thre ad_pool.enabled |
oob_thread_pool.min _threads | 10 | exo.jcr.cluster.jgroups.tcp.oob_thre ad_pool.min_threads |
oob_thread_pool.max _threads | 100 | exo.jcr.cluster.jgroups.tcp.oob_thre ad_pool.max_threads |
oob_thread_pool.kee p_alive_time | 60000 | exo.jcr.cluster.jgroups.tcp.oob_thre ad_pool.keep_alive_time |
oob_thread_pool.que ue_enabled | false | exo.jcr.cluster.jgroups.tcp.oob_thre ad_pool.queue_enabled |
oob_thread_pool.que ue_max_size | 1000 | exo.jcr.cluster.jgroups.tcp.oob_thre ad_pool.queue_max_size |
oob_thread_pool.rej ection_policy | Discard | exo.jcr.cluster.jgroups.tcp.oob_thre ad_pool.rejection_policy |
TCPPING | ||
timeout | 3000 | exo.jcr.cluster.jgroups.tcpping.timeo ut |
initial_hosts | localhost[78 00] | exo.jcr.cluster.jgroups.tcpping.initi al_hosts |
port_range | 0 | exo.jcr.cluster.jgroups.tcpping.port_range |
num_initial_members | 1 | exo.jcr.cluster.jgroups.tcpping.num_ initial_members |
MERGE2 | ||
max_interval | 100000 | exo.jcr.cluster.jgroups.merge2.max_i nterval |
min_interval | 20000 | exo.jcr.cluster.jgroups.merge2.min_i nterval |
FD | ||
timeout | 10000 | exo.jcr.cluster.jgroups.fd.timeout |
max_tries | 5 | exo.jcr.cluster.jgroups.fd.max_tries |
shun | true | exo.jcr.cluster.jgroups.fd.shun |
VERIFY_SUSPECT | ||
timeout | 1500 | exo.jcr.cluster.jgroups.verify_suspe ct.timeout |
pbcast.NAKACK | ||
use_mcast_xmit | false | exo.jcr.cluster.jgroups.pbcast.nakack .use_mcast_xmit |
gc_lag | 0 | exo.jcr.cluster.jgroups.pbcast.nakack .gc_lag |
retransmit_timeout | 300,600,1200 ,2400,4800 | exo.jcr.cluster.jgroups.pbcast.nakack .retransmit_timeout |
discard_delivered_m sgs | true | exo.jcr.cluster.jgroups.pbcast.nakack .discard_delivered_msgs |
UNICAST | ||
timeout | 300,600,1200 | exo.jcr.cluster.jgroups.unicast.timeo ut |
pbcast.STABLE | ||
stability_delay | 1000 | exo.jcr.cluster.jgroups.pbcast.stable .stability_delay |
desired_avg_gossip | 50000 | exo.jcr.cluster.jgroups.pbcast.stable .desired_avg_gossip |
max_bytes | 1m | exo.jcr.cluster.jgroups.pbcast.stable .max_bytes |
VIEW_SYNC | ||
avg_send_interval | 60000 | exo.jcr.cluster.jgroups.view_sync.av g_send_interval |
pbcast.GMS | ||
print_local_addr | true | exo.jcr.cluster.jgroups.pbcast.gms.pr int_local_addr |
join_timeout | 3000 | exo.jcr.cluster.jgroups.pbcast.gms.jo in_timeout |
shun | true | exo.jcr.cluster.jgroups.pbcast.gms.sh un |
view_bundling | true | exo.jcr.cluster.jgroups.pbcast.gms.vi ew_bundling |
FC | ||
max_credits | 2000000 | exo.jcr.cluster.jgroups.fc.max_credi ts |
min_threshold | 0.10 | exo.jcr.cluster.jgroups.fc.min_thres hold |
FRAG2 | ||
frag_size | 60000 | exo.jcr.cluster.jgroups.frag2.frag_s ize |
UDP configuration for Service layer¶
TCP configuration for Service layer caches¶
See how to activate TCP default configuration in Activating TCP default configuration files.
Using customized JGroups xml files¶
JGroups configuration, for both JCR and Service layers, is externalized via
exo.properties
(see Configuration overview for
this file). It is recommended you use this file. See previous section
for list of default values and externalized variables.
Only when the variables are not enough, or when migrating from previous version you want to re-use your JGroups configuration files, you will follow this section to activate your xml files.
Put your xml file somewhere, typically
standalone/configuration/gatein/jgroups/
in JBoss andgatein/conf/jgroups/
in Tomcat.Edit the following properties in
exo.properties
:exo.jcr.cluster.jgroups.config=${exo.conf.dir}/jgroups/jgroups-jcr.xml exo.jcr.cluster.jgroups.config-url=file:${exo.jcr.cluster.jgroups.config} exo.service.cluster.jgroups.config=${exo.conf.dir}/jgroups/jgroups-service.xml
In which exo.conf.dir
is standalone/configuration/gatein
in
JBoss and gatein/conf
in Tomcat by default.
If you put your files somewhere else, pay attention that you must use an absolute path after “file:”.
exo.jcr.cluster.jgroups.config=/path/to/your/jgroups-jcr-file
exo.jcr.cluster.jgroups.config-url=file:/path/to/your/jgroups-jcr-file
exo.service.cluster.jgroups.config=/path/to/your/jgroups-service-file
Setting up a load balancer¶
Setting up a basic load balancing with Apache¶
The following modules need to be activated in order to do load balancing on several cluster nodes :
- mod_proxy_balancer
- mod_slotmem_shm (mandatory for mod_proxy_balancer)
- mod_lbmethod_byrequests if you choose the by request balancing algorithm (can be also mod_lbmethod_bytraffic or mod_lbmethod_bybusyness)
Part of an apache configuration to enabled load balancing :
# Add a http header to explicitly identify the node and be sticky
Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED
# Declare the http server pool
<Proxy "balancer://plf">
BalancerMember "http://node1:8080" route=node1 acquire=2000 retry=5 keepalive=on ping=30 connectiontimeout=2
BalancerMember "http://node2:8080" route=node2 acquire=2000 retry=5 keepalive=on ping=30 connectiontimeout=2
ProxySet stickysession=ROUTEID
</Proxy>
# Declare the pool dedicated to the websocket tunnels
<Proxy "balancer://plf_ws">
BalancerMember "ws://node1:8080" route=node1 acquire=2000 retry=0 keepalive=on ping=30 connectiontimeout=2 disablereuse=on flushpackets=on
BalancerMember "ws://node2:8080" route=node2 acquire=2000 retry=0 keepalive=on ping=30 connectiontimeout=2 disablereuse=on flushpackets=on
ProxySet stickysession=ROUTEID
</Proxy>
# Common options
ProxyRequests Off
ProxyPreserveHost On
# Declare the redirection for websocket urls, must be declared before the general ProxyPass definition
ProxyPass /cometd "balancer://plf_ws/cometd"
# Declare the redirection for the http requests
ProxyPass / "balancer://plf/"
ProxyPassReverse / "balancer://plf/"
Note
This configuration must be adapted to you specific needs before you go to production.
All the configuration detail can be found on the Apache configuration page
Improving the logs¶
Diagnose a cluster problem can be difficult. The Apache logs can be customized to help you to follow the load balancing behavior.
The BALANCER_WORKER_ROUTE
will add in your logs the name of the node
that received the requests.
The BALANCER_ROUTE_CHANGED
will set the field to 1
if the user
was redirected to different node compared his previous request. This
indicate the node was removed from the cluster pool or was not able to
received more requests. During normal processing, this flag should
always have the value -
.
Example of log format with cluster diagnosis enabled :
LogFormat "%h %l %u %t \"%r\" %>s %b %{BALANCER_WORKER_ROUTE}e %{BALANCER_ROUTE_CHANGED}e" common_cluster
Note
More log options are detailed in the Apache documentation
Setting up basic load balancing with NGINX¶
Note
The load balancing support on the free version of NGINX is limited. The sticky algorithm is limited to ip hash and the nodes configuration can’t be precisly tuned.
If you have a NGINX plus license, the full load balancing documentation can be found here
Basic NGINX load balancing configuration :
upstream plf {
ip_hash;
server node1:8080;
server node2:8080;
}
server {
listen 80;
location / {
proxy_pass http://plf;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; }
# Websocket for Cometd
location /cometd/cometd {
proxy_pass http://plf;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
FAQs of clustering¶
Q: How to migrate from local to the cluster mode?
A: If you intend to migrate your production system from the local (non-cluster) to the cluster mode, follow these steps:
Update the configuration to the cluster mode as explained above on your main server.
Use the same configuration on other cluster nodes.
Move the index and value storage to the shared file system.
Start the cluster.
Q: Why is startup failed with the “Port value out of range” error?
A: On Linux, your startup is failed if you encounter the following error:
[INFO] Caused by: java.lang.IllegalArgumentException: Port value out of range: 65536
This problem happens under specific circumstances when the JGroups networking library behind the clustering attempts to detect the IP to communicate with other nodes.
You need to verify:
- The host name is a valid IP address, served by one of the network devices, such as eth0, and eth1.
- The host name is NOT defined as localhost or 127.0.0.1.
Q: How to solve the “failed sending message to null” error?
A: If you encounter the following error when starting up in the cluster mode on Linux:
Dec 15, 2010 6:11:31 PM org.jgroups.protocols.TP down
SEVERE: failed sending message to null (44 bytes)
java.lang.Exception: dest=/228.10.10.10:45588 (47 bytes)
Be aware that clustering on Linux only works with IPv4. Therefore, when using a cluster under Linux, add the following property to the JVM parameters:
-Djava.net.preferIPv4Stack=true