Имеется совсем простой случай конфигурации pacemaker
node test-1 \
attributes standby="off"
node test-2 \
attributes standby="off"
primitive IP1 ocf:heartbeat:IPaddr2 \
params ip="192.168.1.2" cidr_netmask="32" nic="eth1" \
op monitor interval="10"
primitive IP2 ocf:heartbeat:IPaddr2 \
params ip="192.168.1.3" cidr_netmask="32" nic="eth1" \
op monitor interval="10"
primitive p_haproxy lsb:haproxy \
op monitor interval="10" timeout="20" \
op start interval="0" timeout="20" \
op stop interval="0" timeout="20"
group grIP1 IP1 \
meta ordered="false"
group grIP2 IP2 \
meta ordered="false"
group gr_WebServer p_haproxy
clone cl_WebServer gr_WebServer \
meta interleaved="true"
colocation c_grIP1_Ws inf: grIP1 cl_WebServer
colocation c_grIP1_grIP2 -200: grIP1 grIP2
colocation c_grIP2_Ws inf: grIP2 cl_WebServer
property $id="cib-bootstrap-options" \
dc-version="1.1.7-6.el6" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
Current DC: test-1 - partition with quorum
2 Nodes configured, 2 expected votes
4 Resources configured.
============
Online: [ test-1 test-2 ]
Resource Group: grIP1
IP1 (ocf::heartbeat:IPaddr2): Started test-2
Resource Group: grIP2
IP2 (ocf::heartbeat:IPaddr2): Started test-1
Clone Set: cl_WebServer [gr_WebServer]
Started: [ test-2 test-1 ]
Current DC: test-1 - partition with quorum
2 Nodes configured, 2 expected votes
4 Resources configured.
============
Online: [ test-1 test-2 ]
Resource Group: grIP1
IP1 (ocf::heartbeat:IPaddr2): Started test-2
Resource Group: grIP2
IP2 (ocf::heartbeat:IPaddr2): Started test-2
Clone Set: cl_WebServer [gr_WebServer]
Started: [ test-2 ]
Stopped: [ gr_WebServer:1 ]
Failed actions:
p_haproxy:1_start_0 (node=test-1, call=17, rc=1, status=complete): unknown error
warning: unpack_rsc_op: Processing failed op p_haproxy:1_last_failure_0 on test-1: unknown error (1)
warning: common_apply_stickiness: Forcing cl_WebServer away from test-1 after 1000000 failures (max=1000000)
warning: common_apply_stickiness: Forcing cl_WebServer away from test-1 after 1000000 failures (max=1000000)
Единственный способ вернуть ресурсы на вновь заработавшую ноду - перезапуск corosync (pacemaker запускается corosync-ом ver 0).
Просьба подсказать - как вернуть ресурс на ноду и почему это не удается сделать сейчас.
Версии ПО: corosync 1.4.1, pacemaker 1.1.7.