[Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて

Back to archive index

renay****@ybb***** renay****@ybb*****
2015年 2月 28日 (土) 07:41:45 JST








----- Original Message -----
>From: Masamichi Fukuda - elf-systems <masamichi_fukud****@elf-s*****>
>To: linux****@lists***** 
>Date: 2015/2/27, Fri 21:04
>Subject: [Linux-ha-jp] スプリットブレイン時のSTONITHエラーについて
>debian Xen上で2ノードのクラスタシステムを構築して検証をしています。
>Dom0はdebian7.7, Xen 4.1.4-3+deb7u3
>DomUはdebian7.8, pacemaker 1.1.7-1, heartbeat 1:3.0.5-3
>Node lbv2.beta.com (82ffc36f-1ad8-8686-7db0-35686465c624): UNCLEAN (offl
>Online: [ lbv1.beta.com ]
>Node lbv1.beta.com (38b0f200-83ea-8633-6f37-047d36cd39c6): UNCLEAN (offl
>Online: [ lbv2.beta.com ]
>lbv1 [12657]: CRIT: external_reset_req: 'stonith-helper reset' for host lbv2.beta.com failed with rc 1
>lbv2 [22225]: CRIT: external_reset_req: 'stonith-helper reset' for host lbv1.beta.com failed with rc 1
>primitive Stonith1-1 stonith:external/stonith-helper \
>    params \
>        priority="1" \
>        stonith-timeout="40" \
>        hostlist="lbv1.beta.com" \
>        dead_check_target="" \
>        standby_wait_time="10" \
>        standby_check_command="/usr/sbin/crm_resource -r varnishd -W | grep -q `hostname`" \
>    op start interval="0s" timeout="60s" on-fail="restart" \
>    op monitor interval="3600s" timeout="60s" on-fail="restart" \
>    op stop interval="0s" timeout="60s" on-fail="ignore"
>primitive Stonith2-1 stonith:external/stonith-helper \
>    params \
>        priority="1" \
>        stonith-timeout="40" \
>        hostlist="lbv2.beta.com" \
>        dead_check_target="" \
>        standby_wait_time="10" \
>        standby_check_command="/usr/sbin/crm_resource -r varnishd -W | grep -q `hostname`" \
>    op start interval="0s" timeout="60s" on-fail="restart" \
>    op monitor interval="3600s" timeout="60s" on-fail="restart" \
>    op stop interval="0s" timeout="60s" on-fail="ignore"
>Feb 27 19:29:04 lbv1.beta.com stonith: [18566]: CRIT: external_reset_req
>: 'stonith-helper reset' for host lbv2.beta.com failed with rc 1
>Feb 27 19:29:04 lbv1.beta.com stonith-ng: [2815]: ERROR: log_operation: 
>Operation 'reboot' [18565] (call 0 from d2acf6a5-ef8d-4249-aaab-25a8686d6647) fo
>r host 'lbv2.beta.com' with device 'Stonith2-1' returned: -2
>Feb 27 19:29:04 lbv1.beta.com stonith-ng: [2815]: ERROR: log_operation: 
>Stonith2-1: Performing: stonith -t external/stonith-helper -T reset lbv2.
>Feb 27 19:29:04 lbv1.beta.com stonith-ng: [2815]: ERROR: log_operation: 
>Stonith2-1: failed: lbv2.beta.com 5
>Feb 27 19:29:05 lbv1.beta.com stonith-ng: [2815]: info: call_remote_ston
>ith: Requesting that lbv1.beta.com perform op reboot lbv2.beta.c
>Feb 27 19:29:05 lbv1.beta.com stonith-ng: [2815]: info: can_fence_host_w
>ith_device: Stonith2-1 can fence lbv2.beta.com: dynamic-list
>Feb 27 19:29:05 lbv1.beta.com stonith-ng: [2815]: info: can_fence_host_w
>ith_device: Stonith2-2 can fence lbv2.beta.com: dynamic-list
>Feb 27 19:29:05 lbv1.beta.com stonith-ng: [2815]: info: can_fence_host_w
>ith_device: Stonith2-3 can fence lbv2.beta.com: dynamic-list
>Feb 27 19:29:05 lbv1.beta.com stonith-ng: [2815]: info: stonith_fence: F
>ound 3 matching devices for 'lbv2.beta.com'
>Feb 27 19:29:05 lbv1.beta.com stonith-ng: [2815]: info: stonith_command:
> Processed st_fence from lbv1.beta.com: rc=-1
>Feb 27 19:29:08 lbv1.beta.com crm_resource: [18790]: info: Invoked: /usr
>/sbin/crm_resource -r varnishd -W 
>Feb 27 19:29:09 lbv1.beta.com stonith: [18706]: CRIT: external_reset_req
>: 'stonith-helper reset' for host lbv2.beta.com failed with rc 1
>Feb 27 19:29:09 lbv1.beta.com stonith-ng: [2815]: ERROR: log_operation: 
>Operation 'reboot' [18705] (call 0 from d2acf6a5-ef8d-4249-aaab-25a8686d6647) fo
>r host 'lbv2.beta.com' with device 'Stonith2-1' returned: -2
>Feb 27 19:29:09 lbv1.beta.com stonith-ng: [2815]: ERROR: log_operation: 
>Stonith2-1: Performing: stonith -t external/stonith-helper -T reset lbv2.
>Feb 27 19:29:09 lbv1.beta.com stonith-ng: [2815]: ERROR: log_operation: 
>Stonith2-1: failed: lbv2.beta.com 5
>Feb 27 19:29:10 lbv1.beta.com stonith-ng: [2815]: info: call_remote_ston
>ith: Requesting that lbv1.beta.com perform op reboot lbv2.beta.c
>Feb 27 19:29:10 lbv1.beta.com stonith-ng: [2815]: info: can_fence_host_w
>ith_device: Stonith2-1 can fence lbv2.beta.com: dynamic-list
>Feb 27 19:29:10 lbv1.beta.com stonith-ng: [2815]: info: can_fence_host_w
>ith_device: Stonith2-2 can fence lbv2.beta.com: dynamic-list
>Feb 27 19:29:10 lbv1.beta.com stonith-ng: [2815]: info: can_fence_host_w
>ith_device: Stonith2-3 can fence lbv2.beta.com: dynamic-list
>Feb 27 19:29:10 lbv1.beta.com stonith-ng: [2815]: info: stonith_fence: F
>ound 3 matching devices for 'lbv2.beta.com'
>Feb 27 19:29:10 lbv1.beta.com stonith-ng: [2815]: info: stonith_command:
> Processed st_fence from lbv1.beta.com: rc=-1
>Feb 27 19:29:13 lbv1.beta.com crm_resource: [18953]: info: Invoked: /usr
>/sbin/crm_resource -r varnishd -W 
>ELF Systems
>Masamichi Fukuda
>mail to: masamichi_fukud****@elf-s*****
>Linux-ha-japan mailing list

Linux-ha-japan メーリングリストの案内
Back to archive index