小林 年伸
toshi****@jig-s*****
2015年 9月 29日 (火) 10:02:06 JST
開発者の皆様 jig-sawの小林と申します。 ultramonkeyl7-3.1.1、 ultramonkeyl7-3.1.2 であるきっかけで内部的にactive_countが膨大な数になり 突然Sorryサーバーに流れる事象がありましたので ご質問させていただいてよろしいでしょうか。 以下私の方で確認できた部分です。 ================================================= ①構成 ================================================= エッジルーター ↓ ultramonkeyl7-3.1.1 or ultramonkeyl7-3.1.2 OS:CentOS 6.6 64bit pacemaker, corosyncの冗長構成 ↓ httpリアルサーバー × 2 (apache) OS:CentOS 5.5 64bit ・l7vsd設定 ※ログ部分は省略させていただきます。 ------------------------------------------ # # l7vsadm setting # [l7vsadm] cmd_interval = 1 cmd_count = 10 con_interval = 1 con_count = 1 # # l7vsd setting # [l7vsd] maxfileno = 65535 # # virtualservice setting # [virtualservice] session_thread_pool_size = 32 throughput_calc_interval = 500 #num_of_core_uses = 0 # # Session Settings # [session] upstream_buffer_size = 8192 downstream_buffer_size = 8192 server_connect_timeout = 5 [replication] 使用していません [snmpagent] enable = "false" ------------------------------------------ ・l7director設定 ------------------------------------------ # - Monitor Settings checktimeout = 5 negotiatetimeout = 5 checkinterval = 10 retryinterval = 5 checkcount = 3 # - Real Server Operation quiescent = yes #fallback = 127.0.0.1:80 # - Monitor Configuration File configinterval = 10 autoreload = yes # HTTP virtual = 〓vip〓:80 real=172.20.0.251:80 masq 10 real=172.20.0.253:80 masq 10 module = sessionless --forwarded-for sorryserver = 〓sorry srver〓 scheduler = lc qosup = 125M qosdown = 125M checktype = connect service = http quiescent = no checkcount = 2 session_thread_pool_size = 50 maxconn = 1000 ------------------------------------------ ・apache設定 ※timeout値とpreforkモジュール部分のみ抜粋です ------------------------------------------ Timeout 60 KeepAlive On MaxKeepAliveRequests 100 KeepAliveTimeout 2 <IfModule prefork.c> StartServers 8 MinSpareServers 5 MaxSpareServers 20 ServerLimit 256 MaxClients 256 MaxRequestsPerChild 4000 </IfModule> ------------------------------------------ ================================================= ②事象 ================================================= 1). 上の構成でultramonkeyl7のvipにtelnetコマンドで接続 $ telnet 〓vip〓 80 Trying 〓vip〓 Connected to 〓vip〓 Escape character is '^]'. GET / HTTP/1.0 通常通り操作を行えばリアルサーバーのコンテンツが返ってきます。 以下はl7vsd_virtualserviceのデバッグログです ---------------------------------------- 2015/09/28 19:03:12 [DEBUG] l7vsd_virtualservice VSD10700031 active session thread id = 0x16003f0 〓ホスト名〓 0x7ffd78fc7700 virtualservice_tcp.cpp:457 2015/09/28 19:03:12 [DEBUG] l7vsd_virtualservice VSD10700032 pool_session.size = 49 〓ホスト名〓 0x7ffd78fc7700 virtualservice_tcp.cpp:460 2015/09/28 19:03:12 [DEBUG] l7vsd_virtualservice VSD10700033 active_session.size = 1 〓ホスト名〓 0x7ffd78fc7700 virtualservice_tcp.cpp:463 2015/09/28 19:03:12 [DEBUG] l7vsd_virtualservice VSD10700034 active_count = 0 〓ホスト名〓 0x7ffd78fc7700 virtualservice_tcp.cpp:466 2015/09/28 19:03:12 [DEBUG] l7vsd_virtualservice VSD10700007 in_function : void session_thread_control::startupstream() 〓ホスト名〓 0x7ffd78fc7700 session_thread_control.cpp:211 2015/09/28 19:03:12 [DEBUG] l7vsd_virtualservice VSD10700008 out_function : void session_thread_control::startupstream() :status = 1 〓ホスト名〓 0x7ffd78fc7700 session_thread_control.cpp:219 2015/09/28 19:03:12 [DEBUG] l7vsd_virtualservice VSD10700011 in_function : void session_thread_control::startdownstream() 〓ホスト名〓 0x7ffd78fc7700 session_thread_control.cpp:246 2015/09/28 19:03:12 [DEBUG] l7vsd_virtualservice VSD10700012 out_function : void session_thread_control::startdownstream() : status = 1 〓ホスト名〓 0x7ffd78fc7700 session_thread_control.cpp:254 2015/09/28 19:03:12 [DEBUG] l7vsd_virtualservice VSD10700035 active session thread id = 0x1610400 〓ホスト名〓 0x7ffd78fc7700 virtualservice_tcp.cpp:512 2015/09/28 19:03:12 [DEBUG] l7vsd_virtualservice VSD10700036 pool_session.size = 48 〓ホスト名〓 0x7ffd78fc7700 virtualservice_tcp.cpp:515 2015/09/28 19:03:12 [DEBUG] l7vsd_virtualservice VSD10700037 active_session.size = 1 〓ホスト名〓 0x7ffd78fc7700 virtualservice_tcp.cpp:518 2015/09/28 19:03:12 [DEBUG] l7vsd_virtualservice VSD10700038 active_count = 0 〓ホスト名〓 0x7ffd78fc7700 virtualservice_tcp.cpp:521 2015/09/28 19:03:12 [DEBUG] l7vsd_virtualservice VSD10700039 out_function: void virtualservice_tcp::handle_accept( const boost::shared_ptr<session_thread_control> , const boost::system::error_code& err ) 〓ホスト名〓 0x7ffd78fc7700 virtualservice_tcp.cpp:544 2015/09/28 19:03:12 [DEBUG] l7vsd_virtualservice VSD10700082 in_function: void virtualservice_tcp::connection_active( const boost::asio::ip::tcp::endpoint& in ): in = 172.20.0.251:80 〓ホスト名〓 0x7ffceef02700 virtualservice_tcp.cpp:1706 2015/09/28 19:03:12 [DEBUG] l7vsd_virtualservice VSD10700083 out_function: void virtualservice_tcp::connection_active() 〓ホスト名〓 0x7ffceef02700 virtualservice_tcp.cpp:1722 2015/09/28 19:03:22 [DEBUG] l7vsd_virtualservice VSD10700084 in_function: void virtualservice_tcp::connection_inactive( const boost::asio::ip::tcp::endpoint& in ): in = 172.20.0.251:80 〓ホスト名〓 0x7ffceef02700 virtualservice_tcp.cpp:1738 2015/09/28 19:03:22 [DEBUG] l7vsd_virtualservice VSD10700085 out_function: void virtualservice_tcp::connection_inactive() 〓ホスト名 〓 0x7ffceef02700 virtualservice_tcp.cpp:1774 2015/09/28 19:03:22 [DEBUG] l7vsd_virtualservice VSD10700013 in_function : void session_thread_control::stopdownstream() 〓ホスト名〓 0x7ffcee501700 session_thread_control.cpp:264 2015/09/28 19:03:22 [DEBUG] l7vsd_virtualservice VSD10700014 out_function : void session_thread_control::stopdownstream() : status = 0 〓ホスト名〓 0x7ffcee501700 session_thread_control.cpp:271 2015/09/28 19:03:22 [DEBUG] l7vsd_virtualservice VSD10700086 in_function: void virtualservice_tcp::release_session( const tcp_session* session_ptr ): session_ptr = 0x16ea650 〓ホスト名〓 0x7ffceef02700 virtualservice_tcp.cpp:1790 2015/09/28 19:03:22 [DEBUG] l7vsd_virtualservice VSD10700089 pool_session.size = 48 〓ホスト名〓 0x7ffceef02700 virtualservice_tcp.cpp:1810 2015/09/28 19:03:22 [DEBUG] l7vsd_virtualservice VSD10700090 active_session.size = 1 〓ホスト名〓 0x7ffceef02700 virtualservice_tcp.cpp:1813 2015/09/28 19:03:22 [DEBUG] l7vsd_virtualservice VSD10700091 active_count = 0 〓ホスト名〓 0x7ffceef02700 virtualservice_tcp.cpp:1816 2015/09/28 19:03:22 [DEBUG] l7vsd_virtualservice VSD10700092 pool_session.size = 49 〓ホスト名〓 0x7ffceef02700 virtualservice_tcp.cpp:1825 2015/09/28 19:03:22 [DEBUG] l7vsd_virtualservice VSD10700093 active_session.size = 0 〓ホスト名〓 0x7ffceef02700 virtualservice_tcp.cpp:1828 2015/09/28 19:03:22 [DEBUG] l7vsd_virtualservice VSD10700094 active_count = 0 〓ホスト名〓 0x7ffceef02700 virtualservice_tcp.cpp:1831 2015/09/28 19:03:22 [DEBUG] l7vsd_virtualservice VSD10700095 out_function: void virtualservice_tcp::release_session( const tcp_session* session_ptr ): session_ptr = 0x16ea650 〓ホスト名〓 0x7ffceef02700 virtualservice_tcp.cpp:1836 2015/09/28 19:03:22 [DEBUG] l7vsd_virtualservice VSD10700009 in_function : void session_thread_control::stopupstream() 〓ホスト名〓 0x7ffceef02700 session_thread_control.cpp:229 2015/09/28 19:03:22 [DEBUG] l7vsd_virtualservice VSD10700010 out_function : void session_thread_control::stopupstream() : status = 0 〓ホスト名〓 0x7ffceef02700 session_thread_control.cpp:236 2). 上の構成でultramonkeyl7のvipにtelnetコマンドで接続、後続のapacheが timeoutする60秒以上放置 $ telnet 〓vip〓 80 Trying 〓vip〓 Connected to 〓vip〓 Escape character is '^]'. リアルサーバがtimeoutすると以下のログが出力されました ---------------------------------------- 2015/09/28 19:14:10 [DEBUG] l7vsd_virtualservice VSD10700027 in_function: void virtualservice_tcp::handle_accept( const boost::shared_ptr<session_thread_control> , const boost::system::error_code& err ): err = false, err.message = Success 〓 ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:349 2015/09/28 19:14:10 [DEBUG] l7vsd_virtualservice VSD10700031 active session thread id = 0x1e0dd70 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:457 2015/09/28 19:14:10 [DEBUG] l7vsd_virtualservice VSD10700032 pool_session.size = 49 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:460 2015/09/28 19:14:10 [DEBUG] l7vsd_virtualservice VSD10700033 active_session.size = 1 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:463 2015/09/28 19:14:10 [DEBUG] l7vsd_virtualservice VSD10700034 active_count = 0 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:466 2015/09/28 19:14:10 [DEBUG] l7vsd_virtualservice VSD10700007 in_function : void session_thread_control::startupstream() 〓ホスト名〓 0x7fcc5ea5d700 session_thread_control.cpp:211 2015/09/28 19:14:10 [DEBUG] l7vsd_virtualservice VSD10700008 out_function : void session_thread_control::startupstream() :status = 1 〓ホスト名〓 0x7fcc5ea5d700 session_thread_control.cpp:219 2015/09/28 19:14:10 [DEBUG] l7vsd_virtualservice VSD10700011 in_function : void session_thread_control::startdownstream() 〓ホスト名〓 0x7fcc5ea5d700 session_thread_control.cpp:246 2015/09/28 19:14:10 [DEBUG] l7vsd_virtualservice VSD10700012 out_function : void session_thread_control::startdownstream() : status = 1 〓ホスト名〓 0x7fcc5ea5d700 session_thread_control.cpp:254 2015/09/28 19:14:10 [DEBUG] l7vsd_virtualservice VSD10700035 active session thread id = 0x1e1f890 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:512 2015/09/28 19:14:10 [DEBUG] l7vsd_virtualservice VSD10700036 pool_session.size = 48 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:515 2015/09/28 19:14:10 [DEBUG] l7vsd_virtualservice VSD10700037 active_session.size = 1 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:518 2015/09/28 19:14:10 [DEBUG] l7vsd_virtualservice VSD10700038 active_count = 0 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:521 2015/09/28 19:14:10 [DEBUG] l7vsd_virtualservice VSD10700039 out_function: void virtualservice_tcp::handle_accept( const boost::shared_ptr<session_thread_control> , const boost::system::error_code& err ) 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:544 2015/09/28 19:14:10 [DEBUG] l7vsd_virtualservice VSD10700082 in_function: void virtualservice_tcp::connection_active( const boost::asio::ip::tcp::endpoint& in ): in = 172.20.0.251:80 〓ホスト名〓 0x7fccba1fc700 virtualservice_tcp.cpp:1706 2015/09/28 19:14:10 [DEBUG] l7vsd_virtualservice VSD10700083 out_function: void virtualservice_tcp::connection_active() 〓ホスト名〓 0x7fccba1fc700 virtualservice_tcp.cpp:1722 3). 2)の状態からtelnet側で操作を行うと後続がtimeoutしているので何も返っ てきません $ telnet 〓vip〓 80 Trying 〓vip〓 Connected to 〓vip〓 Escape character is '^]'. GET / HTTP/1.0 Connection closed. この時以下のログが出力されました ---------------------------------------- 2015/09/28 19:15:37 [DEBUG] l7vsd_virtualservice VSD10700084 in_function: void virtualservice_tcp::connection_inactive( const boost::asio::ip::tcp::endpoint& in ): in = 172.20.0.251:80 〓ホスト名〓 0x7fccba1fc700 virtualservice_tcp.cpp:1738 2015/09/28 19:15:37 [DEBUG] l7vsd_virtualservice VSD10700085 out_function: void virtualservice_tcp::connection_inactive() 〓ホスト名 〓 0x7fccba1fc700 virtualservice_tcp.cpp:1774 2015/09/28 19:15:37 [DEBUG] l7vsd_virtualservice VSD10700086 in_function: void virtualservice_tcp::release_session( const tcp_session* session_ptr ): session_ptr = 0x1dfc850 〓ホスト名〓 0x7fccba1fc700 virtualservice_tcp.cpp:1790 2015/09/28 19:15:37 [DEBUG] l7vsd_virtualservice VSD10700089 pool_session.size = 48 〓ホスト名〓 0x7fccba1fc700 virtualservice_tcp.cpp:1810 2015/09/28 19:15:37 [DEBUG] l7vsd_virtualservice VSD10700090 active_session.size = 1 〓ホスト名〓 0x7fccba1fc700 virtualservice_tcp.cpp:1813 2015/09/28 19:15:37 [DEBUG] l7vsd_virtualservice VSD10700091 active_count = 18446744073709551615 〓ホスト名〓 0x7fccba1fc700 virtualservice_tcp.cpp:1816 2015/09/28 19:15:37 [DEBUG] l7vsd_virtualservice VSD10700092 pool_session.size = 49 〓ホスト名〓 0x7fccba1fc700 virtualservice_tcp.cpp:1825 2015/09/28 19:15:37 [DEBUG] l7vsd_virtualservice VSD10700093 active_session.size = 0 〓ホスト名〓 0x7fccba1fc700 virtualservice_tcp.cpp:1828 2015/09/28 19:15:37 [DEBUG] l7vsd_virtualservice VSD10700094 active_count = 18446744073709551615 〓ホスト名〓 0x7fccba1fc700 virtualservice_tcp.cpp:1831 2015/09/28 19:15:37 [DEBUG] l7vsd_virtualservice VSD10700095 out_function: void virtualservice_tcp::release_session( const tcp_session* session_ptr ): session_ptr = 0x1dfc850 〓ホスト名〓 0x7fccba1fc700 virtualservice_tcp.cpp:1836 2015/09/28 19:15:37 [DEBUG] l7vsd_virtualservice VSD10700009 in_function : void session_thread_control::stopupstream() 〓ホスト名〓 0x7fccba1fc700 session_thread_control.cpp:229 2015/09/28 19:15:37 [DEBUG] l7vsd_virtualservice VSD10700010 out_function : void session_thread_control::stopupstream() : status = 0 〓ホスト名〓 0x7fccba1fc700 session_thread_control.cpp:236 4).その後もう一度1)の通常操作を行うとactive_countが膨大な数になり以後 l7vsdを再起動若しくはバーチャルサービスを消さないかぎりSorryサーバーへ流 され続けます この時以下のログが出力されました ---------------------------------------- 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700027 in_function: void virtualservice_tcp::handle_accept( const boost::shared_ptr<session_thread_control> , const boost::system::error_code& err ): err = false, err.message = Success 〓 ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:349 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700030 Connection switch Sorry mode. active_session.size = 1, active_count.get = 18446744073709551615 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:448 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700031 active session thread id = 0x1e1f890 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:457 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700032 pool_session.size = 49 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:460 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700033 active_session.size = 1 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:463 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700034 active_count = 18446744073709551615 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:466 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700007 in_function : void session_thread_control::startupstream() 〓ホスト名〓 0x7fcc5ea5d700 session_thread_control.cpp:211 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700008 out_function : void session_thread_control::startupstream() :status = 1 〓ホスト名〓 0x7fcc5ea5d700 session_thread_control.cpp:219 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700011 in_function : void session_thread_control::startdownstream() 〓ホスト名〓 0x7fcc5ea5d700 session_thread_control.cpp:246 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700012 out_function : void session_thread_control::startdownstream() : status = 1 〓ホスト名〓 0x7fcc5ea5d700 session_thread_control.cpp:254 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700035 active session thread id = 0x1e313b0 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:512 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700036 pool_session.size = 48 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:515 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700037 active_session.size = 1 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:518 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700038 active_count = 18446744073709551615 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:521 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700039 out_function: void virtualservice_tcp::handle_accept( const boost::shared_ptr<session_thread_control> , const boost::system::error_code& err ) 〓ホスト名〓 0x7fcc5ea5d700 virtualservice_tcp.cpp:544 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700017 in_function : l7vs::virtualservice_element& virtualservice_base::get_element() 〓ホ スト名〓 0x7fccb8dfa700 virtualservice_base.cpp:577 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700018 element struct data, udpmode = TCP, tcp_accept_endpoint = 〓vip〓:80, udp_recv_endpoint = 0.0.0.0:0, realserver_vector:size = 2, protocol_module_name = sessionless, schedule_module_name = lc, protocol_args.size = 1, sorry_maxconnection = 1000, sorry_endpoint = 〓sorry サーバー〓, sorry_flag = 0, qos_upstream = 15625000, qos_downstream = 15625000, throughput_upstream = 0, throughput_downstream = 0, ssl_file_name = , http_total_count = 0, http_get_count = 0, http_post_count = 0 〓ホスト名〓 0x7fccb8dfa700 virtualservice_base.cpp:626 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700019 out_function : l7vs::virtualservice_element& virtualservice_base::get_element() 〓ホスト名〓 0x7fccb8dfa700 virtualservice_base.cpp:627 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700017 in_function : l7vs::virtualservice_element& virtualservice_base::get_element() 〓ホ スト名〓 0x7fccb8dfa700 virtualservice_base.cpp:577 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700018 element struct data, udpmode = TCP, tcp_accept_endpoint = 〓vip〓:80, udp_recv_endpoint = 0.0.0.0:0, realserver_vector:size = 2, protocol_module_name = sessionless, schedule_module_name = lc, protocol_args.size = 1, sorry_maxconnection = 1000, sorry_endpoint = 〓sorry サーバー〓, sorry_flag = 0, qos_upstream = 15625000, qos_downstream = 15625000, throughput_upstream = 0, throughput_downstream = 0, ssl_file_name = , http_total_count = 0, http_get_count = 0, http_post_count = 0 〓ホスト名〓 0x7fccb8dfa700 virtualservice_base.cpp:626 2015/09/28 19:16:27 [DEBUG] l7vsd_virtualservice VSD10700019 out_function : l7vs::virtualservice_element& virtualservice_base::get_element() 〓ホスト名〓 0x7fccb8dfa700 virtualservice_base.cpp:627 私の見方が違うかもしれませんが active_countをインクリメントしている部分をみるとloopしていれば ログも大量に出力されそうなのですがそうはなっておりません。 https://osdn.jp/projects/ultramonkey-l7/scm/git/ultramonkey-l7-v3/blobs/master/l7vsd/src/virtualservice_tcp.cpp virtualservice_tcp.cpp 441行目付近のこの部分がおかしくなってますでしょうか? ((0 < element.sorry_maxconnection) && ((active_count.get() >= static_cast<size_t>(element.sorry_maxconnection))) 長くなりましたが、可能であれば回避する方法をご教授いただけますでしょうか。 お手数ですが、何卒よろしくお願いいたします。 -- -------------------------------------------- ジグソー株式会社 小林 年伸 E-mail:toshi****@jig-s***** Tel:011-737-0240 Fax:011-737-0241 〒060-0808 北海道札幌市北区北8条西3丁目32番