owin – Service Fabric Stateless Identity Server 3服务在加载时失败

我们正在将基于Identity Server 3(
https://github.com/IdentityServer/IdentityServer3)的无状态服务迁移到服务结构.

该项目在正常开发负载下在本地开发盒和生产集群上运行良好,但是当以每秒约20-30个请求投入生产时,它会快速停止响应请求并且ARR中配置的运行状况检查变得不健康.

该服务由IIS ARR(应用程序请求路由)群集实现,该群集执行SSL卸载.

身份服务器日志输出以下两个错误,一个看似与入站请求相关,另一个与用于身份服务器持久性的azure存储的出站请求相关.

Microsoft.WindowsAzure.Storage.StorageException: The client could not finish the operation within specified timeout. ---> System.TimeoutException: The client could not finish the operation within specified timeout.
   --- End of inner exception stack trace ---
   at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndExecuteAsync[T](IAsyncResult result) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 50
   at Microsoft.WindowsAzure.Storage.Core.Util.AsyncExtensions.<>c__DisplayClass1`1.<CreateCallback>b__0(IAsyncResult ar) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Util\AsyncExtensions.cs:line 66
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at Collabco.Myday.Identity.IdSvr.BaseStore`1.<ExecuteQueryAsync>d__21.MoveNext() in C:\Dev\myday-identity\IdentityServer\IdSvr\BaseStore.cs:line 258
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at Collabco.Myday.Identity.IdSvr.ScopeStore.<GetScopesAsync>d__3.MoveNext() in C:\Dev\myday-identity\IdentityServer\IdSvr\ScopeStore.cs:line 43
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at IdentityServer3.Core.Endpoints.DiscoveryEndpointController.<GetConfiguration>d__11.MoveNext() in c:\local\identity\server3\IdentityServer3\source\Core\Endpoints\Connect\DiscoveryEndpointController.cs:line 73
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Threading.Tasks.System.Web.Http910180.TaskHelpersExtensions.<CastToObject>d__3`1.MoveNext() in c:\local\identity\server3\IdentityServer3\source\Core\Configuration\DiscoveryOptions.cs:line 0
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Web.Http.Controllers.ApiControllerActionInvoker.<InvokeActionAsyncCore>d__0.MoveNext() in c:\local\identity\server3\IdentityServer3\source\Core\Configuration\DiscoveryOptions.cs:line 0
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Web.Http.Controllers.ActionFilterResult.<ExecuteAsync>d__2.MoveNext() in c:\local\identity\server3\IdentityServer3\source\Core\Configuration\DiscoveryOptions.cs:line 0
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Web.Http.Dispatcher.HttpControllerDispatcher.<SendAsync>d__1.MoveNext() in c:\local\identity\server3\IdentityServer3\source\Core\Configuration\DiscoveryOptions.cs:line 0
Request Information
RequestID:
RequestDate:
StatusMessage:

我们看到的另一个例外是:

System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'System.Net.HttpListenerRequest'.
   at System.Net.HttpListenerRequest.CheckDisposed()
   at System.Net.HttpListenerRequest.get_LocalEndPoint()
   at System.Net.HttpListenerRequest.get_IsLocal()
   at Microsoft.Owin.Host.HttpListener.RequestProcessing.OwinHttpListenerContext.GetServerIsLocal()
   at Microsoft.Owin.Host.HttpListener.RequestProcessing.CallEnvironment.get_ServerIsLocal()
   at Microsoft.Owin.Host.HttpListener.RequestProcessing.CallEnvironment.PropertiesTryGetValue(String key, Object& value)
   at Microsoft.Owin.Host.HttpListener.RequestProcessing.CallEnvironment.TryGetValue(String key, Object& value)
   at Microsoft.Owin.OwinContext.Get[T](String key) in c:\local\identity\server3\IdentityServer3\source\Core\Configuration\DiscoveryOptions.cs:line 0
   at System.Web.Http.Owin.OwinHttpRequestContext.get_IsLocal() in c:\local\identity\server3\IdentityServer3\source\Core\Configuration\DiscoveryOptions.cs:line 0
   at System.Web.Http.Owin.OwinHttpRequestContext.get_IncludeErrorDetail() in c:\local\identity\server3\IdentityServer3\source\Core\Configuration\DiscoveryOptions.cs:line 0
   at System.Net.Http.System.Web.Http910180.HttpRequestMessageExtensions.CreateErrorResponse(HttpRequestMessage request, HttpStatusCode statusCode, Func`2 errorCreator) in c:\local\identity\server3\IdentityServer3\source\Core\Configuration\DiscoveryOptions.cs:line 0
   at System.Web.Http.ExceptionHandling.DefaultExceptionHandler.Handle(ExceptionHandlerContext context) in c:\local\identity\server3\IdentityServer3\source\Core\Configuration\DiscoveryOptions.cs:line 0
   at System.Web.Http.ExceptionHandling.DefaultExceptionHandler.HandleAsync(ExceptionHandlerContext context, CancellationToken cancellationToken) in c:\local\identity\server3\IdentityServer3\source\Core\Configuration\DiscoveryOptions.cs:line 0
   at System.Web.Http.ExceptionHandling.LastChanceExceptionHandler.HandleAsync(ExceptionHandlerContext context, CancellationToken cancellationToken) in c:\local\identity\server3\IdentityServer3\source\Core\Configuration\DiscoveryOptions.cs:line 0
   at System.Web.Http.ExceptionHandling.ExceptionHandlerExtensions.<HandleAsyncCore>d__0.MoveNext() in c:\local\identity\server3\IdentityServer3\source\Core\Configuration\DiscoveryOptions.cs:line 0
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Web.Http.Dispatcher.HttpControllerDispatcher.<SendAsync>d__1.MoveNext() in c:\local\identity\server3\IdentityServer3\source\Core\Configuration\DiscoveryOptions.cs:line 0
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Web.Http.Owin.PassiveAuthenticationMessageHandler.<SendAsync>d__0.MoveNext() in c:\local\identity\server3\IdentityServer3\source\Core\Configuration\DiscoveryOptions.cs:line 0
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Web.Http.HttpServer.<SendAsync>d__0.MoveNext() in c:\local\identity\server3\IdentityServer3\source\Core\Configuration\DiscoveryOptions.cs:line 0

在卸下负载后大约5-10分钟后没有回收任何东西,服务恢复了生命.服务结构在不稳定期间也检测不到故障.

任何想法?

最佳答案 解决方法是:

>确保捕获并处理和记录超时异常,在某些情况下它们不是.
>调整重试策略以使重试更接近(100毫秒延迟)并且限制为不超过10秒.还实现了重试记录.实际上我到目前为止还没有重试.
>将表服务端点的连接限制增加到1000.

最重要的因素似乎是数字3虽然通过实施1注意到了一些改进,我先做了一次,这显然是非常明显的.

为此,我在我的存储库类(身份服务器术语中的基本存储类)中使用静态构造函数具有以下代码,该代码依赖于StorageAccount实例.除最后一行之外的所有行都已存在.

        var tableServicePoint = System.Net.ServicePointManager.FindServicePoint(storageAccount.TableEndpoint);
        tableServicePoint.UseNagleAlgorithm = false;
        tableServicePoint.Expect100Continue = false;
        tableServicePoint.ConnectionLimit = 1000;

以下文章对此有所帮助:https://azure.microsoft.com/en-gb/documentation/articles/storage-performance-checklist/

总之,我的结论是默认连接限制(2或10个不同的文档冲突)导致请求表存储排队并最终超时并最终导致服务崩溃.

要确认在使用azure Web应用程序时没有必要设置连接限制,那么在使用影响azure存储的连接时,服务结构/ owin自主项目的工作方式会有所不同.

点赞