Skype for Business client displays the message: “Your chat room access may be limited due to an outage.”

Problem

You’ve noticed that the Skype for Business client displays the following message:

Your chat room access may be limited due to an outage.

Persistent Chat rooms are no longer accessible but all other functionality appears to be functioning.

Reviewing the Lync Server event logs on the front-end server reveals the following error logged:

Log Name: Lync Server

Source: LS Protocol Stack

Event ID: 14428

Level: Error

User: N/A

TLS outgoing connection failures.


Over the past 359 minutes, Skype for Business Server has experienced TLS outgoing connection failures 15 time(s). The error code of the last failure is 0x800B0101(CERT_E_EXPIRED) while trying to connect to the server “contbmlyncpc.contoso.com” at address [10.34.30.79:5041], and the display name in the peer certificate is “contbmlyncpc.contoso.com”.
Cause: Most often a problem with the peer certificate or perhaps the host name (DNS) record used to reach the peer server. Target principal name is incorrect means that the peer certificate does not contain the name that the local server used to connect. Certificate root not trusted error means that the peer certificate was issued by a remote CA that is not trusted by the local machine.
Resolution:
Check that the address and port matches the FQDN used to connect, and that the peer certificate contains this FQDN somewhere in its subject or SAN fields. If the FQDN refers to a DNS load balanced pool then check that all addresses returned by DNS refer to a server in the same pool. For untrusted root errors, ensure that the remote CA certificate chain is installed locally. If you have already installed the remote CA certificate chain, then try rebooting the local machine.

You proceed to log into the Persistent Chat server and confirm that the certificate has expired and therefore none has been assigned to the service:

You continue by requesting and assigning the new certificate to the Persistent Chat service:

You attempt to start the Skype for Business Server Persistent Chat service but notice that it starts and quickly stops:

Reviewing the Lync Server logs on the Persistent Chat server reveal the following error logged:

Log Name: Lync Server

Source: LS Persistent Chat Server

Event ID: 53503

Level: Error

Skype for Business Server 2015, Persistent Chat could not start due to the following exception:

at

System.IdentityModel.Tokens.SecurityTokenException: Certificate verification failed.

Server stack trace:

at Microsoft.Rtc.Internal.Chat.Server.ServerCommon.PeerTransport.CustomX509CertificateValidator.Validate(X509Certificate2 certificate)

at System.IdentityModel.Selectors.X509SecurityTokenAuthenticator.ValidateTokenCore(SecurityToken token)

at System.IdentityModel.Selectors.SecurityTokenAuthenticator.ValidateToken(SecurityToken token)

at System.ServiceModel.Channels.SslStreamSecurityUpgradeInitiator.ValidateRemoteCertificate(Object sender, X509Certificate certificate, X509Chain chain, SslPolicyErrors sslPolicyErrors)

at System.Net.Security.SecureChannel.VerifyRemoteCertificate(RemoteCertValidationCallback remoteCertValidationCallback, ProtocolToken& alertToken)

at System.Net.Security.SslState.CompleteHandshake(ProtocolToken& alertToken)

at System.Net.Security.SslState.CheckCompletionBeforeNextReceive(ProtocolToken message, AsyncProtocolRequest asyncRequest)

at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)

at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest)

at System.Net.Security.SslState.CheckCompletionBeforeNextReceive(ProtocolToken message, AsyncProtocolRequest asyncRequest)

at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)

at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest)

at System.Net.Security.SslState.CheckCompletionBeforeNextReceive(ProtocolToken message, AsyncProtocolRequest asyncRequest)

at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)

at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest)

at System.Net.Security.SslState.CheckCompletionBeforeNextReceive(ProtocolToken message, AsyncProtocolRequest asyncRequest)

at System.Net.Security.SslState.ForceAuthentication(Boolean receiveFirst, Byte[] buffer, AsyncProtocolRequest asyncRequest)

at System.Net.Security.SslState.ProcessAuthentication(LazyAsyncResult lazyResult)

at System.ServiceModel.Channels.SslStreamSecurityUpgradeInitiator.OnInitiateUpgrade(Stream stream, SecurityMessageProperty& remoteSecurity)

at System.ServiceModel.Channels.StreamSecurityUpgradeInitiatorBase.InitiateUpgrade(Stream stream)

at System.ServiceModel.Channels.ConnectionUpgradeHelper.InitiateUpgrade(StreamUpgradeInitiator upgradeInitiator, IConnection& connection, ClientFramingDecoder decoder, IDefaultCommunicationTimeouts defaultTimeouts, TimeoutHelper& timeoutHelper)

at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.SendPreamble(IConnection connection, ArraySegment`1 preamble, TimeoutHelper& timeoutHelper)

at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.DuplexConnectionPoolHelper.AcceptPooledConnection(IConnection connection, TimeoutHelper& timeoutHelper)

at System.ServiceModel.Channels.ConnectionPoolHelper.EstablishConnection(TimeSpan timeout)

at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.OnOpen(TimeSpan timeout)

at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)

at System.ServiceModel.Channels.ServiceChannel.OnOpen(TimeSpan timeout)

at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)

at System.ServiceModel.Channels.ServiceChannel.CallOpenOnce.System.ServiceModel.Channels.ServiceChannel.ICallOnce.Call(ServiceChannel channel, TimeSpan timeout)

at System.ServiceModel.Channels.ServiceChannel.CallOnceManager.CallOnce(TimeSpan timeout, CallOnceManager cascade)

at System.ServiceModel.Channels.ServiceChannel.EnsureOpened(TimeSpan timeout)

at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)

at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)

at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

Exception rethrown at [0]:

at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)

at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)

at Microsoft.Rtc.Internal.Chat.Server.ServerCommon.PeerTransport.IPublisher.IsAlive()

at Microsoft.Rtc.Internal.Chat.Server.ServerCommon.PeerTransport.PeerWrapper.ExecuteWithRetry(Action action)

at Microsoft.Rtc.Internal.Chat.Server.ServerCommon.PeerTransport.WCFService.CreatePeerWrapper(Int32 peerId, Uri peerServiceUri)

at Microsoft.Rtc.Internal.Chat.Server.ServerCommon.PeerTransport.WCFService.GetPeerWrapper(Int32 peerId, PeerWrapper& peerWrapper)

at Microsoft.Rtc.Internal.Chat.Server.ServerCommon.PeerTransport.WCFService.SubscribeToPeerImpl(Int32 peerId)

at Microsoft.Rtc.Internal.Chat.Server.ServerCommon.PeerTransport.WCFService.SubcribeToPeers()

at Microsoft.Rtc.Internal.Chat.Server.ServerCommon.PeerTransport.PeerTransport.Connect(IWCFService service)

at Microsoft.Rtc.Internal.Chat.Server.ServerCommon.PeerTransport.PeerServerManager.Connect(IPeerFinder peerFinder, ReceiveConduitMessageCallback callback)

at Microsoft.Rtc.Internal.Chat.Server.Channel.Server.ChannelServer.OnStart()

at Microsoft.Rtc.Internal.Chat.Server.ServerCommon.ServerBase.Start()

at Microsoft.Rtc.Internal.Chat.Server.ServerCommon.MgcServiceBase.startServer()

at Microsoft.Rtc.Internal.Chat.Server.ServerCommon.MgcServiceBase.createAndStartServer().

Solution

One of the common reasons why the Persistent Chat server would exhibit this behavior is if there is a second Persistent Chat server in the environment, which also has the certificate used for the service expired.  This environment in this example had a second persistent chat server for disaster recovery purposes so proceeding to reissue a valid certificate on the server then restarting the services corrected the issue: