CacheEntry Issue

Hello Team,

We are using API Key Auth provider and we have only one api key. We are caching sessions in SQL Server using ORMLite. Over a period of time there are about 15,00 entries in CacheEntry table with a table size of approx 900 MB. But the CacheEntry table has 1 row for the API Key. Sometimes the API authentication starts to fail with timeout errors in SQL Server and the CPU is pegged at 100% for SQL Server. It becomes non-responsive but strangely all other auth session look ups continue to work in sub second.

We have to truncate CacheEntry table to get out of this situation and then suddenly SQL Server comes back to normal. We are on Azure SQL just to clarify. Let me know if you need more information.

Thanks,
Amit.

This sounds like the size (i.e. 900mb) of your cache is causing the slow down.

The CacheEntry rows are only removed when trying to access a cache entry that has expired, so if you’re never accessing old caches they wont be removed.

You can run this periodic query to clear the CacheEntry table of expired items:

using (var db = HostContext.TryResolve<IDbConnectionFactory>())
{
    db.Delete<CacheEntry>(q => DateTime.UtcNow > q.ExpiryDate);
}

Or you can purge all cache entries with:

HostContext.TryResolve<ICacheClient>().FlushAll();

Thanks Demis, we’ll do that. The only thing that is worrying is that when we query the table directly from Mgmt Studio at the time of incident the queries return the value immediately. It’s the SS call to validate and authenticate that times out and throws SQL Server in the high CPU situation.

Thanks,
Amit.

Are you using parameters? Sounds like parameter sniffing issues to me. Query will run instantly in SSMS and then timeout in .Net.

Thanks for your input Brandon. We are not issuing any queries to that table. It’s Servicestack internal table and Caching Provider uses that to store sessions. How do we find out the issue that you are suggesting is happening? Is there a way to ascertain that?

Thanks,
Amit.

Do you have any full Exception StackTraces available? is the Exception always for the same API/query or any API that queries the CacheEntry table?

Here is the exception that we get in the API side.


Message: Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
The statement has been terminated.

StackTrace:

   at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at System.Data.SqlClient.SqlCommand.RunExecuteNonQueryTds(String methodName, Boolean async, Int32 timeout, Boolean asyncWrite)
   at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry)
   at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
   at ServiceStack.OrmLite.OrmLiteCommand.ExecuteNonQuery()
   at ServiceStack.OrmLite.OrmLiteWriteCommandExtensions.ExecuteSql(IDbCommand dbCmd, String sql, IEnumerable`1 sqlParams)
   at ServiceStack.OrmLite.WriteExpressionCommandExtensions.Delete[T](IDbCommand dbCmd, SqlExpression`1 where)
   at ServiceStack.OrmLite.WriteExpressionCommandExtensions.Delete[T](IDbCommand dbCmd, Expression`1 where)
   at ServiceStack.OrmLite.OrmLiteWriteExpressionsApi.<>c__DisplayClass16_0`1.<Delete>b__0(IDbCommand dbCmd)
   at ServiceStack.OrmLite.OrmLiteExecFilter.Exec[T](IDbConnection dbConn, Func`2 filter)
   at ServiceStack.OrmLite.OrmLiteReadExpressionsApi.Exec[T](IDbConnection dbConn, Func`2 filter)
   at ServiceStack.OrmLite.OrmLiteWriteExpressionsApi.Delete[T](IDbConnection dbConn, Expression`1 where)
   at ServiceStack.Caching.OrmLiteCacheClient`1.Verify(IDbConnection db, TCacheEntry entry)
   at ServiceStack.Caching.OrmLiteCacheClient`1.<>c__DisplayClass9_0`1.<Get>b__0(IDbConnection db)
   at ServiceStack.Caching.OrmLiteCacheClient`1.Exec[T](Func`2 action)
   at ServiceStack.Caching.OrmLiteCacheClient`1.Get[T](String key)
   at ServiceStack.Auth.ApiKeyAuthProvider.PreAuthenticateWithApiKey(IRequest req, IResponse res, ApiKey apiKey)
   at ServiceStack.Auth.ApiKeyAuthProvider.PreAuthenticate(IRequest req, IResponse res)
   at ServiceStack.AuthenticateAttribute.PreAuthenticateAsync(IRequest req, IEnumerable`1 authProviders)
   at ServiceStack.ServiceExtensions.IsAuthenticated(IRequest req)
   at AutomotiveLynk.Api.CustomAuthenticateAttribute.Execute(IRequest req, IResponse res, Object requestDto) in E:\Alynk\AlynkApi\AutomotiveLynk.Api.ServiceModel\CustomAuthenticateAttribute.cs:line 23
   at ServiceStack.RequestFilterAttribute.RequestFilter(IRequest req, IResponse res, Object requestDto)
   at ServiceStack.ServiceStackHost.<ApplyRequestFiltersSingleAsync>d__339.MoveNext()
   
   
Errors:
		[0]	{System.Data.SqlClient.SqlError: Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.}	object {System.Data.SqlClient.SqlError}
		[1]	{System.Data.SqlClient.SqlError: The statement has been terminated.}	object {System.Data.SqlClient.SqlError}

It’s interesting that it’s the Delete operation that timed out:

The issue might be that ExpiryDate is not an indexed field although 15k rows isn’t a lot for SQL Server to hang over, I still think it’s related to the amount of data in the CacheEntry table but potentially adding an index on ExpiryDate may help reduce Timeout Exceptions:

CREATE INDEX idx_cacheentry_expirydate ON "CacheEntry" ("ExpiryDate"); 

I’ve also added the [Index] on CacheEntry.ExpiryDate column since it’s being used by the OrmLite CacheClient to delete expired items.

I checked with the team and they had created this index already and it was not helpful.

Also if you can avoid this on the critical path to cache lookup/ whatever the case may be so that the delete request is handled some other way and the auth is not blocked, that would be great. Also we have found that delete works better in sql server in small batch sizes.

This is literally just a simple > date comparison query on an indexed column, I can’t see how it can be the issue if you’ve only got 15k rows. I’d get your team to profile the queries run against the server and analyze the queries that are taking the longest, it’s more likely to be related to storing too much data in the CacheEntry table then the date comparison.

I’ve changed it so a single expired entry lookup will only delete the entry by Primary Key in this commit.

This change is available from v5.8.1 that’s now available on MyGet.

Although you may need to look into a different caching strategy, e.g. switching to a fast distributed cache like Redis or minimize the size of your caches by storing references to the to your largest cached payloads instead of storing it in SQL Server.