Migration project .net to .net core 3.1

Virab · March 23, 2020, 11:33am

Hi Demis,

I want to migrate an existing service that runs on Windows to Kubernetes.
The current service runs on .net 4.7. thanks to a good SS architecture, it was very easy to convert the service to .net core 3.1.

But the results were not what was expected
Requests have become several times slower, and sometimes the service restarts. It seems to me that the problem is that the number of threads in .net core is limited and we need to use an asynchronous programming approach.

Configuration of the service:

.UseKestrel(options => {
  options.Limits.MaxRequestBodySize = null;
  options.Limits.MaxConcurrentConnections = null;
  options.Limits.MaxConcurrentUpgradedConnections = null;
  options.AllowSynchronousIO = false;
})

//Against memory leaks in kubernetes
<PropertyGroup>
  <ServerGarbageCollection>false</ServerGarbageCollection>
</PropertyGroup>

Our files were stored on a hard disk drive, and when migrate .net core, we moved the files to s3.
And the problem may be in them.

public async Task Any(GetFile request) {
  var objectResponse = await S3Storage.ProxyObjectAsync(bucketName, filePath);
  await ServiceExtensions.ResponseFromS3Async(Response, objectResponse);
}

public class S3Storage {
  private const string AccessKey = "key";
  private const string SecretKey = "secret";
  private const string ServiceUrl = "url";
  private readonly AmazonS3Client _client;

  private S3Storage() {
    AWSCredentials credentials = new BasicAWSCredentials(AccessKey, SecretKey);
    var config = new AmazonS3Config {
      ServiceURL = ServiceUrl
    };

    _client = new AmazonS3Client(credentials, config);
  }

   public static S3Storage Instance { get; } = new S3Storage();

   public async Task<GetObjectResponse> ProxyObjectAsync(string bucketName, string filePath) {
     return await _client.GetObjectAsync(new GetObjectRequest {
       BucketName = bucketName,
       Key = filePath
     });
   }
}

 public static class ServiceExtensions {
      public static async Task ResponseFromS3Async(IResponse res, GetObjectResponse s3Response, bool asAttachment = false) {
                var statusCode = (int) s3Response.HttpStatusCode;

                res.StatusCode = statusCode == 0 ? (int) HttpStatusCode.OK : statusCode;
                res.ContentType = s3Response.Headers.ContentType;

                res.AddHeader(
                    HttpHeaders.ContentDisposition,
                    $"{(asAttachment ? "attachment;" : "")}{HttpExt.GetDispositionFileName(Path.GetFileName(s3Response.Key))}"
                );

                if (s3Response.ContentLength >= 0)
                    res.SetContentLength(s3Response.ContentLength);

                await s3Response.ResponseStream.CopyToAsync(res.OutputStream);
                res.EndHttpHandlerRequest(true);
            }
  }

The problem is that when requesting ‘GetFile’ even the simplest query Select () (the table contains 100 rows of two columns) is executed in 1000 milliseconds but when there are no queries Select() is executed in 30 milliseconds.

Please tell me if there are any known factors that I did not consider? I understand that this is a specific case and may not be entirely related to the SS, but in fact it is a case that other people may encounter when migrating. I want to thank you in advance for your time!

mythz · March 23, 2020, 2:03pm

Am I reading this correctly, you went from serving a file from a local disk to using a remote S3 service and you think the issue is with the platform it’s running on and not the difference between accessing local vs remote I/O? Were you also previously running your .NET Framework workload in a VM? Or are you comparing bare metal vs a running within a Kube pod? Are you going through any extra middleware, e.g. gateways, load-balancers, proxies?

The first thing when trying to track down performance issues is to identify the problem. If you believe it’s the framework you should test your hypothesis by running the exact same workload on .NET Framework and .NET Core on the same spec’d machine (and VM if possible), this should tell you which platform is slower running your code. There’s no way around it, you’re going to have to profile and add logging to find out where most of the time is being spent, compare it with an empty response to see the base-line performance and how much overhead your architecture is adding, then try serving the same file in memory to see how much the S3 call is slowing things down, etc.

Virab · March 23, 2020, 3:05pm

Yes, of course, now 2 services are running, on Windows and in k8s, they are faced with nginx proxy, with which traffic is switched to one of the services, both services save and receive files from s3.
There is no load balancer or proxy between the servers and s3.

Workload has always been on VMS, in the cloud.

As I wrote earlier , there are no additional layers between services and s3.

It is interesting that if traffic is transferred to a new service, over time responses begin to slow down , and so it continues until the memory is increased to the level when k8s restarts the server .

you can notice that due to slow requests, the size of the file buffer of the nginx proxy increases, which does not happen when data is transmitted to a serivs that runs on Windows.

The picture shows a diagram of the service relationship

As I wrote earlier, in both cases, services run on virtual machines, and the speed of the same service will differ depending on the platform. Of course s3 slows down the request response, but for some reason windows responds normally and linux does not.

I tried using different profilers to find problems, but unfortunately not one of them shows me any problems. This Profiler currently works at the code level, but all responses are < 1 second when data is received and < 15 seconds when data is saved in s3.

The number of simultaneous requests is 150-200 requests per second, so I suspect the problem is that threads are blocked…

Maybe there are alternative profiling options that can help me determine exactly where the problem is? Thanks!

mythz · March 23, 2020, 3:20pm

You can try Microsoft Application Insights for Kubernetes.

Are the CPU & Memory resources for the .NET Framework & Linux VM similar? Does the issue only happen from S3, e.g. does it happen when serving the file in memory?

Can you try running .NET Core 3.1 Linux outside of K8 to see if you get the same issues?

Virab · March 23, 2020, 4:04pm

I have already tried, no anomalies were noticeable, no errors are shown, 95% of requests respond < 800 milliseconds.

CPU consumption at the same level. Memory consumption is different, on .net core memory grows constantly until k8c restarts the pod.

The idea is good, I will try and tell you about the result!

mythz · March 23, 2020, 4:08pm

I mean are the CPU & Memory Resources allocated to each VM similar, e.g. if there isn’t enough memory in the VM it could start forcing paging to disk.

Virab · March 23, 2020, 4:22pm

no, there are no limits, even more resources than for windows

Virab · March 23, 2020, 8:31pm

Start the service on Windows, there are no errors, memory and CPU consumption is very low, the response speed has not increased, but now it works stably.