VirtualFiles/AWS S3 directory listing

I’ve been using S3VirtualFiles and it has been working great.

However, I’ve come to doing directory listings with it and it doesn’t seem to be behaving correctly. I can list all the files that I’ve got stored (getallmatchingfiles), however I don’t seem to be able to list ‘folders’

Looking at the RESTFiles code, it appears you would do the following to list directories from a location:

var dir1 = VirtualFiles.GetDirectory("files").Directories;
foreach (var item in dir1)
{
// list
}

Whenever I call this, I don’t get any directories even though there should be in this location.

S3 is a flat structure of files, it doesn’t conceptually store files in a hierarchical tree of folders, which the VirtualFiles APIs will try to simulate but may have some behavioral differences due to its underlying flat file structure.

Here are S3 Virtual Files tests, can you provide a small repro that highlights the issue?

I think you’re right, I don’t think the RESTFiles example code is helping when I’m translating to S3.

I’ve done some more research and apparently CommonPrefixes is the way to go to list folders, the methods provided in VirtualFiles don’t seem to list the folders.

Right that’s what the APIs look like it has to use under the hood, so using APIs like GetDirectory().GetAllMatchingFiles("wildcard*") is a good way to traverse files.

It may be wise to note that the directory functions when utilising S3 aren’t as useable as a drop-in for say local storage. I’m trying to do a drop-in replacement in my code so I’ll need to bully some code!

Is there an easy way to grab the underlying Amazon client from the VirtualFiles statement:

VirtualFiles = new S3VirtualFiles(s3Client, HostContext.AppSettings.Get<string>("storage:s3BucketName"));

It’s accessible from the AmazonS3 public property:

So basically just cast then access, e.g:

var s3Client = ((S3VirtualFiles)VirtualFiles).AmazonS3;

Awesome, thank you Sir!

Spent a bit of time trying to work around this issue, I ended up creating a clone of the S3 Virtual File System and fiddled with the GetImmediateDirectories function, it seems to work and in my limited use case, doesn’t seem to break anything!

public IEnumerable < MyS3VirtualDirectory > GetImmediateDirectories(string fromDirPath) {
  var isTopLevel = false;
  var delimiter = char.ToString(MemoryVirtualFiles.DirSep);
  if (fromDirPath == "" || fromDirPath == char.ToString(MemoryVirtualFiles.DirSep)) {
    isTopLevel = true;
  }

  if (!fromDirPath.EndsWith(delimiter)) {
    fromDirPath += delimiter;
  }

  var listObjectsRequest = isTopLevel ? new ListObjectsRequest {
    BucketName = BucketName,
      Delimiter = delimiter
  } : new ListObjectsRequest {
    BucketName = BucketName,
      Prefix = fromDirPath,
      Delimiter = delimiter
  };
  var objects = AmazonS3.ListObjects(listObjectsRequest);
  var dirPaths = objects.CommonPrefixes;

  var parentDir = GetParentDirectory(fromDirPath);
  return dirPaths.Map(x => new MyS3VirtualDirectory(this, x.TrimEnd(MemoryVirtualFiles.DirSep), parentDir));
}

Just thought I’d share the love as it were. Thanks for all your help as always.

1 Like

Thanks for sharing, if you want to see it in S3VirtualFiles (and all tests still pass) feel free to send a PR to https://github.com/ServiceStack/ServiceStack.Aws