2

I'm designing a file upload API that needs to work with large files. I want to stay away from passing around byte arrays. The endpoint storage for the file will be a third party such as Azure or Rackspace file storage.

I have the following project structure, which is following DDD:

  1. Web API (Netcore - accepts the uploaded file)
  2. Business Service (calls to Azure to save the file and saves a record to the database)
  3. Domain (Domain models for EF)
  4. Persistence (Repositories EFCore - saves the database changes)

I would like to have methods in each that can start passing through the uploaded filestream as soon as the upload starts. I'm unsure if this is possible?

Previously we've used byte[] to pass the files through the layers, but for large files this seems to require lots of memory to do so and has caused us issues.

Is it possible to optimize the upload of files through a ntier application, so you don't have to copy around large byte arrays, and if so, how can it be done?

In order to clarify, the code structure would be something like the following. Repository stuff has been excluded:

namespace Project.Controllers
{
    [Produces("application/json")]
    [Route("api/{versionMNumber}/")]
    public class DocumentController : Controller
    {
        private readonly IAddDocumentCommand addDocumentCommand;

        public DocumentController(IAddDocumentCommand addDocumentCommand)
        {
            this.addDocumentCommand = addDocumentCommand;
        }

        [Microsoft.AspNetCore.Mvc.HttpPost("application/{applicationId}/documents", Name = "PostDocument")]
        public IActionResult UploadDocument([FromRoute] string applicationId)
        {
            var addDocumentRequest = new AddDocumentRequest();
            addDocumentRequest.ApplicationId = applicationId;
            addDocumentRequest.FileStream = this.Request.Body;

            var result = new UploadDocumentResponse { DocumentId = this.addDocumentCommand.Execute(addDocumentRequest).DocumentId };

            return this.Ok(result);
        }
    }
}

namespace Project.BusinessProcess
{
    public interface IAddDocumentCommand
    {
        AddDocumentResponse Execute(AddDocumentRequest request);
    }

    public class AddDocumentRequest
    {
        public string ApplicationId { get; set; }
        public Stream FileStream { get; set; }
    }

    public class AddDocumentResponse
    {
        public Guid DocumentId { get; set; }
    }

    public class AddDocumentCommand : IAddDocumentCommand
    {
        private readonly IDocuentRepository documentRepository;
        private readonly IMessageBus bus;

        public AddDocumentCommand(IDocumentRepository documentRepository, IMessageBus bus)
        {
            this.documentRepository = documentRepository;
            this.bus = bus;
        }

        public AddDocumentResponse Execute(AddDocumentRequest request)
        {
            /// We need the file to be streamed off somewhere else, fileshare, Azure, Rackspace etc
            /// We need to save a record to the db that the file has been saved successfully
            /// We need to trigger the background workers to process the uploaded file

            var fileUri = AzureStorageProvider.Save(request.FileStream);
            var documentId = documentRepository.Add(new Document { FileUri = fileUri });
            bus.AddMessage(new DocumentProcessingRequest { documentId = documentId, fileUri = fileUri });

            return new AddDocumentResponse { DocumentId = documentId };
        }
    }
}
8
  • And what processing exactly is required for this file? Does this processing involve reading that file completely? Commented Dec 14, 2017 at 12:50
  • @Evk no, I just need to pass it through to which ever file storage provider we use. For example, Rackspace Cloud files has the method: 'cloudFiles.CreateObject(containerName, stream, fileUri);' and Azure Blob Storage has the method 'blockBlob.UploadFromStream(fileStream);' Commented Dec 14, 2017 at 13:01
  • But you said " but the file needs some processing first in our business layer"? Commented Dec 14, 2017 at 13:12
  • After edit it's still not clear for me what a problem is. You pass that Stream between layers in your example, just like you want to. It doesn't work as you expect or what? Commented Dec 14, 2017 at 13:18
  • Essentially you need to stream the file, although I suppose this does limit what you can do to "process" it. There's some information here learn.microsoft.com/en-us/aspnet/core/mvc/models/file-uploads Commented Dec 14, 2017 at 13:28

1 Answer 1

0

Some notes:

  1. Passing around a byte array or stream doesn't copy the data - the issue is having the data on your server at all. If your web server needs to process the data in its complete form, you aren't going to be able to avoid the memory usage of doing so.

  2. If you don't need to process the data on your web server at all, but just need to put it in blob storage, you should return a uri for the upload, which points to blob storage (this might be helpful: Using CORS with Azure)

  3. If you need to process the data, but you're okay doing that a bit-at-a-time, something like this answer is what you need Using streams in ASP.Net web API

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.