0

I am trying to convert a zip file into a text file (xml) using the following methods. It works fine for smaller file but dose not seem to work for files larger than 50 mb.

class Program
{
    public static void Main(string[] args)
    {
        try
        {

            string importFilePath = @"D:\CorpTax\Tasks\966442\CS Publish error\CSUPD20180604L.zip";

            int maxLengthInMb = 20;
            byte[] payLoad = File.ReadAllBytes(importFilePath);
            int payLoadInMb = (payLoad.Length / 1024) / 1024;
            bool splitIntoMultipleFiles = (payLoadInMb / maxLengthInMb) > 1;
            int payLoadLength = splitIntoMultipleFiles ? maxLengthInMb * 1024 * 1024 : payLoad.Length;

            if (splitIntoMultipleFiles)
            {
                foreach (byte[] splitPayLoad in payLoad.Slices(payLoadLength))
                {
                    ToXml(payLoad);
                }
            }              
        }
        catch (Exception ex)
        {
            throw new Exception(ex.Message);
        }
    }

    public static string ToXml(byte[] payLoad)
    {
        using (XmlStringWriter xmlStringWriter = new XmlStringWriter())
        {
            xmlStringWriter.WriteStartDocument();
            xmlStringWriter.Writer.WriteStartElement("Payload");

            xmlStringWriter.Writer.WriteRaw(Convert.ToBase64String(payLoad));
            xmlStringWriter.Writer.WriteEndElement();
            xmlStringWriter.WriteEndDocument();
            return xmlStringWriter.ToString();
        }
    }
}

I have a .zip file which is like 120 MB in size and I get the System.OutOfMemoryException when calling Convert.ToBase64String().

So I went ahead and split the byte array into a size of 20 mb chunks hoping that it will not fail. But I see that it works until it goes through the loop 3 times i.e able to convert 60mb of the data and in the 4th iteration i get the same exception. Some times I also get exceptions at the line return xmlStringWriter.ToString()

To split the byte[] I have used the following extension classes

public static class ArrayExtensions
{
    public static T[] CopySlice<T>(this T[] source, int index, int length, bool padToLength = false)
    {
        int n = length;
        T[] slice = null;

        if (source.Length < index + length)
        {
            n = source.Length - index;
            if (padToLength)
            {
                slice = new T[length];
            }
        }

        if (slice == null) slice = new T[n];
        Array.Copy(source, index, slice, 0, n);
        return slice;
    }
    public static IEnumerable<T[]> Slices<T>(this T[] source, int count, bool padToLength = false)
    {
        for (var i = 0; i < source.Length; i += count)
        {
            yield return source.CopySlice(i, count, padToLength);
        }
    }
}

I got the above code from the following link Splitting a byte[] into multiple byte[] arrays in C#

Funny part is the program runs fine when I run it in a console application but when I put this code into the windows application it throws the System.OutOfMemoryException.

9
  • You should use a Stream instead of byte[]. Commented Jul 26, 2018 at 21:45
  • 2
    Project > Properties > Build tab, untick the "Prefer 32-bit" checkbox. You don't prefer it. Commented Jul 26, 2018 at 21:45
  • Notice that attempting to calculate the Base64 of slices won't give the same result. Not even close: dotnetfiddle.net/IUtlzH Commented Jul 26, 2018 at 21:51
  • 2
    In general, it's a bad idea to try to manipulate such huge amounts of data in memory all at once; string builders, xml builders, and so on, are not designed for this scenario. My advice would be to find or implement a streaming builder that dumps directly out to disk rather than building up such huge structures in memory. Commented Jul 26, 2018 at 21:57
  • 1
    That's an antipattern too. At this point not putting the zip file in the xml seems better. Commented Jul 26, 2018 at 22:18

1 Answer 1

1

Preferablilty you want to be doing something like this

            byte[] Packet = new byte[4096];
            string b64str = "";
            using (FileStream fs = new FileStream(file, FileMode.Open))
            {
                int i = Packet.Length;
                while (i == Packet.Length)
                {
                    i = fs.Read(Packet, 0, Packet.Length);
                    b64str = Convert.ToBase64String(Packet, 0, i);
                }
            }

with that b64str you should create your xml data. Also it is typically unwise to allocate 20mb on stack all in one go.

Sign up to request clarification or add additional context in comments.

1 Comment

@user6520378 np. also study this. this is the most common way to read files in c# most people will read a part of the stream and do something with it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.