1

I've read several different questions from SO about this already but still haven't been able to get it to work. I'm using public static string[] files = Directory.GetFiles(CurrentDirectory, "*.wav", SearchOption.AllDirectories); to get an array of file paths that will then be passed of to a filestream. The operations the filestream does were taking too long with just one thread handling all of the files. So I decided I'd split the array and pass those smaller arrays to different threads.

The code I was using for that I got from another SO question, and used it to pass the split array, but it only worked with just one file in the first array, but I know what the problem was:

var thing = from index in Enumerable.Range(0, files.Length) 
          group files[index] by index/600;
foreach(var set in thing)
    string.Join(";", set.ToArray());

(This isn't exactly how I used it, I've messed with it so much I can't remember.) The problem with this is that everything was treated as just one massive file path, I have a foreach loop that gets each file from the smaller array, but it treated every file in it as just one, throwing the filepathtoolong exception when there was more than one file returned from the search. My function takes an array and then uses foreach (string file in smallerArray) to write to each one. What I need to do is break up the files array into 4 smaller arrays and start the new threads like new Thread(() => { DoWork(newArray); }).Start(); but nothing I've tried has worked.

4
  • 1
    Have you considered using Parallel LINQ instead of writing this yourself? Commented Aug 1, 2012 at 18:32
  • @TimB: Just what I was posting as an answer :) Commented Aug 1, 2012 at 18:33
  • Probably better to use Directory.EnumerateFiles() as the source. Commented Aug 1, 2012 at 19:54
  • @Henk, thanks, that helps speed up the initial search. Commented Aug 1, 2012 at 20:14

2 Answers 2

5

So I decided I'd split the array and pass those smaller arrays to different threads.

Sounds like you're doing it the hard way :) Let the framework handle it for you with Parallel.ForEach:

Parallel.ForEach(files, file => 
{
    // Do stuff with one file
});
Sign up to request clarification or add additional context in comments.

3 Comments

I'd forgotten about this. I have several thousand files, this would spawn a thread per file, right? Is there going to be a performance impact on running that many threads at the same time? Or will the framework limit the number of concurrently running threads when using Parallel.ForEach?
@0_______0: Sorry for not replying before. No, it doesn't spawn a thread per file. It creates a task per file, which isn't the same thing. It's like adding tasks into the thread pool. You can explicitly limit the degree of parallelism though - in particular, if these are files on the same disk, it's unclear that any parallelism will actually help. (It will depend on the disk type too.)
0

Here is example

    public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> source, int blockSize)
    {
        if (source == null)
            throw new ArgumentNullException("source");
        if (blockSize <= 0)
            throw new ArgumentException("blockSize = {0}".FormatWith(blockSize), "blockSize");
        var result = new List<IEnumerable<T>>();
        for (int blockStartIndex = 0; blockStartIndex < source.Count(); blockStartIndex += blockSize)
        {
            int blockStart = blockStartIndex;
            int blockEnd = blockStartIndex + blockSize - 1;
            IEnumerable<T> block = source.Where((x, i) => i >= blockStart && i <= blockEnd);
            result.Add(block);
        }
        return result;
    }

here is test

    [Test]
    public void TestSplit()
    {
        var list = new[] {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
        IEnumerable<IEnumerable<int>> splitted = list.Split(10);
        Assert.That(splitted.Count(), Is.EqualTo(1));
        Assert.That(splitted.First().Count(), Is.EqualTo(10));

        splitted = list.Split(11);
        Assert.That(splitted.Count(), Is.EqualTo(1));
        Assert.That(splitted.First().Count(), Is.EqualTo(10));

        splitted = list.Split(9);
        Assert.That(splitted.Count(), Is.EqualTo(2));
        Assert.That(splitted.First().Count(), Is.EqualTo(9));
        Assert.That(splitted.ElementAt(1).Count(), Is.EqualTo(1));

        splitted = list.Split(3);
        Assert.That(splitted.Count(), Is.EqualTo(4));
        Assert.That(splitted.First().Count(), Is.EqualTo(3));
        Assert.That(splitted.ElementAt(1).Count(), Is.EqualTo(3));
        Assert.That(splitted.ElementAt(2).Count(), Is.EqualTo(3));
        Assert.That(splitted.ElementAt(3).Count(), Is.EqualTo(1));
    }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.