0

I have a string array full of guids. I am trying to replace certain guids with different guids. My approach is below;

var newArray = this.to.Select(s => s.Replace("e77f75b7-2373-dc11-8f13-0019bb2ca0a0", "1fe8f3f6-fe17-e811-80d8-00155d5ce473")
    .Replace("fbd0c892-2373-dc11-8f13-0019bb2ca0a0", "1fe8f3f6-fe17-e811-80d8-00155d5ce473")
    .Replace("76cd4297-1e31-dc11-95d8-0019bb2ca0a0", "eb892fb0-fe17-e811-80d8-00155d5ce473")
    .Replace("cd42bb68-2073-dc11-8f13-0019bb2ca0a0", "dc6077e2-fe17-e811-80d8-00155d5ce473")
    .Replace("96b97150-cd45-e111-a3d5-00155d10010f", "1fe8f3f6-fe17-e811-80d8-00155d5ce473")
    ).ToArray();

I have a few fields I am doing this for and it is leading to an OutOfMemoryException. Is it because the Replace() method is creating a new array every time? Is there a more efficient way to do this with an array of strings? This method is running for tens of thousands of records and so I think this is the issue. When I comment these lines out then i do not get the exception.

EDIT: The data in the 'to' variable is a short string in each case, but this is run for thousands of records. So 'to' might look like this for one record;

"systemuser|76cd4297-1e31-dc11-95d8-0019bb2ca0a0;contact|96b97150-cd45-e111-a3d5-00155d10010f"

It might have any of the guids I want to replace in it, so even though it might only have one guid in for that record, I need to run the full set of replaces() just in case it has any of them in it.

Any pointers would be great! Thanks.

11
  • 2
    how big is the string? Commented Mar 7, 2018 at 16:19
  • 1
    How many elements are there in the input array? Can you modify the input array in place instead of creating a new one? Commented Mar 7, 2018 at 16:20
  • 2
    @Liam My point is that the OP has a large array of small strings. Commented Mar 7, 2018 at 16:22
  • 3
    Have you tried a simple foreach loop modifying the array elements in-place instead of creating another large array? Commented Mar 7, 2018 at 16:24
  • 2
    How positive are you that this is the code leading to your OutOfMemoryException? I'm able to run your code with a million copies of your sample input, in a 32-bit .NET environment without any problems. Commented Mar 7, 2018 at 16:47

3 Answers 3

3

I would use a replacement dictionary - its easier to maintain and easier to understand (I think) so its easier all the way:

Boilerplate and create demo data / replace dict:

using System;
using System.Collections.Generic;
using System.Data;
using System.Linq;

internal class Program
{
    static void Main(string[] args)
    {
        // c#7 inline func
        string[] CreateDemoData(Dictionary<string, string> replDict)
        {
            // c#7 inline func
            string FilText(string s) => $"Some text| that also incudes; {s} and more.";

            return Enumerable
                .Range(1, 5)
                .Select(i => FilText(Guid.NewGuid().ToString()))
                .Concat(replDict.Keys.Select(k => FilText(k)))
                .OrderBy(t => Guid.NewGuid().GetHashCode())
                .ToArray();
        }

        // replacement dict
        var d = new Dictionary<string, string>
        {
            ["e77f75b7-2373-dc11-8f13-0019bb2ca0a0"] = "e77f75b7-replaced",
            ["fbd0c892-2373-dc11-8f13-0019bb2ca0a0"] = "fbd0c892-replaced",
            ["76cd4297-1e31-dc11-95d8-0019bb2ca0a0"] = "76cd4297-replaced",
            ["cd42bb68-2073-dc11-8f13-0019bb2ca0a0"] = "cd42bb68-replaced",
            ["96b97150-cd45-e111-a3d5-00155d10010f"] = "96b97150-replaced",
        };

        var arr = CreateDemoData(d);

Code that creates the actual replaced array:

        // c#7 inline func
        string Replace(string a, Dictionary<string, string> dic)
        {
            foreach (var key in dic.Keys.Where(k => a.Contains(k)))
                a = a.Replace(key, dic[key]);

            return a;
        }

        // select value from dict in key in dict else leave unmodified            
        var b = arr.Select(a => Replace(a, d));
        // if you have really that much data (20k guids of ~50byte length
        // is not really much imho) you can use the same approach for in
        // place replacement - just foreach over your array.

Output code:

        Console.WriteLine("\nBefore:");
        foreach (var s in arr)
            Console.WriteLine(s);

        Console.WriteLine("\nAfter:");
        foreach (var s in b)
            Console.WriteLine(s);

        Console.ReadLine(); 
    }
}

Output:

Before:
Some text| that also incudes; a5ceefd8-1388-47cd-b69e-55b6ddbbc133 and more.
Some text| that also incudes; 76cd4297-1e31-dc11-95d8-0019bb2ca0a0 and more.
Some text| that also incudes; 3311a8c5-015e-4260-af80-86b20b277234 and more.
Some text| that also incudes; ed10c79c-dad6-4c88-865c-4d7624945d66 and more.
Some text| that also incudes; 96b97150-cd45-e111-a3d5-00155d10010f and more.
Some text| that also incudes; 0226d9b1-c5f0-41fb-9294-bc9297e8afd9 and more.
Some text| that also incudes; e77f75b7-2373-dc11-8f13-0019bb2ca0a0 and more.
Some text| that also incudes; a04d1e34-e7bc-4bbc-ae0e-12ec846a353c and more.
Some text| that also incudes; cd42bb68-2073-dc11-8f13-0019bb2ca0a0 and more.
Some text| that also incudes; fbd0c892-2373-dc11-8f13-0019bb2ca0a0 and more.

Output:

After:
Some text| that also incudes; a5ceefd8-1388-47cd-b69e-55b6ddbbc133 and more.
Some text| that also incudes; 76cd4297-replaced and more.
Some text| that also incudes; 3311a8c5-015e-4260-af80-86b20b277234 and more.
Some text| that also incudes; ed10c79c-dad6-4c88-865c-4d7624945d66 and more.
Some text| that also incudes; 96b97150-replaced and more.
Some text| that also incudes; 0226d9b1-c5f0-41fb-9294-bc9297e8afd9 and more.
Some text| that also incudes; e77f75b7-replaced and more.
Some text| that also incudes; a04d1e34-e7bc-4bbc-ae0e-12ec846a353c and more.
Some text| that also incudes; cd42bb68-replaced and more.
Some text| that also incudes; fbd0c892-replaced and more.
Sign up to request clarification or add additional context in comments.

5 Comments

In order to apply this solution, you need to do extra parsing. according to OP's row format.
Thanks @AlexandruClonțea - that was edited in after the fact. Will see how to adapt this.
Upvoting. I wonder if Regex.Replace("a|b|c|d|e", "newVal") would be less iterations. I know regex.Replace is slower for the same number of steps... I just wonder if performance would be better because of the fewer overall "perceived" iterations. Also, I wonder why OP is getting OOM
@AlexandruClonțea I only replace once for each existing key - courtesty of the linq-where before it. Regex.Replace with or'ed conditions would probably not work, as each guid is repleaced by something different, so you would have to use multiple single replaces as well. The Linq to get if a key is in the string could be still apllied, but then you test string.Replace() vs. regex.Replace - I like regex but sometimes its easier to just use the inbuilt ones. Regex replace also would have to deal with it, if 2 guids in the same line should be replaced so you ned another loop - makes it less easy.
It just seemed to me from OP's sample that the target guid is the same. I imagine it might represent something like group membership business-wide.... (i.e "I want this to be the new ... something for these" is the actual business need, the "transaction") That's why I said maybe just one would work
0

I'd do this in a single sweep by using a regex to extract fields, then a dictionary of replacements to apply the change, then to reconstitute the string:

IDictionary<string, string> replacements = new Dictionary<string, string>
{
    {"76cd4297-1e31-dc11-95d8-0019bb2ca0a0","something else"},
    //etc
};
var newData = data
    //.AsParallel() //for speed
    .Select(d => Regex.Match(d, @"^(?<f1>[^\|]*)\|(?<f2>[^;]*);(?<f3>[^\|]*)\|(?<f4>.*)$"))
    .Where(m => m.Success)
    .Select(m => new
    {
        field1 = m.Groups["f1"].Value,
        field2 = m.Groups["f2"].Value,
        field3 = m.Groups["f3"].Value,
        field4 = m.Groups["f4"].Value
    })
    .Select(x => new
    {
        x.field1,
        field2 = replacements.TryGetValue(x.field2, out string r2) ? r2 : x.field2,
        x.field3,
        field4 = replacements.TryGetValue(x.field4, out string r4) ? r4 : x.field4
    })
    .Select(x => $"{x.field1}|{x.field2};{x.field3}|{x.field4}")
    .ToArray();

Comments

-1

Have you tested with StringBuilder?

StringBuilder sb = new StringBuilder(string.Join(",", this.to));

      string tempStr = sb
            .Replace("e77f75b7-2373-dc11-8f13-0019bb2ca0a0", "1fe8f3f6-fe17-e811-80d8-00155d5ce473")
            .Replace("fbd0c892-2373-dc11-8f13-0019bb2ca0a0", "1fe8f3f6-fe17-e811-80d8-00155d5ce473")
            .Replace("76cd4297-1e31-dc11-95d8-0019bb2ca0a0", "eb892fb0-fe17-e811-80d8-00155d5ce473")
            .Replace("cd42bb68-2073-dc11-8f13-0019bb2ca0a0", "dc6077e2-fe17-e811-80d8-00155d5ce473")
            .Replace("96b97150-cd45-e111-a3d5-00155d10010f", "1fe8f3f6-fe17-e811-80d8-00155d5ce473")
            .ToString();

      var newArray = tempStr.Split(',');

2 Comments

This puts all the to[] records in one big string. The Replace() calls will be much slower.
@HenkHolterman But I think still it faster than multiple string.Replace() calls.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.