1

I'm trying to parse a a bunch of file with Replace method(string) while is doing what I expect: I feels is not practical. for instance I will process 10K files but in the First 72 I found like 30 values that need to be replace And this is the rule :

My Goal :"

My goal is to replace all Instance of the ':' Dont follows this Rules :

1- the 2nd or 3rd Character foward is Not Another ':' 2-the 3rd or 2nd Chacarcter backward is Not Another ':'

All other should be Replaced

1- Any time that I found this character (:) and this character is not preceded by two char or three characters like :00: or :12A: I should replace it with an (*).

This is the method that I have so far.....

private static string cleanMesage(string str)
{
    string result = String.Empty;
    try
    {
        result = str.Replace("BNF:", "BNF*").Replace("B/O:", "B/O*").Replace("O/B:", "O/B*");
        result = result.Replace("Epsas:", "Epsas*").Replace("2017:", "2017*").Replace("BANK:", "BANK*");
        result = result.Replace("CDT:", "CDT*").Replace("ENT:", "").Replace("GB22:", "GB22*");
        result = result.Replace("A / C:", "A/C*").Replace("ORD:", "ORD*").Replace("A/C:", "A/C*");
        result = result.Replace("REF:", "REF*").Replace("ISIN:", "ISIN*").Replace("PAY:", "PAY*");
        result = result.Replace("DEPOSITO:", "DEPOSITO*").Replace("WITH:", "WITH*");
        result = result.Replace("Operaciones:", "Operaciones*").Replace("INST:", "INST*");
        result = result.Replace("DETAIL:", "DETAIL*").Replace("WITH:", "WITH*").Replace("BO:", "BO*");
        result = result.Replace("CUST:", "CUST*").Replace("ISIN:", "ISIN*").Replace("SEDL:", "SEDL*");
        result = result.Replace("Enero:", "Enero*").Replace("enero:", "Enero*");
        result = result.Replace("agosto:", "agosto*").Replace("febrero:", "febrero*");
        result = result.Replace("marzo:", "marzo*").Replace("abril:", "abril*");
        result = result.Replace("mayo:", "mayo*").Replace("junio:", "junio*").Replace("RE:", "RE:*");
        result = result.Replace("julio:", "julio*").Replace("septiembre:", "septiembre*");
        result = result.Replace("NIF:", "NIF*").Replace("INST:", "INST*").Replace("SHS:", "SHS*")
            .Replace("SK:", "");
        result = result.Replace("PARTY:", "PARTY*").Replace("SEDOL:", "SEDOL*").Replace("PD:", "PD*");
    }
    catch (Exception e)
    {

    }
    return result;
}

And this is some sample data :"

:13: <-- keep /ISIN/XS SVUNSK UXPORTKRUDIT ZX PZY DZTU:<- replace UX DZ
TU:<- replace02ZUG12 RZTU:<- replace W/H TZX RZTU:<- replace0.00000 SHZRUS PZID:<- replace
0.000000 IDDSIN:<- replace
:31:  <-- keep 1201000100CD05302,24NSUC20523531001//00520023531014
:13: <-- keep /ISIN/XS0153242003 SVUNSK UXPORTKRUDIT ZX PZY DZTU:<- replace00ZUG12 UX DZ
TU:02ZUG12 RZTU:0.30241 W/H TZX RZTU:<- replace0.00000 SHZRUS PZID:<- replace
0.000000 ISIN:XS0153242003
:31: <-- keep 1201000100DD121253,25S202IMSSMSZUX534C//S0322211DF4301
S F/O 0150001400
:13: <-- keep XNF:<- replace this 
6
  • 4
    You should probably try Regex. Commented Oct 20, 2017 at 14:42
  • And you should show your input data without the keep suggestion as well the expected output Commented Oct 20, 2017 at 14:44
  • It's not practical, indeed. Each call to replace creates a new temporary string. If you have a large file, or have to process many files, this will cause a HUGE waste of memory and CPU. You should check regular expressions or parsers. For example, it looks like you are trying to modify tags. Those tags look like a bunch of letters followed by :. You can capture that with \w+:. That won't capture spaces or slashes though. In order to *replace something though, you need to catch the tag name, eg: (\w+):. This captures the tag. You can replace this with $1*. Commented Oct 20, 2017 at 14:47
  • Check Substitutions in regular expressions Commented Oct 20, 2017 at 14:48
  • Your sample data seems to be all kinds of mixed up... Commented Oct 20, 2017 at 14:55

2 Answers 2

1

If your goal is to replace all instances of the ':' character where it is not followed by 2 or 3 other characters. You could indeed try the System.Text.RegularExpressions library. You could then simplify your cleanMessage function in the following way.

using System.Text.RegularExpressions;

function string cleanMessage(string str)
{
     string pattern = ":(\s)"; //This will be a ':' followed by a space
     Regex rgx = new Regex(pattern);
     string replaceResult = rgx.Replace(str,"*$1") //this will replace the pattern with a '*' followed by a space. 
     return replaceResult;
}

If your goal is to replace all instances of the ':' character where it is not followed by 2 or 3 other characters and the 2nd or 3rd character forward or backward is not another ':'. You could change your cleanMessage to the following instead.

using System.Text.RegularExpressions;

function string cleanMessage(string str)
{
     string pattern = "([^;]{2}.):(\s[^:]{2})"; 
     //This will be 2 characters that cannot be ':' followed by anything then a ':' followed by a space and 2 more characters that cannot by ':' 
     //For instance, "BNF: :F" would FAIL and not get replaced but "BNF: HH" would pass and become "BNF* HH"
     Regex rgx = new Regex(pattern);
     string replaceResult = rgx.Replace(str,"$1*$2") //this will replace the : with a * 
     return replaceResult;
}

More information on the System.Text.RegularExpressions library replace can be found at https://msdn.microsoft.com/en-us/library/xwewhkd1(v=vs.110).aspx

Sign up to request clarification or add additional context in comments.

3 Comments

My goal is to replace all Instance of the ':' Dont follows this Rules : 1- the 2nd or 3rd Character foward is Not Another ':' 2-the 3rd or 2nd Chacarcter backward is Not Another ':' All other should be Replaced
I have edited my answer. Is it now closer to what you need? It should now only replace the : where it is no close to another one.
Awesome Thanks A lot
0

As @dymanoid mentioned, regular expressions are a way to handle this. By using the following you'd get what you want:

result = Regex.Replace(str, "([a-zA-Z0-9]{2,3})\:", "$1*");

However for large datasets this won't perform well. In that case I'd look at walking through str character by character using a for-loop. If the current character is not a colon, add it to the result string and to a temporary string. When the current character is a colon (:) and the temporary string has a length of 2 or 3, write an asterisk to the result and clear the temporary string. In this case you don't do any string replacement, you just select what to write to a new string.

See here for a speed comparison between string replacement and regex replacement.

8 Comments

This doesn't look like it does what the OP wants... The first susbtitution in his current code is.Replace("BNF:", "BNF*"). As you can see it should only be replacing the : whereas yours will replace the preceding characters too...
Thanks a lot for the suggestion! this one will not work because is replacing this value ( keep -----> :31: <-- keep ) and I want to Replace this value - replace --->RZTU:<- replace, because on 2 or 3 position back dosnt have another : character... but it's a good starting point. so Thanks a lot!
@Chris it Does! because I only want to replace the (:) not the complete word (BNF:), my Result here is (BNF*) I that is exactly what im looking for...
@IvanS: I am confused again... It really looks like it doesn't... If I run var str = "BNF:"; var result = Regex.Replace(str, "[a-zA-Z0-9]{2,3}:", "*"); then result will equal just * and not BNF*. Is that actually correct? I don't think so from what you've said previously but your last comment seemed to indicate it is correct...
@Chris you are Right! I was Talking about the replace method, not the REGEX, this {result = Regex.Replace(str, "[a-zA-Z0-9]{2,3}\:", "*");} doesnt work it Replace the BNF also... I thought you was talking about the replace and not the regex, this regex wont work for what im looking for... but is a good staring point to learn about it.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.