2

Sorry about the ambiguous title, as I couldn't think of how to articulate the question.

I have a CSV file that has hundreds of lines, with thousands of LDAP distinquished names. One of the sample lines could look like:

CN=John Doe,OU=Miami,DC=contoso,DC=com; CN=Spamela Anderson,OU=Los Angeles,DC=contoso,DC=com; CN=Cosmo Kramer,OU=Subfolder,OU=Subfolder,OU=ParentFolder,DC=FABRIKAM,DC=com; CN=Bob Barker,DC=contoso,DC=com
CN=Luke Skywalker,OU=Tattoine,DC=contoso,DC=com; CN=Brad Pitt,OU=Hollywood,DC=contoso,DC=com; CN=Mickey Mouse,OU=Users,DC=contoso,DC=com
CN=Ted Nugent,OU=Houston,DC=FABRIKAM,DC=com; CN=Carl Sagan,DC=Uranus,DC=contoso,DC=com

I'd like to remove any distinguished name that is in the FABRIKAM.COM domain (dc=fabrikam,dc=com). In the sample, I'd like to strip out:

;CN=Cosmo Kramer,OU=Subfolder,OU=Subfolder,OU=ParentFolder,DC=FABRIKAM,DC=com

I've tried to use:

CN=(.*)?,DC=fabrikam,DC=com

But this finds the first occurrence of "CN=" from the beginning of the line until an occurrence of "DC=fabrikam,dc=com" (which would also include John Doe and Spamela Anderson, in my sample).

Is there a way to find the first occurrence of "CN=" to the left of "DC=fabrikam,DC=com" as the boundary?

(I use either Notepad++ or Programmer's Notepad)

1 Answer 1

1

If you can assume that ; never appears in the values and is only used for delimiting different records, then you can use this:

CN=[^;]*,DC=fabrikam,DC=com

Note that the regex above may grab the match from multiple lines.

This is a quick fix, if the file uses \n to separate the lines:

CN=[^;\n]*,DC=fabrikam,DC=com
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks, it works! One weirdness is that if the fabrikam DN occurs at the beginning of a line, the search also grabs the last occurrence of "CN=" from the previous line. Not a big deal as I can hopefully manually edit those. But for my own knowledge, is there a way to prevent the search from crossing newlines or carriage returns?
@BHall: That is one case that I expect this to fail. Can you edit your question to include more test data?
Original question updated, but I see that you've already fixed it! Thanks for your help, @nhahtdh!
@BHall: Note that the fix may not work if your file uses something other than \n to separate the lines. We normally use \n (Windows or Linux, Windows uses \r\n, but there is still \n in there), but there are other Unicode characters which are also used for separating lines.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.