2

I'm trying to format phone numbers in a large CSV directory. I will need to re-format this periodically as it changes so this is not a one-off solution. I have used Notepad++'s regex replace feature successfully in the past and would like to use this tool if possible. However, I'm open to better/faster methods including scripting like PowerShell, which I am familiar with.

Sample of number formats in the database:
XXX-XXXX
XXXXXXX
XXXXXXXXXX
1XXXXXXXXXX
(XXX) XXX-XXXX
1(XXX) XXX-XXXX
(1XXX) XXX-XXXX
XXX-XXX-XXXX

That last one is what I want all phone numbers to look like in the final output. For the one that is lacking the area code, I would add a default value. For the ones with extra country codes, I would need to truncate it.

Here are some of the regex searches I've used:
FIND: 1-(\d{3})-(\d{3})-(\d{4})
REPLACE: \1-\2-\3
This works!

FIND: 1\((\d{3})\)\s(\d{3})-(\d{4})
REPLACE: \1-\2-\3
This works!

FIND: (\d{11})
REPLACE: ???
This finds the correct string, but I don't know how to format the output.

FIND: (\d{3})-(\d{4})
REPLACE: XXX-\1-\2 (here the XXX is my standard area code that I will add)
This finds the correct substring in XXX-XXX-XXXX as well as XXX-XXXX and zip codes with +4 appended (XXXXX-XXXX). Need to just find the XXX-XXXX without anything preceding it and just from phone numbers. Because this is a CSV file, the actual character before each field is a comma.

My problem is twofold. 1) I don't know how to break up a found string into the parts I need for the replace. I need to convert blocks of digits (7, 10 and 11 digits) and format them to fit the pattern XXX-XXX-XXXX. 2) I don't know how to select just the string I'm searching for (i.e. only XXX-XXXX)

2 Answers 2

4

Provided you have a sample list of numbers like

Current             Expected
---------------------------------
123-1234            XXX-123-1234
1234567             XXX-123-4567
1234567890          123-456-7890
10123456789         012-345-6789
(123) 456-1234      123-456-1234
1(123) 123-1234     123-123-1234
1-123-123-1234      123-123-1234
(1999) 999-1234     999-999-1234
123-123-1234        123-123-1234

You may use

Find What: ^(?:1-?)?(?|\(1?(\d{3})\)|(\d{3}))[-\s]?(\d{3})[-\s]?(\d{4})$|^(\d{3})[-\s]?(\d{4})$
Replace With: (?1$1-$2-$3:XXX-$4-$5)

enter image description here

Details:

  • ^ - start of string
  • (?:1-?)? - optional sequence of 1 and an optional -
  • (?|\(1?(\d{3})\)|(\d{3})) - a branch reset group (syntax is (?|...), all groups inside alternative branches receive same IDs) matching either:
    • \(1?(\d{3})\) - ( + an optional 1 + Group 1 capturing 3 digits + )
    • | - or
    • (\d{3}) - Group 1 (still! because of a branch reset group) capturing 3 digits
  • [-\s]? - 1 or 0 (optional) - or whitespace
  • (\d{3}) - Group 2 capturing 3 digits
  • [-\s]? - an optional - or whitespace
  • (\d{4}) - Group 3 capturing 4 digits
  • $ - end of line
  • | - OR
  • ^ - start of line
  • (\d{3}) - Group 4 capturing 3 digits
  • [-\s]? - an optional - or whitespace
  • (\d{4}) - Group 5 capturing 4 digits
  • $ - end of line

The replacement pattern:

  • (?1 - If Group 1 matched, then use
    • $1-$2-$3 - Backreference to Group 1, 2 and 3 with hyphens in between
  • : - or else
  • XXX-$4-$5 - XXX (or whatever the country code is), and Group 4 and 5 separated with a hyphen.
  • ) - end of the if-then block.
Sign up to request clarification or add additional context in comments.

1 Comment

To change the format to (123) 456-7890 use this replace string (?1\($1\) $2-$3:XXX-$4-$5)
1

I'm not familiar with powershell but yea it would be a good idea to make a small script to do this for you.

For the notepad approach though, i'd try running the replace twice:

  1. FIND: (?:^|,)(\d{3})[ -]?(\d{4})(?:,|$)

    REPLACE: XXX-\1-\2 where the XXX is your input area code

  2. FIND: \(?1?\(?(\d{3})\)?[ -]?(\d{3})[ -]?(\d{4})

    REPLACE: \1-\2-\3

I don't think the order matters. Try it out in a test file first.

I'm not sure what you mean by your second question, are the regexes selecting numbers from the wrong column in csv? (if so that's another reason why a script would be better)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.