0

In python Im writing code to extract string from alphanumeric characters. The code should extract only string and print in the following format.

Input should be given in the form of IND1234AUS1234 (i.e Characters must be separated by few digits)

For the above input python code should extract string IND and AUS and print as IND to AUS

Input must not given in any other formats other than mentioned above and if given in wrong format, code should print invalid input. (example of wrong formats of inputs are 1234INDAUS, IND1234, 123IND123AUS, INDAUS1234)

Below is the code i have tried. It extracts string but I don't know how to seperate and print as IND to AUS

My program prints only INDAUS

test_string  = input()
only_alpha = ""
for char in test_string:
    if char.isalpha():
      only_alpha += char
print(only_alpha)

Please help me with the code. I don't know to write code for validating invalid inputs as i mentioned above.

4
  • You can use re.findall([A-Z]+,your_string) Commented May 3, 2020 at 14:22
  • Is there any constraint on the number of letters and digits ? Commented May 3, 2020 at 14:28
  • No. There are no constraints on number of letters and digits. Its letters followed by digits and again letters followed by digits. Commented May 3, 2020 at 15:12
  • Preferably letters of atleast 2 or 3 characters and atleast one digit followed after letters like IND12AUS1234. Commented May 3, 2020 at 15:17

2 Answers 2

2

You whould use a regex to

  1. validate the content [A-Z]+\d+[A-Z]+\d+ : upper with digits twice
  2. retrieve the uppercase content at the same time ([A-Z]+)\d+([A-Z]+)\d+ parenthesis to make captuting group anr get content with .groups()
def extract(value):
    m = re.search(r"([A-Z]+)\d+([A-Z]+)\d+", value)
    if m:
        return " to ".join(m.groups())
    return "Invalid Input"

Testing

value = "IND1234AUS1234 "
res = extract(value)
print(res)  # IND to AUS

value = "INDAUS1234 "
res = extract(value)  # Invalid Input
print(res)

A regex that checks the length of each part would be

"([A-Z]{2,5})\d{1,5}([A-Z]{2,5})\d{1,5}"
Sign up to request clarification or add additional context in comments.

2 Comments

@azro For invalid input just print Invalid input.
@AshokKumar Sorry to tell you that but this is very basic code, don't call me genius ;) Also you have to understand it, understand why it return the good string a in case, and why 'invalid' in the other case. Also I've edit to put at the end the regex that check the length of each part
0

This looks like a perfect way to use regex:

def validate(a):
    match = re.match("([A-Z]+)[0-9]+([A-Z]+)[0-9]+", a)
    if not match:
        raise ValueError()
    else:
        return match[1] + " to " + match[2]

>>> validate("IND1234AUS1234")
"IND to AUS"
>>> validate("USD1234EUR1234")
"USD to EUR"
>>> validate("123IND123AUS")
Traceback (most recent call last):
  File "<pyshell#23>", line 1, in <module>
    validate("123IND123AUS")
  File "<pyshell#20>", line 4, in validate
    raise ValueError()
  File "<string>", line None
SyntaxError: <no detail available>

Quick explanation: "([A-Z]+)[0-9]+([A-Z]+)[0-9]+"

  • ([A-Z]+): One or more A-Z characters, in parenthesis so we can index them later
  • [0-9]+: One or more digits

2 Comments

You should raise ValueError instead of SyntaxError
@Ch3steR Now changed

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.