I need to extract data from unstructured string coming from a sms
The data I need to extract is the following
Code: This a 5 letter alphanumeric string it must contain at least one digit
Identity document: This is a numeric string between 5 and 8 characters, valid formats are:
V55555555
E55555555
55555
55 555
E55 555 555
55 555 555
5 555 555
555 555
The data I need to extract could be in any position in the string, I have normalized the string, replaced duplicate spaces by only one, and deleting anything that is not a space, number and letter
Samples
1. resuelvete 15C20 Pdero Perez c.i. V55.555.555,
2. Pedro Perez resuelvete 15c20 55 555 555,
3. 15c20 Resuelvete 555555 Pedro Perez,
4. Resuelvete 555555 Pedro Perez 15c20
For the code part I've tried this regex:
$regex = '/([a-zA-Z0-9]{5})/i';
I also tried this: $regex = '(?=.{5})(?=.*[A-Z])(?=.*[a-z])(?=.*\d)[a-zA-Z\d]';, that I see here but it's not working (I must say I don't totally understand this regex)
But it's not working, its returnig the first five characters of the string, I need it to return in this examples 15c20
For the Identity document part I've tried the following:
// This not work with spaces
$regex = "/(V|E)?(\d{5,8})/i";
// This not work without spaces
//This fail in first case returning only 7 digits instead of 8
// Also fails in cases 3 and 4, does not match anything
$regex = "/(V|E)?(\d{1,2}? ?\d{3} ?\d{3})/i";