I am in a real trouble here to read a large txt file (around 12mb) with PHP. I have to match a regex, and then search for the first another regex occurrence backwards this matched regex, and then extract the string between these two matches. Here is a real example:
PROCESSO:583.00.2012.105981
No ORDEM:01.19.2012/000154
CLASSE:PROCEDIMENTO SUMÁRIO (EM GERAL)
REQUERENTE:ASSETJ ASSOCIAÇÃO DOS SERVIDORES DO TRIBUNAL DE JUSTIÇA DO ESTADO DE SÃO PAULO
ADVOGADO:273919/SP - THIAGO PUGINA
Requerido:TIM CELULAR S/A E OUTRO
VARA:19a. VARA CÍVEL
PROCESSO:583.00.2012.105970
No ORDEM:01.07.2012/000134
CLASSE:PROCEDIMENTO ORDINÁRIO (EM GERAL)
REQUERENTE:CARLOS NEUMANN
ADVOGADO:79117/SP - ROSANA CHIAVASSA
Requerido:SUL AMÉRICA SEGURO SAÚDE S/A
VARA:7a. VARA CÍVEL
The script should find this code: 273919/SP (regex: [0-9]{6}/SP) Check backwards for the code: 583.00.2012.105981 (regex: [0-9]{3}.[0-9]{2}.[0-9]{4}.[0-9]{6})
And then get all the text between it.
I can't do a preg_match with both of those regex at the same pattern because through the file some of the blocks have more than one 273919/SP type and it would mess up with everything
What can I do? Do you have any ideas?
Sorry if my regex is crappy, I am new at it and it is very difficult to learn :P
EDIT:
Please check another form that the code appears:
583.00.2012.100905-6/000000-000 - no ordem 82/2012 - Procedimento Sumário (em geral) - JOSE APARECIDO DOS
SANTOS X SEGURADORA LIDER DOS CONSORCIOS DO SEGUROS DPVAT S/A - Fls. 79 - Demonstre o autor, por meio
de documento idôneo (declaração de bens e renda e comprovante de pagamento), a necessidade de obtenção do benefício
da justiça gratuita, a fim de ser cumprido o disposto no artigo 5o, LXXIV da CF. Após, tornem os autos conclusos. Int. - ADV
GUILHERME DIAS GONÇALVES OAB/SP 302632 - ADV TIAGO RAFAEL OLIVEIRA ALEGRE OAB/SP 302811
That is my problem. Now I have two occurrences: OAB/SP 302632 and OAB/SP 302811, and I need to get the last one and extract the text between the id 583.00.2012.100905-6/000000-000 and OAB/SP 302811
Those numbers aren't fixed, so I can't do a search for OAB/SP 302811, but OAB\/SP\s\d{6}
AVOCADO:andPROSECCO:keys? Or do need to extract a single block only? Have you tried using the search strings in the natural order with.*?in between?273919/SPdo you want to match up to the first or last273919/SP?