So im working on a script and im trying to match certain patterns in some documents, thing is, while it works 9/10 times, SOMETIMES, it just refuses to work, ive tried checking in Regex101 and RegExr and it should work. But it doesnt.
const ENDTIMEREGEX = /(Hora|Horario) de (cierre|finalización): (\d\d(:|.)\d\d)/ ;
Example of the text im having an issue with:
- NOTIFICAR a las partes en este acto, a las víctimas por intermedio de la Fiscalía, a la OCSPP una vez que venza el plazo para recurrir y, oportunamente, COMUNICAR.
Hora de cierre: 11:57 horas
PALABRAS CLAVE: resolucion_interlocutoria suspension_del_proceso_a_prueba_reanuda_plazos
While other times im trying to match something thats like almost the exact same and it just works:
Hora de cierre: 13:13 horas.
That case works, but i dont see any difference or reason why this one would and the other wont.
Like, it should work! I've even tried just matching the exact text and it still wont work for some reason (also yes, ive tried looking for similar questions, but i havent found any with my particular problem)
Edit:
https://regex101.com/r/BYjNgV/1
function copiarResos() {
let ass = SpreadsheetApp.getActiveSpreadsheet();
let maxRows = ass.getSheetByName('set_de_datos_unificado').getLastRow();
let columnaResos = ass.getSheetByName('set_de_datos_unificado').getSheetValues(2,2,maxRows,1);
let columnaLinks = ass.getSheetByName('set_de_datos_unificado').getSheetValues(2,53,maxRows,1);
let folder = DriveApp.getFolderById('folderID'); // I change the folder ID here
let list = [];
let files = folder.getFilesByType(MimeType.OPENDOCUMENT_TEXT);
let match = '';
const ENDTIMEREGEX = /(Hora|Horario) de (cierre|finalización): (\d\d(:|.)\d\d)/i ;
let testText = "3. NOTIFICAR a las partes en este acto, a las víctimas por intermedio de la Fiscalía, a la OCSPP una vez que venza el plazo para recurrir y, oportunamente, COMUNICAR. /n Hora de cierre: 11:57 horas /n PALABRAS CLAVE: resolucion_interlocutoria suspension_del_proceso_a_prueba_reanuda_plazos"
while (files.hasNext()){
file = files.next();
list.push(file.getName().toString(), file.getId().toString());
}
for (let i = 0; i in columnaResos; i++){
if(columnaLinks[i] == ''){
let substring = columnaResos[i];
match = list.find(element => {
if (element.includes(substring)) {
return true;
}
});
matchId = list.indexOf(match);
if(matchId !== 0){
let idString = list[matchId+1];
let texto = driveTing(idString); //cargamos el texto en una variable
console.log(texto)
let endTimeArr = texto.match(ENDTIMEREGEX);
if(endTimeArr == null){
console.log(endTimeArr)
}else {
let endTime = endTimeArr[0];
console.log(endTimeArr)
//endTime = endTime.replace("Horario de cierre: ","").replace("Hora de finalización: ", "").replace("Hora de cierre: ", "");
//let endTimeInsert = ass.getSheetByName('set_de_datos_unificado').getRange(i+2 ,51);
//endTimeInsert.setValue(endTime);
}
}
}
match = '';
};
}
function driveTing(fileId) {
//esta funcion nos devuelve el documento, en formato string
const id = Drive.Files.copy({title: "temp", mimeType: MimeType.GOOGLE_DOCS}, fileId).id;
const doc = DocumentApp.openById(id);
const header = doc.getHeader().getText();
const text = header + doc.getBody().getText();
DriveApp.getFileById(id).setTrashed(true)
return text;
}
While not reproducible per se, this is the an abridged version of the script i wrote. Im getting a list of files, then getting the text of the ones i care about and trying to match things, the issue seems to be somewhere between getting the text from google drive and using regex or something. Not entirely sure, still, when i try to do testText.match, it works, and its the same string as im getting from driveTing(fileId) if im not wrong.
/gas well. See regex101.com/r/9ei08o/1Hora de cierre: 13:13 horas.