0

So im working on a script and im trying to match certain patterns in some documents, thing is, while it works 9/10 times, SOMETIMES, it just refuses to work, ive tried checking in Regex101 and RegExr and it should work. But it doesnt.

  const ENDTIMEREGEX = /(Hora|Horario) de (cierre|finalización): (\d\d(:|.)\d\d)/ ;

Example of the text im having an issue with:

  1. NOTIFICAR a las partes en este acto, a las víctimas por intermedio de la Fiscalía, a la OCSPP una vez que venza el plazo para recurrir y, oportunamente, COMUNICAR.

Hora de cierre: 11:57 horas

PALABRAS CLAVE: resolucion_interlocutoria suspension_del_proceso_a_prueba_reanuda_plazos

While other times im trying to match something thats like almost the exact same and it just works:

Hora de cierre: 13:13 horas.

That case works, but i dont see any difference or reason why this one would and the other wont.

Like, it should work! I've even tried just matching the exact text and it still wont work for some reason (also yes, ive tried looking for similar questions, but i havent found any with my particular problem)

Edit:

https://regex101.com/r/BYjNgV/1

function copiarResos() {
  let ass = SpreadsheetApp.getActiveSpreadsheet();
  let maxRows = ass.getSheetByName('set_de_datos_unificado').getLastRow();
  let columnaResos = ass.getSheetByName('set_de_datos_unificado').getSheetValues(2,2,maxRows,1);
  let columnaLinks = ass.getSheetByName('set_de_datos_unificado').getSheetValues(2,53,maxRows,1);
  let folder = DriveApp.getFolderById('folderID'); // I change the folder ID  here 
  let list = [];
  let files = folder.getFilesByType(MimeType.OPENDOCUMENT_TEXT);
  let match = '';
  const ENDTIMEREGEX = /(Hora|Horario) de (cierre|finalización): (\d\d(:|.)\d\d)/i ;
  let testText = "3. NOTIFICAR a las partes en este acto, a las víctimas por intermedio de la Fiscalía, a la OCSPP una vez que venza el plazo para recurrir y, oportunamente, COMUNICAR. /n Hora de cierre: 11:57 horas /n PALABRAS CLAVE: resolucion_interlocutoria  suspension_del_proceso_a_prueba_reanuda_plazos"
  while (files.hasNext()){
    file = files.next();
    list.push(file.getName().toString(), file.getId().toString());

  }
  for (let i = 0; i in columnaResos; i++){
    if(columnaLinks[i] == ''){
      let substring = columnaResos[i];
      match = list.find(element => {
        if (element.includes(substring)) {
          return true;
        } 
      });
      matchId = list.indexOf(match);
      if(matchId !== 0){
        let idString = list[matchId+1];
        let texto = driveTing(idString); //cargamos el texto en una variable
        console.log(texto)
        let endTimeArr = texto.match(ENDTIMEREGEX);
        if(endTimeArr == null){
          console.log(endTimeArr)
        }else {
          let endTime = endTimeArr[0];
          console.log(endTimeArr)
          //endTime = endTime.replace("Horario de cierre: ","").replace("Hora de finalización: ", "").replace("Hora de cierre: ", "");
          //let endTimeInsert = ass.getSheetByName('set_de_datos_unificado').getRange(i+2 ,51);
          //endTimeInsert.setValue(endTime);
        }        
      }
    }
    match = '';
  };
}
function driveTing(fileId) {
  //esta funcion nos devuelve el documento, en formato string
  const id = Drive.Files.copy({title: "temp", mimeType: MimeType.GOOGLE_DOCS}, fileId).id;
  const doc = DocumentApp.openById(id);
  const header = doc.getHeader().getText();
  const text = header + doc.getBody().getText();
  DriveApp.getFileById(id).setTrashed(true)
  return text;
}

While not reproducible per se, this is the an abridged version of the script i wrote. Im getting a list of files, then getting the text of the ones i care about and trying to match things, the issue seems to be somewhere between getting the text from google drive and using regex or something. Not entirely sure, still, when i try to do testText.match, it works, and its the same string as im getting from driveTing(fileId) if im not wrong.

5
  • The pattern should be working in all cases. Paste your Regex101 link here and we can go from there. Commented Jul 11, 2022 at 15:00
  • Provide your script as well in a minimal reproducible example Commented Jul 11, 2022 at 15:00
  • @TheMaster i'm having a bit of trouble with doing a minimal reproducible example since apparently the issue has something to do with the text im getting back from a function i wrote, cause if i just try matching a string with the text it works, and also that function is accessing google drive Commented Jul 11, 2022 at 15:24
  • If you want multiple matches, you have to use the global flag /g as well. See regex101.com/r/9ei08o/1 Commented Jul 11, 2022 at 15:40
  • @Thefourthbird my bad, just removed that second Hora de cierre: 13:13 horas. Commented Jul 11, 2022 at 15:42

1 Answer 1

1

Try

function lfunko() {
  let s = `NOTIFICAR a las partes en este acto, a las víctimas por intermedio de la Fiscalía, a la OCSPP una vez que venza el plazo para recurrir y, oportunamente, COMUNICAR.
Hora de cierre: 11:57 horas

PALABRAS CLAVE: resolucion_interlocutoria suspension_del_proceso_a_prueba_reanuda_plazos`;
  Logger.log(s.match(/(Hora|Horario) de (cierre|finalización): (\d{1,2}(:|.)\d{1,2})/gm));
}

Execution log
10:10:35 AM Notice  Execution started
10:10:34 AM Info    [Hora de cierre: 11:57]
10:10:36 AM Notice  Execution completed
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.