0

I'm writing a code that transforms a Word document into a different format. Well, I've made the rest of the changes, but I'm having a problem with the numbering and bullets. I'd like to keep them. I've explored two ways, but neither achieves the goal:

  • Using pythondocx, I haven't been able to preserve the numbering or bullets associated with a list item; it simply recognizes that the text is in a list and that's it.

  • Using pywin32, I'm simulating what we would do manually as "Keep only unformatted text," and the problem is in the name itself. I need the text formats and the boxes they contain to be preserved.

I need to keep everything as it is, just convert the numbers or bullets in the list format to text format.

For Example:

INPUT

  1. ASDASDADDADAD

  2. ADDDDDDdfsD

OUTPUT

1. ASDASDADDADAD

2. ADDDDDDdfsD

import os
from pathlib import Path
import win32com.client as win32
from win32com.client import constants


def is_list_paragraph(paragraph) -> bool:
    """
    Devuelve True si el párrafo forma parte de una lista (viñeta o numeración).
    """
    return paragraph.Range.ListFormat.ListType != constants.wdListNoList


def main():
    origen = Path(__file__).parent / "SLIPS"
    destino = Path(__file__).parent / "destino"
    destino.mkdir(exist_ok=True)

    # Iniciar Word (no visible para trabajar en segundo plano)
    word = win32.gencache.EnsureDispatch("Word.Application")
    word.Visible = False

    try:
        for archivo in origen.glob("*.doc*"):
            print(f"Procesando {archivo.name}…")

            src_doc = word.Documents.Open(str(archivo))
            dst_doc = word.Documents.Add()

            sel = word.Selection
            sel.HomeKey(Unit=constants.wdStory)  # situar cursor al inicio

            # 1️⃣ Recoger objetos en orden: primero todas las tablas,
            #    luego los párrafos que NO estén dentro de una tabla.
            elementos = []

            # Tablas
            for tbl in src_doc.Tables:
                elementos.append((tbl.Range.Start, "table", tbl))

            # Párrafos sueltos
            for para in src_doc.Paragraphs:
                # Saltar párrafos que pertenecen a una tabla
                if para.Range.Information(constants.wdWithInTable):
                    continue
                elementos.append((para.Range.Start, "para", para))

            # Ordenar por posición en el documento
            elementos.sort(key=lambda x: x[0])

            # 2️⃣ Recorrer y copiar / pegar según corresponda
            for _, tipo, obj in elementos:
                if tipo == "table":
                    obj.Range.Copy()
                    sel.EndKey(Unit=constants.wdStory)  # ir al final
                    sel.Paste()  # pegado normal
                else:  # párrafo
                    obj.Range.Copy()
                    sel.EndKey(Unit=constants.wdStory)

                    if is_list_paragraph(obj):
                        # Pegar como “Conservar solo texto”
                        sel.PasteSpecial(DataType=constants.wdFormatText)
                    else:
                        sel.Paste()  # pegado normal

            # 3️⃣ Guardar salida
            nuevo_nombre = archivo.stem + "_COPIA.docx"
            salida = destino / nuevo_nombre
            dst_doc.SaveAs(str(salida), FileFormat=constants.wdFormatXMLDocument)
            dst_doc.Close(SaveChanges=False)
            src_doc.Close(SaveChanges=False)

            print(f"   → guardado en {salida}")

    finally:
        word.Quit()


if __name__ == "__main__":
    main()
2
  • 1
    Please edit your question to include a minimal reproducible example showing the code you've tried (we don't need all of your code, just enough to reproduce the issue). Remember: we can't help you debug code we can't see! Commented Jun 17 at 17:51
  • Please trim your code to make it easier to find your problem. Follow these guidelines to create a minimal reproducible example. Commented Jun 18 at 0:09

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.