0

i need to export datatable into excel csv file. But when im exporting the table, the excel data comes with HTML tags like , tags or whaterver inside the datatable. its actually a opensource project called 'leantime' from github. I tried a lot. but i can't find anything related to this issue. Below mentioned example will be the column data if i export a file. <a class="ticketModal" href="http://localhost/pmt/tickets/showTicket/20#subtasks">test task - TEST ASSIGNER | Plan Hrs:2 | Hrs Left: 4</a>

Check the javascript code, & suggest me a code to strip html tags

DataTable.ext.buttons.csvHtml5 = {
        bom: !1,
        className: "buttons-csv buttons-html5",
        available: function() {
            return window.FileReader !== undefined && window.Blob
        },
        text: function(dt) {
            return dt.i18n("buttons.csv", "CSV")
        },
        action: function(e, dt, button, config) {
            this.processing(!0);
            var output = _exportData(dt, config).str,
                info = dt.buttons.exportInfo(config),
                charset = config.charset;
            config.customize && (output = config.customize(output, config, dt)), charset = !1 !== charset ? (charset = charset || document.characterSet || document.charset) && ";charset=" + charset : "", config.bom && (output = String.fromCharCode(65279) + output), _saveAs(new Blob([output], {
                type: "text/csv" + charset
            }), info.filename, !0), this.processing(!1)
        },
        filename: "*",
        extension: ".csv",
        exportOptions: {
            
        },
        fieldSeparator: ",",
        fieldBoundary: '"',
        escapeChar: '"',
        charset: null,
        header: !0,
        footer: !1
    }

I want my exported csv file contains only a plain data (without html tags). Give me idea to change code to prevent html tags from the file

1 Answer 1

1

I am not familiar with the DataTable. You'd need to iterate over each row and cell, and convert the cell HTML text to plain text.

You can use this function to strip HTML from HTML text:

    /**
     * Strip HTML and return an plain text
     *
     * @param {String} HTML str
     */
    const htmlToText = function (str) {
      return str
        .replace(/\s+/g, ' ')
        .replace(/<\/(p|li|ul|ol)>/gi, ' ')
        .replace(/<(ul|ol|br[ \/]*)>/gi, ' ')
        .replace(/<li>/gi, ' • ')
        .replace(/<[a-z]+[^>]*>/gi, '')
        .replace(/<\/[a-z]+>/gi, '')
        .replace(/\s{2,}/g, ' ')
        .replace(/^\s+/, '')
        .replace(/\s+$/, '');
    };

You could add support for HTML entities as well, such as:

        .replace(/&lt;/g, '<')
        .replace(/&gt;/g, '>')
        .replace(/&amp;/g, '&')
        .replace(/&copy;/g, '©')
        .replace(/&reg;/g, '®')
        .replace(/&trade;/g, '™')
        .replace(/&(#(?:x[0-9a-f]+|\d+));/gi, function(m, c1) {
          return String.fromCharCode(c1[1].toLowerCase() === "x"
            ? parseInt(c1.substr(2), 16)
            : parseInt(c1.substr(1), 10)
          );
        })

There are some unsupported corner cases:

  • <h2 title="Use <this>">Use this</h2> -- no support for unescaped brackets inside attributes, example should be written as <h2 title="Use &lt;this&gt;">Use this</h2>
Sign up to request clarification or add additional context in comments.

1 Comment

O wow that's great! This works for me.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.