2

We have a TinyMCE script on one of our pages that allows users to paste text segments from Word into it. We've noticed that on paste from Word documents, we get some additional, unwanted CSS like code prepended in the text are, which looks like

@font-face
{
    font-family: "Arial";
}
@font-face
{
    font-family: "Cambria Math";
}
@font-face
{
    font-family: "Cambria";
}
p.MsoNormal, li.MsoNormal, div.MsoNormal
{
    margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Arial;
}
strong { }
.MsoChpDefault
{
    font-size: 10pt;
    font-family: Cambria;
}
div.WordSection1
{
    page: WordSection1;
}

We currently have a PHP script that uses a regex expression to delete this data before it is saved. However, we wish to have this data deleted on paste, so that the user never comes into contact with it.

I've added the following regex expression into the onPaste plugin of TinyMCE

/@font(.*)\{(.*)\}/i

However, it doesn't delete anything. If I remove the last literal bracket \}, it will remove sections of the code, but not the entire thing, so the expression seems to be in the correct place, however, it seems like it is not formed correctly.

Basically, I'm looking for a valid JavaScript regex expression that will delete everything from @font to the last curly bracket }.

3 Answers 3

3

The dot selector (.) in Javascript RegExp matches all characters EXCEPT line-breaks. Unfortunately in Javascript there is no s flag to turn on the dot matching line-breaks. Instead there is the work around of using the character set [\s\S] to match any whitespace character (including line-breaks) and any non-whitespace character. Therefore the following RegExp will delete everything from @font to the last curly bracket }:

yourText = yourText.replace(/@font[\s\S]*\{[\s\S]*\}/i,'');

See working example →

Sign up to request clarification or add additional context in comments.

1 Comment

Perfect, I figured it was either the line-breaks or the at-sign, just couldn't figure out which one was causing it. Thank you!
0

This works just fine

 "@font-face {...}".match(/@font.*?{.*?}/g);
 ["@font-face {...}"]

It is important to use the ? as the * is a greedy quantifier. Not doing so would cause a single match to occur starting with the first @font and ending with the last }.

Comments

0

I agree with Sean Kinsey, but depending on the regex engine, you may need to account for new lines. If you have to worry about newlines and carriage returns, I would use [\s\S] instead of . to capture those characters as well. Here is an example that you can try out on jsbin or another dynamic JavaScript tester:

// An array of lines of the css code.
var cssCode = [];
cssCode.push('@font-face');
cssCode.push('{');
cssCode.push('    font-family: "Arial";');
cssCode.push('}');
cssCode.push('@font-face');
cssCode.push('{');
cssCode.push('    font-family: "Cambria Math";');
cssCode.push('}');

// A string with new line characters separating each line in the array.
cssCode = cssCode.join("\n");

// Show the matches.
alert(cssCode.match(/@font[\s\S]*?{[\s\S]*?}/g));

1 Comment

The non-greedy match is required so as not to accidentally wipe out extra CSS code.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.