0

I am reading a file I created in Notepad in windows. (The basic txt editor.)

When creating the file I wrote (where [newline] indicates a return)

app.exe[newline]background.jpg[newline]

and then saved it. I put this into a directory.

My Nodekit program read this file and then did the following:

var data = fs.readFileSync(filenameTemp, "utf8");
data.replace(/\r\n/g, "\n");
data.replace(/\r/g, "\n");
var strARR = data.split("\n");

strARR[0] is length 8 ????? when "app.exe" is length 7. When I look at strARR[0][7] in Chrome it says it is "", ie a string with nothing in. Also strARR[1] is length 15 when "background.jpg" is length 14. Again Chrome reports the extra character as "". strARR[2] is length 0 as expected.

Where is this ghost character coming from? It's responsible for another error I am getting.

6
  • 3
    Can you share the result of strARR[0][7].charCodeAt(0)? (or starARR[0].charCodeAt(7); they behave identically) Commented Jan 27, 2014 at 22:38
  • 1
    Is there a byte-order marker? I.e., does data.indexOf('\uFEFF') >= 0 yield true? Commented Jan 27, 2014 at 22:38
  • @MikeSamuel: That wouldn't account for the second occurrence. Commented Jan 27, 2014 at 22:38
  • 1
    @ Rewind: Simply output the character codes of each character in each string. Separately, use a reasonable hexdump program to output the contents of the file. This is simple debugging. Commented Jan 27, 2014 at 22:39
  • @MikeSamuel: Fair enough. Commented Jan 27, 2014 at 22:41

1 Answer 1

5

The replace method returns a new string - it does NOT modify the existing string. Lines two and three of your code aren't changing the value held in data. You need to assign the returned value back into your data variable, like so:

var data = fs.readFileSync(filenameTemp, "utf8");
data = data.replace(/\r\n/g, "\n");
data = data.replace(/\r/g, "\n");
var strARR = data.split("\n");

The 'ghost' character you're seeing is in fact the \r character, which you think you've removed but haven't!

Sign up to request clarification or add additional context in comments.

5 Comments

OMG how could we all have missed that. I plead the fact it's past bedtime and I had some port. Well done.
Side note: data = data.replace(/\r\n?/g, "\n") can do both replacements in one. But probably data = data.replace(/[\r\n]+/g, "\n") is what the OP really wants. But there's no reason to do the replacements at all, strARR = data.split(/[\r\n]+/) with no replacements beforehand avoids the issue entirely.
Also thanks T.J. for the correction. So what does /\r\n?/g do? What does /[\r\n]+/g do? And what does /[\r\n]+/ do? I am always a bit confused by the regex stuff.
What about data.trim() ?
@Rewind: /\r\n?/g, means "match \r optionally followed by \n, globally throughout the string". The ? after the \n makes it optional. /[\r\n]+/g means "match one or more occurrences of \r and/or \n in a row, globally throughout the string." It's well worth spending an hour genning up on regular expressions. You'll wonder how you got by without them. :-) This site is useful for trying them out and getting explanations of them. (Be sure to choose the right type, it defaults to PHP's variant rather than JavaScript's but it has an option for JavaScript's.)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.