Skip to main content
We’ve updated our Terms of Service. A new AI Addendum clarifies how Stack Overflow utilizes AI interactions.

Questions tagged [character-encoding]

Filter by
Sorted by
Tagged with
16 votes
4 answers
7k views

On this github C++ related page the writer said Note that the value_type of those two containers is uint8_t which is not a printable character, make sure to cast it to int before you print. Why ...
Russell McMahon's user avatar
-1 votes
1 answer
545 views

Java, by default, uses UTF-16 to represent characters in the String data type. I inherited a JavaFX project which currently has some Strings in UTF-8 and others in UTF-16. This is causing bugs (in pop-...
chilliefiber's user avatar
0 votes
1 answer
442 views

Windows: Uses CR (\r) in combination with LF (\n) for line endings, facilitating compatibility with legacy systems. Unix-like systems (including Linux and macOS): Employs LF (\n) for line endings, ...
DevelBase2's user avatar
1 vote
1 answer
672 views

I am starting to look into how to implement SHA256 in JavaScript, and found this for example. It requires UTF-8 encoding it sounds like. Another one I saw required/supported only ASCII encoding and ...
Lance Pollard's user avatar
1 vote
1 answer
85 views

I've been reading Unicode's core specification (see https://www.unicode.org/versions/latest/). I mostly understood what the text was explaining in section 2.1 Architectural Context until it started ...
lonious's user avatar
  • 121
2 votes
2 answers
571 views

Today I went across a weird case for which I have no explanation, so here I am. I have two files with identical content, but one is encoded in UTF-8 and the other one is in IBM EBCDIC. Both of them ...
rodripf's user avatar
  • 137
5 votes
1 answer
424 views

When you encode a code point to code units based on UTF-8, then if the code point fits on 7 bits, the most significant bit is set to zero so that it tells you it is a character which is stored on 1 ...
codepersonnel49's user avatar
1 vote
2 answers
105 views

If you go to www.htmlbasictutor.ca/character-encoding.htm you will find the following description of character encoding. Character encoding tells the browser and validator what set of characters to ...
progner's user avatar
  • 523
2 votes
2 answers
153 views

The two concepts seem equal to me, but I'm not really sure I understand encoding well enough to confirm that this is the case.
progner's user avatar
  • 523
1 vote
1 answer
363 views

I'm working on a project with huge files that contain only the set {[0-9],.}. Encoding in UTF-8 or ASCII make huge files. I wonder if I could find a way to encode in only 4 bits (make those file 16 ...
PyThagoras's user avatar
2 votes
1 answer
442 views

I am developing against a file spec that lists the data type for certain fields as CHAR(<length>) The spec is for a fixed width flat file. In most cases, possible values to populate the fields ...
mathewb's user avatar
  • 137
0 votes
2 answers
2k views

I am wondering why unicode encoding is necessary in JavaScript. I am looking at utf8.js as an example. I am also looking at the utf8 spec, but am not really following the different pieces of data. ...
Lance Pollard's user avatar
0 votes
1 answer
1k views

In general a character is represented in 1 byte i.e. 8 bits . This is I believe true for all text editors even for databases like oracle. 1 byte can represent 2^8 = 256 Characters. My question is when ...
user3198603's user avatar
  • 1,896
8 votes
1 answer
4k views

I used to think that the BOM is optional for UTF-8, but mandatory for UTF-16 and UTF-32. But then I have read the following (in this article): Let's look just at the ones that Notepad supports. ...
user9002947's user avatar
0 votes
1 answer
327 views

I can't figure out a barcode that would support ~3500 chars. The barcode should contain 40 strings with caret return, each 76 chars long. Each string will look like this: ...
SovereignSun's user avatar
-1 votes
1 answer
2k views

Ok. My question is confusing, but what im asking is how could i create my own file with my own encoding representing data. sorta like my own database file with my own encoding ,headers, and info.
Fumerian Gaming's user avatar
-1 votes
2 answers
317 views

I have a printer and SDK to work with it in Java. Printer working well with english letters and digits but doesn't print correctly special symbols like 'ä' or 'ê'. I suppose that I need to convert ...
BArtWell's user avatar
  • 107
9 votes
2 answers
73k views

I can type ⅓, ⅔ and ½ but can I type 3/3 and 2/2 using unicode? I know that from a mathematical point of view the fractions 2/2 = 3/3 = 1 but I am typing a list where I want to indicate that you have ...
d-b's user avatar
  • 215
2 votes
1 answer
222 views

Many programs will supply one or more of the following as file encoding formats: UTF-8, UTF-16, UTF-32 and simply Unicode. How do I know what Unicode Transformation Format Unicode is referring to? I'm ...
Govind Rai's user avatar
2 votes
1 answer
2k views

If I was going to write a parser for HTTP, would I be able to assume the encoding of the HTTP headers and status line? Until I read the charset or encoding header, how could I tell what the encoding ...
Travis Parks's user avatar
  • 2,583
0 votes
3 answers
553 views

Binary-to-text encoding is to represent binary data as characters. Hex dump of binary files seem also do the same for reading binary files. Are they related or different things?
Tim's user avatar
  • 5,555
0 votes
1 answer
124 views

For what I understand, given a sequence of bytes without any further information, it's not generally possible to understand which encoding we are talking about. Of course we can guess (e.g. perl's ...
Dacav's user avatar
  • 175
3 votes
2 answers
2k views

It make sense to use entity names for describing <a> as per shown below code. <!doctype html> <html> <head> <title> My First Webpage</title> &...
overexchange's user avatar
  • 2,315
7 votes
2 answers
10k views

Suppose a program A opens a text file A using encoding A to decode the file, and a program B opens a text file B using encoding B. When we copy some text from file B in program B to file A in ...
Tim's user avatar
  • 5,555
2 votes
2 answers
1k views

I programmed a telnet server using C as programming language but I have a problem to send characters with emphases (é, è, à ...). The character encoding is different between the telnet clients (...
ipStack's user avatar
  • 121
21 votes
4 answers
4k views

According to the Wikipedia article, UTF-8 has this format: First code Last code Bytes Byte 1 Byte 2 Byte 3 Byte 4 point point Used U+0000 U+007F 1 0xxxxxxx U+0080 U+...
qbt937's user avatar
  • 321
4 votes
2 answers
638 views

There are different conventions of representing the new line character in different types of OSes. Does the newline convention have nothing to do with what encoding is used? Is the newline ...
Tim's user avatar
  • 5,555
1 vote
1 answer
377 views

I recently got a requirement to develop a chat-like application, or rather, a foundation of classes and methods that would allow certain applications to have chat-like features. The framework must be ...
ferc's user avatar
  • 113
6 votes
2 answers
32k views

I need to fit at least 300 bytes (or much much more) into a QR code and think I can do this by mapping each byte into the associated ISO/IEC 8859-1 character located here. Since each byte (1-255) ...
makerofthings7's user avatar
76 votes
2 answers
65k views

I work in C# and MSSQL and as you'd expect I store my passwords salted and hashed. When I look at the hash stored in an nvarchar column (for example the out the box aspnet membership provider). I've ...
Liath's user avatar
  • 3,436
5 votes
3 answers
3k views

Unicode seems that its becoming more and more ubiquitous these days if it's not already, but I have to wonder if there are any domains were Unicode isn't the best implementation choice. Are there any ...
Daniel Wolfe's user avatar
2 votes
2 answers
6k views

When is it beneficial to use encodings other than UTF-8? Aside from dealing with pre-unicode documents, that is. And more importantly, why isn't UTF-8 the default in most languages? That is, why do I ...
Electric Coffee's user avatar
2 votes
3 answers
3k views

I am working on a highly customized shop software, based on a open-source one, written in PHP and usual web techniques (CSS, HTML, JS). I did a lot of customization in the past months/years and ...
Ello's user avatar
  • 144
1 vote
1 answer
308 views

I'm working with a binary structure, the goal of which is to index the significance of specific bits for any character encoding so that we may trigger events while doing specific checks against the ...
That Realtor Programmer Guy's user avatar
4 votes
4 answers
19k views

Please can you answer a couple of questions based on the code below (excludes the try/catch blocks), which transforms input XML and XSL files into an output XSL-FO file: File xslFile = new File("...
Helen Reeves's user avatar
4 votes
2 answers
428 views

Given that we do not know the encoding of a string what is the best way to make sure that it is transformed to say ASCII? Also in such situations we are willing to accept potential loss of data.
GreatOrdinary's user avatar
1 vote
3 answers
8k views

I'm writing some code that sets cookies and I'm wondering about the exact semantics of the Set-Cookie header. Imagine the following HTTP header line: Set-Cookie: name=value; Path=/%20 For with path ...
Philippe Marschall's user avatar
171 votes
6 answers
855k views

I have some SQL script files on Windows 7. When opened with Notepad++, in the "Encoding" menu some of them are reported to have an encoding of "UCS-2 Little Endian" and some of &...
Marcel's user avatar
  • 3,172
27 votes
7 answers
3k views

I wrote an open source library that parses structured data but intentionally left out carriage-return detection because I don't see the point. It adds additional complexity and overhead for little/no ...
Evan Plaice's user avatar
  • 5,785
585 votes
1 answer
76k views

I have recently seen a few URIs containing the query parameter "utf8=✓". My first impression (after thinking "mmm, looks cool") was that this could be used to detect a broken character encoding. So, ...
Gary's user avatar
  • 24.4k
3 votes
2 answers
7k views

I just read through the documentation on the Codecs module, but I guess my knowledge/experience of comp sci doesn't run deep enough yet for me to comprehend it. It's for dealing with encoding/...
temporary_user_name's user avatar
8 votes
2 answers
2k views

I recently implemented incoming emails for an application and boy, did I open the gates of hell? Since then every other day an email arrives that makes the app fail in a different way. One of those ...
Pablo Fernandez's user avatar
3 votes
3 answers
22k views

Well, I am reading Programing Windows with MFC, and I came across Unicode and ASCII code characters. I understood the point of using Unicode over ASCII, but what I do not get is how and why is it ...
vin's user avatar
  • 177
41 votes
3 answers
172k views

I'm learning T-SQL. From the examples I've seen, to insert text in a varchar() cell, I can write just the string to insert, but for nvarchar() cells, every example prefix the strings with the letter N....
qinking126's user avatar
2 votes
2 answers
347 views

I'm fixing code that is using ASCIIEncoding in some places and UTF-8 encoding in other functions. Since we aren't using the UTF-8 features, all of our unit tests passed, but I want to create a ...
makerofthings7's user avatar
12 votes
3 answers
8k views

I feel that often you don't really choose what format your code is in. I mean most of my tools in the past have decided for me. Or I haven't really even thought about it. I was using TextPad on ...
Parris's user avatar
  • 241
2 votes
2 answers
891 views

http://php.net/manual/en/function.mb-convert-encoding.php Say I do: $encoded = mb_convert_encoding ($original); That looks like simple enough. WHat I am imagining is the following $original has a ...
user4951's user avatar
  • 739
6 votes
2 answers
4k views

With an encoding such as EBCDIC being in existence already (and being 8 bit to boot), what was the need to invent yet another encoding and a 7 bit one at that? Why was ASCII invented and what ...
Oded's user avatar
  • 53.8k
4 votes
2 answers
1k views

In previous web applications I've built, I've had issues with users entering exotic characters into forms which get stored strangely in the database, and sometimes appear different or double-encoded ...
CFL_Jeff's user avatar
  • 3,507
3 votes
3 answers
723 views

Caveat: I am a political science student and I have tried my level best to understand the technicalities; if I still sound naive please overlook that. In the Symantec report on Stuxnet, the authors ...
The Kaykay's user avatar