1,073 questions
3
votes
0
answers
128
views
Go regex word boundary marker doesn't work with non-ASCII characters [duplicate]
I would like to match a string that may contain non-ASCII characters using a regular expression in Go. After writing some tests, I discovered some surprising behavior that I'd like to check if it's ...
1
vote
1
answer
91
views
Translate UTF-8 punctuation with normal ascii punctuation marks
I'm trying to cleanup a raw data that has embedded \r\n or \n in csv lines.Line terminator is \r\n.
trying to translate utf-8 punctuation marks to normal ascii punctuation marks.
cleaning up any ...
0
votes
0
answers
43
views
Spark file with corrupted header
I have a zipped parquet file with a corrupted header, i.e. it contains weird characters making it impossible to read the table in a standard way.
So I created a cleaning function that reads in the ...
0
votes
0
answers
72
views
Rails way to elegantly avoid Zero Width Space character problems
We had a Zero Width Space character problem with our rails app. Somebody copied and pasted a configuration value (a URL) into a form in our rails app, which later caused confusing error messages. It ...
2
votes
1
answer
148
views
How does C determine whether a character is lower case (islower or isupper)?
I was looking into GNU tr in bash on Debian Linux. The regex engine appears to have a [:lower:] and [:upper:] shorthand. The regex matches on "lowercase" and "uppercase" letters. ...
3
votes
0
answers
99
views
php script with accented characters or spaces in filename path not found by apache
On my Apache web server I have a few (file system) paths that contain spaces or accented characters, such as é, á, or ö. The web server returns "File not found." if (1) the path contains a ...
-1
votes
2
answers
96
views
Detecting and converting string containing ASCII [closed]
I have this string:
Miami, Florida
I would like to find a regex to help defect to see if this string contains ASCII code.
I have tried these regex \\p{ASCII}, ^[\\u0000-\\u007F]*$, ^\p{ASCII}*...
1
vote
1
answer
966
views
python - How to get Unicode characters to display as boxes instead of accented letters - "x96\x88" and "x96\x80"
I have a table that is returning the characters "â\x96\x88" and "â\x96\x80"
These are displaying as "â" and "â"
However, what I need is for them to display as &...
0
votes
2
answers
141
views
Reading UTF-8 texts in PowerPoint via VBA, for export to another software [duplicate]
I want to read all text in a PowerPoint file using VBA, and write them to external file (or some other way) to use in another Software.
I wrote this code:
Sub ReadFileText()
On Error Resume Next
...
1
vote
1
answer
109
views
How to manage non ASCII characters inside sh/bash scripts
My terminal.txt file in the sequel
shows the output of my tmpPdfFile.sh and tmpPdfFile1.sh scripts:
both scripts are unable to properly manage the file
"6._ANbertà_di_scelta.docx.pdf"
The ...
0
votes
1
answer
137
views
I am getting extra character  when i run sql file
I have a .sql file to create a Postgres function:
CREATE FUNCTION id_generator°generateid() RETURNS SETOF integer AS
$BODY$
BEGIN
RETURN QUERY SELECT max(ua_id)
FROM user_attribute;...
0
votes
0
answers
54
views
WinmergeU.exe mangles non-ASCII characters
I have a file with this text (on Windows 10):
'content' => '<h3>This is Schüler Doc1</h3>
<p>glurk</p>
<p>Straße</p>
<p>Europäisch</p>
<p&...
0
votes
0
answers
23
views
Why is gnu strings not detecting chars followed by null char?
There are unprintable characters in an applescript file. However, when i attempt to run strings on the file, it doesn't output all printable characters. It only outputs a portion of the file as plain ...
0
votes
0
answers
154
views
English character frequency calculation
I've written code to perform character frequency analysis on strings by taking the standard ETAOIN SRHD... frequencies of english character occurrence and I have two questions.
Is this the most ...
0
votes
0
answers
119
views
How can I escape all unicode characters in a QString so that only ASCII is used and the result is as short as possible?
I'm searching for a safe method to escape all non-ASCII characters in a QString (and of course to un-escape them later) that will result in pure (printable) ASCII but yield the shortest possible ...
1
vote
1
answer
345
views
PHP - U+FFFD Unicode � error instead of the char
I have tested my website on localhost but once uploaded i see that in the
address is not displayed correctly. The address contains the following errror:
37 Gr�ce Ave
instead of
37 Grâce Ave,
Here ...
2
votes
1
answer
89
views
Jave code replaces accented letters by interrogation points
A very basic problem that I've encountered in a project and it reoccured in this basic code:
import java.util.Scanner;
public class Special {
public static void main(String[] args) {
...
0
votes
0
answers
20
views
Capturing strings with accented characters into an AutoCompleteTextView
My concern is : when the user begins writing some characters into an AutoCompleteTextView edit field, I would like that, even if he enters the equivalent un-accented characters, the actv drop down ...
1
vote
1
answer
281
views
Unicode Left/Right Three-Eighths Block Not the Same Height?
Why is it that the Unicode left three-eighths block and right three-eighths block characters have different heights and widths? I've tested this in a few fonts:
🮈▍🮈▍🮈▍
I'm trying to find full ...
0
votes
0
answers
79
views
Conflict of accentuation in C
I'm trying to create a defensive read function that doesn't allow non-alphabetic characters to be input. It is working normally using <wchar.h>. However, there is a conflict when printing the ...
0
votes
1
answer
2k
views
Rsync transfer Filename Character Encoding Problem (accents etc)
Transferring my large web site I have transferred a lot of photo files between two Linux servers using rsync. Going from Centos 7 to CloudLinux so both Linux.
Example of file name on old server
...
1
vote
1
answer
143
views
How does one match alphabets of different languages in PostgreSQL?
How to find sequence of Alphabets of non-ASCII (other languages) in a given string in PostgreSQL? For example, ASCII alphabets can be matched using '[A-Za-z]'.
In SQL Server, @ch BETWEEN 'A' and 'Z' ...
0
votes
0
answers
137
views
Alternatives to `asciifolding` filter for removing Greek ascents from unicode text
I see that the asciifolding filter of OpenSearch only handles Latin accents and does not handle Greek at all (note: some accents are not rendered well in this site due to the font used):
POST /...
0
votes
0
answers
243
views
Why is my code returning an "ascii codec" error when I'm trying to use utf-8?
I'm new to Python, and am only starting to use it as part of a CTF challenge (I'm a cybersecurity student). I was given a mostly pre-built "decoder" script, and the assignment was to ...
1
vote
1
answer
454
views
I am getting encoding issue and ensure_ascii=False issue while writing to a csv file
I have list of dictionaries and I loaded those dictionaries from json files. I stored those json files with my code as following
import json
data = {Some Json}
with open('test.json','w',encoding='ISO-...
2
votes
2
answers
84
views
How do I open in R a downloaded .csv file that contains both correct accented characters and faulty ones?
I have a .csv file that contains both correct and misread accented characters. For example, on the first line I have "Veríssimo", and on the second I have "VirgÃ-nia" (was supposed ...
3
votes
1
answer
150
views
Using `grepl` on Hex characters and escaped unicode non-ASCII characters with `stringi::stri_escape_unicode()`
Old question
I have an R package in which I have a list of university names that I want to match to the user input. The list of names contains special characters and this is generating a warning in R ...
2
votes
1
answer
566
views
Is there any free email provider with Internationalized Domain Name (IDN)?
I just added support for recipients with IDN in an application that I'm building, but I cannot properly test this feature without having a domain with non-ASCII characters.
Is there any free service ...
0
votes
1
answer
93
views
How to convert a long vector of class character containing non-ASCII unicode characters to their escaped version?
I have an R package in which I have a list of university names that I want to match to the user input. The list of names contains special characters and this is generating a warning in R CMD check:
...
0
votes
0
answers
59
views
Convert non Ascii hyphen to ascii cpp
Convert non Ascii hyphen (-) which is bighypen to ascii, when I pasted big hyphen in VI editor it is pasting like this (▒~@~S).
I have a requirement to identify this and replace with ASCII hyphen.
0
votes
1
answer
290
views
Sorting non ascii characters in Dart
Users submitted those weird characters. How to sort them along with normal text in Dart ?
𝔦𝔠𝔦𝔞𝔩𝔞
𝓕𝓪𝓴𝓷𝓬𝓻𝓪
𝗺𝗹𝗲𝗖𝗹𝗮𝗿𝗮
𝙖𝙨𝙡𝙖𝙜𝙖𝙈𝙮
𝖧𝗋𝗂𝖾
𝑮𝒂𝒍𝒍𝒆𝒂
𝐿𝑢𝑛𝑎
𝐃𝐡𝐢𝐚
𝕠ℝ𝕒𝕥𝕧...
0
votes
1
answer
2k
views
MySQL Workbench Unhandled exception: 'ascii' codec can't decode byte 0xc3 in position 1022: ordinal not in range(128)
I am constantly facing this issue when importing .csv files into the database. I am getting the following error:
Unhandled exception: 'ascii' codec can't decode byte 0xc3 in position 1022: ordinal not ...
1
vote
1
answer
3k
views
Excel VBA special characters in literal strings being changed
I have a macro that inserts a few literal strings into an excel file to be converted to a txt file. These literal strings have some German special characters. The macro works as expected for ...
1
vote
1
answer
283
views
Highlighting words with unicode characters (accents, etc) in Ace Editor
I need to hightlight specific keywords of pseudocode into Ace Editor. I've found at the below post a nice and simple solution that works fine:
Want to highlight/change color of certain words in Ace ...
0
votes
0
answers
71
views
Laravel sail - accented search term queries not working
I am using Laravel Sail and when performing back-end queries, the accented term searches not working, as, for example "éz" becomes "ez". I checked php.ini, and it is UTF-8, so the ...
1
vote
1
answer
127
views
How do I show U+0104 with openpdf?
I am trying to create a pdf using openPDF. If I try to print certain characters like Ą (U+0104) they are not shown on the pdf which is produced.
I have set the Chunk font to Times New Roman using a ....
1
vote
1
answer
264
views
How write the text "ї φ" at MATLAB?
How write the text "ї φ" at MATLAB?
This code:
text(0.1, 0.1,'ї \varphi')
text(0.1, 0.3,{'ї $\varphi$'},'Interpreter','latex')
text(0.1, 0.5,{'\text{ї} $\varphi$'},'Interpreter','latex')
...
2
votes
1
answer
275
views
How change the FontName of Greek Letters at MATLAB TeX text?
How change the font type of Greek Letters at MATLAB TeX text? The 'FontName' does not work.
I try this code:
text(0.5, 0.7,'\it\Omega \phi A b','FontName','Times New Roman','FontSize',50);
text(0, 0....
0
votes
1
answer
112
views
Create an xslt function for checking if a string contains non-ASCI characters
i want to check via and xslt function if a string of an..17 and if a character like this fould it will be replaced with underscore _.
My problem is that i dont know how to implement with the specific ...
0
votes
0
answers
41
views
How to apply trim logic to multilingual characters
I am trying to use below query:
SELECT REPLACE(REPLACE(TRIM([dbo].[fn_string_shuffle] (a.text_field)),' ',' '),'&','&') AS [text_field] from dbo.test;
a.text_field has value 성 이름 ...
1
vote
1
answer
167
views
Create table in SQL Server that support multilingual collation
alter column party_name nvarchar(max)
I am trying to insert below characters:
insert into landing.lnd_test (party_name)
values ('성 이름')
but in the table I see the values are replaced with '? ??'.
...
1
vote
1
answer
80
views
Dhall: using non-ascii characters as labels in records?
{ "ハルカナホシノセカイへ": "https://www.youtube.com/watch?v=pwl1nISaCNg" }
Simply put, is it yet possible to use non-ascii e.g. CJK characters in the labels of Dhall records? Like, to ...
0
votes
1
answer
1k
views
How to Find UNICODE Value and Character
I have a table with several nvarchar(max) fields that are a mess. I am trying to locate the "non-ascii" characters that are going to cause problems during our conversion.
I have this ...
0
votes
1
answer
528
views
Trouble reading Japanese characters with full-width characters in PHP
I have a PHP program that reads a certain FILE from an INVENTORY SCANNER. The data is streamed in 1 line like this.
3701804901070125616シャルダン ステキプラスクルマ専用 ジャスミンマリアシャルダン ステキプラスクルマ ジ2131970080 ...
0
votes
0
answers
48
views
Why french accents are not represented correctly after Matlab2011b?
I am mantaining a piece of software developed on Matlab R2011b which uses several languages, french among them.
I am now compiling the code on Matlab R2022 and the non ASCII characters are represented ...
3
votes
1
answer
146
views
Why does std::iswalpha return false for some French characters in C++?
I am using std::iswalpha to check if a non-ASCII character (French character) is alphabetic. However, I have found that it returns false for é character. I have set my locale to fr_FR.UTF-8 in my code....
1
vote
1
answer
881
views
Which codepage is 0x81 = ü, 0x94 = ö, 0x9A = Ü?
I've got a CSV file, which has a character encoding which I can't identify. From it's content (German language entries) I could find the following characters matching some 1-byte character encodings:
...
0
votes
0
answers
23
views
How to use non-ASCII characters in Powershell script [duplicate]
My powershell script needs to accept parameters with non-ASCII characters. And post it as Json to a API, but Powershell is converting all non-ASCII characters. How can I use the non-ASCII characters?
...
-1
votes
1
answer
107
views
ASCII issues in SQL Server table export
I recently had an issue with an SSIS package where it failed to export the data in a SQL table. I was able to narrow down the problem to a record which contained a strange character in one field. ...
0
votes
1
answer
290
views
How to upload a file which name contains special characters using JMeter?
I struggle to upload files using a POST HTTP request in JMeter, which file name (not file content) contains special characters such as "é è à". For example: "...