1

While inserting data in a db (sqlite3) I get following error

Encoding::UndefinedConversionError ("\xF0" from ASCII-8BIT to UTF-8):

In database.yml file, I have provided the encoding as UTF-8

development:
<<: *default
database: db/development.sqlite3
encoding: utf8

Even sqlite is configured to accept UTF-8 (PRAGMA encoding returns UTF-8).

Still the query is rolling back -

   (0.1ms)  begin transaction
   SQL (1.0ms)  INSERT INTO "chat_data_regulars" ("username",    "chat_timestamp", "name", "sent_text", "created_at", "updated_at") VALUES (?, ?, ?, ?, ?, ?)  [["username", "a5fbf8bb6fea32fbbcc566c744592136"], ["chat_timestamp", "2016-05-14 04:12:16.942722"], ["name", "Tushar Saurabh"], ["sent_text", "You gave your mentee critical feedback"], ["created_at", "2016-05-14 04:12:33.308923"], ["updated_at", "2016-05-14 04:12:33.308923"]]
   (12.6ms)  commit transaction
   (0.1ms)  begin transaction
   (0.2ms)  rollback transaction
   Completed 500 Internal Server Error in 16416ms (ActiveRecord: 14.5ms)

   Encoding::UndefinedConversionError ("\xF0" from ASCII-8BIT to UTF-8):
3
  • are you trying to add a new record from a model? if so, i think you need to add #encoding: utf-8 at the very top of you model file Commented May 14, 2016 at 5:06
  • @jhonquintero, yes I am adding it through model. I added #encoding: utf-8 but I am getting same error. Commented May 14, 2016 at 5:20
  • What encoding do your HTML pages have? It would be best it they were in UTF-8 too. If they are, can you provide the full stack trace? Possibly even after removing backtrace silencers in the backtrace_silencers.rb initializer? And what ruby + rails version do you use? Commented May 14, 2016 at 6:45

1 Answer 1

2

You might try to force the conversion to UTF before storing to the database. This code will convert the original string, replacing invalid or undefined characters:

string.encode!("UTF-8", invalid: :replace, undef: :replace).force_encoding("utf-8") }

See this information on String#encode for more information.

If your encodings are matched, and you still have this issue, you can simply strip those non-ASCII characters from the strings with this gsub call:

x.map {|text| text.gsub!(/[^\001-\176]+/, "") }

The regex will remove any characters that are between ASCII code 1 (octal 001) and ASCII code 126 (octal 176). This effectively scrubs the string of any non-ASCII characters (and ASCII 0).

If you require "extended ASCII" for use with an international character set, such as ISO-8859 character set or Windows 1252, or even specific Unicode characters, you can extend the range to include those characters by changing the digits to include those characters.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.