Accents stored in Redis not being readable - redis

Working with Redis 2.10 using redis-cli on Linux, I am faced with a problem regarding accents...
If I execute the command
set "string" "à"
=> I get "\xc3\xa0"
It seems each converted accent begin with "\xc3"
How do I get my original string back?

Try using
redis-cli --raw
It solved problem for me.

"\xc3\xa0" is just Unicode "à" in UTF-8 encoding. Just decode the string and you're done...

"you string".encode("utf-8")
when you need get the string
"you string".decode("utf-8")

You need to spec the version of Redis and more importantly the client you are using.
If you are using a telnet client, the problem may be your client. Redis supports arbitrary bytes for values and UTF-8 is not a problem at all (if your client is properly converting the entered glyphs to the associated byte sequence.)

Related

How to import UTF8 table in another encoding(win1251, SQL_ASCII) with COPY()?

Prehistory: Hello, i saw many questions about encoding in postgres, but.
I have UFT8 table, and i'm using COPY function to import that table in CSV, and i need to make COPY with different encodings like WIN1251 and SQL_ASCII.
Problem: When in table i have characters that not supported in WIN1251/SQL_ASCII, i will got classic error
character with byte sequence 0xe7 0xb0 0xab in encoding "UTF8" has no equivalent in encoding "WIN1251"
I tried using "set client_encoding/ convert / convert_to" - no success.
Main question: Is there any way to do this without error using sql?
There is simply no way to convert 簫 into Windows-1252, so you can forget about that.
If you set the client encoding to SQL_ASCII, you will be able to load the data into an SQL_ASCII database, but that is of little use, since the database does not recognize it as a character, but three meaningless bytes above 127.

Encoding issue in Postgres ERROR "UTF8" is it best to set encoding to UTF8 or to make the data WIN1252 compatible?

I created a table importing a CSV file from an excel spreadsheet. When I try to run the select statement below I get the error.
test=# SELECT * FROM dt_master;
ERROR: character with byte sequence 0xc2 0x9d in encoding "UTF8" has no equivalent in encoding "WIN1252"
I have read the solution posted in this stack overflow post and was able to overcome the issue by setting the encoding to UTF8, so up to that point I am still able to keep working with the data. My question, however, is whether setting the encoding to UTF8 actually is solving the problem or it is just a workaround that and will create other problems down the road and I would be better off removing the conflicting characters and making the data WIN1252 compliant.
Thank you
You have a weird character in your database (Unicode code point 9D, a control character) that probably got there by mistake.
You have to set the client encoding to the encoding that your application expects; no other value will produce correct results, even if you get rid of the error. The error has a reason.
You have two choices:
Fix the data in the database. The character is very likely not what was intended.
Change the application to use LATIN1 or (better) UTF-8 internally and set the client encoding appropriately.
Using UTF-8 everywhere would have the advantage that you are safe from this kind of problem.

BigQuery load - control character as delimiter

We have files to load where field values are separated by the "unit separator", 0x1f
As per the doc, if not printable, it should be encoded in UTF-8.
Using the bq CLI, I tried passing the -F argument with U+001F to no avail though:BigQuery error in load operation: Field delimiter must be a single character, found:"U+001F".
No luck either with 0x1F or `\x1f, with or without quotes.
Have I the encoding wrong or is it a bug in bq, or the API ?
EDIT:
Turns out after playing with the explorer that it's the API that doesn't like the delimiter.
Besides the printable delimiters, you can use \t but also the undocumented \b (backspace) and \f (form field) apparently.
tab could be a valid user-entered character in a free-form text field so we need to use a control char (after conversion from 'unit sep')
EDIT2::
Note that \f as delimiter does work fine through the API directly but not the bq CLI (Field delimiter must be a single character, found:"\f").
Actually, courtesy of GCP support, this works on Linux:
bq load --autodetect --field_delimiter=$(printf '\x1f') [DATASET].[TABLE] gs://[BUCKET]/simple.csv
On Windows, it's not that straightforward to return/generate a control character on the command-line. Easier if you use PowerShell.
I agree with #Felipe, this is currently a limitation in the bq CLI tool, but one that can easily be fixed in the source code in my mind with a .decode('utf-8') on the argument in bytes, so that
--field_delimiter=\x1f
can work as-is on any platform.
Closing with the hope the bq CLI team will consider the enhancement.
You can specify bq load --field_delimiter=$'\x01'
You found a limitation of the CLI: It won't accept all characters that the API would.
As said in edit2, the solution is to go straight to the API through alternative methods.

Openfire: Offline UTF-8 encoded messages are saved wrong

We use Openfire 3.9.3. Its MySql database uses utf8_persian_ci collation and in openfire.xml we have:
...<defaultProvider>
<driver>com.mysql.jdbc.Driver</driver>
<serverURL>jdbc:mysql://localhost:3306/openfire?useUnicode=true&amp;characterEncoding=UTF-8</serverURL>
<mysql>
<useUnicode>true</useUnicode>
</mysql> ....
The problem is that offline messages which contain Persian characters (UTF-8 encoded) are saved as strings of question marks. For example سلام (means hello in Persian) is stored and showed like ????.
MySQL does not have proper Unicode support, which makes supporting data in non-Western languages difficult. However, the MySQL JDBC driver has a workaround which can be enabled by adding
?useUnicode=true&characterEncoding=UTF-8&characterSetResults=UTF-8
to the URL of the JDBC driver. You can edit the conf/openfire.xml file to add this value.
Note: If the mechanism you use to configure a JDBC URL is XML-based, you will need to use the XML character literal & to separate configuration parameters, as the ampersand is a reserved character for XML.
Also be sure that your DB and tables have utf8 encoding.

Why does Redis not work with requirepass directive?

I want to set a password to connect to a Redis server.
The appropriate way to do that is using the requirepass directive in the configuration file.
http://redis.io/commands/auth
However, after setting the value, I get this upon restarting Redis:
Stopping redis-server: redis-server.
Starting redis-server: Segmentation fault (core dumped)
failed
Why is that?
The password length is limited to 512 characters.
In redis.h:
#define REDIS_AUTHPASS_MAX_LEN 512
In config.c:
} else if (!strcasecmp(argv[0],"requirepass") && argc == 2) {
if (strlen(argv[1]) > REDIS_AUTHPASS_MAX_LEN) {
err = "Password is longer than REDIS_AUTHPASS_MAX_LEN";
goto loaderr;
}
server.requirepass = zstrdup(argv[1]);
}
Now, the parsing mechanism of the configuration file is quite basic. All the lines are split using the sdssplitargs function of the sds (string management) library. This function interprets specific sequence of characters such as:
single and double quotes
\x hex digits
special characters such as \n, \r, \t, \b, \a
Here the problem is your password contains a single double quote character. The parsing fails because there is no matching double quote at the end of the string. In that case, the sdssplitargs function returns a NULL pointer. The core dump occurs because this pointer is not properly checked in the config.c code:
/* Split into arguments */
argv = sdssplitargs(lines[i],&argc);
sdstolower(argv[0]);
This is a bug that should be filed IMO.
A simple workaround would be to replace the double quote character or any other interpreted characters by an hexadecimal sequence (ie. \x22 for the double quote).
Although not documented, it seems there are limitations to the password value, particularly with the characters included, not the length.
I tried with 160 characters (just digits) and it works fine.
This
9hhNiP8MSHZjQjJAWE6PmvSpgVbifQKCNXckkA4XMCPKW6j9YA9kcKiFT6mE
too. But this
#hEpj6kNkAeYC3}#:M(:$Y,GYFxNebdH<]8dC~NLf)dv!84Z=Tua>>"A(=A<
does not.
So, Redis does not support some or all of the "special characters".
Just nailed this one with:
php: urlencode('crazy&char's^pa$$wor|]');
-or-
js: encodeURIComponent('crazy&char's^pa$$wor|]');
Then it can be used anywhere sent to the redis server via (usually) tcp