How to translate and what could cause characters such as å¿è€ - sql

Goal:
I only use select statements with the dbs I have access to.
One of the columns is supposed to store legible english sentences but there are values with strange characters. I would like to find a way to translate those special characters to legible characters
My question is two fold:
Can I translate the following string to a legible format all stored data is basically lost in translation
How can I ensure that the data is stored correctly?
Column Collation: SQL_Latin1_General_CP1_CI_AS
Column Data Type: NVARCHAR(300)
Data Examples:
å¿è€
ÐžÐ±Ð°Ð¶Ð´Ð°Ð½Ð¸Ñ Ð·Ð°

Use prefix of ‘N’ While you Enter to table
Insert Into TownMessage_Tbl Values (elanat=N' + Elanat +"')

Related

Cant Migrate to bigquery because bigquery column names allow only english characters

Bigquery column names (fields) can only contain English letters, numbers, and underscores.
I am using python and I want to create a script to migrate my data from Postgres to Bigquery and the Postgres tables have many non-english column names.
I will probably need to encode the column names to some format that Bigquery accepts, but I will need the ability to later decode it back to the original.
what is the best way to do this?
You can encode the column names to something like base64 and replace the +=/ characters to some kind of place holder.
If you don't care about fields length you can encode to base32 (its about 20% longer then base64 but don't use '+' or '/' and the '=' is used only for padding so you can discard it and it wont affect the string)
Except that you can make small conversion table for each non English character in your language to some combination in English chars, this will work only if you have small amount of non-english characters.

Can we select the datas that have spaces between the lines in DB without the spaces?

I have a textbox to make a search in my table.My table name is ADDRESSBOOK and this table holds the personel records like name,surname,phone numbers and etc.The phone numbers holding like "0 123 456789".If I write "0 123 456789" in my textbox in the background this code is working
SELECT * FROM ADDRESSBOOK WHERE phonenumber LIKE "0 123 456789"
My problem is how can I select the same row with writing "0123456789" in the textbox.Sorry for my english
You can use replace():
WHERE REPLACE(phonenumber, ' ', '') LIKE REPLACE('0 123 456789', ' ', '')
If performance is an issue, you can do the following in SQL Server:
alter table t add column phonenumber_nospace as (replace(phonenumber, ' ', '');
create index idx_t_phonenumber_nospace on t(phonenumber_nospace);
Then, remove the spaces in the parameter value before constructing the query, and use:
WHERE phonenumber_nospace = #phonenumber_nospace
This assumes an equality comparison, as in your example.
If there is a specific format in which the Phone number is stored than you can insert space at the specific locations and than pass that to the database query.
For Example as you have mentioned in the question for number 0 123 456789.
If there is a space after first number and space after fourth number then you could take the text from the textbox and insert space at second position and sixth position(as after adding space at second position + next three positions are number so sixth position) and pass that text to the database query.
An important part of Db design is ensuring data consistency. The more consistently it's stored, the easier it is to query. That's why you should make a point of ensuring your columns use the correct data types:
Dates/time columns should use an appropriate date/time type.
Number columns should use a numeric type of the appropriate size. (None of this numeric varchar rubbish.)
String columns should be of the appropriate length (whether char or varchar).
Columns with referential relationships should never store invalid references to the referenced table.
And similarly, you need to determine the exact format you wish to use when storing telephone numbers; and ensure that any time you store a number it's done so consistently.
Some queries will be complex enough as is. As soon as you're unable to rely on a consistent format, your queries to find data need to cater for all the possible variations. They'll be less likely to leverage indexes effectively.
I have seen argument in favour of storing telephone numbers as numeric data. (It is after all a "number".) Though I'm not really convinced because this approach would be unable to represent leading zeroes (which might be desirable).
Conclusion
Whenever you insert/update a telephone number, ensure it's stored in a consistent format. (NOTE: You can be flexible about how the number appears to your users. It's only the stored value that needs to be consistent.)
Whenever you search for a telephone number, convert the search value into the compatible format before searching.
It's up to you exactly where/how you do these conversions. But you might wish to consider CHECK constraints to ensure that if you failed to convert a number appropriately at some point, that it isn't accidentally stored in the incorrect format. E.g.
CONSTRAINT CK_NoSpacesInTelno CHECK (Telephone NOT LIKE '% %')

SQL NVARCHAR(MAX) returning ASCII and Weird Characters instead of Text

I have an SQL Table and I'm trying to return the values as a string.
The values should be city names like Sydney, Melbourne, Port Maquarie etc.
But When I run a select I either get black results or as detailed in the first picture some strange backwards L character. The column is an NVARCHAR(MAX)
SELECT ctGlobalName FROM Crm.Cities
Then I tried using MSSQL's Edit top 200 rows feature and I could see the names of the cities, but also all these weird ascii characters.
Now I didn't create the database, I'm just running queries on it. Some things I've read have suggested it is a problem with the Collation. But the table is SQL_Latin1_General_CP1_CI_AS which matches the server collation.
I'm sure there must be something I can add to my select query to return the values as an ordinary string. Is there something I can do to my select query to return the expected format without the weird characters?
An NVARCHAR datatype can store Unicode characters, which are used for languages that are not supported by the ASCII character set i.e. non-English (or related) languages such as Chinese or Indonesian. If your SQL Server or Windows doesn't have that language installed then you might see strange-looking representations of the data.
On the other hand, it could also be that the application that updates this table has just stored bad data in that column.
Either way you might need to do some string manipulation to strip out the characters you don't want.

Store string with special characters like quotes or backslash in postgresql table

I have a string with value
'MAX DATE QUERY: SELECT iso_timestamp(MAX(time_stamp)) AS MAXTIME FROM observation WHERE offering_id = 'HOBART''
But on inserting into postgresql table i am getting error:
org.postgresql.util.PSQLException: ERROR: syntax error at or near "HOBART".
This is probably because my string contains single quotes. I don't know my string value. Every time it keeps changing and may contain special characters like \ or something since I am reading from a file and saving into postgres database.
Please give a general solution to escape such characters.
As per the SQL standard, quotes are delimited by doubling them, ie:
insert into table (column) values ('I''m OK')
If you replace every single quote in your text with two single quotes, it will work.
Normally, a backslash escapes the following character, but literal backslashes are similarly escaped by using two backslashes"
insert into table (column) values ('Look in C:\\Temp')
You can use double dollar quotation to escape the special characters in your string.
The above query as mentioned insert into table (column) values ('I'm OK')
changes to insert into table (column) values ($$I'm OK$$).
To make the identifier unique so that it doesn't mix with the values, you can add any characters between 2 dollars such as
insert into table (column) values ($aesc6$I'm OK$aesc6$).
here $aesc6$ is the unique string identifier so that even if $$ is part of the value, it will be treated as a value and not a identifier.
You appear to be using Java and JDBC. Please read the JDBC tutorial, which describes how to use paramaterized queries to safely insert data without risking SQL injection problems.
Please read the prepared statements section of the JDBC tutorial and these simple examples in various languages including Java.
Since you're having issues with backslashes, not just 'single quotes', I'd say you're running PostgreSQL 9.0 or older, which default to standard_conforming_strings = off. In newer versions backslashes are only special if you use the PostgreSQL extension E'escape strings'. (This is why you always include your PostgreSQL version in questions).
You might also want to examine:
Why you should use prepared statements.
The PostgreSQL documentation on the lexical structure of SQL queries.
While it is possible to explicitly quote values, doing so is error-prone, slow and inefficient. You should use parameterized queries (prepared statements) to safely insert data.
In future, please include a code snippet that you're having a problem with and details of the language you're using, the PostgreSQL version, etc.
If you really must manually escape strings, you'll need to make sure that standard_conforming_strings is on and double quotes, eg don''t manually escape text; or use PostgreSQL-specific E'escape strings where you \'backslash escape\' quotes'. But really, use prepared statements, it's way easier.
Some possible approaches are:
use prepared statements
convert all special characters to their equivalent html entities.
use base64 encoding while storing the string, and base64 decoding while reading the string from the db table.
Approach 1 (prepared statements) can be combined with approaches 2 and 3.
Approach 3 (base64 encoding) converts all characters to hexadecimal characters without loosing any info. But you may not be able to do full-text search using this approach.
Literals in SQLServer start with N like this:
update table set stringField = N'/;l;sldl;'''mess'

SQL Server : whitespaces in rows

I have this problem where in my database I get lots of empty spaces after my text,
In my email I have "something#mail.com+_________________________________________" lots of spaces
My email row is nchar(255)
and that is happening to all of my tables
Can anyone explain to me why is this happening and how to fix it?
CHAR and NCHAR will automatically right-pad a string with spaces to meet the defined length. Use NVARCHAR(255) instead of NCHAR(255).
you should use nvarchar
SQL Server provides both datatypes to store character information. For the most part the two datatypes are identical in how you would work with them within SQL Server or from an application. The difference is that nvarchar is used to store unicode data, which is used to store multilingual data in your database tables. Other languages have an extended set of character codes that need to be saved and this datatype allows for this extension. If your database will not be storing multilingual data you should use the varchar datatype instead. The reason for this is that nvarchar takes twice as much space as varchar, this is because of the need to store the extended character codes for other languages from