NLSSORT Oracle to Snowflake - sql

I'm trying to convert the following code from Oracle to Snowflake:
order by nlssort(name, 'NLS_SORT=BINARY')
I know NLSSORT is not a function in Snowflake, but is there anything I can use as an alternative?

It should be pretty similar already to Snowflake's default sorting - you just need to consider your database charset in Oracle (select * from nls_database_parameters where parameter='NLS_CHARACTERSET') and see whether it has a different binary order than ASCII/UTF-8.
Oracle's documentation:
If the value is BINARY, then comparison is based directly on byte
values in the binary encoding of the character values being compared.
Snowflake's documentation:
All data is sorted according to the numeric byte value of each
character in the ASCII table. UTF-8 encoding is supported.
So I think you should be able to just do:
order by name
It's kind of odd that somebody would write that Oracle code to begin with, since BINARY is the default sort order (collation). But if your Oracle database is using multilingual collation (which is not common) for other queries, I don't think you're going to be able to easily emulate that in Snowflake.

Related

Alternative for regexp_replace for BIGINT

I'm quite new at programming with Oracle and DB2 and have a question. I need to mask a field that has a BIGING as datatype. But when i tried to execute a query with regexp_replace, i have this error line SQLCODE=-420, SQLSTATE=22018.
Is there a alternative for a regexp_replace for BIGING.
Thanks a lot!
You can 'mask' integers by replace all digits except first and last by zeroes using next code (Oracle):
select
N, -- source number
FLOOR(N/POWER(10, FLOOR(LOG(10, N)))) * POWER(10, FLOOR(LOG(10, N))) + MOD(N, 10) MASKED
from a;
run sql online
Depending on the platform and version of Db2, you might consider using CREATE MASK if available. That would ensure the data is always masked without needing to do it in every application.
A quick search seems to indicate the Oracle also has similar support but they call it redaction. Masking in oracle seems to be tied into subsetting and exporting data from production to DEV/TEST.
Do you really need a solution for both RDBMs?
And if you really want to roll your own, you need to provide some examples of the masked value you want returned.
EDIT
Here is a part of the code. PK_PERSON has BIGINT as datatype. update
Person.T_PERSON set PK_PERSON = REGEXP_REPLACE(PK_PERSON, '[0-9]',
'*') where PK_PERSON in ('117888')
That's not going to work, you can't set a BIGINT column to a string. That's also not how masking works. Masking generally refers to a process that happens when the data is read out of the DB.

Interpret numeric field as a string in SQL

I have a 64-bit integer field in my Postgres database, which is populated with 64 bit integer numbers. (Non) coincidentally, those numbers are actually 8-chars strings in ASCII format, little endian. For example, a number 5208208757389214273 is a numeric representation of a string "ABCDEFGH": it is 0x4847464544434241 in hex, where 0x41 is A, 0x42 is B, 0x43 is C and so forth.
I would like to convert those numbers purely for display purposes - i.e. find a way to leave them as numbers in the database, but be able to see them as strings when querying. Is there any way to do it in SQL? If not in SQL, is there anything I can do on the server side (install extensions, stored procedures, anything at all) which would allow this? This problem is trivially solvable with any script or programming language, but I do not know how to solve it with SQL.
P.S. And just one more time for some of trigger-happy duplicate-hammer-yielding bunch - this is not a question of translating number like 5208208757389214273 to string "5208208757389214273" (we have a lot of answers on how to do this, but this is not what I am looking for).
Use to_hex() to get a hexadecimal representation for the number. Then use decode() to turn it into a bytea. (Unfortunately I did not find any direct way from bigint to bytea.) Cast that to text and reverse() it, because of the endianess.
reverse(decode(to_hex(5208208757389214273), 'hex')::text)
ABCDEFGH
The bytea_output must be set to 'escape' for this to work properly -- use SET bytea_output = 'escape';.
(Tested on versions 9.4 and 9.6.)
An alternative way to achieve the same rsult without using SET is following:
select reverse(encode(decode(to_hex(5208208757389214273),'hex'),'escape'))

How to check for BOM in postgres text columns?

We have some encoding issues and I need to check whether a BOM is already present in a PostgreSQL text column. I used
select convert(varbinary, columnXY) from tableXY where id = 1;
for MS SQL successfully, but don't find equivalent conversions for PostgreSQL. I found this documentation and tried with decode(columnXY, 'hex'), but that is not working.
You may consider the binary representation of the TEXT column by converting it to BYTEA (edit: not by a direct cast, better use convert_to(text,'UTF-8') instead) and searching the BOM sequence in it as a series of bytes.
as an SQL expression:
position('\xefbbbf'::bytea IN convert_to(your_text_column,'UTF-8'))=1
0 as the result of position(...) would mean the BOM is not in the string.
1 means it's at the beginning of the string.

How to create portable inserts from SQL Server?

Now it generates inserts like
INSERT [Bla] ([id], [description], [name], [version])
VALUES (CAST(1 AS Numeric(19, 0)), convert(t...
It's very SQL Server specific. I would like to create a script that everybody can use, database agnostic. I have very simple data types - varchars, numbers, dates, bits(boolean).
I think
insert into bla values (1, 'die', '2001-01-01 11:11:11')
should work in all DBMSs, right?
Some basic rules:
Get rid of the square brackets. In your case they are not needed - not even in SQL Server. (At the same time make sure you never use reserved words or special characters in column or table names).
If you do need to use special characters or reserved words (which is not something I would recommend), then use the standard double quotes (e.g. "GROUP").
But remember that names are case sensitive then: my_table is the same as MY_TABLE but "my_table" is different to "MY_TABLE" according to the standard. Again this might vary between DBMS and their configuration.
The CAST operator is standard and works on most DBMS (although not all support casting in all possible combinations).
convert() is SQL Server specific and should be replaced with an approriate CAST expression.
Try to specify values in the correct data type, never rely on implicit data conversion (so do not use '1' for a number). Although I don't think casting a 1 to a numeric() should be needed.
Usually I also recommend to use ANSI literals (e.g. DATE '2011-03-14') for DATE/TIMESTAMP literals, but SQL Server does not support that. So it won't help you very much.
A quick glance at the Wikipedia article on SQL, will tell you a bit about standardisation of SQL across different implementations, such as MS SQL, PostgreSQL, Oracle etc.
In short, there is a number of ANSI standards but there is varying support for it throught each product.
The general way to support multiple database servers from your software product is to accept there are differences, code for them at the database level, and make your application able to call the same database access code irrespective of database server.
There are a number of problems with number formats which will not port between dbmses however this pales when you look at the problems with dates and date formats. For instance the default DATE format used in an ORACLE DB depends on the whims of whoever installed the software, you can use date conversion functions to get ORACLE to accept the common date formats - but these functions are ORACLE specific.
Besides how do you know the table and column names will be the same on the target DB?
If you are serious about this, really need to port data between hydrogenous DBMSes, and know a bit of perl thn try using SqlFairy which is available from CPAN. The sheer size of this download should be enough to convince you how complex this problem can be.

operations on blob data in informix

How can we use substring, trim, length operations on some text of blob datatype. And how can we update a column of blob datatype using query?
Thanks,
With difficulty!
First of all, which of the 4 various types of blob are you discussing:
BYTE
TEXT
BLOB
CLOB
These come in pairs (like Sith Lords): there is a binary version (BYTE, BLOB) and a text version (TEXT, CLOB). There's also another pairing: old (BYTE, TEXT) and newer (BLOB, CLOB). The BYTE and TEXT types were introduced with Informix OnLine 4.00 in about 1989. The BLOB and CLOB types were introduced with Informix Universal Server 9.00 in 1996, and are also known as SmartBlobs.
However, there's a very real sense in which it doesn't matter which of the types you are referring to.
There are very few operations that can be performed on BYTE and TEXT blobs. They can be fetched and stored, but for all practical purposes, that's all. I believe you can use LENGTH to determine the length of a TEXT blob. I don't believe there are any methods available to update part of BYTE or TEXT blob; it is an all-or-nothing replacement. Further, the replacement is from a host variable of the appropriate type - there are no BYTE or TEXT literals.
The situation is a bit better with SmartBlobs, but I'm not an expert on them. There are mechanisms for obtaining a LO (large object) handle and then manipulating that, but I don't think those are available server-side (from SQL or SPL). I may be willfully not understanding what's available with the SmartBlobs, but I think the operations are only available from programming APIs and not within SQL. There are no BLOB or CLOB literals either. However, you can use SQL to load from files (FILETOBLOB, FILETOCLOB) and write to files (LOTOFILE) - with the files either on the server or on the client.
I have already answered your question about substring: substring operation on blob text in informix
. With BLOBs you can use substring operator, but not SUBSTRING() nor SUBST() functions.
You can also use LENGTH(), but not TRIM().
Example code:
CREATE TABLE _text_test (id serial, txt_vch varchar(200), txt_text text);
INSERT INTO _text_test (txt_vch, txt_text) VALUES ('1234567890', '1234567890');
SELECT txt_vch, txt_text, txt_vch[3,5], txt_text[3,5], length(txt_text) FROM _text_test;
In my example I used TEXT blob type (Jonathan showed you more blob types, you should show us what kind of blob you use in question). Last select shows usage of substring operator and LENGTH() function. You can replace LENGTH() function with other functions like TRIM() to test it with your environment. In my case TRIM() test ends with:
ODBC Error: -880 [Informix][Informix ODBC Driver][Informix]
Trim character and trim source must be of string data type.
Last select works well with JDBC 3.70JC1 driver, but it seems that ODBC 3.70TC1 driver has bug and shows 3 first chars: 123 instead of 345. Test it yourself.
In recent version (12.10) there is DBMS_LOB package
However it doesn't work as documented: for example there is no dbms_lob.get_length function. Instead I've found that dbms_lob_get_length is working as expected.
So for CLOB fields you have following usefull operations:
dbms_lob_get_length;
dbms_lob_instr;
dbms_lob_substr (unfortunately it gets data after get_length too);
I've found also one undocumented but very, very useful function: dbms_lob_new_clob which gets lvarchar argument and it converts it to CLOB.
I know that this answer is very late. I think that it can be usefull for other people searching ways to handle blobs in Informix (I've found this post few days ago when I was starting mini-research about using blobs for storing xml).