How is a binary/bytes column sorted in sql? - sql

The following query sorts a binary column in BigQuery:
with tbl as (
select B'123' as col union all select B'234'
) select * from tbl order by col;
----------------------------------
Row col f0_
1 MTIz false
2 MjM0 false
Is there a convention for how a Binary column is sorted? The above example is tested against BigQuery.

The BINARY type is similar to CHAR, except that it stores binary strings rather than nonbinary strings. That is, it stores byte strings rather than character strings.
This means they have the binary character set and collation, and comparison and sorting are based on the numeric values of the bytes in the values.

Related

Get max on comma separated values in column

How to get max on comma separated values in Original_Ids column and get max value in one column and remaining ids in different column.
|Original_Ids | Max_Id| Remaining_Ids |
|123,534,243,345| 534 | 123,234,345 |
Upadte -
If I already have Max_id and just need below equation?
Remaining_Ids = Original_Ids - Max_id
Thanks
Thanks to the excellent possibilities of array manipulation in Postgres, this could be done relatively easy by converting the string to an array and from there to a set.
Then regular queries on that set are possible. With max() the maximum can be selected and with EXCEPT ALL the maximum can be removed from the set.
A set can then be converted to an array and with array_to_string() and the array can be converted to a delimited string again.
SELECT ids original_ids,
(SELECT max(un.id::integer)
FROM unnest(string_to_array(ids,
',')) un(id)) max_id,
array_to_string(ARRAY((SELECT un.id::integer
FROM unnest(string_to_array(ids,
',')) un(id)
EXCEPT ALL
SELECT max(un.id::integer)
FROM unnest(string_to_array(ids,
',')) un(id))),
',') remaining_ids
FROM elbat;
Another option would have been regexp_split_to_table() which directly produces a set (or regexp_split_to_array() but than we'd had the possible regular expression overhead and still had to convert the array to a set).
But nevertheless you just should (almost) never use delimited lists (nor arrays). Use a table, that's (almost) always the best option.
SQL Fiddle
You can use a window function (https://www.postgresql.org/docs/current/static/tutorial-window.html) to get the max element per unnested array. After that you can reaggregate the elements and remove the calculated max value from the array.
Result:
a max_elem remaining
123,534,243,345 534 123,243,345
3,23,1 23 3,17
42 42
56,123,234,345,345 345 56,123,234
This query needs only one split/unnest as well as only one max calculation.
SELECT
a,
max_elem,
array_remove(array_agg(elements), max_elem) as remaining -- C
FROM (
SELECT
*,
MAX(elements) OVER (PARTITION BY a) as max_elem -- B
FROM (
SELECT
a,
unnest((string_to_array(a, ','))::int[]) as elements -- A
FROM arrays
)s
)s
GROUP BY a, max_elem
A: string_to_array converts the string list into an array. Because the arrays are treated as string arrays you need the cast them into integer arrays by adding ::int[]. The unnest() expands all array elements into own rows.
B: window function MAX gives the maximum value of the single arrays as max_elem
C: array_agg reaggregates the elements through the GROUP BY id. After that array_remove removes the max_elem value from the array.
If you do not like to store them as pure arrays but as string list again you could add array_to_string. But I wouldn't recommend this because your data are integer arrays and not strings. For every further calculation you would need this string cast. A even better way (as already stated by #stickybit) is not to store the elements as arrays but as unnested data. As you can see in nearly every operation should would do the unnest before.
Note:
It would be better to use an ID to adress the columns/arrays instead of the origin string as in SQL Fiddle with IDs
If you install the extension intarray this is quite easy.
First you need to create the extension (you have to be superuser to do that):
create extension intarray;
Then you can do the following:
select original_ids,
original_ids[1] as max_id,
sort(original_ids - original_ids[1]) as remaining_ids
from (
select sort_desc(string_to_array(original_ids,',')::int[]) as original_ids
from bad_design
) t
But you shouldn't be storing comma separated values to begin with

Can SQL CASE return a different number of columns in WHEN clauses?

Is there a way to SELECT a different number of columns in different WHEN clauses of the same CASE statement?
For example
SELECT
CASE x
WHEN is y THEN show me 1 column
WHEN is z THEN show me 3 columns
END
FROM i;
The restriction is that all branches of the CASE expression must resolve to the same data type. The manual:
The data types of all the result expressions must be convertible to a
single output type. See Section 10.5 for more details.
If all your output columns have a compatible data type, you could use an ARRAY to include a variable number of columns (resulting in the same array type). Like:
SELECT CASE x
WHEN 1 THEN ARRAY[y]
WHEN 2 THEN ARRAY[x,y,z]
-- no ELSE defaults to NULL
END AS my_result_array
FROM tbl;
If not, you could cast to a common element type of your choice (text would be the safe default):
SELECT CASE x
WHEN 1 THEN ARRAY[x::text]
WHEN 2 THEN ARRAY[x::text, y::text, z::text]
END AS my_result_array
FROM tbl;
Or, to make it work for heterogeneous data types, you can use a composite type (row type) and pad with NULL values. Name, number and type of output columns are fixed and have to cover all possible result combinations.
CREATE TYPE foo (a int, b text, c date);
SELECT CASE x
WHEN 1 THEN (x, NULL, NULL)::foo
WHEN 2 THEN (x, y, z)::foo
END AS my_result_array
FROM tbl;
Or you could use a document type like json, hstore or xml to contain a variable number of column ...
Note that you get one result column either way, the workarounds just contain a variable payload - which can be decomposed in the next step.
No you can't since columns selection are static. If you really want to choose columns dynamically based on some condition then consider building a dynamic query for that purpose.

SQLite sort varchar that contains letter and number

I have a table where I have field traffic_num(varchar) that contains this data
A100
A586
A4594
A125
A2
A492
now I want to sort that data ascending order. It would be really easy if traffic_num contains only number (without letter A), then I could cast varchar to integer CAST(traffic_num as SIGNED INTEGER) ASC. But what to do in this situation?
You can use substr to first order by the first alphabet character and then by the number following it.
select * from tablename
order by substr(traffic_num,1,1), cast(substr(traffic_num,2) as signed integer)
A simple way to do this is by sorting by the length and then the value:
order by length(traffic_num), traffic_num
This works when the values are integers.

determine DB2 text string length

I am trying to find out how to write an SQL statement that will grab fields where the string is not 12 characters long. I only want to grab the string if they are 10 characters.
What function can do this in DB2?
I figured it would be something like this, but I can't find anything on it.
select * from table where not length(fieldName, 12)
From similar question DB2 - find and compare the lentgh of the value in a table field - add RTRIM since LENGTH will return length of column definition. This should be correct:
select * from table where length(RTRIM(fieldName))=10
UPDATE 27.5.2019: maybe on older db2 versions the LENGTH function returned the length of column definition. On db2 10.5 I have tried the function and it returns data length, not column definition length:
select fieldname
, length(fieldName) len_only
, length(RTRIM(fieldName)) len_rtrim
from (values (cast('1234567890 ' as varchar(30)) ))
as tab(fieldName)
FIELDNAME LEN_ONLY LEN_RTRIM
------------------------------ ----------- -----------
1234567890 12 10
One can test this by using this term:
where length(fieldName)!=length(rtrim(fieldName))
This will grab records with strings (in the fieldName column) that are 10 characters long:
select * from table where length(fieldName)=10
Mostly we write below statement
select * from table where length(ltrim(rtrim(field)))=10;

I'm confused about Sqlite comparisons on a text column

I've got an Sqlite database where one of the columns is defined as "TEXT NOT NULL". Some of the values are strings and some can be cast to a DOUBLE and some can be case to INTEGER. Once I've narrowed it down to DOUBLE values, I want to do a query that gets a range of data. Suppose my column is named "Value". Can I do this?
SELECT * FROM Tbl WHERE ... AND Value >= 23 AND Value < 42
Is that going to do some kind of ASCII comparison or a numeric comparison? INTEGER or REAL? Does the BETWEEN operator work the same way?
And what happens if I do this?
SELECT MAX(Value) FROM Tbl WHERE ...
Will it do string or integer or floating-point comparisons?
It is all explained in the Datatypes In SQLite Version 3 article. For example, the answer to the first portion of questions is
An INTEGER or REAL value is less than any TEXT or BLOB value. When an INTEGER or REAL is compared to another INTEGER or REAL, a numerical comparison is performed.
This is why SELECT 9 < '1' and SELECT 9 < '11' both give 1 (true).
The expression "a BETWEEN b AND c" is treated as two separate binary comparisons "a >= b AND a <= c"
The most important point to know is that column type is merely an annotation; SQLite is dynamically typed so each value can have any type.
you cant convert text to integer or double so you wont be able to do what you want.
If the column were varchar you could have a chance by doing:
select *
from Tbl
WHERE ISNUMERIC(Value ) = 1 --condition to avoid a conversion from string to int for example
and cast(value as integer) > 1 --rest of your conditions