Does Big Query support custom sorting? - sql

I am trying to sort data by applying case when statement in the order by clause but looks like Big Query doesn't support even though it worked fine in other SQL environments. Can somebody share your thoughts on this.

Update (2021) - Bigquery now does support ORDER BY with expressions, e.g.
SELECT event_type, COUNT(*) as event_count
FROM events
GROUP BY event
ORDER BY (
CASE WHEN event='generated' THEN 1
WHEN event='sent' THEN 2
WHEN event='paid' THEN 3
ELSE 4
END
)

select x
from (
select x ,
case when x = 'a' then 'z' else x end as y
from
(select 'a' as x),
(select 'b' as x),
(select 'c' as x),
(select 'd' as x)
)
order by y desc

I think the documentation is pretty clear:
ORDER BY clause
... ORDER BY field1|alias1 [DESC|ASC], field2|alias2 [DESC|ASC] ...
The ORDER BY clause sorts the results of a query in ascending or
descending order of one or more fields. Use DESC (descending) or ASC
(ascending) to specify the sort direction. ASC is the default.
You can sort by field names or by aliases from the SELECT clause. To
sort by multiple fields or aliases, enter them as a comma-separated
list. The results are sorted on the fields in the order in which they
are listed.
So, BigQuery doesn't allow expressions in the ORDER BY. However, you can include the expression in the SELECT and then refer to it by the alias. So, BigQuery does support "custom sorting", but only by expressions in the SELECT.
Interestingly, Hive has a similar limitation.

Related

How to write a custom sort using SQLite

I have a single table
Create Table Part(Part TEXT, Rev TEXT, DateCode Date, Unique(Part,Rev))
Is it possible to perform a custom sort by DateCode DESC but for the records with same Part should be grouped together for example result:
PART_1, B, 2022-02-14
PART_1, A, 1999-01-11
PART_2, C, 2000-02-24
PART_2, B, 1998-11-12
PART_2, A, 1998-11-10
My instinct tells me it must be done with
ORDER BY CASE WHEN....
But my knowledge is not good enough to continue. Please help me.
You can use MAX() window function in the ORDER BY clause to get the max DateCode of each part and sort by that descending:
SELECT *
FROM Part
ORDER BY MAX(DateCode) OVER (PARTITION BY Part) DESC,
Part, -- just in case 2 different parts have the same max DateCode
DateCode DESC;
See the demo.
To me it looks a simple case of sorting it by Part first and Date second
SELECT * FROM Part order by Part,DateCode Desc
Sqlfiddle for SQLlite for this case here
I think I am surely missing something ..

How to determine the order of the result from my postgres query?

I have the following query:
SELECT
time as "time",
case
when tag = 'KEB1.DB_BP.01.STATUS.SOC' THEN 'SOC'
when tag = 'KEB1.DB_BP.01.STATUS.SOH' THEN 'SOH'
end as "tag",
value as "value"
FROM metrics
WHERE
("time" BETWEEN '2021-07-02T10:39:47.266Z' AND '2021-07-09T10:39:47.266Z') AND
(container = '1234') AND
(tag = 'KEB1.DB_BP.01.STATUS.SOC' OR tag = 'KEB1.DB_BP.01.STATUS.SOH')
GROUP BY 1, 2, 3
ORDER BY time desc
LIMIT 2
This is giving me the result:
Sometimes the order changes of the result changes from SOH -> SOC or from SOC -> SOH. I'm trying to modify my query so I always get SOH first and than SOC.. How can I achieve this?
You have two times that are identical. The order by is only using time as a key. When the key values are identical, the resulting order for those keys is arbitrary and indeterminate. In can change from one execution to the next.
To prevent this, add an additional column to the order by so each row is unique. In this case that would seem to be tag:
order by "time", tag
You want to show the two most recent rows. In your example these have the same date/time but they can probably also differ. In order to find the two most recent rows you had to apply an ORDER BY clause.
You want to show the two rows in another order, however, so you must place an additional ORDER BY in your query. This is done by selecting from your query result (i.e. putting your query into a subquery):
select *
from ( <your query here> ) myquery
order by tag desc;
Try this:
order by 1 desc, 2
(order by first column descending and by the second column)

oracle decode group by warning

I have this code
for x_eo in ( select decode(mod(card_name_id,2),0,1,1,2) e_o, count(*) nr
from rp_Deck where session_id=p_session_id_in
and position<=35 group by mod(card_name_id,2) )
I am getting sqldeveloper warning that select list inconsistent with group by.
And developer gives me solution:
select decode(mod(card_name_id,2),0,1,1,2) e_o, count(*) nr
from rp_Deck where session_id=p_session_id_in
and position<=35 group by mod(card_name_id,2), card_name_id, 2, decode(mod(card_name_id,2),0,1,1,2) )
What is difference between these two group by ?
Thanks !
In general, when you use GROUP BY in a statement then all the values either need to be:
constants;
within aggregation functions; or
in the GROUP BY clause.
SQL Developer does not realise that decode(value_mod_2,0,1,1,2) is effectively just adding 1 to the value and does not change the allocation of items to groups so, since it is not either a constant or an aggregation function, it expects the entire function to be in the GROUP BY clause.
Personally, I would write it as:
select mod(card_name_id,2) + 1 e_o,
count(*) nr
from rp_Deck
where session_id=p_session_id_in
and position<=35
group by mod(card_name_id,2)
(the + 1 is a constant so does not need to be in the GROUP BY clause)
The solution SQL Developer proposes is wrong as:
select decode(mod(card_name_id,2),0,1,1,2) e_o,
count(*) nr
from rp_Deck
where session_id=p_session_id_in
and position<=35
group by
mod(card_name_id,2),
card_name_id,
2,
decode(mod(card_name_id,2),0,1,1,2)
is effectively the same as just grouping by the finest grained grouping, so:
group by card_name_id;
Which is not what you want to group by. To be the same as your original query's intended output, it should propose something like:
group by
mod(card_name_id,2),
decode(mod(card_name_id,2),0,1,1,2)
or more simply just:
group by
decode(mod(card_name_id,2),0,1,1,2)

deterministic stats_mode in Oracle

In Oracle, stats_mode function selects the mode of a set of data. Unfortunately, it is non-deterministic in picking it's result in the presence of ties (e.g. stats_mode(1,2,1,2) could return 1 or 2 depending on the ordering of rows inside Oracle. In many situations this is not acceptable. Is there a function or nice technique for being able to supply your own deterministic ordering for stats_mode function?
Oracle's web-page on STATS_MODE explains that If more than one mode exists, Oracle Database chooses one and returns only that one value.
As there are no additional parameters, etc, you can not change it's behaviour.
The same page, however, does also show that the following sample query can generate multiple mode values...
SELECT x FROM (SELECT x, COUNT(x) AS cnt1 FROM t GROUP BY x)
WHERE cnt1 = (SELECT MAX(cnt2) FROM (SELECT COUNT(x) AS cnt2 FROM t GROUP BY x));
By modifying such code you could once again just choose a single value, as determined by a specified ORDER...
SELECT x FROM (SELECT x, MAX(y) AS y, COUNT(x) AS cnt1 FROM t GROUP BY x)
WHERE cnt1 = (SELECT MAX(cnt2) FROM (SELECT COUNT(x) AS cnt2 FROM t GROUP BY x))
AND rownum = 1
ORDER BY y DESC;
A bit messy, unfortunately, though you may be able to tidy it slightly for your particular case. But I'm not aware of alternative fundamentally different approaches.
Selecting the value among a set of values with the highest occurring frequency could also be done by counting and ordering.
select x from t group by x order by count(*) desc limit 1;
You can also make it deterministic by ordering on the value itself.
select x from t group by x order by count(*) desc, x desc limit 1;
I don't quite understand the complexity of Oracles query examples, the performance is really bad. Can anyone shine some light on the difference?

How to do this query in T-SQL

I have table with 3 columns A B C.
I want to select * from this table, but ordered by a specific ordering of column A.
In other words, lets' say column A contains "stack", "over", "flow".
I want to select * from this table, and order by column A in this specific ordering: "stack", "flow", "over" - which is neither ascending nor descending.
Is it possible?
You can use a CASE statement in the ORDER BY clause. For example ...
SELECT *
FROM Table
ORDER BY
CASE A
WHEN 'stack' THEN 1
WHEN 'over' THEN 2
WHEN 'flow' THEN 3
ELSE NULL
END
Check out Defining a Custom Sort Order for more details.
A couple of solutions:
Create another table with your sort order, join on Column A to the new table (which would be something like TERM as STRING, SORTORDER as INT). Because things always change, this avoids hard coding anything and is the solution I would recommend for real world use.
If you don't want the flexibility of adding new terms and orders, just use a CASE statement to transform each term into an number:
CASE A WHEN 'stack' THEN 1 WHEN 'over' THEN 2 WHEN 'flow' THEN 3 END
and use it in your ORDER BY.
If you have alot of elements with custom ordering, you could add those elements to a table and give them a value. Join with the table and each column can have a custom order value.
select
main.a,
main.b,
main.c
from dbo.tblMain main
left join tblOrder rank on rank.a = main.a
order by rank.OrderValue
If you have only 3 elements as suggested in your question, you could use a case in the order by...
select
*
from dbo.tblMain
order by case
when a='stack' then 1
when a='flow' then 2
when a='over' then 3
else 4
end