Select all columns grouping by version - Postgres - sql

I need to query all columns in a table of all customers, the main factor being the latest version for each customer.
My table:
My Query:
SELECT DISTINCT ON(code)
code,
namefile,
versioncol,
status
FROM table_A
ORDER BY versioncol desc
Error:
ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY expressions
LINE 1: SELECT DISTINCT ON(code)

Postgres' error message is trying to tell you what to do:
DISTINCT ON expressions must match initial ORDER BY expressions
Actually that's quite clear: to make your code a valid DISTINCT ON query, you just need to add code (that's the DISTINCT ON expression) as a first sorting criteria to the query (ie as initial ORDER BY).
SELECT DISTINCT ON(code) a.*
FROM table_A a
ORDER BY code, versioncol DESC

Related

What's the meaning of this sql statement (order by count(*))?

What could be the meaning of this sql statement ?
select * from tab1 order by (select count(*) from tab2) desc
The below line just returns the number of rows in tab2, which is some constant number
select count(*) from tab2
Consider the columns numbered 1 through n where n is the last column.
select * from tab1 order by 1
would order by the first column
select * from tab1 order by 2
would order by the second column and etc.
If n is larger than the number of columns then you'll run into a problemEDIT
You are using a subquery however and having
select * from tbl1 order by (select 1000)
does not cause a problem if you have <1000 columns, it seems to do nothing; the query may be missing some information
The result is to order by the column whose index is the count returned by the inner query in the ORDER BY clause. Whoever wrote this, especially without a comment, should be hanged by body parts important for reproduction.
The answer is based in Microsoft SQL functionality [edit:] where subquery in ORDER BY (subquery) expression indicates sort value.
Here's how I see it: since tab2 is not linked to tab1 in a subquery, the SQL can be reduced to:
select * from tab1 order by (SELECT <CONSTANT>) desc
therefore it's equivalent to:
select * from tab1
Quite frankly all that query is going to do is return all records from tab1 in some unknown order.
The order by clause is a bit asinine because the value returned will always be the count of all records in tab2.
I suspect it's missing a where clause on the (select count(*) from tab2) part. Something along the lines of (select count(*) from tab2 t where t.tab1id = tab1.id) Although it's hard to say without knowing the structure of those two tables.
The ORDER BY is equivalent to ORDER BY 'X'; that is, it has no effect. It does not order by the column number referenced by the count(*) in the second query -- it is not equivalent to order by 3 if the second table has three rows.
See fiddles for Oracle, MySQL, and SQL Server. If the ORDER BY was based on the count(*), the result should then be sorted by the third column. None of them are. Also, a count(*)+100 has no effect.

Using the DISTINCT keyword causes this error: not a SELECTed expression

I have a query that looks something like this:
SELECT DISTINCT share.rooms
FROM Shares share
left join share.rooms.buildingAdditions.buildings.buildingInfoses as bi
... //where clause omitted
ORDER BY share.rooms.floors.floorOrder, share.rooms.roomNumber,
share.rooms.firstEffectiveAt, share.shareNumber, share.sharePercent
Which results in the following exception:
Caused by: org.hibernate.exception.SQLGrammarException: ORA-01791: not a SELECTed expression
If I remove the DISTINCT keyword, the query runs without issue. If I remove the order by clause, the query runs without issue. Unfortunately, I can't seem to get the ordered result set without duplicates.
You are trying to order your result with columns that are not being calculated. This wouldn't be a problem if you didn't have the DISTINCT there, but since your query is basically grouping only by share.rooms column, how can it order that result set with other columns that can have multiple values for the same share.rooms one?
This post is a little old but one thing I did to get around this error is wrap the query and just apply the order by on the outside like so.
SELECT COL
FROM (
SELECT DISTINCT COL, ORDER_BY_COL
FROM TABLE
// ADD JOINS, WHERE CLAUSES, ETC.
)
ORDER BY ORDER_BY_COL;
Hope this helps :)
When using DISTINCT in a query that has an ORDER BY you must select all the columns you've used in the ORDER BY statement:
SELECT DISTINCT share.rooms.floors.floorOrder, share.rooms.roomNumber,
share.rooms.firstEffectiveAt, share.shareNumber, share.sharePercent
...
ORDER BY share.rooms.floors.floorOrder, share.rooms.roomNumber,
share.rooms.firstEffectiveAt, share.shareNumber, share.sharePercent

Using distinct on a column and doing order by on another column gives an error

I have a table:
abc_test with columns n_num, k_str.
This query doesnt work:
select distinct(n_num) from abc_test order by(k_str)
But this one works:
select n_num from abc_test order by(k_str)
How do DISTINCT and ORDER BY keywords work internally that output of both the queries is changed?
As far as i understood from your question .
distinct :- means select a distinct(all selected values should be unique).
order By :- simply means to order the selected rows as per your requirement .
The problem in your first query is
For example :
I have a table
ID name
01 a
02 b
03 c
04 d
04 a
now the query select distinct(ID) from table order by (name) is confused which record it should take for ID - 04 (since two values are there,d and a in Name column). So the problem for the DB engine is here when you say
order by (name).........
You might think about using group by instead:
select n_num
from abc_test
group by n_num
order by min(k_str)
The first query is impossible.
Lets explain this by example. we have this test:
n_num k_str
2 a
2 c
1 b
select distinct (n_num) from abc_test is
2
1
Select n_num from abc_test order by k_str is
2
1
2
What do you want to return
select distinct (n_num) from abc_test order by k_str?
it should return only 1 and 2, but how to order them?
How do extended sort key columns
The logical order of operations in SQL for your first query, is (simplified):
FROM abc_test
SELECT n_num, k_str i.e. add a so called extended sort key column
ORDER BY k_str DESC
SELECT n_num i.e. remove the extended sort key column again from the result.
Thanks to the SQL standard extended sort key column feature, it is possible to order by something that is not in the SELECT clause, because it is being temporarily added to it behind the scenes prior to ordering, and then removed again after ordering.
So, why doesn't this work with DISTINCT?
If we add the DISTINCT operation, it would need to be added between SELECT and ORDER BY:
FROM abc_test
SELECT n_num, k_str i.e. add a so called extended sort key column
DISTINCT
ORDER BY k_str DESC
SELECT n_num i.e. remove the extended sort key column again from the result.
But now, with the extended sort key column k_str, the semantics of the DISTINCT operation has been changed, so the result will no longer be the same. This is not what we want, so both the SQL standard, and all reasonable databases forbid this usage.
Workarounds
PostgreSQL has the DISTINCT ON syntax, which can be used here for precisely this job:
SELECT DISTINCT ON (k_str) n_num
FROM abc_test
ORDER BY k_str DESC
It can be emulated with standard syntax as follows, if you're not using PostgreSQL
SELECT n_num
FROM (
SELECT n_num, MIN(k_str) AS k_str
FROM abc_test
GROUP BY n_num
) t
ORDER BY k_str
Or, just simply (in this case)
SELECT n_num, MIN(k_str) AS k_str
FROM abc_test
GROUP BY n_num
ORDER BY k_str
I have blogged about SQL DISTINCT and ORDER BY more in detail here.
You are selecting the collection distinct(n_num) from the resultset from your query. So there is no actual relation with the column k_str anymore. A n_num can be from two rows each having a different value for k_str. So you can't order the collection distinct(n_num) by k_str.
According to SQL Standards, a SELECT clause may refer either to as clauses ("aliases") in the top level SELECT clause or columns of the resultset by ordinal position, and therefore nether of your queries would be compliant.
It seems Oracle, in common with other SQL implemetations, allows you to refer to columns that existed (logically) immediately prior to being projected away in the SELECT clause. I'm not sure whether such flexibility is such a good thing: IMO it is good practice to expose the sort order to the calling application by including the column/expressions etc in the SELECT clause.
As ever, you need to apply dsicpline to get meaningful results. For your first query, the definition of order is potentially entirely arbitrary.You should be grateful for the error ;)
This approach is available in SQL server 2000, you can select distinct values from a table and order by different column which is not included in Distinct.
But in SQL 2012 this will through you an error
"ORDER BY items must appear in the select list if SELECT DISTINCT is specified."
So, still if you want to use the same feature as of SQL 2000 you can use the column number for ordering(its not recommended in best practice).
select distinct(n_num) from abc_test order by 1
This will order the first column after fetching the result. If you want the ordering should be done based on different column other than distinct then you have to add that column also in select statement and use column number to order by.
select distinct(n_num), k_str from abc_test order by 2
When I got same error, I got it resolved by changing it as
SELECT n_num
FROM(
SELECT DISTINCT(n_num) AS n_num, k_str
FROM abc_test
) as tbl
ORDER BY tbl.k_str
My query doesn't match yours exactly, but it's pretty close.
select distinct a.character_01 , (select top 1 b.sort_order from LookupData b where a.character_01 = b.character_01 )
from LookupData a
where
Dataset_Name = 'Sample' and status = 200
order by 2, 1
did you try this?
SELECT DISTINCT n_num as iResult
FROM abc_test
ORDER BY iResult
you can do
select distinct top 10000 (n_num) --assuming you won't have more than 10,000 rows
from abc_test order by(k_str)

Give priority to ORDER BY over a GROUP BY in MySQL without subquery

I have the following query which does what I want, but I suspect it is possible to do this without a subquery:
SELECT *
FROM (SELECT *
FROM 'versions'
ORDER BY 'ID' DESC) AS X
GROUP BY 'program'
What I need is to group by program, but returning the results for the objects in versions with the highest value of "ID".
In my past experience, a query like this should work in MySQL, but for some reason, it's not:
SELECT *
FROM 'versions'
GROUP BY 'program'
ORDER BY MAX('ID') DESC
What I want to do is have MySQL do the ORDER BY first and then the GROUP BY, but it insists on doing the GROUP BY first followed by the ORDER BY. i.e. it is sorting the results of the grouping instead of grouping the results of the ordering.
Of course it is not possible to write
SELECT * FROM 'versions' ORDER BY 'ID' DESC GROUP BY 'program'
Thanks.
By definition, ORDER BY is processed after grouping with GROUP BY. By definition, the conceptual way any SELECT statement is processed is:
Compute the cartesian product of all tables referenced in the FROM clause
Apply the join criteria from the FROM clause to filter the results
Apply the filter criteria in the WHERE clause to further filter the results
Group the results into subsets based on the GROUP BY clause, collapsing the results to a single row for each such subset and computing the values of any aggregate functions -- SUM(), MAX(), AVG(), etc. -- for each such subset. Note that if no GROUP BY clause is specified, the results are treated as if there is a single subset and any aggregate functions apply to the entire results set, collapsing it to a single row.
Filter the now-grouped results based on the HAVING clause.
Sort the results based on the ORDER BY clause.
The only columns allowed in the results set of a SELECT with a GROUP BY clause are, of course,
The columns referenced in the GROUP BY clause
Aggregate functions (such as MAX())
literal/constants
expresssions derived from any of the above.
Only broken SQL implementations allow things like select xxx,yyy,a,b,c FROM foo GROUP BY xxx,yyy — the references to colulmsn a, b and c are meaningless/undefined, given that the individual groups have been collapsed to a single row,
This should do it and work pretty well as long as there is a composite index on (program,id). The subquery should only inspect the very first id for each program branch, and quickly retrieve the required record from the outer query.
select v.*
from
(
select program, MAX(id) id
from versions
group by program
) m
inner join versions v on m.program=v.program and m.id=v.id
SELECT v.*
FROM (
SELECT DISTINCT program
FROM versions
) vd
JOIN versions v
ON v.id =
(
SELECT vi.id
FROM versions vi
WHERE vi.program = vd.program
ORDER BY
vi.program DESC, vi.id DESC
LIMIT 1
)
Create an index on (program, id) for this to work fast.
Regarding your original query:
SELECT * FROM 'versions' GROUP BY 'program' ORDER BY MAX('ID') DESC
This query would not parse in any SQL dialect except MySQL.
It abuses MySQL's ability to return ungrouped and unaggregated expressions from a GROUP BY statement.

GROUP BY / aggregate function confusion in SQL

I need a bit of help straightening out something, I know it's a very easy easy question but it's something that is slightly confusing me in SQL.
This SQL query throws a 'not a GROUP BY expression' error in Oracle. I understand why, as I know that once I group by an attribute of a tuple, I can no longer access any other attribute.
SELECT *
FROM order_details
GROUP BY order_no
However this one does work
SELECT SUM(order_price)
FROM order_details
GROUP BY order_no
Just to concrete my understanding on this.... Assuming that there are multiple tuples in order_details for each order that is made, once I group the tuples according to order_no, I can still access the order_price attribute for each individual tuple in the group, but only using an aggregate function?
In other words, aggregate functions when used in the SELECT clause are able to drill down into the group to see the 'hidden' attributes, where simply using 'SELECT order_no' will throw an error?
In standard SQL (but not MySQL), when you use GROUP BY, you must list all the result columns that are not aggregates in the GROUP BY clause. So, if order_details has 6 columns, then you must list all 6 columns (by name - you can't use * in the GROUP BY or ORDER BY clauses) in the GROUP BY clause.
You can also do:
SELECT order_no, SUM(order_price)
FROM order_details
GROUP BY order_no;
That will work because all the non-aggregate columns are listed in the GROUP BY clause.
You could do something like:
SELECT order_no, order_price, MAX(order_item)
FROM order_details
GROUP BY order_no, order_price;
This query isn't really meaningful (or most probably isn't meaningful), but it will 'work'. It will list each separate order number and order price combination, and will give the maximum order item (number) associated with that price. If all the items in an order have distinct prices, you'll end up with groups of one row each. OTOH, if there are several items in the order at the same price (say £0.99 each), then it will group those together and return the maximum order item number at that price. (I'm assuming the table has a primary key on (order_no, order_item) where the first item in the order has order_item = 1, the second item is 2, etc.)
The order in which SQL is written is not the same order it is executed.
Normally, you would write SQL like this:
SELECT
FROM
JOIN
WHERE
GROUP BY
HAVING
ORDER BY
Under the hood, SQL is executed like this:
FROM
JOIN
WHERE
GROUP BY
HAVING
SELECT
ORDER BY
Reason why you need to put all the non-aggregate columns in SELECT to the GROUP BY is the top-down behaviour in programming. You cannot call something you have not declared yet.
Read more: https://sqlbolt.com/lesson/select_queries_order_of_execution
SELECT *
FROM order_details
GROUP BY order_no
In the above query you are selecting all the columns because of that its throwing an error not group by something like..
to avoid that you have to mention all the columns whichever in select statement all columns must be in group by clause..
SELECT *
FROM order_details
GROUP BY order_no,order_details,etc
etc it means all the columns from order_details table.
To use group by clause you have to mention all the columns from select statement in to group by clause but not the column from aggregate function.
TO do this instead of group by you can use partition by clause you can use only one port to group as a partition by.
you can also make it as partition by 1
use Common table expression(CTE) to avoid this issue.
multiple CTes also come handy, pasting a case where I have used...maybe helpful
with ranked_cte1 as
( select r.mov_id,DENSE_RANK() over ( order by r.rev_stars desc )as rankked from ratings r ),
ranked_cte2 as ( select * from movie where mov_id=(select mov_id from ranked_cte1 where rankked=7 ) ) select * from ranked_cte2
select * from movie where mov_id=902