&& Operator change the order of result [duplicate] - sql

This question already has answers here:
Does order by in view guarantee order of select?
(3 answers)
Closed 6 years ago.
Why && operator of PostgreSQL 9.4 use for checking overlapping of two array, change the order of result?
I have a query
Select * FROM "View_Student_Plan" WHERE "ClassID" && ARRAY[53]:: bigint[]
My view is sorted in order of Admission date.
It works fine, if I use
Select * FROM "View_Student_Plan"
But when I attach remaining part with query it change the order of result.
I have used some other condition in where clause like Student_Name like 'P%', then it not affect order of result given by select statement. Then why don't for "ClassID" && ARRAY[53]:: bigint[]

In any RDBMS the order of the output is not set by default, and can be sorted differently every time! It can be sorted by indexes, the optimizer, default set up and ETC..
The only way of forcing a specific order(E.G. by Admission date) , is by using the ORDER BY clause .
EDIT: Regarding to your comment, a VIEW can't contain an ORDER BY clause, it is not effective as this clause should be at the end of the query , otherwise , it can be ignored.
E.G.
SELECT * FROM YourView
ORDER BY....--THIS WILL WORK
When it's inside the view its like:
SELECT *
FROM (SELECT * FROM ....
ORDER BY ..)
So the optimizer is free to ignore this.

Related

Why can we not use the Where Clause on a Grouped statement? [duplicate]

This question already has answers here:
SQL - WHERE Condition on SUM()
(4 answers)
Closed 5 years ago.
For a simplified example of my issue, why could I not do this?
select id_number, sum(revenue)
from table A
where sum(revenue)>1000
group by id_number
(In case this causes any confusion, why can I not only return the results that have over 1000 in revenue?)
Disclaimer, I'm somewhat new to SQL but couldn't find any documentation regarding this.
Thanks,
This is by design of SQL. By using WHERE You filter the source table. And the sequence of statement fragments is as written. That means You would like to filter the SUM which is applied on filtered table. That means You must use filter on already grouped result using HAVING clause. Use
select id_number, sum(revenue)
from table A
group by id_number
having sum(revenue) > 1000
Simple answer is because the WHERE clause is evaluated before the aggregation clause. Therefore, you are trying to filter based on something that doesn't exist yet. However, you can solve that problem by making it exist first. Write a subquery, then select from that:
WITH RevenueTotals AS (SELECT id_number, sum(revenue) AS Rev_Total
FROM table A
GROUP BY id_number)
SELECT id_number, Rev_Total
FROM RevenueTotals
WHERE Rev_Total > 1000

Postgres SQL - Column does not exist [duplicate]

This question already has answers here:
Using an Alias column in the where clause in Postgresql
(6 answers)
Closed 6 years ago.
SELECT nmemail as order_email,
dtorder,
vlOrder,
cohorts.cohortdate
FROM factorderline
JOIN (SELECT nmemail as cohort_email, Min(dtorder) AS cohortDate FROM factorderline GROUP BY cohort_email limit 5) cohorts
ON order_email= cohort_email limit 5;
ERROR: column "order_email" does not exist
What is the problem with this query?
The problem is most likely that the definition of the column alias hasn't been parsed at the time the join is evaluated; use the actual column name instead:
SELECT nmemail as order_email,
dtorder,
vlOrder,
cohorts.cohortdate
FROM factorderline
JOIN (
SELECT nmemail as cohort_email, Min(dtorder) AS cohortDate
FROM factorderline
GROUP BY cohort_email limit 5
) cohorts ON nmemail = cohort_email
limit 5;
Also, when using limit, you really should use an order by clause.
From the docs:
When using LIMIT, it is important to use an ORDER BY clause that
constrains the result rows into a unique order. Otherwise you will get
an unpredictable subset of the query's rows.
The problem is that output column names can't be used in joins.
From the documentation:
An output column's name can be used to refer to the column's value in ORDER BY and GROUP BY clauses, but not in the WHERE or HAVING clauses; there you must write out the expression instead.

Why do partitions require nested selects?

I have a page to show 10 messages by each user (don't ask me why)
I have the following code:
SELECT *, row_number() over(partition by user_id) as row_num
FROM "posts"
WHERE row_num <= 10
It doesn't work.
When I do this:
SELECT *
FROM (
SELECT *, row_number() over(partition by user_id) as row_num FROM "posts") as T
WHERE row_num <= 10
It does work.
Why do I need nested query to see row_num column? Btw, in first request I actually see it in results but can't use where keyword for this column.
It seems to be the same "rule" as any query, column aliases aren't visible to the WHERE clause;
This will also fail;
SELECT id AS newid
FROM test
WHERE newid=1; -- must use "id" in WHERE clause
SQL Query like:
SELECT *
FROM table
WHERE <condition>
will execute in next order:
3.SELECT *
1.FROM table
2.WHERE <condition>
so, as Joachim Isaksson say, columns in SELECt clause are not visible in WHERE clause, because of processing order.
In your second query, column row_num are fetched in FROM clause first, so it will be visible in WHERE clause.
Here is simple list of steps in order they executes.
There is a good reason for this rule in standard SQL.
Consider the statement:
SELECT *, row_number() over (partition by user_id) as row_num
FROM "posts"
WHERE row_num <= 10 and p.type = 'xxx';
When does the p.type = 'xxx' get evaluated relative to the row number? In other words, would this return the first ten rows of "xxx"? Or would it return the "xxx"s in the first ten rows?
The designers of the SQL language recognize that this is a hard problem to resolve. Only allowing them in the select clause resolves the issue.
You can check this topic and this one on dba.stockexchange.com about order in which SQL executes SELECT clause. I think it aplies not only for PostgreSQL, but for all RDBMS.

Exclude specific column from result in SQL Server [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
SQL exclude a column using SELECT * [except columnA] FROM tableA?
I have following query and I want to exclude the column RowNum from the result, how can I do it ?
SELECT *
FROM
(SELECT
ROW_NUMBER() OVER ( ORDER BY [Report].[dbo].[Reflow].ReflowID ) AS RowNum, *
FROM
[Report].[dbo].[Reflow]
WHERE
[Report].[dbo].[Reflow].ReflowProcessID = 2) AS RowConstrainedResult
WHERE
RowNum >= 100 AND RowNum < 120
ORDER BY
RowNum
Thanks.
It's considered bad practice to not specify column names in your query.
You could push the data into a #temp table, then ALTER the columns in that #temp to DROP a COLUMN, then SELECT * FROM #temp.
This would be inefficent, but it will get you the result you are asking for. By default though, it's best to get into the way of specifying all the columns you require. If someone ALTERs your initial table, even using the push #temp method above, you'll end up with different columns.
Do not use * but give the field lsit you are interested in. That simple. Using a "*" is bad practice anyawy as the order is not defined.
Because you want to order the results based on RowNum's values, you can not exclude this column from your results. You can save the result of your query in a temp table and then make another query on temp table and mention the columns that you want to show in the results(instead of select *). Such an approach will show all columns except RowNum which are ordered based on RowNum's values.
This should work, I dont know the names of your columns so used generic names. Try not to use * its considered bad practice, makes it difficult for people to read your code.
SELECT [column1],
[column2],
[etcetc]
FROM ( SELECT ROW_NUMBER() OVER(ORDER BY RowConstrainedResult.RowNum) [RN],
*
FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY [Report].[dbo].[Reflow].ReflowID ) AS RowNum, *
FROM [Report].[dbo].[Reflow]
WHERE [Report].[dbo].[Reflow].ReflowProcessID = 2
) AS RowConstrainedResult
WHERE RowNum >= 100
AND RowNum < 120

Is COUNT(fld) faster than COUNT(*)? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
COUNT(id) vs. COUNT(*) in MySQL
Short but simple: in MySQL, would a SELECT COUNT(fld) AS count FROM tbl be faster than SELECT COUNT(*) AS count FROM tbl as I understand * is the "all" selector in MySQL.
Does COUNT(*) select all rows to compute a count, and therefore make a query like SELECT(id) less expensive? Or does it not really matter?
No, count(*) is faster than count(fld) (in the cases where there is a difference at all).
The count(fld) has to consider the data in the field, as it counts all non-null values.
The count(*) only counts the number of records, so it doesn't need access to the data.
SELECT COUNT(*) AS count FROM tbl
The above query doesn't even count the rows assuming there's no WHERE clause, it reads directly from the table cache. Specifying a field instead of * forces SQL to actually count the rows, so it's much faster to use * when there's no WHERE clause.
* is the “all” selector in MySQL
That's true when you SELECT columns, where the * is a shortcut for the whole column list.
SELECT * becomes SELECT foo, bar.
But COUNT(*) is not expanded to COUNT(foo,bar) which is nonsensical in SQL. COUNT is an aggregate function which normally needs one value per selected row.