What is the Postgres "All" operator? - sql

I was reading a query that had the all keyword within a function call:
select count(all 97);
┌───────────┐
│ count(97) │
╞═══════════╡
│ 1 │
└───────────┘
Elapsed: 11 ms
What does all (outside a subselect) do in postgres? I was having a hard time finding it in the documentation.

ALL is a "set quantifier" as well as DISTINCT for aggregated functions. It's defined in section 6.5 of the SQL Standard SQL-92.
It means that all values need to be considered -- as in a multiset -- and not only distinct values -- as in a set. It's the default behavior if no quantifier is specified.
Excerpt from SQL-92:
6.5 <set function specification>
...
<general set function> ::=
<set function type>
<left paren> [ <set quantifier> ] <value expression> <right paren>
<set function type> ::= AVG | MAX | MIN | SUM | COUNT
<set quantifier> ::= DISTINCT | ALL
Syntax Rules
1) If <set quantifier> is not specified, then ALL is implicit.
...

Seems it's just an explicit way to say 'default behavior' as opposed to doing COUNT(DISTINCT ...). From the docs:
aggregate_name (expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]
aggregate_name (ALL expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]
The first form of aggregate expression invokes the aggregate once for each input row. The second form is the same as the first, since ALL is the default. The third form invokes the aggregate once for each distinct value of the expression (or distinct set of values, for multiple expressions) found in the input rows. The fourth form invokes the aggregate once for each input row; since no particular input value is specified, it is generally only useful for the count(*) aggregate function. The last form is used with ordered-set aggregate functions, which are described below.
https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-AGGREGATES

Related

Databases that support SELECT * EXCEPT/REPLACE?

BigQuery supports the following notation for SQL:
select_list:
{ select_all | select_expression } [, ...]
select_all:
[ expression. ]*
[ EXCEPT ( column_name [, ...] ) ]
[ REPLACE ( expression [ AS ] column_name [, ...] ) ]
Meaning something like the following can be done:
SELECT * EXCEPT (id, socialSecurity)
And some other small things.
Do any databases support this? I find the EXCEPT clause useful, and although I know how to use the REPLACE I've never found an actual practical use case for that ever. Are there ever any practical uses of that (i.e., aside from made up examples in the docs)?

Unable to use nested window functions in Looker

which is similar to MYSQL, and having trouble understanding why I'm unable to use SUM & POWER together in a window function. Specifically the SUM(POWER("DELTA"... line throws the following error:
SQL compilation error: Window function [AVG(CAST(VALUE AS NUMBER(38,3))) OVER (PARTITION BY ID)] may not be nested inside another window function.
Removing this line or moving it to the second select statement fixes the error, and all the other . I think this is a more fundamental SQL misunderstanding I have. Any thoughts would be much appreciated!
WITH UTILS AS (
SELECT
ID,
VALUE AS "TEMP_CELSIUS,
AVG(VALUE) OVER(PARTITION BY ID) AS "TEMP_AVG",
VAR_POP(VALUE) OVER(PARTITION BY ID) As "TEMP_VAR",
STDDEV_POP(VALUE) OVER(PARTITION BY ID) As "TEMP_STD",
COUNT(VALUE) OVER(PARTITION BY ID) As "DEVICE_N",
(VALUE-"TEMP_AVG") AS "DELTA",
SUM(POWER("DELTA", 3)) OVER(PARTITION BY ID) AS "SKEW2"
FROM
TABLE1
)
SELECT
"SKEW2"
FROM
UTILS
Just to be clear, your problem wasn't POWER exactly, but that POWER referred to TEMP_AVG, a derived column involving a window function.
In standard SQL, we also can't refer to one derived column in a subsequent item in the same select list.
This is a problem:
SELECT a + b AS c
, c + d AS e
FROM t1
;
Here are some fragments of the standard SQL specification that refer specifically to window functions and some supporting items.
You asked about "fundamental SQL". While your database might support non-standard behavior, it might be helpful to understand the behavior provided by standard SQL, at least from a foundational perspective, as a starting point.
This is just a small part of that detail from a slightly older version of the specification (the foundation document dated 2011-12). Much of this hasn't changed.
In Section: 7.12 <query specification> we have:
<query specification> ::= SELECT [ <set quantifier> ] <select list> <table expression>
You can see that your select list is followed by a table expression.
Your question is basically about what is allowed in the <select list>.
But to understand the select list behavior, we need to peek for a moment at the term called <table expression>:
<table expression> ::=
<from clause>
[ <where clause> ]
[ <group by clause> ]
[ <having clause> ]
[ <window clause> ]
In the corresponding Syntax Rules for a <query expression>, we find the following:
2) Let T be the result of the <table expression> simply contained in QSPEC.
Basically, T is the result of your FROM clause, which optionally includes WHERE, GROUP BY, etc.
Now let's jump to the expression you asked about, the window function.
The Syntax Rule which applies:
11) Each column reference contained in a <window function> shall unambiguously reference a column of T.
In your case, you were referring to a derived column in the current select list, not to a column of T.
That's the problem. The standard limits you to just columns of T.
Be aware, I'm picking out very tiny bits from the huge specification. But I think this points out the basic detail.
One common approach is to simply move the window function to a subsequent query expression, so that it refers to a T which contains the first derived column.
WITH cte1 AS (
SELECT ...
, AVG(...) OVER ... AS temp_avg
FROM t1
)
SELECT ...
, SUM(temp_avg) OVER ...
FROM cte1
;
In some cases, if you wish, you can just repeat an expression used in the derived column, without using separate CTE terms or derived tables. At least in standard SQL, we can't do that with two window functions.

What is this TSQL V() method/syntax/function?

I have just come across this syntax for defining an inline view...
SELECT myAlias, myAlias1 FROM ( SELECT myCol, myCol1 FROM myTable ) V( myAlias, myAlias1)
I can see what the V is doing, but what is this called and where is it documented? And why would I ever want to do this when I can just define the aliases inside the inline view?
Googling seems not to be working because V is not a word!
V is just another alias - it's the alias for the whole subquery, not for an individual column.
See the derived table line from the syntax for FROM:
| derived_table [ [ AS ] table_alias ] [ ( column_alias [ ,...n ] ) ]
V is the table_alias.
when I can just define the aliases inside the inline view
Yes, you often can. But sometimes you're building a complex query with lots of nesting in the individual column expressions, and it's easier to place all of the names (the table_alias and column_aliases) that will be exposed to the remaining parts of the query in one place.

What does 'AS' mean in SQL?

Below is the synopsis of SELECT from the PostgreSQL documentation. It seems to me that sometimes we write <expression> AS <name> and sometimes it's <name> AS <expression>. In ordinary English, I tend to think <expression> AS <name> is much more common (e.g. "Address her as. Doctor Smith, please., and I'm having trouble understanding how to think about <name> AS <expression>.
How can we distinguish between where to use <name> AS <expression> and <expression> as <name>?
What are minimal obvious examples of each?
Are there parallels of each kind in ordinary language, which would make it intuitively obvious when to use what?
[ WITH [ RECURSIVE ] with_query [, ...] ]
SELECT [ ALL | DISTINCT [ ON ( expression [, ...] ) ] ]
* | expression [ [ AS ] output_name ] [, ...]
[ FROM from_item [, ...] ]
[ WHERE condition ]
[ GROUP BY expression [, ...] ]
[ HAVING condition [, ...] ]
[ WINDOW window_name AS ( window_definition ) [, ...] ]
[ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select ]
[ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ]
[ LIMIT { count | ALL } ]
[ OFFSET start [ ROW | ROWS ] ]
[ FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY ]
[ FOR { UPDATE | SHARE } [ OF table_name [, ...] ] [ NOWAIT ] [...] ]
where from_item can be one of:
[ ONLY ] table_name [ * ] [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
( select ) [ AS ] alias [ ( column_alias [, ...] ) ]
with_query_name [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
function_name ( [ argument [, ...] ] ) [ AS ] alias [ ( column_alias [, ...] | column_definition [, ...] ) ]
function_name ( [ argument [, ...] ] ) AS ( column_definition [, ...] )
from_item [ NATURAL ] join_type from_item [ ON join_condition | USING ( join_column [, ...] ) ]
and with_query is:
with_query_name [ ( column_name [, ...] ) ] AS ( select | values | insert | update | delete )
TABLE [ ONLY ] table_name [ * ]
I like the question.
Here is how I see it and how I explain it to people, hope it helps:
Let's start with <expression> as <name>. The simplest analogy from real life is an abbreviation. It was created to make the code cleaner, easier to read and simply shorter. Let's imagine a scenario: we have data on all students from the Massachusetts Institute of Technology as well as Massachusetts Department of Motor Vehicles in our database and we want to find students that have speeding tickets and how much they have paid.
SELECT
government.SocialSecurityAdministration.FirstName,
government.SocialSecurityAdministration.LastName,
education.MasachusettsInstituteOfTechnology.Faculty
government.DepartmentOfMotorVehiclesTickets.TicketTotal,
government.SocialSecurityAdministration.SocialSecurityNumber
FROM
education.MasachusettsInstituteOfTechnology
INNER JOIN government.DepartmentOfMotorVehiclesTickets ON education.MasachusettsInstituteOfTechnology.SocialSecurityNumber = government.DepartmentOfMotorVehicles.SocialSecurityNumber
INNER JOIN government.SocialSecurityAdministration ON government.DepartmentOfMotorVehicles.SocialSecurityNumber = government.SocialSecurityAdministration.SocialSecurityNumber
Looks ugly, doesn't it? In real life, we've abbreviated the Massachusetts Institute of Technology to be MIT and the Department of Motor Vehicles to be DMV. I'm not aware of the official abbreviation for Social Security Administration (but we can come up with one) though we say SSN when we mean Social Security Number. Let's implement this idea:
SELECT
ssnAdm.FirstName,
ssnAdm.LastName,
ssnAdm.Faculty
dmv.TicketTotal,
ssnAdm.SocialSecurityNumber AS ssn
FROM
education.MasachusettsInstituteOfTechnology AS mit
INNER JOIN government.DepartmentOfMotorVehiclesTickets AS dmv ON mit.SocialSecurityNumber = dmv.SocialSecurityNumber
INNER JOIN government.SocialSecurityAdministration AS ssAdm ON dmv.SocialSecurityNumber = dmv.SocialSecurityNumber
Looks better now, doesn't it?
Now to the <name> as <expression> portion of it. This is done to simplify the code as well as some performance optimizations but let's focus on simplification for now. Using the same example I've used above, you might want to get/ask the following: "For every MIT student that has received a ticket I need to know the last 4 digits of their SSN, their last name, the amount of money in their bank account and their last VISA transaction amount". Yes, you work for CIA.
Let's write it:
SELECT
RIGHT(4,ts.ssn) as LastFourDigitsSsn,
ts.LastName,
bad.TotalAmount,
ISNULL(visa.TransactionAmt,'Student uses MasterCard') AS VisaTransaction
FROM
(SELECT
ssnAdm.FirstName,
ssnAdm.LastName,
ssnAdm.Faculty
dmv.TicketTotal,
ssnAdm.SocialSecurityNumber AS ssn
FROM
education.MasachusettsInstituteOfTechnology AS mit
INNER JOIN government.DepartmentOfMotorVehiclesTickets AS dmv ON mit.SocialSecurityNumber = dmv.SocialSecurityNumber
INNER JOIN government.SocialSecurityAdministration AS ssAdm ON dmv.SocialSecurityNumber = dmv.SocialSecurityNumber
) AS ts
INNER JOIN business.BankAccountsData AS bad ON ts.ssn = bad.SocialSecurityNumber
OUTER APPLY (SELECT TOP 1 TransactionAmt FROM business.VisaProcessingData vpd WHERE vpd.BankAccountID = bad.ID ORDER BY TransactionDateTime DESC) as visa
Well, looks ugly again. But what if we simplify it a bit and express certain things outside of the actual statement? That's when <name> as <expression> comes in. Let's do it:
WITH MitTicketedStudents AS (
SELECT
ssnAdm.LastName,
ssnAdm.SocialSecurityNumber as ssn,
RIGHT(4,ssnAdm.SocialSecurityNumber) as LastFourDigitsSsn
FROM
education.MasachusettsInstituteOfTechnology AS mit
INNER JOIN government.DepartmentOfMotorVehiclesTickets AS dmv ON mit.SocialSecurityNumber = dmv.SocialSecurityNumber
INNER JOIN government.SocialSecurityAdministration AS ssAdm ON dmv.SocialSecurityNumber = dmv.SocialSecurityNumber
),
LatestVisaTransactions AS (
SELECT DISTINCT
BankAccountID,
FIRST_VALUE(TransactionAmt) OVER (PARTITION BY BankAccountId ORDER BY TransactionDateTime DESC) as TransactionAmt
FROM
business.VisaProcessingData
)
-- And let's use our expressions now
SELECT
mts.LastFourDigitsSsn,
mts.LastName,
bad.TotalAmount,
ISNULL(lvt.TransactionAmt,'Student uses MasterCard') AS VisaTransaction
FROM
MitTicketedStudents mts
INNER JOIN business.BankAccountsData AS bad ON mts.ssn = bad.SocialSecurityNumber
LEFT OUTER JOIN LatestVisaTransactions lvt ON bad.ID = lvt.BankAccountID;
Looks better, doesn't it?
Conclusion: when you want to separate code you use <name> as <expression>, when you want to give something an alias to simplify code you use <expression> as <name>.
What matters is where it appears.
mytable:
mycolumn myothercolumn
----------------------
1 a
2 b
SELECT myrow.mycolumn * 2 AS mylabel
FROM (SELECT * FROM mytable) AS myrow
WHERE myrow.mycolumn > 1
mylabel
-------
4
In SELECT, we refer to the value of an expression AS some output column name ("column alias"). In FROM, we refer to (a typical row of) the value of a table expression AS some name ("table alias", "correlation name").
(It turns out that because of details of the grammar typos are less problematic if we use AS in SELECT clauses but don't use AS in FROM clauses.)
There are other uses of AS. The context also determines what they mean, and they also correspond to using using a name to refer to something.
In technical contexts it turns out not to be helpful to try to make sense of what something means based on the everyday meanings of technical terms, including making sense of what a thing is based on its name. The SQL language designers [sic] didn't choose to always have either <expression> AS <name> or <name> AS <expression>. That is just how it is. That is just how you write stuff to get your program to do stuff to stuff. (Accepted but more modern principles of computer language design do suggest more regular notations.)

Using an equality check between columns in a SELECT clause

I am using Microsoft SQL Server 2012 and I would like to run this seemingly simple query:
SELECT
FirstEvent.id AS firstEventID,
SecondEvent.id AS secondEventID,
DATEDIFF(second, FirstEvent.WndFGEnd, SecondEvent.WndFGStart) AS gap,
FirstEvent.TitleID = SecondEvent.TitleID AS titlesSameCheck
FROM VibeFGEvents AS FirstEvent
RIGHT OUTER JOIN VibeFGEvents AS SecondEvent
ON
FirstEvent.intervalMode = SecondEvent.intervalMode
AND FirstEvent.id = SecondEvent.id - 1
AND FirstEvent.logID = SecondEvent.logID
However FirstEvent.TitleID = SecondEvent.TitleID AS titlesSameCheck in the SELECT clause is incorrect syntax. But the SELECT Clause (Transact-SQL) documentation includes this syntax:
SELECT [ ALL | DISTINCT ]
[ TOP ( expression ) [ PERCENT ] [ WITH TIES ] ]
<select_list>
<select_list> ::=
{
*
| { table_name | view_name | table_alias }.*
| {
[ { table_name | view_name | table_alias }. ]
{ column_name | $IDENTITY | $ROWGUID }
| udt_column_name [ { . | :: } { { property_name | field_name }
| method_name ( argument [ ,...n] ) } ]
| expression
[ [ AS ] column_alias ]
}
| column_alias = expression
} [ ,...n ]
I think that means an expression is valid in the select clause and indeed the examples given include things like 1 + 2. Looking at the documentation for expressions:
{ constant | scalar_function | [ table_name. ] column | variable
| ( expression ) | ( scalar_subquery )
| { unary_operator } expression
| expression { binary_operator } expression
| ranking_windowed_function | aggregate_windowed_function
}
boolean equality checks are valid expressions and indeed the example expression given in the = (Equals) (Transact-SQL) documentation includes one:
SELECT DepartmentID, Name
FROM HumanResources.Department
WHERE GroupName = 'Manufacturing'
albeit in the WHERE clause not the SELECT clause. It looks like I cannot use = the equality operator to compare expressions in my SELECT clause as they are being wrongly interpreted as assignment.
How do I include a Boolean equality column comparison equivalent to FirstEvent.TitleID = SecondEvent.TitleID AS titlesSameCheck in my SELECT clause?
Like this:
case when FirstEvent.TitleID = SecondEvent.TitleID then 1 else 0 end as titlesSameCheck
You cannot use the Boolean type directly except in conditional statements (case, where, having, etc.)
Best way to solve your problem is to do something like
select case when x = y then 'true' else 'false' end
The bit type is probably the closest to boolean.
select CAST(case when x = y then 1 else 0 end as bit)
Of course, use whichever two values best represent what you are after.
As the two existing answers state, boolean values can't be returned as a column value. This is documented in the Comparison Operators section:
Unlike other SQL Server data types, a Boolean data type cannot be
specified as the data type of a table column or variable, and cannot
be returned in a result set.
Given that restriction, using CASE to transform the value to something that can be displayed is your best alternative.