How to explicitly access a new column in SQL from GROUP BY? - sql

I'm using SQLite.
I reduced my problem to this query:
WITH list_5([value]) AS (VALUES(1),(2),(3))
SELECT (t1.[value] * 2) AS [value], SUM(t1.[value]) AS [sum]
FROM [list_5] t1
GROUP BY value / 2;
I want SQL to group by the new value column, instead of the old one.
I can explicitly refer to the old value (t1.value), but how do I refer to the new one?
No matter what I do, group by uses the old value of value, instead of the column that's multiplied by 2.
The obvious solution would be to change the column names to unique names. But that's exactly what I'm trying to avoid.
Is there a way to do this?

You could use positional notation:
SELECT (t1.[value] * 2) AS [value], SUM([t1.value]) AS [sum]
FROM [list_5] t1
GROUP BY 1;
Or give it a name that does not conflict with a table column:
SELECT (t1.[value] * 2) AS computed, SUM([t1.value]) AS [sum]
FROM [list_5] t1
GROUP BY computed;

Please use the same sql operation in the Group by clause instead of alias,
WITH list_5([value]) AS (VALUES(1),(2),(3))
SELECT (t1.[value] * 2) AS [value], SUM([t1.value]) AS [sum]
FROM [list_5] t1
GROUP BY (t1.[value] * 2);

Related

How to use to functions - MAX(smthng) and after COUNT(MAX(smthng)

I don't understand why I can't use this in my code :
SELECT MAX(SMTHNG), COUNT(MAX(SMTHNG))
FROM SomeTable;
Searched for an answer but didn't find it in documentation about these aggregate functions.
Also I get an SQL-compiler error "Invalid column name "SMTHNG"".
You want to know what the maximum SMTHNG in the table is with:
SELECT MAX(SMTHNG) FROM SomeTable;
This is an aggregation without GROUP BY and hence results in one single row containing the maximum SMTHNG.
Now you also want to know how often this SMTHNG occurs and you add COUNT(MAX(SMTHNG)). This, however, does not work, because you can not aggregate an aggregate directly.
This doesn't work either:
SELECT ANY_VALUE(max_smthng), COUNT(*)
FROM (SELECT MAX(smthng) AS max_smthng FROM sometable) t;
because the sub query only contains one row, so it's too late to count.
So, either use a sub query and select from the table again:
SELECT ANY_VALUE(smthng), COUNT(*)
FROM sometable
WHERE smthng = (SELECT MAX(smthng) FROM sometable);
Or count per SMTHNG before looking for the maximum. Here is how to get the counts:
SELECT smthng, COUNT(*)
FROM sometable
GROUP BY smthng;
And the easiest way to get the maximum from this result is:
SELECT TOP(1) smthng, COUNT(*)
FROM sometable
GROUP BY smthng
ORDER BY COUNT(*) DESC;
First of all, please read my comment.
Depending on what you're trying to achieve, the statement have to be changed.
If you want to count the highest values in SMTHNG field, you may try this:
SELECT T1.SMTHNG, COUNT(T1.SMTHNG)
FROM SomeTable T1 INNER JOIN
(
SELECT MAX(SMTHNG) AS A
FROM SomeTable
) T2 ON T1.SMTHNG = T2.A
GROUP BY T1.SMTHNG;
use cte like below or subquery
with cte as
(
select count(*) as cnt ,col from table_name
group by col
) select max(cnt) from cte
you can not use double aggregate function at a time on same column

count all the distinct records in a table

I need to count all the distinct records in a table name with a single query and also without using any sub-query.
My code is
select count ( distinct *) from table_name
It gives an error:
Incorrect syntax near '*'.
I am using Microsoft SQL Server
Try this -
SELECT COUNT(*)
FROM
(SELECT DISTINCT * FROM [table_name]) A
I'm afraid that if you don't want to use a subquery, the only way to achieve that is replacing * with a concatenation of the columns in your table
select count(distinct concat(column1, column2, ..., columnN))
from table_name
To avoid undesired behaviours (like the concatenation of 1 and 31 becoming equal to the concatenation of 13 and 1) you could add a reasonable separator
select count(distinct concat(column1, '$%&£', column2, '$%&£', ..., '$%&£', columnN)
from table_name
You can use CTE.
;WITH CTE AS
(
SELECT DISTINCT * FROM TableName
)
SELECT COUNT(*)
FROM CTE
Hope this query gives you what you required.
As others mentioned, you cannot use DISTINCT with *. Also it is good practice to use a column name instead of the *, like a unique key / primary key of the table.
SELECT COUNT( DISTINCT id )
FROM table
select distinct Name , count(Name) from TableName
group by Name
having count(Name)=1
select ##rowcount
I had the same issue involving a query that had multiple joins to tables and I could not simply do count(distinct ) or count(distinct alias.).
My solution was to create a string made up of the key columns I cared about and count them.
SELECT Count(DISTINCT person.first || '~' || person.last)
from person;
If you want to use DISTINCT keyword, you need to specify column name on which bases you want to get distinct records.
Example:
SELECT count(DISTINCT Column-Name) FROM table_name

Summing Values From Calculated Column From Distinct Rows

I have data in the following format.
Column 1 and Value from the database. I use a LEFT() function to extract Column 2. Where I need help is to sum the values from the newly calculated Column 2 and list the sums a new column.
Any help is appreciated. Thanks.
Basically, you seem to want an aggregation with a function for the aggregation key:
select left(column1, 1), sum(value)
from t
group by left(column1, 1);
SELECT *
,CalculatedColumn2 = LEFT(Column1,1)
,Value
,CalculatedSum = SUM(Value) OVER (PARTITION BY LEFT(Column1,1))
FROM
Table
While Gordon's answer get's you the SUM if you want it per row you can use a partitioned window Function such as SUM() OVER.
You can achieve it using Windows Functions
Try the following query
SELECT column1
, SUBSTRING(column1,1,1) as [calculated column 2]
,value
, SUM(value) OVER(PARTITION BY SUBSTRING(column1,1,1) )
FROM table1
And you can also use LEFT(Column1,1) instead of SUBSTRING(column1,1,1) (String Functions)
Get the first character from column1 using LEFT function and use that result set as a sub-query and find the sum of value column group by the new column.
Query
SELECT t.[column2], SUM(t.[value]) as [value] FROM(
SELECT [column1], LEFT([column1], 1) AS [column2], [value]
FROM [your_table_name]
)t
GROUP BY t.[column2];
Add computed persisted column which can contain values from left(column1, 1). You shloudn't repeat your code.
alter table tbl add newcomputedcol column (left(column1, 1)) persisted
Use grouped subqry or use over clause:
select
*,
sum(value) over() over(partition by newcomputedcol) SumPerNewComputedCol
from tbl
Your new persisted column can be indexed, your qry can be added into view.

Is there a way to access the "previous row" value in a SELECT statement?

I need to calculate the difference of a column between two lines of a table. Is there any way I can do this directly in SQL? I'm using Microsoft SQL Server 2008.
I'm looking for something like this:
SELECT value - (previous.value) FROM table
Imagining that the "previous" variable reference the latest selected row. Of course with a select like that I will end up with n-1 rows selected in a table with n rows, that's not a probably, actually is exactly what I need.
Is that possible in some way?
Use the lag function:
SELECT value - lag(value) OVER (ORDER BY Id) FROM table
Sequences used for Ids can skip values, so Id-1 does not always work.
SQL has no built in notion of order, so you need to order by some column for this to be meaningful. Something like this:
select t1.value - t2.value from table t1, table t2
where t1.primaryKey = t2.primaryKey - 1
If you know how to order things but not how to get the previous value given the current one (EG, you want to order alphabetically) then I don't know of a way to do that in standard SQL, but most SQL implementations will have extensions to do it.
Here is a way for SQL server that works if you can order rows such that each one is distinct:
select rank() OVER (ORDER BY id) as 'Rank', value into temp1 from t
select t1.value - t2.value from temp1 t1, temp1 t2
where t1.Rank = t2.Rank - 1
drop table temp1
If you need to break ties, you can add as many columns as necessary to the ORDER BY.
WITH CTE AS (
SELECT
rownum = ROW_NUMBER() OVER (ORDER BY columns_to_order_by),
value
FROM table
)
SELECT
curr.value - prev.value
FROM CTE cur
INNER JOIN CTE prev on prev.rownum = cur.rownum - 1
Oracle, PostgreSQL, SQL Server and many more RDBMS engines have analytic functions called LAG and LEAD that do this very thing.
In SQL Server prior to 2012 you'd need to do the following:
SELECT value - (
SELECT TOP 1 value
FROM mytable m2
WHERE m2.col1 < m1.col1 OR (m2.col1 = m1.col1 AND m2.pk < m1.pk)
ORDER BY
col1, pk
)
FROM mytable m1
ORDER BY
col1, pk
, where COL1 is the column you are ordering by.
Having an index on (COL1, PK) will greatly improve this query.
LEFT JOIN the table to itself, with the join condition worked out so the row matched in the joined version of the table is one row previous, for your particular definition of "previous".
Update: At first I was thinking you would want to keep all rows, with NULLs for the condition where there was no previous row. Reading it again you just want that rows culled, so you should an inner join rather than a left join.
Update:
Newer versions of Sql Server also have the LAG and LEAD Windowing functions that can be used for this, too.
select t2.col from (
select col,MAX(ID) id from
(
select ROW_NUMBER() over(PARTITION by col order by col) id ,col from testtab t1) as t1
group by col) as t2
The selected answer will only work if there are no gaps in the sequence. However if you are using an autogenerated id, there are likely to be gaps in the sequence due to inserts that were rolled back.
This method should work if you have gaps
declare #temp (value int, primaryKey int, tempid int identity)
insert value, primarykey from mytable order by primarykey
select t1.value - t2.value from #temp t1
join #temp t2
on t1.tempid = t2.tempid - 1
Another way to refer to the previous row in an SQL query is to use a recursive common table expression (CTE):
CREATE TABLE t (counter INTEGER);
INSERT INTO t VALUES (1),(2),(3),(4),(5);
WITH cte(counter, previous, difference) AS (
-- Anchor query
SELECT MIN(counter), 0, MIN(counter)
FROM t
UNION ALL
-- Recursive query
SELECT t.counter, cte.counter, t.counter - cte.counter
FROM t JOIN cte ON cte.counter = t.counter - 1
)
SELECT counter, previous, difference
FROM cte
ORDER BY counter;
Result:
counter
previous
difference
1
0
1
2
1
1
3
2
1
4
3
1
5
4
1
The anchor query generates the first row of the common table expression cte where it sets cte.counter to column t.counter in the first row of table t, cte.previous to 0, and cte.difference to the first row of t.counter.
The recursive query joins each row of common table expression cte to the previous row of table t. In the recursive query, cte.counter refers to t.counter in each row of table t, cte.previous refers to cte.counter in the previous row of cte, and t.counter - cte.counter refers to the difference between these two columns.
Note that a recursive CTE is more flexible than the LAG and LEAD functions because a row can refer to any arbitrary result of a previous row. (A recursive function or process is one where the input of the process is the output of the previous iteration of that process, except the first input which is a constant.)
I tested this query at SQLite Online.
You can use the following funtion to get current row value and previous row value:
SELECT value,
min(value) over (order by id rows between 1 preceding and 1
preceding) as value_prev
FROM table
Then you can just select value - value_prev from that select and get your answer

MSSQL Select statement with incremental integer column... not from a table

I need, if possible, a t-sql query that, returning the values from an arbitrary table, also returns a incremental integer column with value = 1 for the first row, 2 for the second, and so on.
This column does not actually resides in any table, and must be strictly incremental, because the ORDER BY clause could sort the rows of the table and I want the incremental row in perfect shape always.
The solution must run on SQL Server 2000
For SQL 2005 and up
SELECT ROW_NUMBER() OVER( ORDER BY SomeColumn ) AS 'rownumber',*
FROM YourTable
for 2000 you need to do something like this
SELECT IDENTITY(INT, 1,1) AS Rank ,VALUE
INTO #Ranks FROM YourTable WHERE 1=0
INSERT INTO #Ranks
SELECT SomeColumn FROM YourTable
ORDER BY SomeColumn
SELECT * FROM #Ranks
Order By Ranks
see also here Row Number
You can start with a custom number and increment from there, for example you want to add a cheque number for each payment you can do:
select #StartChequeNumber = 3446;
SELECT
((ROW_NUMBER() OVER(ORDER BY AnyColumn)) + #StartChequeNumber ) AS 'ChequeNumber'
,* FROM YourTable
will give the correct cheque number for each row.
Try ROW_NUMBER()
http://msdn.microsoft.com/en-us/library/ms186734.aspx
Example:
SELECT
col1,
col2,
ROW_NUMBER() OVER (ORDER BY col1) AS rownum
FROM tbl
It is ugly and performs badly, but technically this works on any table with at least one unique field AND works in SQL 2000.
SELECT (SELECT COUNT(*) FROM myTable T1 WHERE T1.UniqueField<=T2.UniqueField) as RowNum, T2.OtherField
FROM myTable T2
ORDER By T2.UniqueField
Note: If you use this approach and add a WHERE clause to the outer SELECT, you have to added it to the inner SELECT also if you want the numbers to be continuous.