SQL carry forward non-null value within groups - sql

The following thread (SQL QUERY replace NULL value in a row with a value from the previous known value) was very helpful to carry forward non-null values, but I'm can't figure out how to add a grouping column.
For example, I would like to do the following:
Example Data
Here is how I would have liked to code it:
UPDATE t1
SET
#n = COALESCE(number, #n),
number = COALESCE(number, #n)
GROUP BY grp
I know this isn't possible, and have seen several solutions that rely on inner joins, but those examples focus on aggregation rather than carrying forward values. For example: SQL Server Update Group by.
My attempt to make this work is to do something like the following:
UPDATE t1
SET
#n = COALESCE(number, #n),
number = COALESCE(number, #n)
FROM t1
INNER JOIN (
-- Lost on what to put in the inner join...
SELECT grp, COUNT(*) FROM t1 GROUP BY grp
) t2
on t1.grp = t2.grp

I think you can do what you want with a correlated subquery:
UPDATE t1
SET number = (SELECT TOP (1) tt1.number
FROM t tt1
WHERE tt1.grp = t1.grp AND tt1.? <= t1.? AND tt1.number IS NOT NULL
ORDER BY t1.? DESC
)
FROM t1
WHERE t1.number IS NULL;
The ? is for the column that specifies "forward" in your expression "carry forward".

Related

Select top 1 column value in the group by clause?

I am joining two tables and my query contains the group by clause. There is one column in the join table which I want to show in the result but doesn't want it to be part of group by because values are different in column values. I just want to show any top value of that column.
I tried to use Distinct and Top 1 and others but nothing works for me.
SELECT t1.Code, t2.Details, t2.FineDate ,Sum(t1.Amount) from Emp_Actions t1
INNER JOIN Emp_Fines ON t1.Code = t2.Code
where (t1.Code = "MYParameter" or #MyParameter = "" )
group by t1.Code,t2.Details,t2.FineDate
Please note that I am using StoredProcedure and code be specific or all. My actual query is too big and I just made a sample to elaborate my issue. I need Top 1 FineDate, I don't want it to part of group by however I want to show it.
One way to get only one value, is using MAX(col) or MIN(col) in your SELECT statement, where col is the column that you don't want to group on. One advantage is that you get somewhat consistent values (first or last in order).
SELECT t1.Code, t2.Details, t2.FineDate, MAX(col) my_col, Sum(t1.Amount) from Emp_Actions t1
Also, there are more advanced analytical window functions (first_value for example), if you want to go that way, to have more control over which value is chosen, also depending on other column values.
https://learn.microsoft.com/en-us/sql/t-sql/functions/analytic-functions-transact-sql?view=sql-server-2017
Use MIN() or MAX():
select t1.Code, t2.Details, t2.FineDate, Sum(t1.Amount),
max(<your column here>) as column+name
from Emp_Actions t1 join
Emp_Fines
on t1.Code = t2.Code
where (t1.Code = "MYParameter" or #MyParameter = "" )
group by t1.Code, t2.Details, t2.FineDate
You can use use subquery for required result.
I am giving you a sample code :
Select
t1.Code, t2.Details, t2.FineDate,Sum(t1.Amount)
,(select top 1 a.ColA from Emp_Fines a where a.Code = t1.Code order by ColOfSort Asc/Desc) as RequiredColumn
from Emp_Actions t1
INNER JOIN Emp_Fines t2 ON t1.Code = t2.Code
where (t1.Code = "MYParameter" or #MyParameter = "" )
group by t1.Code,t2.Details,t2.FineDate
Where RequiredColumn is the column that you wants in the result but don't want to use in group by

Sum multiple columns using a subquery

I'm trying to play with Oracle's DB.
I'm trying to sum two columns from the same row and output a total on the fly.
However, I can't seem to get it to work. Here's the code I have so far.
SELECT a.name , SUM(b.sequence + b.length) as total
FROM (
SELECT a.name, a.sequence, b.length
FROM tbl1 a, tbl2 b
WHERE b.sequence = a.sequence
AND a.loc <> -1
AND a.id='10201'
ORDER BY a.location
)
The inner query works, but I can't seem to make the new query and the subquery work together.
Here's a sample table I'm using:
...[name][sequence][length]...
...['aa']['100000']['2000']...
...
...['za']['200000']['3001']...
And here's the output I'd like:
[name][ total ]
['aa']['102000']
...
['za']['203001']
Help much appreciated, thanks!
SUM() sums number across rows. Instead replace it with sequence + length.
...or if there is the possibility of NULL values occurring in either the sequence or length columns, use: COALESCE(sequence, 0) + COALESCE(length, 0).
Or, if your intention was indeed to produce a running total (i.e. aggregating the sum of all the totals and lengths for each user), add a GROUP BY a.name after the end of the subquery.
BTW: you shouldn't be referencing the internal aliases used inside a subquery from outside of that subquery. Some DB servers allow it (and I don't have convenient access to an Oracle server right now, so I can test it), but it's not really good practice.
I think what you are after is something like:
SELECT a.name,
SUM(B.sequence + B.length) AS total
FROM Tbl1 A
INNER JOIN Tbl2 B
ON B.sequence = A.sequence
WHERE A.loc <> -1
AND A.id = 10201
GROUP BY a.name
ORDER BY A.location
Your query with the subquery fails for several reasons:
You use the table alias a, but it is not defined.
You use the table alias b, but it is not defined.
You have a sum() in the select clause with unaggregated columns, but no group by.
In addition, you have an order by in the subquery which is allowed syntactically, but ignored.
Here is a better way to write the query without a subquery:
SELECT t1.name, (t1.sequence + t2.length) as total
FROM tbl1 t1 join
tbl2 t2
on t1.sequence = t2.sequence
where t1.loc <> -1 AND t1.id = '10201'
ORDER BY t1.location;
Note the use of proper join syntax, the use of aliases that make sense, and the simple calculation at this level.
Here is a version with a subquery:
select name, (sequence + length) as total
from (SELECT t1.name, t1.sequence, t2.length
FROM tbl1 t1 join
tbl2 t2
on t1.sequence = t2.sequence
where t1.loc <> -1 AND t1.id = '10201'
) t
ORDER BY location;
Note that the order by is going at the outer level. And, I gave the subquery an alias. This is not strictly required, but typically a good idea.

SQL - using a value in a nested select

Hope the title makes some kind of sense - I'd basically like to do a nested select, based on a value in the original select, like so:
SELECT MAX(iteration) AS maxiteration,
(SELECT column
FROM table
WHERE id = 223652
AND iteration = maxiteration)
FROM table
WHERE id = 223652;
I get an ORA-00904 invalid identifier error.
Would really appreciate any advice on how to return this value, thanks!
It looks like this should be rewritten with a where clause:
select iteration,
col
from tbl
where id = 223652
and iteration = (select max(iteration) from tbl where id = 223652);
You can circumvent the problem alltogether by placing the subselect in an INNER JOIN of its own.
SELECT t.iteration
, t.column
FROM table t
INNER JOIN (
SELECT id, MAX(iteration) AS iteration
FROM table
WHERE id = 223652
) tm ON tm.id = t.id AND tm.iteration = t.iteration
Since you're using Oracle, I'd suggest using analytic functions for this:
SELECT * FROM (
SELECT col,
iteration,
row_number() over (partition by id order by iteration desc) rn
FROM tab
WHERE id = 223652
) WHERE rn = 1
do it like this:
with maxiteration as
(
SELECT MAX(iteration) AS maxiteration
FROM table
WHERE id = 223652
)
select
column,
iteration
from
table
where
id = 223652
AND iteration = maxiteration
;
Not 100% sure on Oracle syntax, but isn't it something like:
select iteration, column from table where id = 223652 order by iteration desc limit 1
I would approach this problem in a slightly different way. You're basically looking for the row that has no other iterations greater than it. There are at least 3 ways I can think of to do this:
SELECT
T1.iteration AS maxiteration,
T1.column
FROM
Table T1
WHERE
T1.id = 223652 AND
NOT EXISTS
(
SELECT *
FROM Table T2
WHERE
T2.id = 223652 AND
T2.iteration > T1.iteration
)
Or...
SELECT
T1.iteration AS maxiteration,
T1.column
FROM
Table T1
LEFT OUTER JOIN Table T2 ON
T2.id = T1.id AND
T2.iteration > T1.iteration
WHERE
T1.id = 223652 AND
T2.id IS NULL
Or...
SELECT
T1.iteration AS maxiteration,
T1.column
FROM
Table T1
INNER JOIN (SELECT id, MAX(iteration) AS maxiteration FROM Table T2 GROUP BY id) SQ ON
SQ.id = T1.id AND
SQ.maxiteration = T1.iteration
WHERE
T1.id = 223652
EDIT: I didn't see the ORA error the first time reading the question and it wasn't tagged as Oracle specific. I think that there may be some differences in the syntax and use of aliases in Oracle, so you may need to tweak some of the above queries.
The Oracle error is telling you that it doesn't know what maxiteration is, because the column alias isn't available yet inside the subquery. You need to refer to it by the table alias and column name instead of the column alias I believe.
You do something like
select maxiteration,column from table a join (select max(iteration) as maxiteration from table where id=1) b using (id) where b.maxiteration=a.iteration;
This could of course return multiple rows for one maxiteration unless your table has a constraint against it.

JOIN without double values

I've got query
SELECT frst.Date,t1.Value
from
[ArchiveAnalog] frst
LEFT JOIN
(SELECT Date,Value FROM
[ArchiveAnalog] scnd
WHERE scnd.ID = 1) t1
ON t1.Date = frst.Date
order by frst.Date
but Join inserts values second time (not group with a main node)
if I do GROUP BY frst.Date , I've got error that I can't use t1 Value with it.
How can I make this JOIN without adding additional rows ?
Martin I want to show full Date and Value only if ID = 1 , also then I want to add t2 value column etc like here :
SELECT frst.Date,t1.Value,t2.Value
from
[ArchiveAnalog] frst
LEFT JOIN
(SELECT Date,Value FROM
[ArchiveAnalog] scnd
WHERE scnd.ID = 1) t1
ON t1.Date = frst.Date
LEFT JOIN
(SELECT Date,Value FROM
[ArchiveAnalog] scnd
WHERE scnd.ID = 2) t2
ON frst.Date = t2.Date
here I have x2 additional rows :S with doubling values, so I need to group them all with a some way.
You either need DISTINCT in the main query (if there's up to one row in t1 that you need), or GROUP BY. If you use GROUP BY, you need to aggregate t1.Value (e.g. sum, concatenate etc.).
Does this do what you need?
SELECT Date ,
MAX(CASE WHEN ID=1 THEN Value END) AS val1,
MAX(CASE WHEN ID=2 THEN Value END) AS val2
FROM [ArchiveAnalog]
WHERE ID IN (1,2) /*<-- Not sure if you need this line without seeing your data*/
GROUP BY Date

How do I compare 2 rows from the same table (SQL Server)?

I need to create a background job that processes a table looking for rows matching on a particular id with different statuses. It will store the row data in a string to compare the data against a row with a matching id.
I know the syntax to get the row data, but I have never tried comparing 2 rows from the same table before. How is it done? Would I need to use variables to store the data from each? Or some other way?
(Using SQL Server 2008)
You can join a table to itself as many times as you require, it is called a self join.
An alias is assigned to each instance of the table (as in the example below) to differentiate one from another.
SELECT a.SelfJoinTableID
FROM dbo.SelfJoinTable a
INNER JOIN dbo.SelfJoinTable b
ON a.SelfJoinTableID = b.SelfJoinTableID
INNER JOIN dbo.SelfJoinTable c
ON a.SelfJoinTableID = c.SelfJoinTableID
WHERE a.Status = 'Status to filter a'
AND b.Status = 'Status to filter b'
AND c.Status = 'Status to filter c'
OK, after 2 years it's finally time to correct the syntax:
SELECT t1.value, t2.value
FROM MyTable t1
JOIN MyTable t2
ON t1.id = t2.id
WHERE t1.id = #id
AND t1.status = #status1
AND t2.status = #status2
Some people find the following alternative syntax easier to see what is going on:
select t1.value,t2.value
from MyTable t1
inner join MyTable t2 on
t1.id = t2.id
where t1.id = #id
SELECT COUNT(*) FROM (SELECT * FROM tbl WHERE id=1 UNION SELECT * FROM tbl WHERE id=2) a
If you got two rows, they different, if one - the same.
SELECT * FROM A AS b INNER JOIN A AS c ON b.a = c.a
WHERE b.a = 'some column value'
I had a situation where I needed to compare each row of a table with the next row to it, (next here is relative to my problem specification) in the example next row is specified using the order by clause inside the row_number() function.
so I wrote this:
DECLARE #T TABLE (col1 nvarchar(50));
insert into #T VALUES ('A'),('B'),('C'),('D'),('E')
select I1.col1 Instance_One_Col, I2.col1 Instance_Two_Col from (
select col1,row_number() over (order by col1) as row_num
FROM #T
) AS I1
left join (
select col1,row_number() over (order by col1) as row_num
FROM #T
) AS I2 on I1.row_num = I2.row_num - 1
after that I can compare each row to the next one as I need