SQL fill gaps with hold - sql

I've encountered a problem I cannot solve with my knowledge and I haven't found any solutions I understood good enough to solve my problem.
So here is what I try to achieve.
I have a database with the following structure:
node_id, source_time, value
1 , 10:13:15 , 1
2 , 10:13:15 , 1
2 , 10:13:16 , 2
1 , 10:13:19 , 2
1 , 10:13:25 , 3
2 , 10:13:28 , 3
I want to have a sql query to get the following output
time , value1, value2
10:13:15, 1 , 1
10:13:16, 1 , 2
10:13:19, 2 , 2
10:13:25, 3 , 2
10:13:28, 3 , 3
You see, the times are all times that occur from both nodes.
But the values have to be filled in the gaps since node1 has no value for the time :16 and :28.
I got it to the point where I get the 2 columns from one table. That was not the hard part.
SELECT T1.[value], T2.[value]
FROM [db1].[t_value_history] T1, [db1].[t_value_history] T2
WHERE ( T1.node_id = 1 AND T2.node_id = 2)
But the result doesn't look like the way I want it to be.
I found something with COALESCE and another table which holds the previous value. But that looked quiet complicated for such a easy thing.
I guess there is an easy sql solution but I haven't had much time to get into the materia.
I would be happy to get any idea which function to use.
Thanks so far.
Edit: Changed the database, made a mistake on the last line.
Edit2: I am using SQL Server. Sorry for not clarifying this. Also the values are not neccessarily increasing. I just used increasing numbers in this example here.

This works in SQL Server. If you are certain that there is a value for both nodes for the minimum time then you could change the OUTER APPLY to a CROSS APPLY, which would perform better.
WITH times
AS ( SELECT DISTINCT
source_time
FROM dbo.t_value_history
)
SELECT t.source_time ,
n1.value ,
n2.value
FROM times AS t
OUTER APPLY ( SELECT TOP 1
h.value
FROM dbo.t_value_history AS h
WHERE h.node_id = 1
AND h.source_time <= t.source_time
ORDER BY h.source_time DESC
) AS n1
OUTER APPLY ( SELECT TOP 1
h.value
FROM dbo.t_value_history AS h
WHERE h.node_id = 2
AND h.source_time <= t.source_time
ORDER BY h.source_time DESC
) AS n2;

You could use conditional aggregation to get the right set of rows:
select vh.source_time,
max(case when vh.node_id = 1 then value end) as value_1,
max(case when vh.node_id = 2 then value end) as value_2
from db1.t_value_history vh
group by vh.source_time;
If you want to fill in the values, then the best solution is lag() with ignore nulls. Supported by ANSI, but not by SQL Server (which I'm guessing you are using). Your values appear to be increasing. If that is the case, you can use a cumulative max:
select vh.source_time,
max(max(case when vh.node_id = 1 then value end)) over (order by vh.source_time) as value_1,
max(max(case when vh.node_id = 2 then value end) over (order by vh.source_time) as value_2
from db1.t_value_history vh
group by vh.source_time;
In your data, value is increasing, so this works for the data in your example. If that is not the case, a more complex query is needed to fill in the gaps.

This will do it in SQL Server. It is not 'nice' though:
SELECT DISTINCT
T1.source_time,
CASE WHEN T1.node_id = 1 THEN T1.[value] ELSE ISNULL(T2.[value], T3.[value]) END,
CASE WHEN T1.node_id = 1 THEN ISNULL(T2.[value], T3.[Value]) ELSE T1.[value] END
FROM
[db1].[t_value_history] T1
LEFT OUTER JOIN [db1].[t_value_history] T2 ON T2.source_time = T1.source_time
AND T2.node_id <> T1.node_id -- This join looks for a value for the other node at the same time.
LEFT OUTER JOIN [db1].[t_value_history] T3 ON T3.source_time < T1.source_time
AND T3.node_id <> T1.node_id -- If the previous join is empty, this looks for values for the other node at previous times
LEFT OUTER JOIN [db1].[t_value_history] T4 ON T4.source_time > T3.source_time
AND T4.source_time < T1.source_time
AND T4.node_id <> T1.node_id -- This join makes sure there aren't any more recent values
WHERE
T4.node_id IS NULL

Related

I just started learning SQL and I couldn't do the query, can you help me?

There is a field in the sql query that I can't do. First of all, a new column must be added to the table below. The value of this column needs to be percent complete, so it's a percentage value. So for example, there are 7 values from Cupboard=1 shelves. Where IsCounted is here, 3 of them are counted. In other words, those with Cupboard = 1 should write the percentage value of 3/7 as the value in the new column to be created. If the IsCounted of the others is 0, it will write zero percent. How can I do this?
My Sql Code:
SELECT a.RegionName,
a.Cupboard,
a.Shelf,
(CASE WHEN ToplamSayım > 0 THEN 1 ELSE 0 END) AS IsCounted
FROM (SELECT p.RegionName,
r.Shelf,
r.Cupboard,
(SELECT COUNT(*)
FROM FAZIKI.dbo.PM_ProductCountingNew
WHERE RegionCupboardShelfTypeId = r.Id) AS ToplamSayım
FROM FAZIKI.dbo.DF_PMRegionType p
JOIN FAZIKI.dbo.DF_PMRegionCupboardShelfType r ON p.Id = r.RegionTypeId
WHERE p.WarehouseId = 45) a
ORDER BY a.RegionName;
The result is as in the picture below:
It looks like a windowed AVG should do the trick, although it's not entirely clear what the partitioning column should be.
The SELECT COUNT can be simplified to an EXISTS
SELECT a.RegionName,
a.Cupboard,
a.Shelf,
a.IsCounted,
AVG(a.IsCounted * 1.0) OVER (PARTITION BY a.RegionName, a.Cupboard) Percentage
FROM (
SELECT p.RegionName,
r.Shelf,
r.Cupboard,
CASE WHEN EXISTS (SELECT 1
FROM FAZIKI.dbo.PM_ProductCountingNew pcn
WHERE pcn.RegionCupboardShelfTypeId = r.Id
) THEN 1 ELSE 0 END AS IsCounted
FROM FAZIKI.dbo.DF_PMRegionType p
JOIN FAZIKI.dbo.DF_PMRegionCupboardShelfType r ON p.Id = r.RegionTypeId
WHERE p.WarehouseId = 45
) a
ORDER BY a.RegionName;

How to sort the query result by the number of specific column in ACCESS SQL?

For example:
This is the original result
Alpha Beta
A 1
B 2
B 3
C 4
After Order by the number of Alpha, this is the result I want
Alpha Beta
B 2
B 3
A 1
C 4
I tried to use GroupBy and OrderBy, but ACCESS always ask me to include all columns.
Why is 'B' placed before 'A' ? I don't understand this order..
Any way, doesn't seem like you need a group by, not from your data sample, but for your desired result you can use CASE EXPRESSION :
SELECT t.alpha,t.beta FROM YourTable t
ORDER BY CASE WHEN t.alpha = 'B' THEN 1 ELSE 0 END DESC,
t.aplha,
t.beta
EDIT: Use this query:
SELECT t.alpha,t.beta FROM YourTable t
INNER JOIN(SELECT s.alpha,count(*) as cnt
FROM YourTable s
GROUP BY s.alpha) t2
ON(t.aplha = t2.alpha)
ORDER BY t2.cnt,t.alpha,t.beta
The query counts number of rows for every distinct Alpha and sorts. General Sql, tweak for ACCESS if needed.
SELECT t1.alpha,t1.beta
FROM t t1
JOIN (
SELECT t2.alpha, count(t2.*) AS n FROM t t2 GROUP BY t2.alpha
) t3 ON t3.alpha = t1.alpha
ORDER BY t3.n, t1.alpha, t1.beta

SQL using CASE in SELECT with GROUP BY. Need CASE-value but get row-value

so basicially there is 1 question and 1 problem:
1. question - when I have like 100 columns in a table(and no key or uindex is set) and I want to join or subselect that table with itself, do I really have to write out every column name?
2. problem - the example below shows the 1. question and my actual SQL-statement problem
Example:
A.FIELD1,
(SELECT CASE WHEN B.FIELD2 = 1 THEN B.FIELD3 ELSE null FROM TABLE B WHERE A.* = B.*) AS CASEFIELD1
(SELECT CASE WHEN B.FIELD2 = 2 THEN B.FIELD4 ELSE null FROM TABLE B WHERE A.* = B.*) AS CASEFIELD2
FROM TABLE A
GROUP BY A.FIELD1
The story is: if I don't put the CASE into its own select statement then I have to put the actual rowname into the GROUP BY and the GROUP BY doesn't group the NULL-value from the CASE but the actual value from the row. And because of that I would have to either join or subselect with all columns, since there is no key and no uindex, or somehow find another solution.
DBServer is DB2.
So now to describing it just with words and no SQL:
I have "order items" which can be divided into "ZD" and "EK" (1 = ZD, 2 = EK) and can be grouped by "distributor". Even though "order items" can have one of two different "departements"(ZD, EK), the fields/rows for "ZD" and "EK" are always both filled. I need the grouping to consider the "departement" and only if the designated "departement" (ZD or EK) is changing, then I want a new group to be created.
SELECT
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END) AS ZD,
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END) AS EK,
TABLE.DISTRIBUTOR,
sum(TABLE.SOMETHING) AS SOMETHING,
FROM TABLE
GROUP BY
ZD
EK
TABLE.DISTRIBUTOR
TABLE.DEPARTEMENT
This here worked in the SELECT and ZD, EK in the GROUP BY. Only problem was, even if EK was not the designated DEPARTEMENT, it still opened a new group if it changed, because he was using the real EK value and not the NULL from the CASE, as I was already explaining up top.
And here ladies and gentleman is the solution to the problem:
SELECT
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END) AS ZD,
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END) AS EK,
TABLE.DISTRIBUTOR,
sum(TABLE.SOMETHING) AS SOMETHING,
FROM TABLE
GROUP BY
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END),
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END),
TABLE.DISTRIBUTOR,
TABLE.DEPARTEMENT
#t-clausen.dk: Thank you!
#others: ...
Actually there is a wildcard equality test.
I am not sure why you would group by field1, that would seem impossible in your example. I tried to fit it into your question:
SELECT FIELD1,
CASE WHEN FIELD2 = 1 THEN FIELD3 END AS CASEFIELD1,
CASE WHEN FIELD2 = 2 THEN FIELD4 END AS CASEFIELD2
FROM
(
SELECT * FROM A
INTERSECT
SELECT * FROM B
) C
UNION -- results in a distinct
SELECT
A.FIELD1,
null,
null
FROM
(
SELECT * FROM A
EXCEPT
SELECT * FROM B
) C
This will fail for datatypes that are not comparable
No, there's no wildcard equality test. You'd have to list every field you want tested individually. If you don't want to test each individual field, you could use a hack such as concatenating all the fields, e.g.
WHERE (a.foo + a.bar + a.baz) = (b.foo + b.bar + b.az)
but either way, you're listing all of the fields.
I might tend to solve it something like this
WITH q as
(SELECT
Department
, (CASE WHEN DEPARTEMENT = 1 THEN ZD
WHEN DEPARTEMENT = 2 THEN EK
ELSE null
END) AS GRP
, DISTRIBUTOR
, SOMETHING
FROM mytable
)
SELECT
Department
, Grp
, Distributor
, sum(SOMETHING) AS SumTHING
FROM q
GROUP BY
DEPARTEMENT
, GRP
, DISTRIBUTOR
If you need to find all rows in TableA that match in TableB, how about INTERSECT or INTERSECT DISTINCT?
select * from A
INTERSECT DISTINCT
select * from B
However, if you only want rows from A where the entire row matches the values in a row from B, then why does your sample code take some values from A and others from B? If the row matches on all columns, then that would seem pointless. (Perhaps your question could be explained a bit more fully?)

How do I use the value from row above when a given column value is zero?

I have a table of items by date (each row is a new date). I am drawing out a value from another column D. I need it to replace 0s though. I need the following logic: when D=0 for that date, use the value in column D from the date prior.
Actually, truth be told, I need it to say, when D is 0, use the value from the latest date where D was not a 0, but the first will get me most of the way there.
Is there a way to build this logic? Maybe a CTE?
Thank you very much.
PS I'm using SSMS 2008.
EDIT: I wasn't very clear at first. The value I want to change is not the date. I want change the value in D with the latest non-zero value from D, based on date.
May be the following query might help you. It uses the OUTER APPLY to fetch the results. Screenshot #1 shows the sample data and query output against the sample data. This query can be written better but this is what I could come up with right now.
Hope that helps.
SELECT ITM.Id
, COALESCE(DAT.New_D, ITM.D) AS D
, ITM.DateValue
FROM dbo.Items ITM
OUTER APPLY (
SELECT
TOP 1 D AS New_D
FROM dbo.Items DAT
WHERE DAT.DateValue < ITM.DateValue
AND DAT.D <> 0
AND ITM.D = 0
ORDER BY DAT.DateValue DESC
) DAT
Screenshot #1:
UPDATE t
Set value = SELECT value
FROM table
WHERE date = (SELECT MAX(t1.date)
FROM table t1
WHERE t1.value != 0
AND t1.date < t.date)
FROM table t
WHERE t.value = 0
You could maybe something like this as part of an update script...
SET myTable.D = (
SELECT TOP 1 myTable2.D
FROM myTable2
WHERE myTable2.myDateField < myTable.myDateField
AND myTable2.D != 0
ORDER BY myTable2.myDateField DESC)
That's assuming that you want to actually update the data though rather than just replace the values for the purpose of a select query.
How about:
SELECT
i.ID,
i.DateValue,
D = CASE WHEN I.D <> 0 THEN I.D ELSE X.D END
FROM
Items I
OUTER APPLY (
SELECT TOP 1 S.D
FROM Items S
WHERE S.DATEVALUE < I.DATEVALUE AND S.D <> 0
ORDER BY S.DATEVALUE DESC
) X
SELECT t.id,
CASE WHEN t.D = 0 THEN t0.D
ELSE t.D END
FROM table AS t
LEFT JOIN table AS t0
ON t0.time =
(
SELECT MAX(time) FROM t0
WHERE t0.time < t.time
AND t0.D != 0
)
or if you want to avoid aggregates entirely,
SELECT t.id,
CASE WHEN t.D = 0 THEN t0.D
ELSE t.D END
FROM table AS t
LEFT JOIN table AS t0
ON t0.time < t.time
LEFT JOIN table AS tx
ON tx.time > t0.time
WHERE t0.D != 0
AND tx.D != 0
AND tx.id IS NULL -- i.e. there isn't any

SQL Summing Multiple Joins

I shortened the code quite a bit, but hopefully someone will get the idea of what i am tryign to do. Need to sum totals from two different selects, i tried putting each of them in Left Outer Joins(tried Inner Joins too). If i run wiht either Left Outer Join commented out, I get the correct data, but when i run them together, i get really screwed up counts. So, i know joins are probably not the correct approach to summing data from the same table, i can;t simple do it in a where clause there is other table involved int he code i commented out.
I guess i am trying to sum together 2 different queries.
SELECT eeoc.EEOCode AS 'Test1',
SUM(eeosum.Col_One) AS 'Col_One',
FROM EEO1Analysis eeo
LEFT OUTER JOIN (
SELECT eeor.AnalysisID, eeor.Test1,
SUM(CASE eeor.ZZZ WHEN 1 THEN (CASE eeor.AAAA WHEN 1 THEN 1 ELSE 0 END) ELSE 0 END) AS 'Col_One',
FROM EEO1Roster eeor
..........
WHERE eeor.AnalysisID = 7
GROUP BY eeor.AnalysisID, eeor.EEOCode
) AS eeosum2 ON eeosum2.AnalysisID = eeo.AnalysisID
LEFT OUTER JOIN (
SELECT eeor.AnalysisID, eeor.Test1,
SUM(CASE eeor.ZZZ WHEN 1 THEN (CASE eeor.AAAA WHEN 1 THEN 1 ELSE 0 END) ELSE 0 END) AS 'Col_One',
FROM EEO1Roster eeor
........
) AS eeosum ON eeosum.AnalysisID = eeo.AnalysisID
WHERE eeo.AnalysisID = 7
GROUP BY eeoc.Test1
You could UNION ALL the 2 queries and then do a SUM + GROUP BY i.e.
SELECT Col1, Col2, SUM(Col_One) FROM
(SELECT Col1, Col2, SUM(Col_One)
FROM Table1
WHERE <Conditionset1>
GROUP BY Col1, Col2
UNION ALL
SELECT Col1, Col2, SUM(Col_One)
FROM Table1
WHERE <Conditionset2>
GROUP BY Col1, Col2)
GROUP BY
Col1, Col2
Of course, if there is are row(s) returned by and they would be double counted.
What about
SELECT ... FROM EEO1Analysis eeo,
(SELECT ... LEFT OUTER JOIN ... GROUP BY ... ) AS data
...
?
And, if you can, I'd recommend preparing the data to separate tables, then operate on them with different analysis IDs. Could save some execution time at least.
Need to sum totals from two different selects
If you expect one row single-column result, this way is enough
SELECT
((SELECT SUM(...) FROM ... GROUP BY...) +
(SELECT SUM(...) FROM ... GROUP BY...)) as TheSumOfTwoSums