MSSQL server SQL query how to utilize "group by" - sql-server-2005

Table Data
My_table_data.
testdate plant value
2020-05-14 02:00:00.000 1 68
2020-05-14 01:00:00.000 3 115
2020-05-14 02:00:00.000 2 107
2020-05-14 03:00:00.000 2 102
2020-05-14 05:00:00.000 2 104
2020-05-14 06:00:00.000 2 111
2020-05-14 08:00:00.000 1 74
2020-05-14 08:00:00.000 2 114
2020-05-14 09:00:00.000 1 70
2020-05-14 09:00:00.000 2 114
2020-05-14 03:00:00.000 3 106
2020-05-14 03:00:00.000 3 102
2020-05-15 02:00:00.000 2 108
2020-05-14 05:00:00.000 1 74
2020-05-14 04:00:00.000 3 96
2020-05-14 04:00:00.000 3 97
2020-05-14 06:00:00.000 1 80
2020-05-14 03:00:00.000 1 77
2020-05-14 06:00:00.000 3 102
I am novice at complex SQL query phrase. How to achieve the result like below.
Conditions
1. Latest data on timestamp(Hour)
2. single row for each plant.
WITH data AS(
SELECT testdate, plant, value, (DATEPART(HOUR,testdate))AS dH FROM My_table_data
WHERE testdate >= (SELECT CONVERT(DATETIME, CONVERT(DATE, GETDATE()), 120))
)
SELECT data.testdate, data.plant, data.value from data
WHERE data.dH= (SELECT MAX(data.dH) FROM data)
GROUP BY data.plant
ORDER BY data.plant DESC;
It gives error!!
testdate Plant value
2020-05-14 09:00:00.000 1 70
2020-05-14 09:00:00.000 2 114
2020-05-14 06:00:00.000 3 102

I think this should be as simple as
with maxData
AS
(
SELECT MAX(testDate) as testDate, plantId
FROM My_table_data
GROUP BY plantId
)
select
t.testDate,
t.plantId,
t.value
from My_table_data t
INNER JOIN maxData m
on t.testDate = m.testDate
and t.plantId = m.plantId
order by plantId
WHat this does is uses a CTE to find the max date for every plantId and then joins that info back to the original table to find the value corresponding to that plantId and date.

Related

Update fields in table based on previous field records

I have a table of records for pay periods with fields WeekStart and WeekEnd populated for ever fiscal year. In my application a user should be able to update the first WeekEnd date for a given fiscal year and that should in turn update subsequent records WeekStart date by adding 1 day to the Previous WeekEnd date and for the same records's WeekEnd date, add 13 days to the new WeekStart date.
This is part of a Stored Procedure written in SQL Server 2016.
UPDATE [dbo].[staffing_BiweeklyPPCopy]
SET
[WeekStart] = DATEADD(DD, 1, LAG([WeekEnd], 1) OVER (ORDER BY [ID])),
[WeekEnd] = DATEADD(DD, 14, LAG([WeekEnd], 1) OVER (ORDER BY [ID]))
WHERE
[FiscalYear] = #fiscalyear
Original Table contents shown below...
ID WeekStart WeekEnd
163 2018-10-01 2018-10-13
164 2018-10-14 2018-10-27
165 2018-10-28 2018-11-10
166 2018-11-11 2018-11-24
167 2018-11-25 2018-12-08
168 2018-12-09 2018-12-22
169 2018-12-23 2019-01-05
170 2019-01-06 2019-01-19
171 2019-01-20 2019-02-02
172 2019-02-03 2019-02-16
173 2019-02-17 2019-03-02
174 2019-03-03 2019-03-16
175 2019-03-17 2019-03-30
176 2019-03-31 2019-04-13
177 2019-04-14 2019-04-27
178 2019-04-28 2019-05-11
179 2019-05-12 2019-05-25
180 2019-05-26 2019-06-08
181 2019-06-09 2019-06-22
182 2019-06-23 2019-07-06
183 2019-07-07 2019-07-20
184 2019-07-21 2019-08-03
185 2019-08-04 2019-08-17
186 2019-08-18 2019-08-31
187 2019-09-01 2019-09-14
188 2019-09-15 2019-09-28
189 2019-09-29 2019-09-30
For example if a user updates the weekend date for record ID 163 to '2018-10-14', the table will update as follows..
ID WeekStart WeekEnd
163 2018-10-01 2018-10-14
164 2018-10-15 2018-10-28
165 2018-10-29 2018-11-11
166 2018-11-12 2018-11-25
167 2018-11-26 2018-12-09
.
.
.
189 2019-09-30 2019-09-30
Thank you in advance.

LAG / OVER / PARTITION / ORDER BY using conditions - SQL Server 2017

I have a table that looks like this:
Date AccountID Amount
2018-01-01 123 12
2018-01-06 123 150
2018-02-14 123 11
2018-05-06 123 16
2018-05-16 123 200
2018-06-01 123 18
2018-06-15 123 17
2018-06-18 123 110
2018-06-30 123 23
2018-07-01 123 45
2018-07-12 123 116
2018-07-18 123 60
This table has multiple dates and IDs, along with multiple Amounts. For each individual row, I want grab the last Date where Amount was over a specific value for that specific AccountID. I have been trying to use the LAG( Date, 1 ) in combination with several variatons of CASE and OVER ( PARTITION BY AccountID ORDER BY Date ) statements but I've had no luck. Ultimately, this is what I would like my SELECT statement to return.
Date AccountID Amount LastOverHundred
2018-01-01 123 12 NULL
2018-01-06 123 150 2018-01-06
2018-02-14 123 11 2018-01-06
2018-05-06 123 16 2018-01-06
2018-05-16 123 200 2018-05-16
2018-06-01 123 18 2018-05-16
2018-06-15 123 17 2018-05-16
2018-06-18 123 110 2018-06-18
2018-06-30 123 23 2018-06-18
2018-07-01 123 45 2018-06-18
2018-07-12 123 116 2018-07-12
2018-07-18 123 60 2018-07-12
Any help with this would be greatly appreciated.
Use a cumulative conditional max():
select t.*,
max(case when amount > 100 then date end) over (partition by accountid order by date) as lastoverhundred
from t;

Wrong results with group by for distinct count

I have these two queries for calculating a distinct count from a table for a particular date range. In my first query I group by location, aRID ( which is a rule ) and date. In my second query I don't group by a date.
I am expecting the same distinct count in both the results but I get total count as 6147 in first result and 6359 in second result. What is wrong here? The difference is group by..
select
r.loc
,cast(r.date as DATE) as dateCol
,count(distinct r.dC) as dC_count
from table r
where r.date between '01-01-2018' and '06-02-2018'
and r.loc = 1
group by r.loc, r.aRId, cast(r.date as DATE)
select
r.loc
,count(distinct r.DC) as dC_count
from table r
and r.date between '01-01-2018' and '06-02-2018'
and r.loc = 1
group by r.loc, r.aRId
loc dateCol dC_count
1 2018-01-22 1
1 2018-03-09 2
1 2018-01-28 3
1 2018-01-05 1
1 2018-05-28 143
1 2018-02-17 1
1 2018-05-08 187
1 2018-05-31 146
1 2018-01-02 3
1 2018-02-14 1
1 2018-05-11 273
1 2018-01-14 1
1 2018-03-18 2
1 2018-02-03 1
1 2018-05-20 200
1 2018-05-14 230
1 2018-01-11 5
1 2018-01-31 1
1 2018-05-17 209
1 2018-01-20 2
1 2018-03-01 1
1 2018-01-03 3
1 2018-05-06 253
1 2018-05-26 187
1 2018-03-24 1
1 2018-02-09 1
1 2018-03-04 1
1 2018-05-03 269
1 2018-05-23 187
1 2018-05-29 133
1 2018-03-21 1
1 2018-03-27 1
1 2018-05-15 202
1 2018-03-07 1
1 2018-06-01 155
1 2018-02-21 1
1 2018-01-26 2
1 2018-02-15 2
1 2018-05-12 331
1 2018-03-10 1
1 2018-01-09 3
1 2018-02-18 1
1 2018-03-13 2
1 2018-05-09 184
1 2018-01-12 2
1 2018-03-16 1
1 2018-05-18 198
1 2018-02-07 1
1 2018-02-01 1
1 2018-01-15 3
1 2018-02-24 4
1 2018-03-19 1
1 2018-05-21 161
1 2018-02-10 1
1 2018-05-04 250
1 2018-05-30 148
1 2018-05-24 153
1 2018-01-24 1
1 2018-05-10 199
1 2018-03-08 1
1 2018-01-21 1
1 2018-05-27 151
1 2018-01-04 3
1 2018-05-07 236
1 2018-03-25 1
1 2018-03-11 2
1 2018-01-10 1
1 2018-01-30 1
1 2018-03-14 1
1 2018-02-19 1
1 2018-05-16 192
1 2018-01-13 5
1 2018-01-07 1
1 2018-03-17 3
1 2018-01-27 2
1 2018-02-22 1
1 2018-05-13 200
1 2018-02-08 2
1 2018-01-16 2
1 2018-03-03 1
1 2018-05-02 217
1 2018-05-22 163
1 2018-03-20 1
1 2018-02-05 2
1 2018-02-11 1
1 2018-01-19 2
1 2018-02-28 1
1 2018-05-05 332
1 2018-05-25 211
1 2018-03-23 1
1 2018-05-19 219
loc dC_count
1 6359
From "COUNT (Transact-SQL)"
COUNT(DISTINCT expression) evaluates expression for each row in a group, and returns the number of unique, nonnull values.
The distinct is relative to the group, not to the whole table (or selected subset). I think this might be your misconception here.
To better understand what this means, take the following simplified example:
CREATE TABLE group_test
(a varchar(1),
b varchar(1),
c varchar(1));
INSERT INTO group_test
(a,
b,
c)
VALUES ('a',
'r',
'x'),
('a',
's',
'x'),
('b',
'r',
'x'),
('b',
's',
'y');
If we GROUP BY a and select count(DISTINCT c)
SELECT a,
count(DISTINCT c) #
FROM group_test
GROUP BY a;
we get
a | #
----|----
a | 1
b | 2
As there is only c='x' for a=1, there is only a distinct count of 1 for this group but 2 for the other group as it has 'x'and 'y' in c. The sum of counts is 3 here.
Now if we GROUP BY a, b
SELECT a,
b,
count(DISTINCT c) #
FROM group_test
GROUP BY a,
b;
we get
a | b | #
----|----|----
a | r | 1
a | s | 1
b | r | 1
b | s | 1
We get 1 for every count here as each value of c is the only one in the group. And all of a sudden the sum of counts is 4.
And if we get the distinct count of c for the whole table
SELECT count(DISTINCT c) #
FROM group_test;
we get
#
----
2
which sums up to 2.
The sum of the counts is different in each case but right none the less.
The more groups there are, the higher the chance for a value to be unique within that group. So your results seem totally plausible.
db<>fiddle

MSSQL MAX returns all results?

I have tried the following query to return the highest P.Maxvalue for each ME.Name from the last day between 06:00 and 18:00:
SELECT MAX(P.MaxValue) AS Value,P.DateTime,ME.Name AS ID
FROM vManagedEntity AS ME INNER JOIN
Perf.vPerfHourly AS P ON ME.ManagedEntityRowId = P.ManagedEntityRowId INNER JOIN
vPerformanceRuleInstance AS PRI ON P.PerformanceRuleInstanceRowId = PRI.PerformanceRuleInstanceRowId INNER JOIN
vPerformanceRule AS PR ON PRI.RuleRowId = PR.RuleRowId
WHERE (ME.ManagedEntityTypeRowId = 2546) AND (pr.ObjectName = 'VMGuest-cpu') AND (pr.CounterName LIKE 'cpuUsageMHz') AND (CAST(p.DateTime as time) >= '06:00:00' AND CAST(p.DateTime as time) <='18:00:00') AND (p.DateTime > DATEADD(day, - 1, getutcdate()))
group by ME.Name,P.DateTime
ORDER by id
but it seems to return each MaxValue for each ID instead of the highest?
like:
Value DateTime ID
55 2018-02-19 12:00:00.000 bob:vm-100736
51 2018-02-19 13:00:00.000 bob:vm-100736
53 2018-02-19 14:00:00.000 bob:vm-100736
52 2018-02-19 15:00:00.000 bob:vm-100736
52 2018-02-19 16:00:00.000 bob:vm-100736
51 2018-02-19 17:00:00.000 bob:vm-100736
54 2018-02-19 18:00:00.000 bob:vm-100736
51 2018-02-20 06:00:00.000 bob:vm-100736
51 2018-02-20 07:00:00.000 bob:vm-100736
53 2018-02-20 08:00:00.000 bob:vm-100736
52 2018-02-20 09:00:00.000 bob:vm-100736
78 2018-02-19 12:00:00.000 bob:vm-101
82 2018-02-19 13:00:00.000 bob:vm-101
79 2018-02-19 14:00:00.000 bob:vm-101
78 2018-02-19 15:00:00.000 bob:vm-101
79 2018-02-19 16:00:00.000 bob:vm-101
77 2018-02-19 17:00:00.000 bob:vm-101
82 2018-02-19 18:00:00.000 bob:vm-101
82 2018-02-20 06:00:00.000 bob:vm-101
79 2018-02-20 07:00:00.000 bob:vm-101
81 2018-02-20 08:00:00.000 bob:vm-101
82 2018-02-20 09:00:00.000 bob:vm-101
155 2018-02-19 12:00:00.000 bob:vm-104432
there is one value per hour for each id hence twelve results for each id
does MAX not work in this way i want ?
Thanks
expected view like this :
Value DateTime ID
55 2018-02-19 12:00:00.000 bob:vm-100736
82 2018-02-19 13:00:00.000 bob:vm-101
etc
If you're using group by on datetime and id, you'll get all datetimes and all ids, it's that simple.
If you don't need exact time, you can group by date only:
SELECT MAX(P.MaxValue) AS Value, cast(P.DateTime as date) as dat, ME.Name AS ID
...
group by ME.Name, cast(P.DateTime as date)
Or if you do, you may use not exists clause instead of group by.

Getting a cumulative number of records within a day

I am trying to get a cumulative sum of records within a time period of a day. Below is a current sample of my data.
DT No_of_records
2017-05-01 00:00:00.000 241
2017-05-01 04:00:00.000 601
2017-05-01 08:00:00.000 207
2017-05-01 12:00:00.000 468
2017-05-01 16:00:00.000 110
2017-05-01 20:00:00.000 450
2017-05-02 00:00:00.000 151
2017-05-02 04:00:00.000 621
2017-05-02 08:00:00.000 179
2017-05-02 12:00:00.000 163
2017-05-02 16:00:00.000 579
2017-05-02 20:00:00.000 299
I am trying to sum up the number of records until the day changes in another column. My desired output is below.
DT No_of_records cumulative
2017-05-01 00:00:00.000 241 241
2017-05-01 04:00:00.000 601 842
2017-05-01 08:00:00.000 207 1049
2017-05-01 12:00:00.000 468 1517
2017-05-01 16:00:00.000 110 1627
2017-05-01 20:00:00.000 450 2077
2017-05-02 00:00:00.000 151 151
2017-05-02 04:00:00.000 621 772
2017-05-02 08:00:00.000 179 951
2017-05-02 12:00:00.000 163 1114
2017-05-02 16:00:00.000 579 1693
2017-05-02 20:00:00.000 299 1992
Do any of you have ideas on how to get the cumulative column?
If 2012+ you can use with window function sum() over
Select *
,cumulative = sum(No_of_records) over (Partition by cast(DT as date) Order by DT)
From YourTable
You can do this with a windowed SUM():
Select DT, No_of_records,
Sum(No_of_records) Over (Partition By Convert(Date, DT) Order By DT) As cumulative
From YourTable
For older version use CROSS APPLY or Correlated sub-query
SELECT DT,
No_of_records,
cs.cumulative
FROM YourTable a
CROSS apply(SELECT Sum(No_of_records)
FROM YourTable b
WHERE Cast(a.DT AS DATE) = Cast(b.DT AS DATE)
AND a.DT >= b.DT) cs (cumulative)
Rextester Demo