SQL Query - Hide lines - sql

in my environment we use Altiris to control our asset and daily we had a policy that set status retired to computers that stay more than 45 days offline and turn back to active if this computer appers online on network.
The problem is sometimes (not for all devices) when the policy change the status it write on database two lines:
first with the current status and and another with the new status:
The same occur when the device go back to active status so when i try to SUM data to know how many devices was changed to retired or active by month the number doesn't make sense because for some devices we have two lines with 2 different status on same dataChanged
eg:
ComputerName Date Changed Status
001PROJNEW-VM 13/01/2015 17:33 Active
002PROJNEW-VM 11/09/2014 11:58 Retired
002PROJNEW-VM 07/10/2014 21:10 Retired
002PROJNEW-VM 07/10/2014 21:10 Active
003PROJNEW-VM 11/09/2014 11:58 Retired
003PROJNEW-VM 13/11/2014 03:27 Retired
003PROJNEW-VM 13/11/2014 03:27 Active
004PROJNEW-VM 06/04/2015 20:00 Retired
005PROJNEW-VM 11/09/2014 11:58 Retired
005PROJNEW-VM 09/10/2014 21:09 Retired
005PROJNEW-VM 09/10/2014 21:09 Active
005PROJNEW-VM 06/04/2015 20:00 Retired
006PROJNEW-VM 26/12/2014 20:00 Retired
006PROJNEW-VM 31/12/2014 05:34 Retired
006PROJNEW-VM 31/12/2014 05:34 Active
006PROJNEW-VM 06/01/2015 20:00 Retired
007PROJNEW-VM 11/09/2014 11:58 Retired
007PROJNEW-VM 27/12/2014 05:38 Retired
007PROJNEW-VM 27/12/2014 05:38 Active
007PROJNEW-VM 12/04/2015 19:50 Retired
008PROJNEW-VM 11/09/2014 11:58 Retired
008PROJNEW-VM 29/10/2014 05:44 Retired
008PROJNEW-VM 29/10/2014 05:44 Active
008PROJNEW-VM 06/04/2015 20:00 Retired
009PROJNEW-VM 11/09/2014 11:58 Retired
009PROJNEW-VM 17/09/2014 20:33 Retired
009PROJNEW-VM 17/09/2014 20:33 Active
009PROJNEW-VM 19/02/2015 20:00 Retired
010PROJNEW-VM 11/09/2014 11:58 Retired
010PROJNEW-VM 29/10/2014 05:44 Retired
010PROJNEW-VM 29/10/2014 05:44 Active
010PROJNEW-VM 06/04/2015 20:00 Retired
011PROJNEW-VM 05/04/2015 20:00 Retired
013PROJNEW-VM 20/02/2015 20:00 Retired
014PROJNEW-VM 06/04/2015 20:00 Retired
Basically what i need and actually i can't do is: if the hostname has two equal 'Date Changed' and two different 'status' bring in query result the last line for this Hostname and 'Date Changed'...
The result eg:
Nome do computador Date Changed Status
001PROJNEW-VM 13/01/2015 17:33 Active
002PROJNEW-VM 11/09/2014 11:58 Retired
002PROJNEW-VM 07/10/2014 21:10 Active
003PROJNEW-VM 11/09/2014 11:58 Retired
003PROJNEW-VM 13/11/2014 03:27 Active
004PROJNEW-VM 06/04/2015 20:00 Retired
005PROJNEW-VM 11/09/2014 11:58 Retired
005PROJNEW-VM 09/10/2014 21:09 Active
005PROJNEW-VM 06/04/2015 20:00 Retired
006PROJNEW-VM 26/12/2014 20:00 Retired
006PROJNEW-VM 31/12/2014 05:34 Active
006PROJNEW-VM 06/01/2015 20:00 Retired
007PROJNEW-VM 11/09/2014 11:58 Retired
007PROJNEW-VM 27/12/2014 05:38 Active
007PROJNEW-VM 12/04/2015 19:50 Retired
008PROJNEW-VM 11/09/2014 11:58 Retired
008PROJNEW-VM 29/10/2014 05:44 Active
008PROJNEW-VM 06/04/2015 20:00 Retired
009PROJNEW-VM 11/09/2014 11:58 Retired
009PROJNEW-VM 17/09/2014 20:33 Active
009PROJNEW-VM 19/02/2015 20:00 Retired
010PROJNEW-VM 11/09/2014 11:58 Retired
010PROJNEW-VM 29/10/2014 05:44 Active
010PROJNEW-VM 06/04/2015 20:00 Retired
011PROJNEW-VM 05/04/2015 20:00 Retired
013PROJNEW-VM 20/02/2015 20:00 Retired
014PROJNEW-VM 06/04/2015 20:00 Retired

If the dates are the same, there is no first or last record, so assuming you only have those 2 statuses, and you want the status always to be Active when 2 rows exist at the same time, you can just do it with min:
select
ComputerName
DateChanged
min(Status) as Status
from
YourTable
group by
ComputerName
DateChanged
If it's more complex, you can do similar things with row_number and ordering by the desired order of statuses.

Assuming you have, say, an id column that specifies the ordering, then you can use row_number():
select t.*
from (select t.*,
row_number() over (partition by computername, datechanged
order by id desc) as seqnum
from table t
) t
where seqnum = 1;
In your particular example, all the duplicates seem to be active. If that is the case, then:
select computername, datechanged,
(case when min(status) = max(status) then min(status)
when sum(case when status = 'Active' then 1 else 0 end) > 0
then 'Active'
else '***Unknown***'
end) as status
from table t
group by computername, datechanged;

Related

Getting wrong(?) average when calculating values in a time range

I am working with AWS Redshift / PostgreSQL. I have two tables that can be joined on the interval_date (DATE data_type) and interval_time_utc (VARCHAR data type) and/or the status and price_source columns. Source A is equivalent to the Y status and Source B is equivalent to the N status. I am trying to get the average price and the sum of mw_power for a given hour for each status / price_source. An hour is the timestamps from XX:05 to XX:00 so for 15:00, the values should be from the 14:05 to the 15:00 timestamps. Even if for an hour interval where all status are one value, I still need to calculate the average price for both price_source values, but the sum of mw_power would be 0. I am passing in the date and time intervals through my application code. I am seeing a different average price for the 15:00 hour than I expect so either I am bad at math or there is a bug in my query I can't determine. The 14:00 and 16:00 hour results come back as expected.
power_table
interval_date
interval_time_utc
mw_power
status
2022-05-09
13:00
92.25
N
2022-05-09
13:05
90.75
N
2022-05-09
13:10
91.25
N
2022-05-09
13:15
92.00
N
2022-05-09
13:20
92.00
N
2022-05-09
13:25
90.00
N
2022-05-09
13:30
93.00
N
2022-05-09
13:35
91.75
N
2022-05-09
13:40
90.25
N
2022-05-09
13:45
93.00
N
2022-05-09
13:50
91.00
N
2022-05-09
13:55
94.00
N
2022-05-09
14:00
91.00
N
2022-05-09
14:05
91.00
N
2022-05-09
14:10
94.00
N
2022-05-09
14:15
92.00
N
2022-05-09
14:20
91.00
N
2022-05-09
14:25
94.00
Y
2022-05-09
14:30
92.00
Y
2022-05-09
14:35
91.75
Y
2022-05-09
14:40
92.25
Y
2022-05-09
14:45
91.00
Y
2022-05-09
14:50
92.00
Y
2022-05-09
14:55
93.00
Y
2022-05-09
15:00
90.00
Y
price_table
interval_date
interval_time_utc
price
price_source
2022-05-09
13:00
54.20
Source A
2022-05-09
13:05
54.20
Source A
2022-05-09
13:10
54.20
Source A
2022-05-09
13:00
54.20
Source B
2022-05-09
13:05
54.20
Source B
2022-05-09
13:10
54.20
Source B
2022-05-09
13:15
34.11
Source A
2022-05-09
13:20
34.11
Source A
2022-05-09
13:25
34.11
Source A
2022-05-09
13:15
39.61
Source B
2022-05-09
13:20
39.61
Source B
2022-05-09
13:25
39.61
Source B
2022-05-09
13:30
2.81
Source A
2022-05-09
13:35
2.81
Source A
2022-05-09
13:40
2.81
Source A
2022-05-09
13:30
17.13
Source B
2022-05-09
13:35
17.13
Source B
2022-05-09
13:40
17.13
Source B
2022-05-09
13:45
1.58
Source A
2022-05-09
13:50
1.58
Source A
2022-05-09
13:55
1.58
Source A
2022-05-09
13:45
15.98
Source B
2022-05-09
13:50
15.98
Source B
2022-05-09
13:55
15.98
Source B
2022-05-09
14:00
4.60
Source A
2022-05-09
14:05
4.60
Source A
2022-05-09
14:10
4.60
Source A
2022-05-09
14:00
18.09
Source B
2022-05-09
14:05
18.09
Source B
2022-05-09
14:10
18.09
Source B
2022-05-09
14:15
2.46
Source A
2022-05-09
14:20
2.46
Source A
2022-05-09
14:25
2.46
Source A
2022-05-09
14:15
16.66
Source B
2022-05-09
14:20
16.66
Source B
2022-05-09
14:25
16.66
Source B
2022-05-09
14:30
3.36
Source A
2022-05-09
14:35
3.36
Source A
2022-05-09
14:40
3.36
Source A
2022-05-09
14:30
21.52
Source B
2022-05-09
14:35
21.52
Source B
2022-05-09
14:40
21.52
Source B
2022-05-09
14:45
4.55
Source A
2022-05-09
14:50
4.55
Source A
2022-05-09
14:55
4.55
Source A
2022-05-09
14:45
16.30
Source B
2022-05-09
14:50
16.30
Source B
2022-05-09
14:55
16.30
Source B
2022-05-09
15:00
-21.87
Source A
2022-05-09
15:00
4.96
Source B
-- query that i am using to get hourly values
SELECT pricet.price_source,
COALESCE(powert.volume, 0),
pricet.price,
powert.status
FROM (SELECT status,
SUM(mw_power) volume
FROM power_table
WHERE (interval_date || ' ' || interval_time_utc)::timestamp BETWEEN '2022-05-09 14:05:00.0' AND '2022-05-09 15:00:00.0'
GROUP BY status) powert
RIGHT JOIN (SELECT price_source,
AVG(price) price
FROM price_table
WHERE (interval_date || ' ' || interval_time_utc)::timestamp BETWEEN '2022-05-09 14:05:00.0' AND '2022-05-09 15:00:00.0'
GROUP BY price_source) pricet
ON pricet.price_source = CASE WHEN powert.status = 'Y' THEN 'Source A'
ELSE 'Source B'
END;
I am looking to get an expected output of the following for the 15:00 hour:
price_source
volume
price
status
Source A
736.00
0.54
Y
Source B
368.00
17.38
N
Result that I'm getting from query:
price_source
volume
price
status
Source A
736.00
1.54
Y
Source B
368.00
17.05
N
db fiddle link of tables and query and results: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=474b009c5cf5366961751a61c0f96c6c
I think you made a calculator error. I changed your fiddle to add a rolling sum and rolling average for the second part of your query. To get an average of .54 (Source A) your sum would need to be 12 less than the total of the values for this hour. 12 is the count of values for the hour so a possible slip in subtracting 12 before dividing by 12?
The other source (B) the total would need to be off by 4m (an addition of 4 to the sum). Not sure how this could have happened but ...
Anyway the fiddle is at https://dbfiddle.uk/?rdbms=postgres_14&fiddle=e65c38677f3ab92607bbff778bc0f69e

Change dataframe by group

I have a pandas dataframe that looks something like this
activity time date
0 Phone 04:00 20210810
1 Phone 08:30 20210810
2 Coffee 10:30 20210810
3 Lunch 04:00 20210810
4 Phone 10:30 20210810
5 Phone 04:00 20210810
6 Lunch 08:30 20210810
7 Lunch 10:30 20210810
0 Phone 08:45 20210811
1 Pooping 08:50 20210811
2 Coffee 10:30 20210811
3 Lunch 04:00 20210811
4 Phone 10:30 20210811
5 Meeting 04:00 20210811
6 Lunch 08:30 20210811
7 Lunch 10:30 20210811
and i need to change it to :
date activity time
20210810 Phone 04:00
08:30
10:30
04:00
Coffee 10:30
Lunch 04:00
08:30
10:30
20210811 Phone 08:45
10:30
Pooping 08:50
Coffee 10:30
Meeting 04:00
Lunch 04:00
08:30
10:30
Basically sort by date, activity and then add '' for the same type.
Set as index and sort:
df.set_index(['date', 'activity']).sort_index()
Or, if the values need to be sorted as well:
df.set_index(['date', 'activity']).sort_values(by='time').sort_index()
By default, in jupyter/ipython the index will display only the first value of the successive rows. If you need another format, please update your question.

Calculate difference between time over midnight and condition

When I calculate the difference between two time, I must also pay attention to what happens after midnight.
select *
,datediff(second,
cast([schedule_deptime]as time),
cast([prev_announce_time]as time))
as kpi1_delta
id
date_event
schedule_deptime
prev_announce_time
kpi1_delta
79643204
2021-02-11 19:55:52.000
19:15
2021-02-11 19:39:01.000
1441
79569510
2021-02-11 16:51:05.000
16:50
2021-02-11 16:48:17.000
-103
106160161
2021-01-21 20:43:44.000
20:28
2021-01-21 01:03:41.000
-69859
106216877
2021-01-21 23:50:10.000
23:45
2021-01-21 00:06:52.000
-85088
79703534
2021-02-11 23:58:01.000
00:04
2021-02-11 00:03:01.000
-59
I should get for my third and fourth row:
id
date_event
schedule_deptime
prev_announce_time
kpi1_delta
106160161
2021-01-21 20:43:44.000
20:28
2021-01-21 01:03:41.000
16541
106216877
2021-01-21 23:50:10.000
23:45
2021-01-21 00:06:52.000
1312

Can I Duplicate an Entire Row based on a Beginning Date and Ending Date in SQL

In SQL Server Management Studio, is there a way to take some excel data that has a Starting Date and Ending Date and Have it take the initial start date and Duplicate the record by month until the end date see examples I have a spreadsheet that has the data and I need to convert it to the one below by Month the number of records the well exist. So for example if the first well starts Jan 2018 and Ends Apr 2018 I need it to duplicate the row for Jan 2018, Feb 2018 and March 2018 it can even list April 2018 but stop duplicating the row leave in the data but don’t duplicate that record past its end date. Hope fully this makes sense. I can do it manually but trying to write a stored procedure that creates a new table from the Original table like the example below.
Starting Data
Well Operator Date_Start DateEnd Months
--------------------------------------------------------------------------------------
JIM TOM LONTOS 30 23S 28E RB MATADOR RESOURCES 1/1/2018 4/2/2018 3
ODIE 1606 BCE-MACH III LLC 1/1/2018 4/16/2018 3
SIEGRIST 1307 MARATHON OIL 1/1/2018 5/23/2018 4
SILVERTIP 76-7 UNIT A OCCIDENTAL PETROLEUM 1/1/2018 12/6/2018 11
Ending Data
Date Lease Operator Start_date End_Date
--------------------------------------------------------------------------------------
Jan-18 JIM TOM LONTOS 30 23S 28E RB MATADOR RESOURCES 1/1/2018 4/2/2018
Feb-18 JIM TOM LONTOS 30 23S 28E RB MATADOR RESOURCES 1/1/2018 4/2/2018
Mar-18 JIM TOM LONTOS 30 23S 28E RB MATADOR RESOURCES 1/1/2018 4/2/2018
Jan-18 ODIE 1606 BCE-MACH III LLC 1/1/2018 4/16/2018
Feb-18 ODIE 1606 BCE-MACH III LLC 1/1/2018 4/16/2018
Mar-18 ODIE 1606 BCE-MACH III LLC 1/1/2018 4/16/2018
Jan-18 SIEGRIST 1307 MARATHON OIL 1/1/2018 5/23/2018
Feb-18 SIEGRIST 1307 MARATHON OIL 1/1/2018 5/23/2018
Mar-18 SIEGRIST 1307 MARATHON OIL 1/1/2018 5/23/2018
Apr-18 SIEGRIST 1307 MARATHON OIL 1/1/2018 5/23/2018
Jan-18 SILVERTIP 76-7 UNIT A OCCIDENTAL PETROLEUM 1/1/2018 12/6/2018
Feb-18 SILVERTIP 76-7 UNIT A OCCIDENTAL PETROLEUM 1/1/2018 12/6/2018
Mar-18 SILVERTIP 76-7 UNIT A OCCIDENTAL PETROLEUM 1/1/2018 12/6/2018
Apr-18 SILVERTIP 76-7 UNIT A OCCIDENTAL PETROLEUM 1/1/2018 12/6/2018
May-18 SILVERTIP 76-7 UNIT A OCCIDENTAL PETROLEUM 1/1/2018 12/6/2018
Jun-18 SILVERTIP 76-7 UNIT A OCCIDENTAL PETROLEUM 1/1/2018 12/6/2018
Jul-18 SILVERTIP 76-7 UNIT A OCCIDENTAL PETROLEUM 1/1/2018 12/6/2018
Aug-18 SILVERTIP 76-7 UNIT A OCCIDENTAL PETROLEUM 1/1/2018 12/6/2018
One option uses a recursive query:
with cte as (
select datefromparts(year(date_start), month(date_start), 1) dt, well, operator, date_start date_end
from mytable t
union all
select dateadd(month, 1, dt), well, operator, date_start, date_end
from cte c
where c.dt < end_date
)
select * from cte

Create an event log from an excel file by turning columns into repeated rows

I have an Excel sheet like the following:
ID Arrival Passed Berthing Date UnBerthing Date Departure Passed
1 13/05/2017 15:30 13/05/2017 16:00 31/05/2017 20:44 31/05/2017
2 15/05/2017 16:56 15/05/2017 17:15 16/05/2017 00:00 16/05/2017
3 20/05/2017 09:54 20/05/2017 10:26 20/05/2017 18:07 20/05/2017
4 24/05/2017 16:09 24/05/2017 16:35 25/05/2017 01:03 25/05/2017
5 29/05/2017 10:30 29/05/2017 10:45 29/05/2017 17:33 29/05/2017
I need this in the following format:
ID Event Time
1 Arrival 13/05/2017 15:30
1 Berth 13/05/2017 16:00
1 UnBerth 31/05/2017 20:44
1 Departure 31/05/2017 20:58
2 Arrival 15/05/2017 16:56
2 Berth 15/05/2017 17:15
2 UnBerth 16/05/2017 00:00
2 Departure 16/05/2017 00:04
etc
I've searched the web and this site(youtube...), but with no right answer, i've tried the transpose function and pivot table, but i couldn't make it.
Any help would be appreciated.
Thanks you.
Assuming that your dataset is in range A2:E6.
For getting ID:
=INDEX($A$2:$E$6,CEILING(ROWS($A$1:A1)/4,1),1)
For getting Event:
=CHOOSE(MOD(ROWS($A$1:A1)-1,4)+1,"Arrival","Berth","Unberth","Departure")
For getting Time:
=INDEX($A$2:$E$6,CEILING(ROWS($A$1:A1)/4,1),MOD(ROWS($A$1:A1)-1,4)+2)
and then copy down until you get error.