LAG and GROUP BY not compatible in Maria DB SQL - sql

I have this SQL query in MariaDB
SELECT substr(sqlth_te.tagpath, 32), stringvalue,
((t_stamp - (CASE WHEN sqlth_te.tagpath = LAG(sqlth_te.tagpath,1) OVER (ORDER BY sqlth_te.tagpath, t_stamp) Then LAG(t_stamp,1) OVER (ORDER BY sqlth_te.tagpath, t_stamp)
ELSE NULL
END))/1000) as seconds
FROM sqlt_data_1_2022_04
LEFT JOIN sqlth_te
ON sqlt_data_1_2022_04.tagid = sqlth_te.id
WHERE stringvalue IS NOT NULL
ORDER BY sqlth_te.tagpath, t_stamp
sql code
Which returns 3 columns; a column with machine names, running status, and duration since status change.
results
I'd like to sum the duration by machine name and running status, but when I try to add a sum and group by I get an error.
SELECT substr(sqlth_te.tagpath, 32), stringvalue,
SUM((t_stamp - (CASE WHEN sqlth_te.tagpath = LAG(sqlth_te.tagpath,1) OVER (ORDER BY sqlth_te.tagpath, t_stamp) Then LAG(t_stamp,1) OVER (ORDER BY sqlth_te.tagpath, t_stamp)
ELSE NULL
END))/1000) as seconds
FROM sqlt_data_1_2022_04
LEFT JOIN sqlth_te
ON sqlt_data_1_2022_04.tagid = sqlth_te.id
WHERE stringvalue IS NOT NULL
ORDER BY sqlth_te.tagpath, t_stamp
GROUP BY substr(sqlth_te.tagpath, 32), stringvalue
Error:
java.sql.SQLSyntaxErrorException: (conn=8) You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'GROUP BY substr(sqlth_te.tagpath, 32), stringvalue' at line 10
Any ideas of what I'm doing wrong or if it's possible to group a column generated with the lag function?
Thanks

First thing: The GROUP BY should come before the ORDER BY
You may also need to nest it like this:
SELECT tagpath, stringvalue, SUM(seconds) as seconds
FROM (
SELECT substr(sqlth_te.tagpath, 32) as tagpath, stringvalue,
((t_stamp - (CASE WHEN sqlth_te.tagpath = LAG(sqlth_te.tagpath,1) OVER (ORDER BY sqlth_te.tagpath, t_stamp) Then LAG(t_stamp,1) OVER (ORDER BY sqlth_te.tagpath, t_stamp)
ELSE NULL
END))/1000) as seconds
FROM sqlt_data_1_2022_04
LEFT JOIN sqlth_te
ON sqlt_data_1_2022_04.tagid = sqlth_te.id
WHERE stringvalue IS NOT NULL
)
GROUP BY tagpath, stringvalue
ORDER BY tagpath, stringvalue

Related

SQL calculation with previous row + current row

I want to make a calculation based on the excel file. I succeed to obtain 2 of the first records with LAG (as you can check on the 2nd screenshot). Im out of ideas how to proceed from now and need help. I just need the Calculation column take its previous data. I want to automatically calculate it over all the dates. I also tried to make a LAG for the calculation but manually and the result was +1 row more data instead of NULL. This is a headache.
LAG(Data ingested, 1) OVER ( ORDER BY DATE ASC ) AS LAG
You seem to want cumulative sums:
select t.*,
(sum(reconciliation + aves - microa) over (order by date) -
first_value(aves - microa) over (order by date)
) as calculation
from CalcTable t;
Here is a SQL Fiddle.
EDIT:
Based on your comment, you just need to define a group:
select t.*,
(sum(reconciliation + aves - microa) over (partition by grp order by date) -
first_value(aves - microa) over (partition by grp order by date)
) as calculation
from (select t.*,
count(nullif(reconciliation, 0)) over (order by date) as grp
from CalcTable t
) t
order by date;
Imo this could be solved using a "gaps and islands" approach. When Reconciliation>0 then create a gap. SUM(GAP) OVER converts the gaps into island groupings. In the outer query the 'sum_over' column (which corresponds to the 'Calculation') is a cumumlative sum partitioned by the island groupings.
with
gap_cte as (
select *, case when [Reconciliation]>0 then 1 else 0 end gap
from CalcTable),
grp_cte as (
select *, sum(gap) over (order by [Date]) grp
from gap_cte)
select *, sum([Reconciliation]+
(case when gap=1 then 0 else Aves end)-
(case when gap=1 then 0 else Microa end))
over (partition by grp order by [Date]) sum_over
from grp_cte;
[EDIT]
The CASE statement could be CROSS APPLY'ed instead
with
grp_cte as (
select c.*, v.gap, sum(v.gap) over (order by [Date]) grp
from #CalcTable c
cross apply (values (case when [Reconciliation]>0 then 1 else 0 end)) v(gap))
select *, sum([Reconciliation]+
(case when gap=1 then 0 else Aves end)-
(case when gap=1 then 0 else Microa end))
over (partition by grp order by [Date]) sum_over
from grp_cte;
Here is a fiddle

Select a line equal to 'X' without TOP 'N' plus the previous line 'Y' in SQL Server?

I need to return in a query only the last lines with 'ProductStatus' equal 'Stop' and the previous line.
I have the table:
And need to get this result:
How do I do this in SQL Server?
One method uses window functions to calculate the last stop and then get the row before that:
select t.*
from (select t.*,
lead(seqnum_ps) over (partition by producttype order by datevalue) as next_seqnum_ps,
lead(status) over (partition by producttype order by datevalue) as next_status
from (select t.*,
row_number() over (partition by producttype, product_status order by datevalue desc) as seqnum_ps
from t
) t
) t
where (seqnum_ps = 1 and product_status = 'Stop') or
(next_seqnum_ps = 1 and next_product_status = 'Stop');
An alternative method gets the maximum stop time and uses that:
select t.*
from (select t.*,
max(case when product_status = 'Stop' then datevalue end) over (partition by producttype) as max_stop_dv,
lead(datevalue) over (partition by producttype order by datevalue) as next_dv
from t
) t
where datevalue = max_stop_dv or
next_dv = max_stop_dv;

SQL Query to get percentages of two selects?

I currently use two seperate Queries to recieve lists of total runs and lists of errors, so i use excel to divide these numbers to get percentages.
The problem is, that i use a subselect to get the errors, because i group the first select, and therefore cannot use the conditions in the first.
So my Query to get all runs is:
Select
Count(*) as All, year([US-Date]) as year, month([US-Date]) as month, day([US-Date]) as day
FROM
(Select
ROW_NUMBER() OVER (PARTITION BY Int_No ORDER BY Time desc) AS RowNumber, [US-Date]
FROM
dbo.Mydatabase
Where
[US-Date] between '2017-10-01' and '2018-03-01') AS a
WHERE
a.RowNumber = 1
GROUP BY
year([US-Date]), month([US-Date]), day([US-Date])
ORDER BY
year([US-Date]), month([US-Date]), day([US-Date])
which gives me a list of all testruns for each day.
then i use this Query to get the errors:
Select
Count(*) as fejlende, year([US-Date]) as år,
month([US-Date]) as måned, day([US-Date]) as dag
From
(Select
ROW_NUMBER() OVER (PARTITION BY Int_No ORDER BY Time desc) AS RowNumber, [US-Date]
From
dbo.Mydatabase
Where
[US-Date] between '2017-10-01' and '2018-03-01'
and ErrorCode in
(Select
ErrorCode from dbo.Mydatabase
Where
(ErrorCode like '2374' or ErrorCode like '2373' or ErrorCode like '2061'))) AS a
WHERE
a.RowNumber = 1
GROUP BY
year([US-Date]), month([US-Date]), day([US-Date])
ORDER BY
year([US-Date]), month([US-Date]), day([US-Date])
So my question is: can i make one query that finds both lists, and divide them, so i dont have to put them into excel and so on :-)?
You can use a CASE expression for this (I simplified the errorcode check):
Select COUNT(*) as ALL
, COUNT(CASE WHEN ErrorCode IN ('2374', '2373', '2061') THEN 1 END) AS fejlende
, YEAR([US-Date]) as year
, MONTH([US-Date]) as month
, DAY([US-Date]) as day
from (
Select ROW_NUMBER() OVER (PARTITION BY Int_No ORDER BY Time desc) AS RowNumber, [US-Date]
From dbo.Mydatabase
Where [US-Date] between '2017-10-01' and '2018-03-01') AS a
where a.RowNumber = 1
GROUP BY year([US-Date]), month([US-Date]), day([US-Date])
ORDER BY year([US-Date]), month([US-Date]), day([US-Date])
Something like this??
SELECT
Count(*) as [Total],
SUM(CASE WHEN (ErrorCode like '2374' or ErrorCode like '2373' or ErrorCode like '2061') THEN 1 ELSE 0 END) AS Errors,
year([US-Date]) as [Year],
month([US-Date]) as [Month],
day([US-Date]) as [Day]
FROM dbo.Mydatabase
WHERE ROW_NUMBER() OVER (PARTITION BY Int_No ORDER BY Time desc) = 1
AND [US-Date] between '2017-10-01' and '2018-03-01'
GROUP BY year([US-Date]), month([US-Date]), day([US-Date])
ORDER BY year([US-Date]), month([US-Date]), day([US-Date])
Not really sure what your ROW_NUMBER is used for, but hopefully you get the idea and can adopt to your needs now you know the SUM(CASE WHEN) method?

Selecting 1 row per day closest to 4am? [duplicate]

This question already has answers here:
Get top 1 row of each group
(19 answers)
Closed 6 years ago.
We're currently working on a query for a report that returns a series of data. The customer has specified that they want to receive 5 rows total, with the data from the previous 5 days (as defined by a start date and an end date variable). For each day, they want the data from the row that's closest to 4am.
I managed to get it to work for a single day, but I certainly don't want to union 5 separate select statements simply to fetch these values. Is there any way to accomplish this via CTEs?
select top 1
'W' as [RecordType]
, [WellIdentifier] as [ProductionPtID]
, t.Name as [Device Name]
, t.RecordDate --convert(varchar, t.RecordDate, 112) as [RecordDate]
, TubingPressure as [Tubing Pressure]
, CasingPressure as [Casing Pressure]
from #tTempData t
Where cast (t.recorddate as time) = '04:00:00.000'
or datediff (hh,'04:00:00.000',cast (t.recorddate as time)) < -1.2
order by Name, RecordDate desc
assuming that the #tTempData only contains the previous 5 days records
SELECT *
FROM
(
SELECT *, rn = row_number() over
(
partition by convert(date, recorddate)
order by ABS ( datediff(minute, convert(time, recorddate) , '04:00' )
)
FROM #tTempData
)
WHERE rn = 1
You can use row_number() like this to get the top 5 last days most closest to 04:00
SELECT TOP 5 * FROM (
select t.* ,
ROW_NUMBER() OVER(PARTITION BY t.recorddate
ORDER BY abs(datediff (minute,'04:00:00.000',cast (t.recorddate as time))) rnk
from #tTempData t)
WHERE rnk = 1
ORDER BY recorddate DESC
You can use row_number() for this purpose:
select t.*
from (select t.*,
row_number() over (partition by cast(t.recorddate as date)
order by abs(datediff(ms, '04:00:00.000',
cast(t.recorddate as time)
))
) seqnum
from #tTempData t
) t
where seqnum = 1;
You can add an appropriate where clause in the subquery to get the dates that you are interested in.
Try something like this:
select
'W' as [RecordType]
, [WellIdentifier] as [ProductionPtID]
, t.Name as [Device Name]
, t.RecordDate --convert(varchar, t.RecordDate, 112) as [RecordDate]
, TubingPressure as [Tubing Pressure]
, CasingPressure as [Casing Pressure]
from #tTempData t
Where exists
(select 1 from #tTempData t1 where
ABS(datediff (hh,'04:00:00.000',cast (t.recorddate as time))) <
ABS(datediff (hh,'04:00:00.000',cast (t1.recorddate as time)))
and GETDATE(t.RecordDate) = GETDATE(t1.RecordDate)
)dt
and t.RecordDate between YOURDATERANGE
order by Name, RecordDate desc;

Calculating a running count & running total across customers with SQL

I have the following table (SQL Server 2012):
DID - cust id
GID - order id
AMT - order amt
Gf_Date - order date
SC - order reversal amount
I'm trying to calculate a running count of orders and a running total of sales by customer so that I can assign a flag to the point in time where a customer achieved cumulative sales of $1,000. As a first step, I've run this query:
Select
[DID]
, [AMT]
, [Gf_Date]
, COUNT([GID]) OVER (PARTITION BY [DID] ORDER BY [Gf_Date]) [RunningGift_Count]
, SUM([AMT]) OVER (PARTITION BY [DID] ORDER BY [Gf_Date]) [CumlativeTotal]
FROM [dbo].[MCT]
WHERE [SC] is null
ORDER BY [DID]
But I get the error message:
Msg 102, Level 15, State 1, Line 3 Incorrect syntax near 'order'
I posted this earlier with the wrong error message pasted in. Regrets and apologies. What you see above is the result I'm getting. Someone commented that this syntax is incorrect. Now that all is in order, can someone tell me what I'm doing wrong?
You should use ROW_NUMBER (link) instead of COUNT:
DECLARE #Threshold NUMERIC(19,2)=1000; -- Use the same data type as `[AMT]`'s data type
Select
[DID]
, [AMT]
, [Gf_Date]
--, COUNT([GID]) OVER (PARTITION BY [DID] ORDER BY [Gf_Date]) [RunningGift_Count]
, ROW_NUMBER() OVER (PARTITION BY [DID] ORDER BY [Gf_Date]) [RunningGift_Count]
, SUM([AMT]) OVER (PARTITION BY [DID] ORDER BY [Gf_Date]) [CumlativeTotal]
, CASE
WHEN SUM([AMT]) OVER (PARTITION BY [DID] ORDER BY [Gf_Date]) >= #Threshold THEN 1
ELSE 0
END IsThresholdPassed
FROM [dbo].[MCT]
WHERE [SC] is null
ORDER BY [DID]