Cumulative values minus overs - sql

I have a query that produces the following table with a cumulative column (cumulate)
+--+---+--------+------+
|id|qty|cumulate|value |
+--+---+--------+------+
|1 |5 |5 |419.6 |
+--+---+--------+------+
|2 |2 |7 |167.84|
+--+---+--------+------+
|3 |1 |8 |83.92 |
+--+---+--------+------+
|4 |2 |10 |167.84|
+--+---+--------+------+
|5 |1 |11 |83.92 |
+--+---+--------+------+
|6 |5 |16 |419.6 |
+--+---+--------+------+
However I need a further attachment to the query that will only select all the rows that cumulate up to 9. In this case the first 4 rows accumulate up to 10 and the first three; 8 .
I need to extract the sum of the total value where the qty is no more and no less than 9.
The rows are in date order (date not shown) and therefor rows cannot be reordered.
How would one achieve this?
EDIT
here is my query (but the results table above is not the same output as what this query would produce):
select branch,
case
when DATEDIFF(MONTH, dateInv, getDate()) between 0 and 6 then '0-6'
when DATEDIFF(MONTH, dateInv, getDate()) between 7 and 12 then '7-12'
when DATEDIFF(MONTH, dateInv, getDate()) between 13 and 18 then '13-18'
when DATEDIFF(MONTH, dateInv, getDate()) between 19 and 24 then '19-24'
when DATEDIFF(MONTH, dateInv, getDate()) between 25 and 36 then '25-36'
when DATEDIFF(MONTH, dateInv, getDate()) > 36 then '>36'
end [period]
,sum(qty*cost) [costs]
from (
select branch,qty, dateInv, max(cost)cost, max(soh)[qoh], SUM(qty*cost)[sumqty]
, sum(qty) over (partition by product order by dateInv desc) [cumulate]
from openquery(linkedserver,
'select branch,product, soh, cost, dateInv, qty
from table
group by branch,product, soh, cost, dateInv, qty
order by dateInv DESC
')
group by branch,product,qty, dateInv
)t
where cumulate <= qoh
group by branch, dateInv

Well from the looks of it each 1 in quantity has a value of 83.92. 9 * 83.92 = 755.28

Thanks everyone for your attempts. I managed to solve this problem by using a series of nested queries with "over (partition by)", Row_number and calculations.
Lots of fun !

Related

group by value but only for continue value

OK, the title is far from obvious, I could not explain it better.
Let's consider the table with columns (date, xvalue, some other columns), what I need is to group them by xvalue but only when they are not interrupted considering time (column date), so for example, for:
Date |xvalue |yvalue|
1 Mar |10 |1 |
2 Mar |10 |2 |
3 Mar |20 |6 |
4 Mar |20 |1 |
5 Mar |10 |4 |
6 Mar |10 |2 |
From the above data, I would like to get three rows, for the first xvalue==10, for xvalue==20 and again for xvalue==10 and for each group aggregate of the other values, for example for sum:
1 Mar, 10, 3
3 Mar, 20, 7
5 Mar, 10, 6
It's like query:
select min(date), xvalue, sum(yvalue) from t group by xvalue
Except above will merge 1,2,5 and 6th of March and I want them separately
This is an example of a gaps-and-islands problem. But you need an ordering column. With such a column, you can use the difference of row numbers:
select min(date), xvalue, sum(yvalue)
from (select t.*,
row_number() over (partition by xvalue order by date) as seqnum_d,
row_number() over (order by date) as seqnum
from t
) t
group by xvalue, (seqnum - seqnum_d)
order by min(date)
Here is a db<>fiddle.
Datas in a database are logically stored in mathematicl sets inside which there is absolutly no order and no way to have a default ordering. they are comparable to bags in which objects can move during their use.
So there is no solution to answer your query until you add a specific column to give the requested sort order that the user need to have...

How To Increment Date By One Year, Based on Last Result (DateTime Banding)

Hopefully I'll be able to explain this better than the title.
I have an activity table that looks like this:
|ID| |LicenseNumber| |DateTime|
|1 | |123 | |2017-11-17 11:19:04.420|
|2 | |123 | |2017-11-26 10:16:52.790|
|3 | |123 | |2018-02-06 11:13:21.480|
|4 | |123 | |2018-02-19 10:12:32.493|
|5 | |123 | |2018-05-16 09:33:05.440|
|6 | |123 | |2019-01-02 10:05:25.193|
What I need is a count of rows per License Number, grouped in essentially 12 month intervals. But, the year needs to start from when the previous entry ended.
For example, I need a count of all records for 12 months from 2017-11-17 11:19:04.420, and then I need a count of all records starting from (2017-11-17 11:19:04.420 + 12 months) for another 12 months, and so on.
I've considered using recursive CTEs, the LAG function etc. but can't quite figure it out. I could probably do something with a CASE statement and static values, but that would require updating the report code every year.
Any help pointing me in the right direction would be much appreciated!
I think the following code using CTE can help you but I am not totally sure what you want to achieve:
WITH CTE AS
(
SELECT TOP 1 DateTime
FROM YourTable
ORDER BY ID
UNION ALL
SELECT DATEADD(YEAR, 1, DateTime)
FROM CTE
WHERE DateTime<= DATEADD(YEAR, 1, GETDATE())
)
SELECT LicenseNumber, DateTime, Count(*) AS Rows
FROM CTE
INNER JOIN YourTable
ON YourTable.DateTime BETWEEN CTE.DateTime AND DATEADD(YEAR, 1, CTE.DateTime)
GROUP BY LicenseNumber, DateTime;
Hmmm. Do you just need the number of records in 12-month intervals after the first record?
If so:
select dateadd(year, yr - 1, min_datetime),
dateadd(year, yr, min_datetime),
count(t.id)
from (values (1), (2), (3)) v(yr) left join
(select t.*,
min(datetime) over () as min_datetime
from t
) t
on t.datetime >= dateadd(year, yr - 1, min_datetime) and
t.datetime < dateadd(year, yr, min_datetime)
group by dateadd(year, yr - 1, min_datetime),
dateadd(year, yr, min_datetime)
order by yr;
This can easily be extended to more years, if it is what you want.

Select rows that are duplicates on two columns

I have data in a table. There are 3 columns (ID, Interval, ContactInfo). This table lists all phone contacts. I'm attempting to get a count of phone numbers that called twice on the same day and have no idea how to go about this. I can get duplicate entries for the same number but it does not match on date. The code I have so far is below.
SELECT ContactInfo, COUNT(Interval) AS NumCalls
FROM AllCalls
GROUP BY ContactInfo
HAVING COUNT(AllCalls.ContactInfo) > 1
I'd like to have it return the date, the number of calls on that date if more than 1, and the phone number.
Sample data:
|ID |Interval |ContactInfo|
|--------|------------|-----------|
|1 |3/1/2017 |8009999999 |
|2 |3/1/2017 |8009999999 |
|3 |3/2/2017 |8001234567 |
|4 |3/2/2017 |8009999999 |
|5 |3/3/2017 |8007771111 |
|6 |3/3/2017 |8007771111 |
|--------|------------|-----------|
Expected result:
|Interval |ContactInfo|NumCalls|
|------------|-----------|--------|
|3/1/2017 |8009999999 |2 |
|3/3/2017 |8007771111 |2 |
|------------|-----------|--------|
Just as juergen d suggested, you should try to add Interval in your GROUP BY. Like so:
SELECT AC.ContactInfo
, AC.Interval
, COUNT(*) AS qnty
FROM AllCalls AS AC
GROUP BY AC.ContactInfo
, AC.Interval
HAVING COUNT(*) > 1
The code should like this :
select Interval , ContactInfo, count(ID) AS NumCalls from AllCalls group by Interval, ContactInfo having count(ID)>1;

SQL query to return a grouped result as a single row

If I have a jobs table like:
|id|created_at |status |
----------------------------
|1 |01-01-2015 |error |
|2 |01-01-2015 |complete |
|3 |01-01-2015 |error |
|4 |01-02-2015 |complete |
|5 |01-02-2015 |complete |
|6 |01-03-2015 |error |
|7 |01-03-2015 |on hold |
|8 |01-03-2015 |complete |
I want a query that will group them by date and count the occurrence of each status and the total status for that date.
SELECT created_at status, count(status), created_at
FROM jobs
GROUP BY created_at, status;
Which gives me
|created_at |status |count|
-------------------------------
|01-01-2015 |error |2
|01-01-2015 |complete |1
|01-02-2015 |complete |2
|01-03-2015 |error |1
|01-03-2015 |on hold |1
|01-03-2015 |complete |1
I would like to now condense this down to a single row per created_at unique date with some sort of multi column layout for each status. One constraint is that status is any one of 5 possible words but each date might not have one of every status. Also I would like a total of all statuses for each day. So desired results would look like:
|date |total |errors|completed|on_hold|
----------------------------------------------
|01-01-2015 |3 |2 |1 |null
|01-02-2015 |2 |null |2 |null
|01-03-2015 |3 |1 |1 |1
the columns could be built dynamically from something like
SELECT DISTINCT status FROM jobs;
with a null result for any day that doesn't contain any of that type of status. I am no SQL expert but am trying to do this in a DB view so that I don't have to bog down doing multiple queries in Rails.
I am using Postresql but would like to try to keep it straight SQL. I have tried to understand aggregate function enough to use some other tools but not succeeding.
The following should work in any RDBMS:
SELECT created_at, count(status) AS total,
sum(case when status = 'error' then 1 end) as errors,
sum(case when status = 'complete' then 1 end) as completed,
sum(case when status = 'on hold' then 1 end) as on_hold
FROM jobs
GROUP BY created_at;
The query uses conditional aggregation so as to pivot grouped data. It assumes that status values are known before-hand. If you have additional cases of status values, just add the corresponding sum(case ... expression.
Demo here
An actual crosstab query would look like this:
SELECT * FROM crosstab(
$$SELECT created_at, status, count(*) AS ct
FROM jobs
GROUP BY 1, 2
ORDER BY 1, 2$$
,$$SELECT unnest('{error,complete,"on hold"}'::text[])$$)
AS ct (date date, errors int, completed int, on_hold int);
Should perform very well.
Basics:
PostgreSQL Crosstab Query
The above does not yet include the total per date.
Postgres 9.5 introduces the ROLLUP clause, which is perfect for the case:
SELECT * FROM crosstab(
$$SELECT created_at, COALESCE(status, 'total'), ct
FROM (
SELECT created_at, status, count(*) AS ct
FROM jobs
GROUP BY created_at, ROLLUP(status)
) sub
ORDER BY 1, 2$$
,$$SELECT unnest('{total,error,complete,"on hold"}'::text[])$$)
AS ct (date date, total int, errors int, completed int, on_hold int);
Up to Postgres 9.4, use this query instead:
WITH cte AS (
SELECT created_at, status, count(*) AS ct
FROM jobs
GROUP BY 1, 2
)
TABLE cte
UNION ALL
SELECT created_at, 'total', sum(ct)
FROM cte
GROUP BY 1
ORDER BY 1
Related:
Grouping() equivalent in PostgreSQL?
If you want to stick to a simple query, this is a bit shorter:
SELECT created_at
, count(*) AS total
, count(status = 'error' OR NULL) AS errors
, count(status = 'complete' OR NULL) AS completed
, count(status = 'on hold' OR NULL) AS on_hold
FROM jobs
GROUP BY 1;
count(status) for the total per date is error-prone, because it would not count rows with NULL values in status. Use count(*) instead, which is also shorter and a bit faster.
Here is a list of techniques:
For absolute performance, is SUM faster or COUNT?
In Postgres 9.4+ use the new aggregate FILTER clause, like #a_horse mentioned:
SELECT created_at
, count(*) AS total
, count(*) FILTER (WHERE status = 'error') AS errors
, count(*) FILTER (WHERE status = 'complete') AS completed
, count(*) FILTER (WHERE status = 'on hold') AS on_hold
FROM jobs
GROUP BY 1;
Details:
How can I simplify this game statistics query?

Get history of record states sql server 2008

Say I had a Bus Garage application that contained a datatable that represented wether buses where either in the garage, out of the garage or in the shop for maintenance. It looks like this:
+-----------------------+-----+-----+
|Date |BusId|State|
+-----------------------+-----+-----+
|2013-09-12 15:02:41,844|1 |IN |
+-----------------------+-----+-----+
|2013-09-12 15:02:41,844|2 |IN |
+-----------------------+-----+-----+
|2013-09-12 15:02:41,844|3 |OUT |
+-----------------------+-----+-----+
|2013-09-12 15:02:41,844|4 |OUT |
+-----------------------+-----+-----+
|2013-09-12 15:02:41,844|5 |OUT |
+-----------------------+-----+-----+
|2013-09-13 15:02:41,844|1 |OUT |
+-----------------------+-----+-----+
|2013-09-14 15:02:41,844|1 |IN |
+-----------------------+-----+-----+
|2013-09-15 15:02:41,844|1 |OUT |
+-----------------------+-----+-----+
|2013-09-15 15:02:41,844|2 |OUT |
+-----------------------+-----+-----+
Now i want to make a nice day-by-day (or hour by hour etc) dataset giving me an overview of how many buses where in the garage an how many that where out of it.
+-------------------+-----+------------+
|Date |State|Count(buses)|
+-------------------+-----+------------+
|2013-09-12 16:00:00|IN |2 |
+-------------------+-----+------------+
|2013-09-12 16:00:00|OUT |3 |
+-------------------+-----+------------+
|2013-09-13 16:00:00|IN |1 |
+-------------------+-----+------------+
|2013-09-13 16:00:00|OUT |4 |
+-------------------+-----+------------+
|2013-09-14 16:00:00|IN |2 |
+-------------------+-----+------------+
|2013-09-14 16:00:00|OUT |3 |
+-------------------+-----+------------+
|2013-09-15 16:00:00|IN |0 |
+-------------------+-----+------------+
|2013-09-15 16:00:00|OUT |5 |
+-------------------+-----+------------+
How (not necessary explained in code) would i go about to do this just using TSQL?
I have one reqirement, and that is that i can not use variable declarations in my statement since i will have this as a View.
I asked a very similar question, but i felt that that one got too verbouse and not as general as this one.
Do you really want multiple records per day/hour just to display the different states? I would make them columns. You can use a CTE and the OVER clause to count per day/hour group:
WITH CTE AS
(
SELECT [Date] = DATEADD(day, DATEDIFF(day, 0, [Date]),0),
[BusId], [State],
[IN] = SUM(CASE WHEN State='IN' THEN 1 END) OVER (PARTITION BY DATEADD(day, DATEDIFF(day, 0, [Date]),0)),
[Out] = SUM(CASE WHEN State='Out' THEN 1 END) OVER (PARTITION BY DATEADD(day, DATEDIFF(day, 0, [Date]),0)),
[DayNum] = ROW_NUMBER() OVER (PARTITION BY DATEADD(day, DATEDIFF(day, 0, [Date]),0)
ORDER BY [Date], [BusID], [State])
FROM dbo.Garage g
)
SELECT [Date], [BusId], [State], [IN], [OUT]
FROM CTE
WHERE [DayNum] = 1
Demo
Result:
DATE BUSID STATE IN OUT
September, 12 2013 00:00:00+0000 1 IN 2 3
September, 13 2013 00:00:00+0000 1 OUT (null) 1
September, 14 2013 00:00:00+0000 1 IN 1 (null)
September, 15 2013 00:00:00+0000 1 OUT (null) 2
This works even in SQL-Server 2005. If you want to group by hour instead of day you have to change DATEADD(day, DATEDIFF(day, 0, [Date]),0) to DATEADD(hour, DATEDIFF(hour, 0, [Date]),0) everywhere.
try this....
DECLARE #businfo AS TABLE([date] datetime,busid int,[state] varchar(5))
INSERT INTO #businfo VALUES('2013-09-12 15:02:41',1,'IN')
INSERT INTO #businfo VALUES('2013-09-12 15:02:41',2,'IN')
INSERT INTO #businfo VALUES('2013-09-12 15:02:41',3,'OUT')
INSERT INTO #businfo VALUES('2013-09-12 15:02:41',4,'OUT')
INSERT INTO #businfo VALUES('2013-09-12 15:02:41',5,'OUT')
INSERT INTO #businfo VALUES('2013-09-13 15:02:41',1,'OUT')
INSERT INTO #businfo VALUES('2013-09-14 15:02:41',1,'IN')
INSERT INTO #businfo VALUES('2013-09-15 15:02:41',1,'OUT')
INSERT INTO #businfo VALUES('2013-09-15 15:02:41',2,'OUT')
select [date],[state],COUNT(busid) as [count(buses)] from #businfo
group by [date],[state]
order by [date]