SQL select and count all items that have occured before - sql

I have a table with rows that symbolize order dates:
2009-05-15 13:31:47.713
2009-05-15 22:09:32.227
2009-05-16 02:38:36.027
2009-05-16 12:06:49.743
2009-05-16 16:20:26.680
2009-05-17 01:36:19.480
2009-05-18 09:44:46.993
2009-05-18 14:06:12.073
2009-05-18 15:25:47.540
2009-05-19 10:28:24.150
I would like have query that returns the following:
2009-05-15 2
2009-05-16 5
2009-05-17 6
2009-05-18 9
2009-05-19 10
Basically it keeps a running total of all the orders placed by the end of the day of the date indicated. The orders are not the orders on that day but all the orders since the earliest dates in the table.
This is MSSQL 2000 and the datatype in the first table is just datetime, in the second it could be datetime or string, it doesn't really matter for my purposes.

I got this to work on SQL Server 2005. I think it should work with 2000, as well.
SELECT dt, count(q2.YourDate)
FROM (SELECT DISTINCT CONVERT(varchar,YourDate,101) dt FROM YourTable) t1
JOIN YourTable q2 ON DATEADD(d,-1,CONVERT(varchar,YourDate,101)) < dt
GROUP BY dt
This will query the table twice, but at least gives correct output.

I recommend a 2 query solution. This is slow, but I use this method almost daily. The important thing is to NOT join the 2 tables in the first query. You want the duplication of each order for every date in your lookup table.
You will need a Lookup table with 1 row for each date of the time period you're interested in. Let's call it dboDateLookup. Here's what it will look like:
DtIndex
2009-05-15
2009-05-16
2009-05-17
2009-05-18
2009-05-19
Let's also assume the order table, dboOrders has 2 columns, ordernumber and orderdate.
ordernumber orderdate
2009-05-15 13:31:47.713 1
2009-05-15 22:09:32.227 2
2009-05-16 02:38:36.027 3
2009-05-16 12:06:49.743 4
2009-05-16 16:20:26.680 5
Query1:
SELECT
Format([ordernumber],"yyyy-mm-dd") AS ByDate,
ordernumber,
(If Format([orderdate],"yyyy-mm-dd")<=[DtIndex],1,0) AS NumOrdersBefore
FROM [dboOrders], [dboDateLookUp];
Query2:
Select
[ByDate],
sum([NumOrdersBefore]) as RunningTotal
from [Query1];

Try this (returns string dates):
SELECT
LEFT(CONVERT(char(23),YourDate,121),10) AS Date
,COUNT(*) AS CountOf
FROM YourTable
GROUP BY LEFT(CONVERT(char(23),YourDate,121),10)
ORDER BY 1
this will table scan. if it is too slow, consider using a persistant computed column with an index for the date, that will run much faster. However, I'mnot sure if you can do all that in SQL 2000.
EDIT read the question better, try this:
SELECT
d.YourDate
,SUM(dt.CountOf) AS CountOf
FROM (SELECT
LEFT(CONVERT(char(23),YourDate,121),10) AS Date
,COUNT(*) AS CountOf
FROM YourTable
GROUP BY LEFT(CONVERT(char(23),YourDate,121),10)
) dt
INNER JOIN (SELECT
DISTINCT LEFT(CONVERT(char(23),YourDate,121),10) AS Date
FROM YourTable
) d ON dt.Date<=LEFT(CONVERT(char(23),d.YourDate,121),10)
GROUP BY d.YourDate
ORDER BY d.YourDate

I have another one
It is not so fancy
I ran it on Access so syntax may differ little bit.
But it seems to work.
P.S. Im relatively new to SQL
Data:
ID F1 F2
1 15/05/2009 13:31:47.713
2 15/05/2009 22:09:32.227
3 16/05/2009 02:38:36.027
4 16/05/2009 12:06:49.743
5 16/05/2009 16:20:26.680
6 17/05/2009 01:36:19.480
7 18/05/2009 09:44:46.993
8 18/05/2009 14:06:12.073
9 18/05/2009 15:25:47.540
10 19/05/2009 10:28:24.150
Query:
SELECT Table1.F1 AS Dates, SUM(REPLACE(Len(Table1.F2), Len(Table1.F2), 1)) AS Occurred
FROM Table1
GROUP BY Table1.F1;
Result:
Dates Occurred
15/05/2009 2
16/05/2009 3
17/05/2009 1
18/05/2009 3
19/05/2009 1

SELECT Count(*), LEFT(CONVERT(char(23),YourDate,121),10) AS Date FROM
(SELECT
DISTINCT LEFT(CONVERT(char(23),YourDate,121),10) AS Date
FROM YourTable
GROUP BY LEFT(CONVERT(char(23),YourDate,121),10)) x //Gets the distinct dates.
INNER JOIN YourTable y on x.Date >= y.Date
GROUP BY LEFT(CONVERT(char(23),YourDate,121),10)
It's going to be slow. REALLY REALLY slow. I hate to think what run times would be.

Related

Extra column looking at where OpenDate > ClosedDate prevoius records

I have a hard struggle with this problem.
I have the following table:
TicketNumber
OpenTicketDate YYYY_MM
ClosedTicketDate YYYY_MM
1
2018-1
2020-1
2
2018-2
2021-2
3
2019-1
2020-6
4
2020-7
2021-1
I would like to create an extra column which would monitor the open tickets at the given OpenTicketDate.
So the new table would look like this:
TicketNumber
OpenTicketDate YYYY_MM
ClosedTicketDate YYYY_MM
OpenTicketsLookingBackwards
1
2018-1
2020-1
1
2
2018-2
2021-2
2
3
2019-1
2020-6
3
4
2020-7
2021-1
2
The logic behind the 4th (extra) column is that it looks at the previous records & current record where the ClosedTicketsDate > OpenTicketDate.
For example ticketNumber 4 has '2' open tickets because there are only 2 ClosedTicketDate records where ClosedTicketDate > OpenTicketDate.
The new column only fills data based on looking at prevoius records. It is backward looking not forward.
Is there anyone who can help me out?
You could perform a self join and aggregate as the following:
Select T.TicketNumber, T.OpenTicketDate, T.ClosedTicketDate,
Count(*) as OpenTicketsLookingBackwards
From table_name T Left Join table_name D
On Cast(concat(T.OpenTicketDate,'-1') as Date) < Cast(concat(D.ClosedTicketDate,'-1') as Date)
And T.ticketnumber >= D.ticketnumber
Group By T.TicketNumber, T.OpenTicketDate, T.ClosedTicketDate
Order By T.TicketNumber
You may also try with a scalar subquery as the following:
Select T.TicketNumber, T.OpenTicketDate, T.ClosedTicketDate,
(
Select Count(*) From table_name D
Where Cast(concat(T.OpenTicketDate,'-1') as Date) <
Cast(concat(D.ClosedTicketDate,'-1') as Date)
And T.ticketnumber >= D.ticketnumber
) As OpenTicketsLookingBackwards
From table_name T
Order By T.TicketNumber
Mostly, joins tend to outperform subqueries.
See a demo.

SQL: How to count the sum of values without GROUP BY

I have the following table:
visitorId visitNumber DATE
1 1 20180101
1 2 20180101
1 3 20180105
2 1 20171230
2 2 20180106
What I would like to return is:
visitorId totalVisits max_visits_in_1_day
1 3 2
2 2 1
I manage to get everything working without max_visits_in_1_day using:
SELECT visitorId,
MAX(visitNumber) - MIN(visitNumber) + 1 as totalVisits,
GROUP BY visitorId
What I need to do is improve the code such that max_visits_in_1_day gets added. Something like MAX(COUNT(GROUP BY(DATE)))
I first tried adding MAX(COUNT(DATE)), but this aggregates all dates, and doesn't actually look for maximum unique date. In a sense, I would need to do a GROUP BY on DATE and the sum the counts then.
I tried adding GROUP BY visitorId, DATE but this creates extra rows.
You will have to take two steps like this:
SELECT visitorId, SUM(perDay) AS totalVisits, MAX(perDay) AS max_visits_in_1_day
FROM
(SELECT visitorId, COUNT(visitNumber) AS perDay, DATE
FROM myTable
GROUP BY visitorId, DATE) A
GROUP BY visitorId
You can try the following query -
SELECT visitorId
,COUNT(visitNumber) totalVisits
,mv1d.count max_visits_in_1_day
FROM YOUR_TABLE YT
INNER JOIN (SELECT visitorId, MAX(COUNT(DATE)) count
FROM YOUR_TABLE YT1)
ON YT.visitorId = YT1.visitorId
GROUP BY visitorId

An aggregation is affecting results in a major way

I seem to be getting duplicates as a result of this query. The only analysis I want to do is the sum of calls/the total orders, and to be able to see how many support_tickets were generated from orders within an order range, up to a call_date. Very simple, but surprisingly complex to code up. Here is my attempt. I have also tried to change the below into a union, but still get wrong aggregate results.
The query:
SELECT marketing_code,
count(order_code) order_code_count,
order_date,
sum(support_ticket_call) call_count,
call_date
FROM
(select distinct marketing_code, order_code, order_date from table1) a
left join
(select count(call_ids) as support_ticket_Call, call_date
FROM table2 group by call_date) b
on b.order_ID_code = a.order_id_code
group by marketing_code, order_date, call_date
Please note, the call can happen at a much later date than the order. The order date is in table 1, but not in table 2; the call_date is in table 2, but not in table 1. Also, in the data, the marketing code is either AB16 or AB17.
Sample data:
Marketing code order_code_count call_count call_date order_date
AB16 30 45 2016-01-01 2015-12-27
AB17 13 17 2016-01-02 2015-12-29
AB16 24 29 2016-01-02 2016-01-01
The sum of support ticket calls should be lower than the order count.
You join your tables by order_id_code, but in the right part of your join you count all calls from one day. This doesn't seem right. Try something like this:
select
marketing_code
count(order_code) order_code_count
order_date
count(call_ids) call_count
call_date
from
table1 a left join table2 b on b.order_ID_code = a.order_id_code
group by
marketing_code, order_date, call_date

Joining next Sequential Row

I am planing an SQL Statement right now and would need someone to look over my thougts.
This is my Table:
id stat period
--- ------- --------
1 10 1/1/2008
2 25 2/1/2008
3 5 3/1/2008
4 15 4/1/2008
5 30 5/1/2008
6 9 6/1/2008
7 22 7/1/2008
8 29 8/1/2008
Create Table
CREATE TABLE tbstats
(
id INT IDENTITY(1, 1) PRIMARY KEY,
stat INT NOT NULL,
period DATETIME NOT NULL
)
go
INSERT INTO tbstats
(stat,period)
SELECT 10,CONVERT(DATETIME, '20080101')
UNION ALL
SELECT 25,CONVERT(DATETIME, '20080102')
UNION ALL
SELECT 5,CONVERT(DATETIME, '20080103')
UNION ALL
SELECT 15,CONVERT(DATETIME, '20080104')
UNION ALL
SELECT 30,CONVERT(DATETIME, '20080105')
UNION ALL
SELECT 9,CONVERT(DATETIME, '20080106')
UNION ALL
SELECT 22,CONVERT(DATETIME, '20080107')
UNION ALL
SELECT 29,CONVERT(DATETIME, '20080108')
go
I want to calculate the difference between each statistic and the next, and then calculate the mean value of the 'gaps.'
Thougts:
I need to join each record with it's subsequent row. I can do that using the ever flexible joining syntax, thanks to the fact that I know the id field is an integer sequence with no gaps.
By aliasing the table I could incorporate it into the SQL query twice, then join them together in a staggered fashion by adding 1 to the id of the first aliased table. The first record in the table has an id of 1. 1 + 1 = 2 so it should join on the row with id of 2 in the second aliased table. And so on.
Now I would simply subtract one from the other.
Then I would use the ABS function to ensure that I always get positive integers as a result of the subtraction regardless of which side of the expression is the higher figure.
Is there an easier way to achieve what I want?
The lead analytic function should do the trick:
SELECT period, stat, stat - LEAD(stat) OVER (ORDER BY period) AS gap
FROM tbstats
The average value of the gaps can be done by calculating the difference between the first value and the last value and dividing by one less than the number of elements:
select sum(case when seqnum = num then stat else - stat end) / (max(num) - 1);
from (select period, row_number() over (order by period) as seqnum,
count(*) over () as num
from tbstats
) t
where seqnum = num or seqnum = 1;
Of course, you can also do the calculation using lead(), but this will also work in SQL Server 2005 and 2008.
By using Join also you achieve this
SELECT t1.period,
t1.stat,
t1.stat - t2.stat gap
FROM #tbstats t1
LEFT JOIN #tbstats t2
ON t1.id + 1 = t2.id
To calculate the difference between each statistic and the next, LEAD() and LAG() may be the simplest option. You provide an ORDER BY, and LEAD(something) returns the next something and LAG(something) returns the previous something in the given order.
select
x.id thisStatId,
LAG(x.id) OVER (ORDER BY x.id) lastStatId,
x.stat thisStatValue,
LAG(x.stat) OVER (ORDER BY x.id) lastStatValue,
x.stat - LAG(x.stat) OVER (ORDER BY x.id) diff
from tbStats x

SQL Query to generate an extra field from data in the table

I have a table with 3 fields like this sample table Tbl1
Person Cost FromDate
1 10 2009-1-1
1 20 2010-1-1
2 10 2009-1-1
I want to query it and get back the 3 fields and a generated field called ToDate that defaults to 2099-1-1 unless there is an actual ToDate implied from another entry for the person in the table.
select Person,Cost,FromDate,ToDate From Tbl1
Person Cost FromDate ToDate
1 10 2009-1-1 2010-1-1
1 20 2010-1-1 2099-1-1
2 10 2009-1-1 2099-1-1
You can select the minimum date from all dates that are after the record's date. If there is none you get NULL. With COALESCE you change NULL into the default date:
select
Person,
Cost,
FromDate,
coalesce((select min(FromDate) from Tbl1 later where later.FromDate > Tbl1.FromDate), '2099-01-01') as ToDate
From Tbl1
order by Person, FromDate;
Although Thorsten's answer is perfectly fine, it would be more efficient to use window-functions to match the derived end-dates.
;WITH nbrdTbl
AS ( SELECT Person, Cost, FromDate, row_nr = ROW_NUMBER() OVER (PARTITION BY Person ORDER BY FromDate ASC)
FROM Tbl1)
SELECT t.Person, t.Cost, t.FromDate, derived_end_date = COALESCE(nxt.FromDate, '9991231')
FROM nbrdTbl t
LEFT OUTER JOIN nbrdTbl nxt
ON nxt.Person = t.Person
AND nxt.row_nr = t.row_nr + 1
ORDER BY t.Person, t.FromDate
Doing a test on a 2000-records table it's about 3 times as efficient according to the Execution plan (78% vs 22%).