Refactor subqueries using GROUP BY/HAVING?

Refactor subqueries using GROUP BY/HAVING? - sql

I'm building a MySQL query to determine how many items from each of several categories appear in a given date range. My initial attempt looked like this:
select Title,
(select count(*) from entries where CategoryID=1
and Date >= #StartDate and Date <= #EndDate) as Cat1,
(select count(*) from entries where CategoryID=2
and Date >= #StartDate and Date <= #EndDate) as Cat2,
(select count(*) from entries where CategoryID is null
and Date >= #StartDate and Date <= #EndDate) as UnkownCategory
from entries
where Date >= #StartDate and Date <= #EndDate
The table is quite large and I'd like to refactor the query to speed it up, but I'm not sure how - can this be rewritten using GROUP BY/HAVING statements or is there another way I'm missing?
Edit: Sample result set - something like this:
Title | Category 1 Total | Category 2 Total | Unknown Category Total
ABC 1 3 0
DEF 2 7 2

select Title, SUM(CategoryID=1) as Cat1, SUM(categoryID=2) as Cat2,
SUM(categoryID IS NULL) as UnknownCategory
FROM entries
WHERE Date BETWEEN #StartDate AND #EndDate
GROUP BY Title
You can stick expressions in sum() functions: truth equals 1, false equals 0. Also I used the BETWEEN operator which is a little faster.
An alternative that would return a different result layout but is a little conceptually simpler:
select Title, CategoryID, count(*)
from entries
WHERE Date BETWEEN #StartDate AND #EndDate
group by Title, CategoryID

How about grouping by category id then using the having statement to filter out specific categories, like:
select CategoryID, count(*)
from entries
where Date >= #StartDate AND Date <= #EndDate
group by CategoryID
having CategoryID = 1 or CategoryID = 2 or CategoryID is null
If there are multiple titles per category you could group by both fields:
select Title, CategoryID, count(*)
from entries
where Date >= #StartDate AND Date <= #EndDate
group by Title, CategoryID
having CategoryID = 1 or CategoryID = 2 or CategoryID is null

Select COUNT(*), sTitle, CategoryID FROM entries
WHERE Date >= #StartDate and Date <= #EndDate
GROUP BY CategoryID, sTitle

Related

How can I sum a date in SQL

I have to do this:
the Total amount of Jobs by Status for the current day
the Total amount of Jobs closed in the last 30 days
My question is how to do that because I have been looking how to do it but I can't find it.
This is the corresponding data:

Select JobStatus, Count(*) as StatusCount
from myTable
where JobStartDate <= currentDate and currentDate < JobEndDate;
Select Count(*) as ClosedCount
from myTable
where JobStatus = 'C' and JobEndDate >= thirtyDaysAgo;
currentDate and thirtyDaysAgo are dependent to your database which you didn't tag.

For question #1:
SELECT JOBSTATUS,COUNT(JOBID)
FROM BTRS_JOBS
WHERE JOBSTARTDATE <= CAST(GETDATE() AS DATE)
AND JOBENDDATE > CAST(GETDATE() AS DATE)
GROUP BY JOBSTATUS
--YOU CAN USE ANY HARDCODED DATE
For question #2:
SELECT COUNT(JOBID) NOOFJOBSCLOSED
FROM BTRS_JOBS
WHERE JOBSTATUS = 'C'
AND JOBENDDATE >= CAST((GETDATE()-30) AS DATE) -- YOU CAN USE ANY HARDCODED DATE

Need to generate "to date" from 2 (or more) different "from dates"

I have a cost table with id, price, and from date. I don't have a "to date". So item1 may have the price of £50 from 01/01/2019 in one row, then item1 will then have the price of £55 from 01/01/2020, in a second row.
If I want to know the price of item 1 today, I couldn't use WHERE today >= fromdate and <= todate.
How do I add "todate"? Where todate is the day before the next row's fromdate?
Ideally need to do this as view, want to avoid creating tables/stored proc, if possible?
Thanks

To get today's price get the row with the latest date not greater than today:
select c.price
from cost c
where c.id = 'Item1'
and c.fromdate = (
select max(fromdate) from cost
where id = c.id and fromdate <= getdate()
)
Or:
select top 1 price
from cost
where id = 'Item1' and fromdate <= getdate()
order by fromdate desc
To create a todate column:
with cte as (
select *, row_number() over (partition by id order by fromdate) rn
from cost
)
select c.id, c.price, c.fromdate, dateadd(day, -1, cc.fromdate) todate
from cte c left join cte cc
on cc.id = c.id and cc.rn = c.rn + 1
See a simplified demo.

As commented by Peter Schneider, a sensible option would be to use window function lead() to recover the fromdate of the next record for the same id:
select
t.*,
lead(fromdate) over(partition by id order by fromdate) todate
from mytable t
Note that with this technique, the record that has the highest fromdate for each id will have todate set to null. If you want to assign a default end date you can use coalesce().
You can put this in a view:
create view myview as
select
t.*,
lead(fromdate) over(partition by id order by fromdate) todate
from mytable t
And then you can query the view for the current price of a given item:
select *
from myview
where
id = ?
and getdate() >= fromdate
and (todate is null or getdate() < todate)

SQL Sum of orders by date

I should know this but for some reason its getting me stumped.
This simple code is outputting all orders by day
USE [K.1]
Select CreatedAt,Identifier,RoundedPriceSum from StarOrder
where SiteID = 1
and OrderType <>2
and CreatedAt between '2015/01/01' and '2015/08/20'
CreatedAt is a date, Identifier is unique order ID and RoundedPriceSum the total of the order.
Is it possible to amend the code to provide a total of RoundedPriceSum per day_

Use GROUP BY:
Select cast(CreatedAt as date) as CreatedDay, SUM(RoundedPriceSum)
from StarOrder so
where SiteID = 1 and OrderType <> 2 and
CreatedAt >= '2015-01-01' and
CreatedAt < '2015/08/20'
group by cast(CreatedAt as date)
order by CreatedDay;
Notes on changes to the query:
Changed the dates to ISO standard YYYY-MM-DD format.
Replaced the BETWEEN with >= and <. This works better for dates with times.
Use cast(as date) to remove the time component.
Added an ORDER BY so the results are in order by day.

select s.CreatedAt,s.Identifier,x.tot
from StarOrder s
join
(select CreatedAt,sum(RoundedPriceSum) as tot
from StarOrder
where SiteID = 1
and OrderType <>2
and CreatedAt between '2015/01/01' and '2015/08/20'
group by createdat) x
on x.createdat = s.createdat
where SiteID = 1
and OrderType <>2
and s.CreatedAt between '2015/01/01' and '2015/08/20'

How can I contstruct this T-SQL query involving missing date ranges?

I'll try to keep the specific details of my problem out of this question and focus only on the pertinent issues.
Lets say I have an Assets table with a primary key of AssetID.
I have another table called ProcessedDates with primary key PID and with additional columns AssetID, StartDate, EndDate.
I want to run a process for a list of assets between a start date and end date. Before I can run this process, I need to know which assets and which date ranges have already been processed.
For example, there are 2 entries in ProcessedDates:
AssetID StartDate EndDate
--------------------------
Asset1 Day4 day7
Asset1 Day10 Day12
I want to process Asset1 between day2 and day11. I don't need to waste time by processing on days that have already been done so in this example, I will only process asset1 from day2 to day3 and from day8 to day 9.
So what I need is a query that returns the gaps in the date ranges. In this case, the result set will be 2 lines:
AssetID StartDate EndDate
--------------------------
Asset1 day2 day3
Asset1 day8 day9
In my actual requirement I have many assetIDs. The ProcessedDates table may have multiple entries for each asset or none at all and each asset does not necessarily have the same processed dates as any other asset.
declare #StartDate date, #EndDate date (assume these are given)
--get distinct assets
select distinct AssetIDs from (some query) into #Assets
--get the already processed date ranges
select p.AssetID, p.StartDate, p.EndDate
from ProcessedDates p inner join #Assets a on p.AssetID = a.AssetID
where p.StartDate between #StartDate and #EndDate
or p.EndDate between #StartDate and #EndDate
From here I have no clue how to proceed. How do I get it to return AssetID, StartDate, EndDate for all the gaps in between?

Something like this:
declare #StartDate date = '2015-01-01', #EndDate date = '2015-05-05'
declare #Assets table (AssetID varchar(50), StartDate date, EndDate date)
declare #AssetTypes table (AssetID varchar(50))
insert into #AssetTypes values
('Asset1'),
('Asset2')
insert into #Assets values
('Asset1', '2014-12-10', '2014-12-31'), -- Ignored
('Asset1', '2015-02-02', '2015-03-02'),
('Asset1', '2015-03-05', '2015-05-01'),
('Asset1', '2015-06-01', '2015-06-06') -- Ignored
;WITH Base AS (
SELECT AT.AssetID
, CASE WHEN A.AssetID IS NULL THEN 1 ELSE 0 END EmptyAsset
, A.StartDate
, A.EndDate
, ROW_NUMBER() OVER (PARTITION BY AT.AssetID ORDER BY StartDate) RN
FROM #AssetTypes AT
LEFT JOIN #Assets A ON A.AssetID = AT.AssetID
WHERE A.AssetID IS NULL -- case of totally missing asset
OR (StartDate <= #EndDate AND EndDate >= #StartDate)
)
-- first missing range, before the first row
SELECT AssetID, #StartDate StartDate, DATEADD(dd, -1, StartDate) EndDate
FROM Base
WHERE RN = 1 AND StartDate > #StartDate
UNION ALL
-- each row joined with the next one
SELECT B1.AssetID, DATEADD(dd, 1, B1.EndDate), ISNULL(DATEADD(dd, -1, B2.StartDate), #EndDate)
FROM Base B1
LEFT JOIN Base B2 ON B2.AssetID = B1.AssetID AND B2.RN = B1.RN + 1
WHERE B1.EmptyAsset = 0
AND (B2.AssetID IS NULL -- Last row case
OR DATEADD(dd, 1, B1.EndDate) < B2.StartDate) -- Other rows case
AND B1.EndDate < #EndDate -- If the range ends after #EndDate, nothing to do
UNION ALL
-- case of totally missing asset
SELECT AssetID, #StartDate, #EndDate
FROM Base
WHERE EmptyAsset = 1
The main idea is that each row is joined with the next one. A new range is generated (if necessary) between the EndDate + 1 and the StartDate - 1. There is a special handling for the last row (B2.AssetID IS NULL and ISNULL(... #EndDate)). The first SELECT generated a row before the first range, and the last select is for the special case of no ranges present for an asset.
As I've written in the comments, it gets ugly quite quickly.

Here's an simple version to get the result you want. I use integer as date, and assume the min date is 0 and the max date is 999.
--DDL
create table Assets (AssetID integer, StartDate integer, EndDate integer);
insert into Assets values
(1,4,7),
(1,10,12),
(1,15,17),
(2,5,7),
(2,9,10);
with temp as(
select a1.AssetId,
a1.enddate+1 as StartDate,
coalesce(min(a2.startdate) - 1,999) as EndDate
from Assets a1
left join Assets a2
on a1.assetid = a2.assetid
and a1.enddate < a2.startdate
group by a1.assetid,a1.enddate
union all
select a.assetid,0,min(startdate) -1
from Assets a
group by a.assetid
)
select AssetId,
case when StartDate<2 then 2 else StartDate end as StartDate,
case when EndDate>11 then 11 else EndDate end as EndDate
from temp
where StartDate<=11 and EndDate>=2
order by AssetId,StartDate
The temp table can get the missing ranges. Then filter the match ranges between Day2 and Day11, will get the result that you want.
AssetId StartDate EndDate
1 2 3
1 8 9
2 2 4
2 8 8
2 11 11
Here's the SqlFiddle Demo

How to count number of records per day?

I have a table in a with the following structure:
CustID --- DateAdded ---
396 2012-02-09
396 2012-02-09
396 2012-02-08
396 2012-02-07
396 2012-02-07
396 2012-02-07
396 2012-02-06
396 2012-02-06
I would like to know how I can count the number of records per day, for the last 7 days in SQL and then return this as an integer.
At present I have the following SQL query written:
SELECT *
FROM Responses
WHERE DateAdded >= dateadd(day, datediff(day, 0, GetDate()) - 7, 0)
RETURN
However this only returns all entries for the past 7 days. How can I count the records per day for the last 7 days?

select DateAdded, count(CustID)
from Responses
WHERE DateAdded >=dateadd(day,datediff(day,0,GetDate())- 7,0)
GROUP BY DateAdded

select DateAdded, count(CustID)
from tbl
group by DateAdded
about 7-days interval it's DB-depending question

SELECT DateAdded, COUNT(1) AS NUMBERADDBYDAY
FROM Responses
WHERE DateAdded >= dateadd(day,datediff(day,0,GetDate())- 7,0)
GROUP BY DateAdded

This one is like the answer above which uses the MySql DATE_FORMAT() function. I also selected just one specific week in Jan.
SELECT
DatePart(day, DateAdded) AS date,
COUNT(entryhash) AS count
FROM Responses
where DateAdded > '2020-01-25' and DateAdded < '2020-02-01'
GROUP BY
DatePart(day, DateAdded )

If your timestamp includes time, not only date, use:
SELECT DATE_FORMAT('timestamp', '%Y-%m-%d') AS date, COUNT(id) AS count FROM table GROUP BY DATE_FORMAT('timestamp', '%Y-%m-%d')

You could also try this:
SELECT DISTINCT (DATE(dateadded)) AS unique_date, COUNT(*) AS amount
FROM table
GROUP BY unique_date
ORDER BY unique_date ASC

SELECT count(*), dateadded FROM Responses
WHERE DateAdded >=dateadd(day,datediff(day,0,GetDate())- 7,0)
group by dateadded
RETURN
This will give you a count of records for each dateadded value. Don't make the mistake of adding more columns to the select, expecting to get just one count per day. The group by clause will give you a row for every unique instance of the columns listed.

select DateAdded, count(DateAdded) as num_records
from your_table
WHERE DateAdded >=dateadd(day,datediff(day,0,GetDate())- 7,0)
group by DateAdded
order by DateAdded

Unfortunately the best answer here IMO is a comment by #Profex on an incorrect answer , but the solution I went with is
SELECT FORMAT(DateAdded, 'yyyy-MM-dd'), count(CustID)
FROM Responses
WHERE DateAdded >= dateadd(day,datediff(day,0,GetDate())- 7,0)
GROUP BY FORMAT(DateAdded, 'yyyy-MM-dd')
ORDER BY FORMAT(DateAdded, 'yyyy-MM-dd')
Note that I haven't tested this SQL since I don't have the OP's DB , but this approach works well in my scenario where the date is stored to the second
The important part here is using the FORMAT(DateAdded, 'yyyy-MM-dd') method to drop the time without losing the year and month , as would happen if you used DATEPART(day, DateAdded)

When a day among last 7 days, has no record means, the following code will list out that day with count as zero.
DECLARE #startDate DATE = GETDATE() - 6,
#endDate DATE = GETDATE();
DECLARE #daysTable TABLE
(
OrderDate date
)
DECLARE #daysOrderTable TABLE
(
OrderDate date,
OrderCount int
)
Insert into #daysTable
SELECT TOP (DATEDIFF(DAY, #startDate, #endDate) + 1)
Date = DATEADD(DAY, ROW_NUMBER() OVER(ORDER BY a.object_id) - 1, #startDate)
FROM sys.all_objects a
CROSS JOIN sys.all_objects b;
Insert into #daysOrderTable
select OrderDate, ISNULL((SELECT COUNT(*) AS OdrCount
FROM [dbo].[MyOrderTable] odr
WHERE CAST(odr.[CreatedDate] as date) = dt.OrderDate
group by CAST(odr.[CreatedDate] as date)
), 0) AS OrderCount from #daysTable dt
select * from #daysOrderTable
RESULT
OrderDate     OrderCount
2022-11-22     42
2022-11-23     6
2022-11-24     34
2022-11-25     0
2022-11-26     28
2022-11-27     0
2022-11-28     22

SELECT DATE_FORMAT(DateAdded, '%Y-%m-%d'),
COUNT(CustID)
FROM Responses
GROUP BY DATE_FORMAT(DateAdded, '%Y-%m-%d');

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Refactor subqueries using GROUP BY/HAVING? - sql

Select COUNT(*), sTitle, CategoryID FROM entries WHERE Date >= #StartDate and Date <= #EndDate GROUP BY CategoryID, sTitle

Related

How can I sum a date in SQL

Need to generate "to date" from 2 (or more) different "from dates"

SQL Sum of orders by date

How can I contstruct this T-SQL query involving missing date ranges?

How to count number of records per day?

Categories

Resources