I couldn't find a past question exactly like this problem. I have an orders table, containing a customer id, order date, and several numeric columns (how many of a particular item were ordered on that date). Removing some of the numberics, it looks like this:
customer_id date a b c d
0001 07/01/22 0 3 3 5
0001 07/12/22 12 0 50 0
0002 06/30/22 5 65 0 30
0002 07/20/22 1 0 19 2
0003 08/01/22 0 0 99 0
I need to sum each numeric column by customer_id, then return the top n customers for each column. Obviously that means a single customer may appear multiple times, once for each column. Assuming top 2, the desired output would look something like this:
column_ranked customer_id sum rank
'a' 001 12 1
'a' 002 6 2
'b' 002 65 1
'b 001 3 2
'c' 003 99 1
'c' 001 53 2
'd' 002 30 1
'd' 001 5 2
(this assumes no date range filter)
My first thought was a CTE to collapse the table into its per-customer sums, then a union from the CTE, with a limit n clause, once for each summed column. That works if the date range is hard-coded into the CTE .... but I want to define this as a view, so it can be called by users something like this:
SELECT * from top_customers_view WHERE date_range BETWEEN ( date1 and date2 )
How can I pass the date restriction down to the CTE? Or am I taking the wrong approach entirely? If a view isn't possible, can it be done as a function? (without using a costly cursor, that is.)
Since the date ranges clearly produce a massive number of combinations you cannot generate a view with them. You can write a query, however, as shown below:
with
p as (select cast ('2022-01-01' as date) as ds, cast ('2022-12-31' as date) as de),
a as (
select top 10 customer_id, 'a' as col, sum(a) as s
from t cross join p where date between ds and de
group by customer_id order by s desc
),
b as (
select top 10 customer_id, 'b' as col, sum(b) as s
from t cross join p where date between ds and de
group by customer_id order by s desc
),
c as (
select top 10 customer_id, 'c' as col, sum(b) as s
from t cross join p where date between ds and de
group by customer_id order by s desc
),
d as (
select top 10 customer_id, 'd' as col, sum(b) as s
from t cross join p where date between ds and de
group by customer_id order by s desc
)
select * from a
union all select * from b
union all select * from c
union all select * from d
order by customer_id, col, s desc
The date range is in the second line.
See db<>fiddle.
Alternatively, you could create a data warehousing solution, but it would require much more effort to make it work.
i have two tables one Job Category with the structure | id | name | and the other one jobs with the structure | id | job_name | job_category |
How to count how many jobs are in each category?
select c.name, count(j.id)
from job_category c
left join jobs j on j.job_category = c.name
group by c.name
You can do it with a left join (see other answers) or with a subquery:
SELECT
c.Name
, (SELECT COUNT(*) FROM jobs j WHERE j.category_id=c.id) AS Count
FROM job_category c
Why do you need join at all? Is it not
select job_category, count(*)
from jobs
group by job_category
If not, do post your own query (and, possibly, some sample data which might help us help you).
[EDIT, after reading comments]
It appears that my (over)simplified "solution" lacks in some details. True, it shows categories with no jobs, while the OP asked that those should also be displayed having "0" as a result.
Outer join with appropriate COUNT function would fix that; here's an example.
SQL> with job_category(id, name) as
2 (select 1, 'categ 1' from dual union
3 select 2, 'categ 2' from dual
4 ),
5 job (id, job_name, job_category) as
6 (select 100, 'job 1', 1 from dual union
7 select 200, 'job 2', 1 from dual
8 )
9 select c.name, count(j.id)
10 from job_category c left join job j on j.job_category = c.id
11 group by c.name
12 order by c.name;
NAME COUNT(J.ID)
------- -----------
categ 1 2
categ 2 0
SQL>
I have two tables on db:
1. memberOne
memberName | gender
===================
Jack | M
Steve | M
Audrey | F
2. memberTwo
memberName | gender
===================
Sarah | F
Steve | M
Audrey | F
Alvin | M
I want to display this view:
Gender | Total
=======================
M | 4
F | 3
I performed this code
SELECT t.Gender, COUNT(t.Gender) Total FROM memberOne t
GROUP BY t.Gender
UNION ALL
SELECT d.Gender, COUNT(d.Gender) Total FROM memberTwo d
GROUP BY d.Gender
;
And this is what I got:
Gender | Total
------------- ----------
M 2
F 1
M 2
F 2
How can I sum the total of M and F from each table? Should I use condition to check the gender?
Any helps would be appreciated, thanks.
You need to use UNION ALL and then apply COUNT
SELECT
gender as Gender,
COUNT(*) as Total
FROM
(
SELECT gender
FROM memberOne
UNION ALL
SELECT gender
FROM memberTwo
) group by gender
One approach here would be to union together only the genders from the two tables, and then do a single aggregation to get the male and female counts.
SELECT
gender,
COUNT(*) AS total
FROM
(
SELECT gender
FROM memberOne
UNION ALL
SELECT gender
FROM memberTwo
) t
GROUP BY gender
ORDER BY gender DESC
Demo here:
Rextester
Wrap your last query in another query that sums the count of M and F.
SELECT
G, SUM(Total)
FROM
(SELECT
t.Gender G, COUNT(t.Gender) Total
FROM
memberOne t
GROUP BY
t.Gender
UNION ALL
SELECT
d.Gender G, COUNT(d.Gender) Total
FROM
memberTwo d
GROUP BY
d.Gender)
GROUP BY
gender
One way to accomplish your goal working off your initial query would be:
select Gender, sum(Total) Total
From (your existing query) q
group by q.Gender
A different way would be:
select Gender, Count(Gender) Total
from
( select Gender from membeOne
Union all
Select Gender from memberTwo ) q
group by q.Gender
SELECT n.Gender, COUNT(n.Gender) Total FROM
(
SELECT t.Gender, COUNT(t.Gender) Total FROM memberOne t
GROUP BY t.Gender
UNION ALL
SELECT d.Gender, COUNT(d.Gender) Total FROM memberTwo d
GROUP BY d.Gender
) n
GROUP BY n.Gender
reiterating select and group by after last query
I have the following table:
custid custname channelid channel dateViewed
--------------------------------------------------------------
1 A 1 ABSS 2016-01-09
2 B 2 STHHG 2016-01-19
3 C 4 XGGTS 2016-01-09
6 D 4 XGGTS 2016-01-09
2 B 2 STHHG 2016-01-26
2 B 2 STHHG 2016-01-28
1 A 3 SSJ 2016-01-28
1 A 1 ABSS 2016-01-28
2 B 2 STHHG 2016-02-02
2 B 7 UUJKS 2016-02-10
2 B 8 AKKDC 2016-02-10
2 B 9 GGSK 2016-02-10
2 B 9 GGSK 2016-02-11
2 B 7 UUJKS 2016-02-27
And I want the results to be:
custid custname month count
------------------------------
1 A 1 1
2 B 1 1
2 B 2 4
3 C 1 1
6 D 1 1
According to the following rules:
All channel views subscription is billed every 15 days. If the
customer viewed the same channel within the 15 days, he will only be
billed once for that channel. For instance, custid 2, custname B his billing cycle is 19 Jan - 3 Feb (one billing cycle), 4 Feb - 20 Feb (one billing cycle) and so on. Therefore, he is billed only 1 time in Jan since he watch the same channel throughout the billing cycle; and he is billed 4 times in Feb for watching (channelid 7, 8, 9) and channelid 7 watched on 27 Feb (since this falls in another billing cycle, customer B is also charged here). Customer B is not charged on 2 Feb for watching channel 2 since he was already billed in 19 jan - 3 Feb billing cycle.
An invoice is generated every month for each customer, therefore, the
results should show the 'Month' and the 'Count' of the channels
viewed for each customer.
Can this be done in SQL server?
;WITH cte AS (
SELECT custid,
custname,
channelid,
channel,
dateViewed,
CAST(DATEADD(day,15,dateViewed) as date) as dateEnd,
ROW_NUMBER() OVER (PARTITION BY custid, channelid ORDER BY dateViewed) AS rn
FROM (VALUES
(1, 'A', 1, 'ABSS', '2016-01-09'),(2, 'B', 2, 'STHHG', '2016-01-19'),
(3, 'C', 4, 'XGGTS', '2016-01-09'),(6, 'D', 4, 'XGGTS', '2016-01-09'),
(2, 'B', 2, 'STHHG', '2016-01-26'),(2, 'B', 2, 'STHHG', '2016-01-28'),
(1, 'A', 3, 'SSJ', '2016-01-28'),(1, 'A', 1, 'ABSS', '2016-01-28'),
(2, 'B', 2, 'STHHG', '2016-02-02'),(2, 'B', 7, 'UUJKS', '2016-02-10'),
(2, 'B', 8, 'AKKDC', '2016-02-10'),(2, 'B', 9, 'GGSK', '2016-02-10'),
(2, 'B', 9, 'GGSK', '2016-02-11'),(2, 'B', 7, 'UUJKS', '2016-02-27')
) as t(custid, custname, channelid, channel, dateViewed)
), res AS (
SELECT custid, channelid, dateViewed, dateEnd, 1 as Lev
FROM cte
WHERE rn = 1
UNION ALL
SELECT c.custid, c.channelid, c.dateViewed, c.dateEnd, lev + 1
FROM res r
INNER JOIN cte c ON c.dateViewed > r.dateEnd and c.custid = r.custid and c.channelid = r.channelid
), final AS (
SELECT * ,
ROW_NUMBER() OVER (PARTITION BY custid, channelid, lev ORDER BY dateViewed) rn,
DENSE_RANK() OVER (ORDER BY custid, channelid, dateEnd) dr
FROM res
)
SELECT b.custid,
b.custname,
MONTH(f.dateViewed) as [month],
COUNT(distinct dr) as [count]
FROM cte b
LEFT JOIN final f
ON b.channelid = f.channelid and b.custid = f.custid and b.dateViewed between f.dateViewed and f.dateEnd
WHERE f.rn = 1
GROUP BY b.custid,
b.custname,
MONTH(f.dateViewed)
Output:
custid custname month count
----------- -------- ----------- -----------
1 A 1 3
2 B 1 1
2 B 2 4
3 C 1 1
6 D 1 1
(5 row(s) affected)
I don't know why you get 1 in count field for customer A. He got:
ABSS 2016-01-09 +1 to count (+15 days = 2016-01-24)
SSJ 2016-01-28 +1 to count
ABSS 2016-01-28 +1 to count (28-01 > 24.01)
So in January there must be count = 3.
Whenever I am trying to count things with complex criteria, I use a sum and case statement. Something like below:
SELECT custid, custname,
SUM(CASE WHEN somecriteria
THEN 1
ELSE 0
END) As CriteriaCount
FROM whateverTable
GROUP BY custid, custname
You can make that somecriteria variable as complicated a statement as you like, so long as it returns a boolean. If it passes, this row returns a 1. If it fails, the row reutrns a 0, then we sum up the values returned to get the count.
Generally this is how you can get any number (10 in this example) of fixed 15 day intervals starting at the given date (#dd in this example).
DECLARE #dd date = CAST('2016-01-19 17:30' AS DATE);
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
E2(N) AS (SELECT 1 FROM E1 a, E1 b),
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000 rows max
tally(N) AS (SELECT TOP (10) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4)
SELECT
startd = DATEADD(D,(N-1)*15, #dd),
endd = DATEADD(D, N*15-1, #dd)
FROM tally
Adapt it to the rules defining how start date must be calculated for the user (and probably chanel).
#Sturgus what if I want to define it in the code? Any other
alternatives besides defining it in the table? How to write a query
that can be run every month to generate the monthly invoice. –
saturday 15 mins ago
Well, one way or another, you will have to save each customer's billing start date (minimally). If you want to do this entirely in SQL without 'editing the database', something like the following should work. The drawback to this approach is that you would need to manually edit the "INSERT INTO" statement every month to suit your needs. If you were allowed to edit the already existing customers table or create a new one, then it would reduce this manual effort.
DECLARE #CustomerBillingPeriodsTVP AS Table(
custID int UNIQUE,
BillingCycleID int,
BillingStartDate Date,
BillingEndDate Date
);
INSERT INTO #CustomerBillingPeriodsTVP (custID, BillingCycleID, BillingStartDate, BillingEndDate) VALUES
(1, 1, '2016-01-03', '2016-01-18'), (2, 1, '2016-01-18', '2016-02-03'), (3, 1, '2016-01-15', '2016-01-30'), (6, 1, '2016-01-14', '2016-01-29');
SELECT A.custid, A.custname, B.BillingCycleID AS [month], COUNT(DISTINCT A.channelid) AS [count]
FROM dbo.tblCustomerChannelViews AS A INNER JOIN #CustomerBillingPeriodsTVP AS B ON A.custid = B.CustID
GROUP BY A.custid, A.custname, B.BillingCycleID;
GO
Where are you getting your customers' billing start dates as it is?
I'm not sure how this solution will scale - but with some good index candidates and decent data housekeeping, it'll work..
You're going to need some extra info for starters, and to normalize your data. You will need to know the first charging period start date for each customer. So store that in a customer table.
Here are the tables I used:
create table #channelViews
(
custId int, channelId int, viewDate datetime
)
create table #channel
(
channelId int, channelName varchar(max)
)
create table #customer
(
custId int, custname varchar(max), chargingStartDate datetime
)
I'll populate some data. I won't get the same results as your sample output, because I don't have the appropriate start dates for each customer. Customer 2 will be OK though.
insert into #channel (channelId, channelName)
select 1, 'ABSS'
union select 2, 'STHHG'
union select 4, 'XGGTS'
union select 3, 'SSJ'
union select 7, 'UUJKS'
union select 8, 'AKKDC'
union select 9, 'GGSK'
insert into #customer (custId, custname, chargingStartDate)
select 1, 'A', '4 Jan 2016'
union select 2, 'B', '19 Jan 2016'
union select 3, 'C', '5 Jan 2016'
union select 6, 'D', '5 Jan 2016'
insert into #channelViews (custId, channelId, viewDate)
select 1,1,'2016-01-09'
union select 2,2,'2016-01-19'
union select 3,4,'2016-01-09'
union select 6,4,'2016-01-09'
union select 2,2,'2016-01-26'
union select 2,2,'2016-01-28'
union select 1,3,'2016-01-28'
union select 1,1,'2016-01-28'
union select 2,2,'2016-02-02'
union select 2,7,'2016-02-10'
union select 2,8,'2016-02-10'
union select 2,9,'2016-02-10'
union select 2,9,'2016-02-11'
union select 2,7,'2016-02-27'
And here is the somewhat unweildy query, in a single statement.
The two underlying sub-queries are actually the same data, so there may be more appropriate / efficient ways to generate these.
We need to exclude from billing any channel charged in the same charging period C for the previous Month. This is the essence of the join. I used a right-join so that I could exclude all such matches from the results (using old.custId is null).
select c.custId, c.[custname], [month], count(*) [count] from
(
select new.custId, new.channelId, new.month, new.chargingPeriod
from
(
select distinct cv.custId, cv.channelId, month(viewdate) [month], (convert(int, cv.viewDate) - convert(int, c.chargingStartDate))/15 chargingPeriod
from #channelViews cv join #customer c on cv.custId = c.custId
) old
right join
(
select distinct cv.custId, cv.channelId, month(viewdate) [month], (convert(int, cv.viewDate) - convert(int, c.chargingStartDate))/15 chargingPeriod
from #channelViews cv join #customer c on cv.custId = c.custId
) new
on old.custId = new.custId
and old.channelId = new.channelId
and old.month = new.Month -1
and old.chargingPeriod = new.chargingPeriod
where old.custId is null
group by new.custId, new.month, new.chargingPeriod, new.channelId
) filteredResults
join #customer c on c.custId = filteredResults.custId
group by c.custId, [month], c.custname
order by c.custId, [month], c.custname
And finally my results:
custId custname month count
1 A 1 3
2 B 1 1
2 B 2 4
3 C 1 1
6 D 1 1
This query does the same thing:
select c.custId, c.custname, [month], count(*) from
(
select cv.custId, min(month(viewdate)) [month], cv.channelId
from #channelViews cv join #customer c on cv.custId = c.custId
group by cv.custId, cv.channelId, (convert(int, cv.viewDate) - convert(int, c.chargingStartDate))/15
) x
join #customer c
on c.custId = x.custId
group by c.custId, c.custname, x.[month]
order by custId, [month]
I have information in the format of the sample table below. Each file can have multiple grades, I need to select the most recent grade (based on completion date) for each file. If there is a file w/ the same completion dates, I would select the best grade (a being best and subsequent letters being a lesser grade). This seems easy, but for some reason having a brain fart
Sample Table:
ID_PK File_No Grade Completion_Date
1 Smith A 10/1/2010
2 Smith C 9/25/2010
3 Davis B 11/1/2010
4 Johnson D 12/5/2010
5 Johnson A 11/1/2010
6 Johnson C 10/1/2010
7 Miller X 9/1/2010
8 Miller F 12/1/2010
9 Miller D 10/1/2010
Ideal Results:
1 Smith A 10/1/2010
3 Davis B 11/1/2010
4 Johnson D 12/5/2010
8 Miller F 12/1/2010
uSING WINDOWING FUNCTION IS MORE EFFICIENT and also simpler as
with cte AS(
select '1' AS ID_no,'Smith' AS FILE_NO,'A' AS GRADE,
CAST('10/1/2010' AS DATE) AS CREATION_DATE
union all
select '2','Smith','C','9/25/2010'
union all
select '3','Davis','B','11/1/2010'
union all
select '4','Johnson','D','12/5/2010'
union all
select '5','Johnson','A','11/1/2010'
union all
select '6','Johnson','C','10/1/2010'
union all
select '7','Miller','X','9/1/2010'
union all
select '8','Miller','F','12/1/2010'
union all
select '9','Miller','D','10/1/2010')
SELECT X.ID_NO,X.FILE_NO,X.GRADE,X.CREATION_DATE FROM(
SELECT ID_NO,FILE_NO,GRADE,CREATION_DATE ,
ROW_NUMBER() OVER(PARTITION BY FILE_NO ORDER BY CREATION_DATE DESC,GRADE ASC ) AS RN
FROM CTE)AS X
WHERE X.RN=1
ORDER BY ID_NO
try this (untested):
select max_grade.*
from `Sample Table` st
inner join (
select File_No, max(Completion_Date) as Completion_Date
from `Sample Table`
group by File_No
) max_date on st.Completion_Date = max_date.CompletionDate
inner join (
select File_No, Completion_Date, max(Grade) as Grade
from `Sample Table`
group by File_No, Completion_Date
) max_grade on st.File_No = max_grade.File_No and st.Completion_Date = max_grade.Completion_Date
Note that you may need to modify the syntax and table name for your particular DB.
I created a table with your example data. I tested the following query against the table and everything seem to work correctly and matched the example results.
SELECT
ID_PK,
StudentGrade.File_No,
MIN(StudentGrade.Grade),
StudentGrade.Completion_Date
FROM
(
SELECT File_No, MAX(Completion_Date) Completion_Date
FROM StudentGrade
GROUP BY File_No
) Student
INNER JOIN StudentGrade ON
Student.File_No = StudentGrade.File_No
AND StudentGrade.Completion_Date = Student.Completion_Date
GROUP BY ID_PK, StudentGrade.File_No, StudentGrade.Completion_Date
ORDER BY ID_PK