SQL - One Table with Two Date Columns. Count and Join - sql

I have a table (vOKPI_Tickets) that has the following columns:
|CreationDate | CompletionDate|
I'd like to get a count on each of those columns, and group them by date. It should look something like this when complete:
| Date | Count-Created | Count-Completed |
I can get each of the counts individually, by doing something like this:
SELECT COUNT(TicketId)
FROM vOKPI_Tickets
GROUP BY CreationDate
and
SELECT COUNT(TicketId)
FROM vOKPI_Tickets
GROUP BY CreationDate
How can I combine the output into one table? I should also note that this will become a View.
Thanks in advance

Simple generic approach:
select
coalesce(crte.creationdate, cmpl.CompletionDate) as theDate,
crte.cnt as created,
cmpl.cnt as completed
from
(select creationdate, count (*) as cnt from vOKPI_Tickets where creationdate is not null group by creationdate) crte
full join
(select CompletionDate, count (*) as cnt from vOKPI_Tickets where CompletionDate is not null group by CompletionDate) cmpl
on crte.creationdate = cmpl.CompletionDate

You can unpivot and aggregate. A general method is:
select dte, sum(created), sum(completed)
from ((select creationdate as dte, 1 as created, 0 as completed
from vOKPI_Tickets
) union all
(select completed as dte, 0 created, 1 as completed
from vOKPI_Tickets
)
) t
group by dte;

In SQL Server, you can use cross apply for this:
select d.dt, sum(d.is_completed) count_created, sum(d.is_completed) count_completed
from vokpi_tickets t
cross apply (values (creationdate, 1, 0), (completion_date, 0, 1)) as d(dt, is_created, is_completed)
where d.dt is not null
group by d.dt

Related

Using MAX function in DateADD SQL. Error - Invalid aggregate function in where clause [MAX(date)]

I have a table 'CSALES' having columns such as customerid,transactiondate,quantity,price. I'm trying to find customers who have not been active in 1 month from a list of dates present in the transactiondate column. I've tried the following code but I'm unsure about the approach and the code is giving a compilation error
SELECT C.CUSTOMERID
FROM CSALES C
WHERE C.CUSTOMERID NOT IN
(
SELECT CS.CUSTOMERID FROM CSALES as CS
WHERE CS.TRANSACTIONDATE > DATEADD(month, -1, MAX(CS.TRANSACTIONDATE )
);
I'm getting the following error
SQL compilation error: Invalid aggregate function in where clause [MAX(CS.TRANSACTIONDATE)]
What changes should I make in the code to reflect the requirement? Would MAX(date) be a right approach ?
SELECT CUSTOMERID
FROM
CSSALES
GROUP BY CUSTOMERID
HAVING
MAX(TRANSACTIONDATE) < ADD_MONTHS(CURRENT_DATE(),-1)
Shawnt00 is right the max date in the transaction table is irrelevant if you just want any customer that hasn't been active in 1 calendar month.
In snowflake use CURRENT_DATE() to get the date portion of Today then ADD_MONTHS(date,int) to get months. Other functions work two but these are pretty easy. If you only want customers to remove duplicate CUSTOMERIDS group by the column.
I think I am about to just repeat Matt's code, but...
With a CTE for some test data:
WITH CSALES(CUSTOMERID, TRANSACTIONDATE) as (
SELECT * FROM VALUES
(1, '2022-05-08'::date), -- to recent
(1, '2021-05-08'::date),
(2, '2021-05-08'::date), -- old enough
(2, '2020-05-08'::date)
)
We can use HAVING for a post aggregation filter.
SELECT C.CUSTOMERID, MAX(C.TRANSACTIONDATE) as last_trans
FROM CSALES C
GROUP BY 1
HAVING last_trans < DATEADD(month,-1,current_date());
As Matt noted there are few ways to find the "one month ago today" he used ADD_MONTHS, I have used DATEADD
CUSTOMERID
LAST_TRANS
2
2021-05-08
Now this code works the same as:
SELECT CUSTOMERID
FROM (
SELECT C.CUSTOMERID, MAX(C.TRANSACTIONDATE) as last_trans
FROM CSALES C
GROUP BY 1
)
WHERE last_trans < DATEADD(month,-1,current_date());
which gives:
CUSTOMERID
2
Albeit we now have hidden away the last transaction, if that was what was wanted, and added some extra select layers for no high level value.
And thus if we want to hide the last_tran in the HAVING version, we can because we have already working code, we can just push the MAX into the HAVING (and we have Matt's code)
SELECT C.CUSTOMERID
FROM CSALES C
GROUP BY 1
HAVING MAX(C.TRANSACTIONDATE) < DATEADD(month,-1,current_date());
which gives for the demo code:
CUSTOMERID
2
Date Options:
There are a couple ways to alter date/time, depending how you like to order you logic, I tend to prefer DATEADD:
SELECT
current_date() as cd_a,
CURRENT_DATE as cd_b,
DATEADD(month, -1, cd_a) as one_month_ago_a,
ADD_MONTHS(cd_a, -1) as one_month_ago_b;
gives:
CD_A
CD_B
ONE_MONTH_AGO_A
ONE_MONTH_AGO_B
2022-05-07
2022-05-07
2022-04-07
2022-04-07
SELECT
C.CUSTOMERID
FROM
CSALES C
GROUP BY
C.CUSTOMERID
HAVING
MAX(C.TRANSACTIONDATE)
<
DATEADD(
month,
-1,
(SELECT MAX(TRANSACTIONDATE) FROM CSALES)
)
Or, assuming you have a customer table...
SELECT
*
FROM
CUSTOMER C
WHERE
NOT EXISTS (
SELECT *
FROM CSALES CS
WHERE CS.CUSTOMERID = C.ID
AND CS.TRANSACTIONDATE >= DATEADD(
month,
-1,
(SELECT MAX(TRANSACTIONDATE) FROM CSALES)
)
)
Demo : dbfiddle
there are multiple possibilities, you must check which is faster
SELECT C.CUSTOMERID
FROM CSALES C
WHERE C.CUSTOMERID NOT IN
(
SELECT CS.CUSTOMERID FROM CSALES as CS CROSS JOIN (SELECT MAX(TRANSACTIONDATE) maxdate FROM CSALES) t1
WHERE CS.TRANSACTIONDATE > DATEADD(month, -1, maxdate)
);
GO
| CUSTOMERID |
| ---------: |
| 4 |
SELECT DISTINCT C.CUSTOMERID
FROM CSALES C CROSS JOIN (SELECT MAX(TRANSACTIONDATE) maxdate FROM CSALES) t1
WHERE NOT EXISTS (SELECT 1 FROM CSALES WHERE CUSTOMERID = c.CUSTOMERID AND TRANSACTIONDATE > DATEADD(month, -1, maxdate))
;
GO
| CUSTOMERID |
| ---------: |
| 4 |
db<>fiddle here

SQLite Getting multiple results with LIMIT 1

I have the following problem.
Part of a task is to determine the visitor(s) with the most money spent between 2000 and 2020.
It just looks like this.
SELECT UserEMail FROM Visitor
JOIN Ticket ON Visitor.UserEMail = Ticket.VisitorUserEMail
where Ticket.Date> date('2000-01-01') AND Ticket.Date < date ('2020-12-31')
Group by Ticket.VisitorUserEMail
order by SUM(Price) DESC;
Is it possible to output more than one person if both have spent the same amount?
Use rank():
SELECT VisitorUserEMail
FROM (SELECT VisitorUserEMail, SUM(PRICE) as sum_price,
RANK() OVER (ORDER BY SUM(Price) DESC) as seqnum
FROM Ticket t
WHERE t.Date >= date('2000-01-01') AND Ticket.Date <= date('2021-01-01')
GROUP BY t.VisitorUserEMail
) t
WHERE seqnum = 1;
Note: You don't need the JOIN, assuming that ticket buyers are actually visitors. If that assumption is not true, then use the JOIN.
Use a CTE that returns all the total prices for each email and with NOT EXISTS select the rows with the top total price:
WITH cte AS (
SELECT VisitorUserEMail, SUM(Price) SumPrice
FROM Ticket
WHERE Date >= '2000-01-01' AND Date <= '2020-12-31'
GROUP BY VisitorUserEMail
)
SELECT c.VisitorUserEMail
FROM cte c
WHERE NOT EXISTS (
SELECT 1 FROM cte
WHERE SumPrice > c.SumPrice
)
or:
WITH cte AS (
SELECT VisitorUserEMail, SUM(Price) SumPrice
FROM Ticket
WHERE Date >= '2000-01-01' AND Date <= '2020-12-31'
GROUP BY VisitorUserEMail
)
SELECT VisitorUserEMail
FROM cte
WHERE SumPrice = (SELECT MAX(SumPrice) FROM cte)
Note that you don't need the function date() because the result of date('2000-01-01') is '2000-01-01'.
Also I think that the conditions in the WHERE clause should include the =, right?

Group by in columns and rows, counts and percentages per day

I have a table that has data like following.
attr |time
----------------|--------------------------
abc |2018-08-06 10:17:25.282546
def |2018-08-06 10:17:25.325676
pqr |2018-08-05 10:17:25.366823
abc |2018-08-06 10:17:25.407941
def |2018-08-05 10:17:25.449249
I want to group them and count by attr column row wise and also create additional columns in to show their counts per day and percentages as shown below.
attr |day1_count| day1_%| day2_count| day2_%
----------------|----------|-------|-----------|-------
abc |2 |66.6% | 0 | 0.0%
def |1 |33.3% | 1 | 50.0%
pqr |0 |0.0% | 1 | 50.0%
I'm able to display one count by using group by but unable to find out how to even seperate them to multiple columns. I tried to generate day1 percentage with
SELECT attr, count(attr), count(attr) / sum(sub.day1_count) * 100 as percentage from (
SELECT attr, count(*) as day1_count FROM my_table WHERE DATEPART(week, time) = DATEPART(day, GETDate()) GROUP BY attr) as sub
GROUP BY attr;
But this also is not giving me correct answer, I'm getting all zeroes for percentage and count as 1. Any help is appreciated. I'm trying to do this in Redshift which follows postgresql syntax.
Let's nail the logic before presenting:
with CTE1 as
(
select attr, DATEPART(day, time) as theday, count(*) as thecount
from MyTable
)
, CTE2 as
(
select theday, sum(thecount) as daytotal
from CTE1
group by theday
)
select t1.attr, t1.theday, t1.thecount, t1.thecount/t2.daytotal as percentofday
from CTE1 t1
inner join CTE2 t2
on t1.theday = t2.theday
From here you can pivot to create a day by day if you feel the need
I am trying to enhance the query #johnHC btw if you needs for 7days then you have to those days in case when
with CTE1 as
(
select attr, time::date as theday, count(*) as thecount
from t group by attr,time::date
)
, CTE2 as
(
select theday, sum(thecount) as daytotal
from CTE1
group by theday
)
,
CTE3 as
(
select t1.attr, EXTRACT(DOW FROM t1.theday) as day_nmbr,t1.theday, t1.thecount, t1.thecount/t2.daytotal as percentofday
from CTE1 t1
inner join CTE2 t2
on t1.theday = t2.theday
)
select CTE3.attr,
max(case when day_nmbr=0 then CTE3.thecount end) as day1Cnt,
max(case when day_nmbr=0 then percentofday end) as day1,
max(case when day_nmbr=1 then CTE3.thecount end) as day2Cnt,
max( case when day_nmbr=1 then percentofday end) day2
from CTE3 group by CTE3.attr
http://sqlfiddle.com/#!17/54ace/20
In case that you have only 2 days:
http://sqlfiddle.com/#!17/3bdad/3 (days descending as in your example from left to right)
http://sqlfiddle.com/#!17/3bdad/5 (days ascending)
The main idea is already mentioned in the other answers. Instead of joining the CTEs for calculating the values I am using window functions which is a bit shorter and more readable I think. The pivot is done the same way.
SELECT
attr,
COALESCE(max(count) FILTER (WHERE day_number = 0), 0) as day1_count, -- D
COALESCE(max(percent) FILTER (WHERE day_number = 0), 0) as day1_percent,
COALESCE(max(count) FILTER (WHERE day_number = 1), 0) as day2_count,
COALESCE(max(percent) FILTER (WHERE day_number = 1), 0) as day2_percent
/*
Add more days here
*/
FROM(
SELECT *, (count::float/count_per_day)::decimal(5, 2) as percent -- C
FROM (
SELECT DISTINCT
attr,
MAX(time::date) OVER () - time::date as day_number, -- B
count(*) OVER (partition by time::date, attr) as count, -- A
count(*) OVER (partition by time::date) as count_per_day
FROM test_table
)s
)s
GROUP BY attr
ORDER BY attr
A counting the rows per day and counting the rows per day AND attr
B for more readability I convert the date into numbers. Here I take the difference between current date of the row and the maximum date available in the table. So I get a counter from 0 (first day) up to n - 1 (last day)
C calculating the percentage and rounding
D pivot by filter the day numbers. The COALESCE avoids the NULL values and switched them into 0. To add more days you can multiply these columns.
Edit: Made the day counter more flexible for more days; new SQL Fiddle
Basically, I see this as conditional aggregation. But you need to get an enumerator for the date for the pivoting. So:
SELECT attr,
COUNT(*) FILTER (WHERE day_number = 1) as day1_count,
COUNT(*) FILTER (WHERE day_number = 1) / cnt as day1_percent,
COUNT(*) FILTER (WHERE day_number = 2) as day2_count,
COUNT(*) FILTER (WHERE day_number = 2) / cnt as day2_percent
FROM (SELECT attr,
DENSE_RANK() OVER (ORDER BY time::date DESC) as day_number,
1.0 * COUNT(*) OVER (PARTITION BY attr) as cnt
FROM test_table
) s
GROUP BY attr, cnt
ORDER BY attr;
Here is a SQL Fiddle.

sql query - find distinct user from table

I am trying to solve a problem using SQL query and need some expert's advice.
I have below transaction table.
-- UserID, ProductId, TransactionDate
-- 1 , 2 , 2014-01-01
-- 1 , 3 , 2014-01-05
-- 2 , 2 , 2014-01-02
-- 2 , 3 , 2014-05-07
.
.
.
What I am trying to achieve is to find all user who purchased more than one product WITHIN 30 DAYS .
My query so far is like
select UserID, COUNT(distinct ProductID)
from tableA
GROUP BY UserID HAVING COUNT(distinct ProductID) > 1
I am not sure where to apply "WITH IN 30 DAYS" logic in the query .
The outcome should be :
1, 2
2, 1
Thanks in advance for your help.
Edit: Within 30 Days
SQL Fiddle
SELECT
a.UserID,
COUNT(DISTINCT ProductID)
FROM TableA a
INNER JOIN (
SELECT UserID, TransactionDate = MAX(TransactionDate)
FROM TableA
GROUP BY UserID
) AS t
ON t.UserID = a.UserID
AND a.TransactionDate >= DATEADD(DAY, -30, t.TransactionDate)
AND a.TransactionDate <= t.TransactionDate
GROUP BY a.UserID
You can use GROUP BY YEAR(TransactionDate), MONTH(TransactionDate)
SELECT
UserID,
COUNT(DISTINCT ProductID)
FROM TableA
GROUP BY
UserID, YEAR(TransactionDate), MONTH(TransactionDate)
HAVING
COUNT(DISTINCT ProductID) > 1
Just add a where clause.
SELECT UserID, COUNT(DISTINCT ProductID) cnt
FROM tableA
WHERE TransactionDate >= CAST(DATEADD(DAY,-30,GETDATE()) AS DATE)
GROUP BY UserID
HAVING COUNT(DISTINCT ProductID) > 1
This works because the where clause is performed BEFORE the Group By and Having. So first it filters out all transactions over 30 days old and then returns only people who bought two distinct products.
Query Processing Order:
http://blog.sqlauthority.com/2009/04/06/sql-server-logical-query-processing-phases-order-of-statement-execution/

SQL query to return data corresponding to all values of a column except for the min value of that column

I have a table with the following columns:
userid, datetime, type
Sample data:
userid datetime type
1 2013-08-01 08:10:00 I
1 2013-08-01 08:12:00 I
1 2013-08-01 08:12:56 I
I need to fetch data for only two rows other than the row with min(datetime)
my query to fetch data for min(datetime) is :
SELECT
USERID, MIN(CHECKTIME) as ChkTime, CHECKTYPE, COUNT(*) AS CountRows
FROM
T1
WHERE
MONTH(CONVERT(DATETIME, CHECKTIME)) = MONTH(DATEADD(MONTH, -1,
CONVERT(DATE, GETDATE())))
AND YEAR(CONVERT(DATETIME, CHECKTIME)) = YEAR(GETDATE()) AND USERID=35
AND CHECKTYPE='I'
GROUP BY
CONVERT(DATE, CHECKTIME), USERID, CHECKTYPE
HAVING
COUNT(*) > 1
a lil help'll be much appreciated..thnx
Maybe something like this will help you:
WITH CTE AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY userid ORDER BY checktime) RN
FROM dbo.T1
WHERE CHECKTYPE = 'I'
--add your conditions here
)
SELECT * FROM CTE
WHERE RN > 1
Using CTE and ROW_NUMBER() function this will select all rows except min(date) for each user.
SQLFiddle DEMO
SELECT * FROM YOURTABLE A
INNER JOIN
(SELECT USERID,TYPE,MIN(datetime) datetime FROM YOURTABLE GROUP BY USERID,TYPE )B
ON
A.USERID=B.USERID AND
A.TYPE=B.TYPE
WHERE A.DATETIME<>B.DATETIME