Selecting latest entry ordered by group, adding a user defined column for the latest one - sql

I currently have two SQL tables:
Employees
ID Created RecordID Status
------ ----------------------- ------------ ----------
1 2020-07-29 12:38:54.070 1 1
2 2020-08-03 14:28:59.803 1 1
3 2020-08-04 13:47:49.427 2 0
and
Payments
ID EmployeeID Amount
------ -------------- ------------
1 1 1022
2 1 1090
3 1 2105
4 2 1112
5 2 1450
6 2 2923
7 3 1064
How do I add another (user defined) column called Latest to the following query, that can check if the row corresponds to the Employee RecordID number with the latest Created, and returns a string true or false value for that row? In other words, a row should be Latest: true if its Created has the latest timestamp for all rows corresponding to that RecordID number.
Here's my SQL so far:
SELECT e.ID, e.Created, e.Status, p.EmployeeID, p.Amount
FROM Employees AS e
JOIN Payments AS p
ON e.ID = p.EmployeeID
Here are the expected results:
ID Created Status EmployeeID Amount Latest
------ --------------------------- ---------- -------------- ---------- ----------
1 2020-07-29 12:38:54.070 1 1 1022 false
1 2020-07-29 12:38:54.070 1 1 1090 false
1 2020-07-29 12:38:54.070 1 1 2105 false
2 2020-08-03 14:28:59.803 1 2 1112 true
2 2020-08-03 14:28:59.803 1 2 1450 true
2 2020-08-03 14:28:59.803 1 2 2923 true
3 2020-08-04 13:47:49.427 0 3 1064 true
Thanks!

;WITH x AS
(
SELECT e.ID, e.Created, e.Status, p.EmployeeID, p.Amount,
Latest = CASE DENSE_RANK() OVER (PARTITION BY e.RecordID ORDER BY e.Created DESC)
WHEN 1 THEN 'true' ELSE 'false' END
FROM dbo.Employees AS e
JOIN dbo.Payments AS p
ON e.ID = p.EmployeeID
)
SELECT * FROM x ORDER BY ID;

Related

sql finding cid with most expired cards [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 months ago.
Improve this question
I have a table Cards(card_id,status,cid)
With the columns:
cid - customer id
status - exp/vld
card_id - card id's
How to find the cid with the most expired cards?
From Oracle 12, you can use:
SELECT cid,
COUNT(*) AS num_exp
FROM cards
WHERE status = 'exp'
GROUP BY cid
ORDER BY num_exp DESC
FETCH FIRST ROW WITH TIES;
You can get count of expired cards for individual customers and then choose customer with MAX count. The below query should give results.
WITH t AS(
SELECT cid, count(1) customer_exp_cards_count
FROM Cards where status = 'exp'
group by cid)
SELECT cid FROM t t1
WHERE t1.customer_exp_cards_count IN (SELECT MAX(t2.customer_exp_cards_count)
FROM t t2)
Sample data and its result:
cardid status cid
3 exp 5
1 exp 1
2 exp 1
3 vld 1
5 vld 1
1 exp 2
2 exp 2
3 exp 2
4 vld 2
5 vld 2
6 exp 3
7 vld 4
4 vld 5
Result:
2
Suppose you have these two tables (just a sample data)
CUSTOMERS
CUST_ID
CUST_NAME
CUST_STATUS
101
John
ACTIVE
102
Annie
ACTIVE
103
Jane
ACTIVE
104
Bob
INACTIVE
CARDS
CARD_ID
CARD_STATUS
CUST_ID
1001001
VALID
101
1001002
VALID
101
1001003
EXPIRED
101
1001004
EXPIRED
101
1001005
VALID
101
1002010
VALID
102
1002020
EXPIRED
102
1002030
EXPIRED
102
1002040
EXPIRED
102
1003100
VALID
103
1003200
VALID
103
If you want just a CUST_ID with the number of most expired cards you can do it without table CUSTOMERS:
Select CUST_ID, EXPIRED_CARDS
From (Select CUST_ID, Count(CARD_ID) "EXPIRED_CARDS" From cards Where CARD_STATUS = 'EXPIRED' Group By CUST_ID)
Where EXPIRED_CARDS = (Select Max(EXPIRED_CARDS) From (Select Count(CARD_ID) "EXPIRED_CARDS" From cards Where CARD_STATUS = 'EXPIRED' Group By CUST_ID) )
--
-- R e s u l t
-- CUST_ID EXPIRED_CARDS
-- ---------- -------------
-- 102 3
Maybe you could consider creating a CTE with the data from both tables which will give you dataset that you could use later for different questions not just for this one. Something like this:
WITH
customers_cards AS
(
Select
cst.CUST_ID,
cst.CUST_NAME,
cst.CUST_STATUS,
crd.CARD_ID,
crd.CARD_STATUS,
Sum(CASE WHEN crd.CUST_ID Is Null Then 0 Else 1 End) OVER(Partition By crd.CUST_ID) "TOTAL_NUM_OF_CARDS",
Sum(CASE WHEN crd.CARD_ID Is Null Then Null WHEN crd.CARD_STATUS = 'VALID' And crd.CARD_ID Is Not Null Then 1 Else 0 End) OVER(Partition By crd.CUST_ID) "VALID_CARDS",
Sum(CASE WHEN crd.CARD_ID Is Null Then Null WHEN crd.CARD_STATUS = 'EXPIRED' And crd.CARD_ID Is Not Null Then 1 Else 0 End) OVER(Partition By crd.CUST_ID) "EXPIRED_CARDS"
From
customers cst
Left Join
cards crd on(crd.CUST_ID = cst.CUST_ID)
)
/* R e s u l t :
CUST_ID CUST_NAME CUST_STATUS CARD_ID CARD_STATUS TOTAL_NUM_OF_CARDS VALID_CARDS EXPIRED_CARDS
---------- --------- ----------- ------- ----------- ------------------ ----------- -------------
101 John ACTIVE 1001001 VALID 5 3 2
101 John ACTIVE 1001002 VALID 5 3 2
101 John ACTIVE 1001003 EXPIRED 5 3 2
101 John ACTIVE 1001004 EXPIRED 5 3 2
101 John ACTIVE 1001005 VALID 5 3 2
102 Annie ACTIVE 1002010 VALID 4 1 3
102 Annie ACTIVE 1002040 EXPIRED 4 1 3
102 Annie ACTIVE 1002030 EXPIRED 4 1 3
102 Annie ACTIVE 1002020 EXPIRED 4 1 3
103 Jane ACTIVE 1003100 VALID 2 2 0
103 Jane ACTIVE 1003200 VALID 2 2 0
104 Bob INACTIVE 0
*/
This can be used to answer many more potential questions. Here is the list of customers sorted by number of expired cards (descending):
Select Distinct
CUST_ID, CUST_NAME, TOTAL_NUM_OF_CARDS, VALID_CARDS, EXPIRED_CARDS
From
customers_cards
Order By
EXPIRED_CARDS Desc Nulls Last, CUST_ID
--
-- R e s u l t :
-- CUST_ID CUST_NAME TOTAL_NUM_OF_CARDS VALID_CARDS EXPIRED_CARDS
-- ---------- --------- ------------------ ----------- -------------
-- 102 Annie 4 1 3
-- 101 John 5 3 2
-- 103 Jane 2 2 0
-- 104 Bob 0
OR to answer your question:
Select Distinct
CUST_ID, CUST_NAME, TOTAL_NUM_OF_CARDS, VALID_CARDS, EXPIRED_CARDS
From
customers_cards
Where
EXPIRED_CARDS = (Select Max(EXPIRED_CARDS) From customers_cards)
Order By
CUST_ID
--
-- R e s u l t :
-- CUST_ID CUST_NAME TOTAL_NUM_OF_CARDS VALID_CARDS EXPIRED_CARDS
-- ---------- --------- ------------------ ----------- -------------
-- 102 Annie 4 1 3
Regards...

Select max date for each register, null if does not exists

I have these tables: Employee (id, name, number), Configuration (id, years, licence_days), Periods (id, start_date, end_date, configuration_id, employee_id, period_type):
Employee table:
id name number
---- ----- -------
1 Bob 355
2 John 467
3 Maria 568
4 Josh 871
configuration table:
id years licence_days
---- ----- ------------
1 1 8
2 3 16
3 5 24
Periods table:
id start_date end_date configuration_id employee_id period_type
---- ---------- ------- ---------------- ----------- -----------
1 2021-05-23 2021-05-31 1 1 vaccation
2 2021-05-24 2021-06-01 1 2 vaccation
3 2021-03-01 2021-03-17 2 2 vaccation
4 2021-05-05 2021-05-21 2 2 vaccation
5 2021-01-01 2021-01-17 2 4 vaccation
I want this result:
Result:
employee_id years licence_days max(end_date)
1 1 8 2021-05-31
1 3 16 null
1 5 24 null
2 1 8 2021-06-01
2 3 16 2021-05-21
2 5 24 null
3 1 8 null
3 3 16 null
3 5 24 null
4 1 8 null
4 3 16 2021-01-17
4 5 24 null
i.e., I want to select all Employees with all configuration, and for each one of that, the max end_date of the "vaccation" type (or null if it does not exists).
How can I do that
Oracle supports cross joins, right? So may be something like that?
SELECT e.employee_id, c.years, c.licence_days, max(p.end_date)
FROM Employee e
CROSS JOIN configuration c
LEFT JOIN Periods p
ON e.employee_id = p.employee_id
AND c.configuration_id = p.configuration_id
GROUP BY e.employee_id, c.years, c.licence_days
ORDER BY e.employee_id, c.years
#umberto-petrov chooses wisely with the ANSI CROSS JOIN syntax for a cartesian join. However, in the very weak probability that your requires output of configurations even where there is no employees, you can go with something like :
EDIT: Filtering the Periods join with 'vaccation' as asked in the comments.
If you have to filter for some employee ids, change ON 1 = 1 by ON Employee.id IN (id1, id2, ...). It still keeps every configurations but only takes employees that match the ids.
SELECT Employee.employee_id,
Configuration.years,
Configuration.licence_days,
MAX(Configuration.end_date) max_end_date
FROM Configuration LEFT JOIN Employee ON 1 = 1
LEFT JOIN Periods ON Periods.configuration_id = Configuration.id
AND Periods.employee_id = Employee.id
AND Periods.period_type = 'vaccation'
GROUP BY Employee.employee_id,
Configuration.years,
Configuration.licence_days
ORDER BY Employee.employee_id,
Configuration.years,
Configuration.licence_days
We start from configuration to take every records from this one at least, then made a LEFT CARTESIAN JOIN with Employee and finally a full LET JOIN on Periods for both. That way , if there is no employees, this will output configuration_id and NULL for years, licence_days and max end_date.

Multiple joins with aggregates

I have the two following tables:
Person:
EntityId FirstName LastName
----------- ------------------ -----------------
1 Ion Ionel
2 Fane Fanel
3 George Georgel
4 Mircea Mircel
SalesQuotaHistory
SalesQuotaId EntityId SalesQuota SalesOrderDate
------------ ----------- ----------- -----------------------
1 1 1000 2014-01-01 00:00:00.000
2 1 1000 2014-01-02 00:00:00.000
3 1 1000 2014-01-03 00:00:00.000
4 3 3000 2013-01-01 00:00:00.000
5 3 3000 2013-01-01 00:00:00.000
7 4 4000 2015-01-01 00:00:00.000
8 4 4000 2015-01-02 00:00:00.000
9 4 4000 2015-01-03 00:00:00.000
10 1 1000 2015-01-01 00:00:00.000
11 1 1000 2015-01-02 00:00:00.000
I am trying to get the SalesQuota for each user in 2014 and 2015.
Using this query i am getting an erroneous result:
SELECT p.EntityId
, p.FirstName
, SUM(sqh2014.SalesQuota) AS '2014'
, SUM(sqh2015.SalesQuota) AS '2015'
FROM Person p
LEFT OUTER JOIN SalesQuotaHistory sqh2014
ON p.EntityId = sqh2014.EntityId
AND YEAR(sqh2014.SalesOrderDate) = 2014
LEFT OUTER JOIN SalesQuotaHistory sqh2015
ON p.EntityId = sqh2015.EntityId
AND YEAR(sqh2015.SalesOrderDate) = 2015
GROUP BY p.EntityId, p.FirstName
EntityId FirstName 2014 2015
--------- ----------- ---------- --------------------
1 Ion 6000 6000
2 Fane NULL NULL
3 George NULL NULL
4 Mircea NULL 12000
In fact, Id 1 has a total SalesQuota of 3000 in 2014 and 2000 in 2015.
What i am asking here, is .. what is really happening behind the scenes? What is the order of operation in this specific case?
Thanks to my last post i was able to solve this using the following query:
SELECT p.EntityId
, p.FirstName
, SUM(CASE WHEN YEAR(sqh.SalesOrderDate) = 2014 THEN sqh.SalesQuota ELSE 0 END) AS '2014'
, SUM(CASE WHEN YEAR(sqh.SalesOrderDate) = 2015 THEN sqh.SalesQuota ELSE 0 END) AS '2015'
FROM Person p
LEFT OUTER JOIN SalesQuotaHistory sqh
ON p.EntityId = sqh.EntityId
GROUP BY p.EntityId, p.FirstName
EntityId FirstName 2014 2015
----------- --------------------- ----------- -----------
1 Ion 3000 2000
2 Fane 0 0
3 George 0 0
4 Mircea 0 12000
but without understanding what's wrong with the first attempt .. i can't get over this ..
Any explanation would be greatly appreciated.
Is easy to see what is happening if you change your select to
SELECT *
and remove the group by
You first approach need something like this
Sql Fiddle Demo
SELECT p.[EntityId]
, p.FirstName
, COALESCE(s2014,0) as [2014]
, COALESCE(s2015,0) as [2015]
FROM Person p
LEFT JOIN (SELECT EntityId, SUM(SalesQuota) s2014
FROM SalesQuotaHistory
WHERE YEAR(SalesOrderDate) = 2014
GROUP BY EntityId
) as s1
ON p.[EntityId] = s1.EntityId
LEFT JOIN (SELECT EntityId, SUM(SalesQuota) s2015
FROM SalesQuotaHistory
WHERE YEAR(SalesOrderDate) = 2015
GROUP BY EntityId
) as s2
ON p.[EntityId] = s2.EntityId
Joining with the result data only if exist for that id and year.
OUTPUT
| EntityId | FirstName | 2014 | 2015 |
|----------|-----------|------|-------|
| 1 | Ion | 3000 | 2000 |
| 2 | Fane | 0 | 0 |
| 3 | George | 0 | 0 |
| 4 | Mircea | 0 | 12000 |
You have multiple rows for each year, so the first method is producing a Cartesian product.
For instance, consider EntityId 100:
1 1 1000 2014-01-01 00:00:00.000
2 1 1000 2014-01-02 00:00:00.000
3 1 1000 2014-01-03 00:00:00.000
10 1 1000 2015-01-01 00:00:00.000
11 1 1000 2015-01-02 00:00:00.000
The intermediate result from the join produces six rows, with these SalesQuotaId:
1 10
1 11
2 10
2 11
3 10
3 11
You can then do the math -- the result is off because of the multiple rows.
You seem to know how to fix the problem. The conditional aggregation approach produces the correct answer.
You could improve the speed of your query by adding a WHERE condition to filter only the years over which you're looking for data:
SELECT p.EntityId
, p.FirstName
, SUM(CASE WHEN YEAR(sqh.SalesOrderDate) = 2014
THEN sqh.SalesQuota ELSE 0 END) AS '2014'
, SUM(CASE WHEN YEAR(sqh.SalesOrderDate) = 2015
THEN sqh.SalesQuota ELSE 0 END) AS '2015'
FROM Person p
LEFT OUTER JOIN SalesQuotaHistory sqh
ON p.EntityId = sqh.EntityId
WHERE YEAR(sqh.SalesOrderDate) IN (2014, 2015)
GROUP BY p.EntityId, p.FirstName
Otherwise, the query that you found is the way to go (good job!)

get multiple column from both table by joining a table on the basis of other max table column

1) vendor table
--------------------------------------------
VENDid VENDname
--- -----
1 ABC
2 XYZ
3 WXY
2)purchase table
---------------------------------------------
VENDid Purchasedate
------ ------------
1 12-01-2012
1 10-11-2013
2 22-02-2014
2 11-04-2014
3 10-05-2014
3 11-06-2014
1 14-06-2014
output(list all rows of vendor table and only max(purchasedate) from purchase table)
---------------------------------------------
VENDid VENDname PurchaseDate
------- -------- -------------
1 ABC 14-06-2014
2 XYZ 11-04-2014
3 WXY 11-06-2014
i got some queries like to solve previous problem-
SELECT v.VendID, VendName, Max(PurchaseDate)
FROM vendor v
INNER JOIN purchase p
ON v.VendID = p.VendID
Group By v.VendID, VendName
select VENDid, VENDname,
(select top 1 purchaseDate from purchase p
where p.VENDid=v.VENDid order by purchaseDate desc) as 'Purchase date'
from Vendor v
Que. If i will add some more column in purchase table like -
2)purchase table
------------------------------------------
VENDid Purchasedate amount_paid
------ ------------ ------------
1 12-01-2012 10000
1 10-11-2013 20000
2 22-02-2014 15000
2 11-04-2014 30000
3 10-05-2014 80000
3 11-06-2014 17000
1 14-06-2014 28000
and i want amount_paid along with previous output like-
---------------------------------------------
VENDid VENDname PurchaseDate amount_paid
------- -------- ------------- -------------
1 ABC 14-06-2014 28000
2 XYZ 11-04-2014 30000
3 WXY 11-06-2014 17000
then what will be query..
You appear to be using SQL Server. If so, you can use cross apply:
select v.VENDid, v.VENDname, p.PurchaseDate, p.Amount_Paid
from Vendor v cross apply
(select top 1 p.*
from purchase p
where p.VENDid = v.VENDid
order by p.purchaseDate desc
) p ;

How to filter first appearance in table only

Here is the table structure:
tblApplicants:
applicantID (index) | ApplyingForYear (nvarchar)
------------------------------------------------------
1 2013/14
11 2013/14
13 2013/14
12 2013/14
15 2013/14
21 2012/13
tblApplicantSchools_shadow:
id (index) | applicantID | updated (datetime) | statusID (int) | schoolID (int)
-----------------------------------------------------------------------------------------------------
1 11 2012-09-24 00:00:00.000 3 2
1 13 2012-10-24 00:00:00.000 4 2
2 15 2012-11-24 00:00:00.000 3 4
3 13 2012-03-24 00:00:00.000 4 3
4 12 2012-09-24 00:00:00.000 4 1
5 21 2012-11-03 00:00:00.000 5 2
6 11 2012-09-04 00:00:00.000 4 4
What I need to do is:
get all applicants, that have an ApplyingForYear of '2013/14' in tblApplicants
have a statusID of 4
I only want to count them once - even if they appear twice or more in tblApplicantschools_show
group the number of distinct applicants (as per the above) - by the updated date column (grouped by week)
So based on the sample data above, there should be 3 rows that come out, (because ApplicantID 13 appears twice and I only want him once).
This is how the result should look:
Datesubmitted TotalAppsPerWeek
-------------------------------------------------------
2012-10-24 00:00:00.000 1
2012-09-24 00:00:00.000 1
2012-09-04 00:00:00.000 1
This is what I have so far - but it results in 4 rows, not 3 :(
select
DATEADD(ww,(DATEDIFF(ww,0,[tblApplicantSchools_shadow].updated)),0) AS Datesubmitted,
count(DISTINCT [tblApplicantSchools_shadow].applicantID) as TotalAppsPerWeek
FROM tblApplicants
INNER JOIN tblApplicantSchools_shadow
ON tblApplicantS.ApplicantID = tblApplicantSchools_shadow.applicantID
WHERE
ApplyingForYear = '2013/14'
AND [tblApplicantSchools_shadow].statusID = 4
GROUP BY
DATEADD(ww, (DATEDIFF(ww, 0, [tblApplicantSchools_shadow].updated)), 0)
And here is a Fiddle: http://sqlfiddle.com/#!3/3aa61/42
From your title, I'm assuming the one row you want from each applicant is the one with the smallest id. You can select one row per applicant ID with the ROW_NUMBER() function:
;with latestApplication AS
(
SELECT DATEADD(ww,(DATEDIFF(ww,0,[tblApplicantSchools_shadow].updated)),0)
AS Datesubmitted,
[tblApplicantSchools_shadow].applicantID,
ROW_NUMBER() OVER (PARTITION BY [tblApplicantSchools_shadow].applicantID
ORDER BY [tblApplicantSchools_shadow].id)
AS rn
FROM tblApplicants
INNER JOIN tblApplicantSchools_shadow
ON tblApplicantS.ApplicantID = tblApplicantSchools_shadow.applicantID
WHERE ApplyingForYear = '2013/14'
AND [tblApplicantSchools_shadow].statusID = 4
)
select Datesubmitted, COUNT(1) AS TotalAppsPerWeek
FROM latestApplication
WHERE rn = 1
group by Datesubmitted
order by Datesubmitted DESC
http://sqlfiddle.com/#!3/3aa61/57