SQL Joining table to itself with GROUP BY - sql

I have one table and I want to GROUP BY it by ID but at the same time I want to create another GROUP BY with ID and DATE and then join this table to table with ID grouping. Example code:
Here is how to tables are created and I want to LEFT JOIN TEMP to ORG with ID and [DATE] to get FIRST and LAST on ORG table.
SELECT
ID,
MIN([DATE]) AS MIN_DATE,
FROM [ORG] AS ORG
GROUP BY ID
SELECT
ID,
[DATE],
MIN(EXEC_AS_OF_TIME) AS [FIRST],
MAX(EXEC_AS_OF_TIME) AS [LAST]
FROM [ORG] AS TEMP
GROUP BY ID, [DATE]
Here is how I thought it would work, but it doesn't. Where I'm going wrong here?
SELECT
ORG.ID,
MIN([ORG.DATE]) AS MIN_DATE,
TEMP.[FIRST],
TEMP.[LAST]
FROM [ORG] AS ORG
GROUP BY ID
LEFT JOIN (
SELECT
ID,
[DATE],
MIN(EXEC_AS_OF_TIME) AS [FIRST],
MAX(EXEC_AS_OF_TIME) AS [LAST]
FROM [TEMP]
GROUP BY ID, [DATE]
) AS TEMP ON ORG.ID = TEMP.ID AND ORG.[MIN_DATE] = TEMP.[DATE]

You need to change the syntax like the below -
select org.id, MIN_DATE,FIRST,LAST
from
(
SELECT ORG.ID,MIN([ORG.DATE]) AS MIN_DATE
FROM [ORG] GROUP BY ID) AS ORG
LEFT JOIN
(
SELECT ID, [DATE],
MIN(EXEC_AS_OF_TIME) AS [FIRST],MAX(EXEC_AS_OF_TIME) AS [LAST]
FROM [ORG]
GROUP BY ID, [DATE]) AS TEMP ON ORG.ID = TEMP.ID AND ORG.[MIN_DATE] = TEMP.[DATE]

You don't need such a complicated query for this. Just use a window function:
SELECT id.*
FROM (SELECT ID, [DATE],
MIN(EXEC_AS_OF_TIME) AS [FIRST],
MAX(EXEC_AS_OF_TIME) AS [LAST],
MIN([DATE]) OVER (PARTITION BY ID) as MIN_DATE
FROM [ORG] AS TEMP
GROUP BY ID, [DATE]
) id
WHERE [DATE] = min_date;

Related

How to query and display longest name of row results with same id, same date, and same first name

For each same id, same date, and same first name, I would like to display the shortest last name through a group by or any other workaround:
Here's the table:
https://imgur.com/LnJRqMZ
Here's the results I would like to see:
https://imgur.com/BSv1ibi
I was wondering if someone could show me how to do this with a query.
Thanks
There are several ways to do this, here are two. As you haven't tagged anything other than sql, I am not sure which program you are using or the syntax you require. I use sql server 2008 r2, so that is the code you see. If you use mysql, oracle, another sql server version or something else, please tag it.
Here is base data:
SELECT 1 as ID, '2019-01-01' as [Date], 'Edward Brady' as LastName, 'Tom' as FirstName
into #table
UNION
SELECT 1 as ID, '2019-01-01' as [Date], 'Brady' as LastName, 'Tom' as FirstName
UNION
SELECT 2 as ID, '2019-02-02' as [Date], 'Wardell Curry' as LastName, 'Steph' as FirstName
UNION
SELECT 2 as ID, '2019-02-02' as [Date], 'Curry' as LastName, 'Steph' as FirstName
UNION
SELECT 2 as ID, '2019-02-02' as [Date], 'Curry II' as LastName, 'Steph' as FirstName
UNION
SELECT 3 as ID, '2019-03-03' as [Date], 'Ronaldo' as LastName, 'Christiano' as FirstName
UNION
SELECT 3 as ID, '2019-03-03' as [Date], 'Ronaldo' as LastName, 'Christiano' as FirstName
You can use a Common Table Expression(CTE) and join it back on the actual length of the string.
WITH CTE AS
(SELECT Distinct ID
,Date
,Firstname
,Min(len(LastName)) as Shortest
FROM #table
GROUP BY ID
,Date
,FirstName)
SELECT t.ID
,t.Date
,t.lastname
,t.firstname
FROM #table t
JOIN CTE c ON c.ID = t.ID
AND LEN(Lastname) = Shortest
This is a Similar Idea, but done without using a Join and using ROW_NUMBER.
WITH CTE AS
(SELECT Distinct ID
,Date
,Firstname
,LastName
,ROW_NUMBER() over (Partition By ID ORDER BY LEN(LastName) ASC) as Rnk
FROM #table)
SELECT *
FROM CTE
WHERE Rnk = 1
If you want all last names that are shortest for each id:
select t.*
from t
where len(t.lastname) = (select min(len(t2.lastname))
from t t2
where t2.id = t.id
);
Actually, in this case, to choose one of them, row_number() is a good solution:
select t.*
from (select t.*, row_number() over (partition by id order by len(lastname) asc) as seqnum
from t
) t
where seqnum = 1;

Get the maximum values of column B per each distinct value of column A sql

I have this table:
I am trying to pull all records from this table for the max value in the DIST_NO column for every distinct ID in the left most column, but I still want to pull every record for each ID in which there are different Product_ID's as well.
I tried partitioning and using row_number, but I am having trouble at the moment.
Here are my desired results:
This is what my code looks like currently:
select *
from
(SELECT *,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY DIST_NO DESC) RN
FROM Table) V
WHERE RN<=3
you want the max(DIST_NO) for each ID, product_ID?
If so, you can:
SELECT
ID, product_ID, max(DIST_NO)
from table
group by ID, product_ID
If you want the detail rows related to the max row, you just need to join it back to your table:
Select
t.ID, max_dist_no, TRANSaction_ID , LINE_NO , PRODUCT_ID
from
table t inner join
(SELECT
ID, max(DIST_NO) as max_dist_no
from table
group by ID) mx on
t.ID = mx.ID and
t.DIST_NO = max_DIST_NO
Try
SELECT MT.ID
, MT.DIST_NO
, MT.TRANS_ID
, MT.LINE_NO
, MT.PRODUCT_ID
FROM MYTABLE MT
INNER JOIN (
SELECT T.ID, MAX(T.DIST_NO) as DIST_NO FROM MYTABLE T
GROUP BY T.ID
) MAX_MT ON MT.Id = MAX_MT.ID AND MT.DIST_NO = MAX_MT.DIST_NO
The sub query returns each combination of ID and Max value of DIST_NO:
SELECT T.ID, MAX(T.DIST_NO) as DIST_NO FROM MYTABLE T
GROUP BY T.ID
Joining this back to your original table will basically filter your original data-set by only these combinations of values.
Tested on PostgreSQL:
WITH t1 AS (
SELECT id, product_id, MAX(dist_no) AS dist_no
FROM test
GROUP BY 1,2)
SELECT t1.id, t1.dist_no, t2.trans_id, t2.line_no, t1.product_id
FROM test t2, t1
WHERE t1.id=t2.id AND t1.product_id=t2.product_id AND t1.dist_no=t2.dist_no
Use rank() or dense_rank():
select t.*
from (SELECT t.*
RANK() OVER (PARTITION BY ID ORDER BY DIST_NO DESC) as seqnum
FROM Table t
) t
WHERE seqnum = 1;
This is almost a literal translation of your request:
I am trying to pull all records from this table for the max value in
the DIST_NO column for every distinct ID in the left most column.
you can try something like this one :). (But is your result correct? I think there is little mistake in TRANS_ID...)
DECLARE #ExampleTable TABLE
(ID INT,
DIST_NO INT,
TRANS_ID INT,
LINE_NO INT,
PRODUCT_ID INT)
INSERT INTO #ExampleTable
( ID, DIST_NO, TRANS_ID,LINE_NO, PRODUCT_ID )
VALUES ( 102657, 1, 1105365, 1, 109119 ),
( 102657, 1, 1105366, 2, 109114 ),
( 102657, 2, 1105365, 1, 109119 ),
( 102657, 2, 1105366, 2, 109114 ),
( 104371, 1, 1190538, 1, 110981 ),
( 104371, 2, 1190538, 1, 110981 )
;WITH CTE AS ( SELECT DISTINCT ID, LINE_NO
FROM #ExampleTable)
SELECT a.ID,
x.DIST_NO,
x.TRANS_ID,
x.LINE_NO,
x.PRODUCT_ID
FROM CTE a
CROSS APPLY (SELECT TOP 1 *
FROM #ExampleTable f
WHERE a.ID = f.ID AND
a.LINE_NO = f. LINE_NO
ORDER BY DIST_NO DESC) x

SQL - Return previous record details as column by date

I am trying to get a list of records showing changes in location and dates to display as one row for each record showing previous location.
Basically a query to take data like:
And display it like:
I tried using lag, but it mixes up some of the records. Would anyone be able suggest a good way to do this?
Thanks!
DECLARE #TABLE TABLE
(ID INT ,NAME VARCHAR(20), LOCATIONDATE DATETIME, REASON VARCHAR(20))
INSERT INTO #TABLE
(ID,NAME, LOCATIONDATE, REASON)
VALUES
( 1,'abc',CAST('2016/01/01' AS SMALLDATETIME),'move'),
( 2,'def',CAST('2016/02/01' AS SMALLDATETIME),'move'),
( 1,'abc',CAST('2016/06/01' AS SMALLDATETIME),'move'),
( 2,'def',CAST('2016/07/01' AS SMALLDATETIME),'move'),
( 1,'abc',CAST('2016/08/01' AS SMALLDATETIME),'move'),
( 3,'ghi',CAST('2016/08/01' AS SMALLDATETIME),'move')
select s.*
,t1.*
from
(
select t.*,
row_number() over (partition by id order by locationdate desc) rn
from #table t
) s
left join
(
select t.*,
row_number() over (partition by id order by locationdate desc) rn
from #table t
) t1
on t1.id = s.id and t1.rn = s.rn + 1
You can try it:
SELECT a.id,a.name,a.location as currentLocation,a.locationdatedate as currrentLocate,b.location as preLocation,b.locationdatedate as prevLocate,a.changereason
FROM test as a
JOIN test as b ON a.name = b.name
where a.locationdatedate > b.locationdatedate
group by a.name
Pleas try this one. it works here
SELECT l.id,
,l.name
,l.location as currentLocation
,l.locationdatedate as currrentLocate
,r.location as preLocation
,r.locationdatedate as prevLocate
,r.changereason
FROM tableName AS l inner join
tableName AS r ON r.id=l.id
WHERE l.locationdatedate !=r.locationdatedate AND l.locationdatedate > r.locationdatedate

Get average time between record creation

So I have data like this:
UserID CreateDate
1 10/20/2013 4:05
1 10/20/2013 4:10
1 10/21/2013 5:10
2 10/20/2012 4:03
I need to group by each user get the average time between CreateDates. My desired results would be like this:
UserID AvgTime(minutes)
1 753.5
2 0
How can I find the difference between CreateDates for all records returned for a User grouping?
EDIT:
Using SQL Server 2012
Try this:
SELECT A.UserID,
AVG(CAST(DATEDIFF(MINUTE,B.CreateDate,A.CreateDate) AS FLOAT)) AvgTime
FROM #YourTable A
OUTER APPLY (SELECT TOP 1 *
FROM #YourTable
WHERE UserID = A.UserID
AND CreateDate < A.CreateDate
ORDER BY CreateDate DESC) B
GROUP BY A.UserID
This approach should aslo work.
Fiddle demo here:
;WITH CTE AS (
Select userId, createDate,
row_number() over (partition by userid order by createdate) rn
from Table1
)
select t1.userid,
isnull(avg(datediff(second, t1.createdate, t2.createdate)*1.0/60),0) AvgTime
from CTE t1 left join CTE t2 on t1.UserID = t2.UserID and t1.rn +1 = t2.rn
group by t1.UserID;
Updated: Thanks to #Lemark for pointing out number of diff = recordCount - 1
since you're using 2012 you can use lead() to do this
with cte as
(select
userid,
(datediff(second, createdate,
lead(CreateDate) over (Partition by userid order by createdate)
)/60) datdiff
From table1
)
select
userid,
avg(datdiff)
from cte
group by userid
Demo
Something like this:
;WITH CTE AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY CreateDate) RN,
UserID,
CreateDate
FROM Tbl
)
SELECT
T1.UserID,
AVG(DATEDIFF(mi, ISNULL(T2.CreateDate, T1.CreateDate), T1.CreateDate)) AvgTime
FROM CTE T1
LEFT JOIN CTE T2
ON T1.UserID = T2.UserID
AND T1.RN = T2.RN - 1
GROUP BY T1.UserID
With SQL 2012 you can use the ROW_NUMBER function and self-join to find the "previous" row in each group:
WITH Base AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY CreateDate) RowNum,
UserId,
CreateDate
FROM Users
)
SELECT
B1.UserID,
ISNULL(
AVG(
DATEDIFF(mi,B2.CreateDate,B1.CreateDate) * 1.0
)
,0) [Average]
FROM Base B1
LEFT JOIN Base B2
ON B1.UserID = B2.UserID
AND B1.RowNum = B2.RowNum + 1
GROUP BY B1.UserId
Although I get a different answer for UserID 1 - I get an average of (5 + 1500) / 2 = 752.
This only works in 2012. You can use the LEAD analytic function:
CREATE TABLE dates (
id integer,
created datetime not null
);
INSERT INTO dates (id, created)
SELECT 1 AS id, '10/20/2013 4:05' AS created
UNION ALL SELECT 1, '10/20/2013 4:10'
UNION ALL SELECT 1, '10/21/2013 5:10'
UNION ALL SELECT 2, '10/20/2012 4:03';
SELECT id, isnull(avg(diff), 0)
FROM (
SELECT id,
datediff(MINUTE,
created,
LEAD(created, 1, NULL) OVER(partition BY id ORDER BY created)
) AS diff
FROM dates
) as diffs
GROUP BY id;
http://sqlfiddle.com/#!6/4ce89/22

SQL query: how to distinct count of a column group by another column

In my table I need to know if each ID has one and only one ID_name. How can I write such query?
I tried:
select ID, count(distinct ID_name) as count_name
from table
group by ID
having count_name > 1
But it takes forever to run.
Any thoughts?
select ID
from YourTable
group by
ID
having count(distinct ID_name) > 1
or
select *
from YourTable yt1
where exists
(
select *
from YourTable yt2
where yt1.ID = yt2.ID
and yt1.ID_Name <> yt2.ID_Name
)
Now, most ID columns are defined as primary key and are unique. So in a regular database you'd expect both queries to return an empty set.
select tt.ID,max(tt.myRank)
from
(
select
ip.ID,ip.ID_name,
ROW_Number() over (partition by ip.ID,ip.ID_nameorder by ip.ID) as myRank
from YourTable ip
) tt
group by tt.ID
This gives you every ID with it's total number of ID_Name
If you want only those ID's which have more than one name associated just add a where clause
e.g.
select tt.ID,max(tt.myRank)
from
(
select
ip.ID,ip.ID_name,
ROW_NUMBER() over (partition by ip.ID,ip.ID_nameorder by ip.ID) as myRank
from YourTable ip
) tt
**where tt.myRank > 1**
group by tt.ID