Simple CTE recursive query - sql

I am sorry to bother with such a simple question, but I decided to learn CTE recursive queries and I am unable to get my query work even after scoping many sources and threads. So I am humbly asking for pointing out my mistake(s).
Here is a part of table I am querying:
ID ContainerInstanceID ItemID ContentContainerInstanceID
--------- -------------------- ----------- --------------------------
73 40 NULL 41
69 40 23885 NULL
68 40 29683 NULL
67 40 29686 NULL
72 41 27392 NULL
71 41 29235 NULL
70 41 29213 NULL
I assembled this simple CTE query:
;WITH ContainerContent_CTE(InstanceID,ItemID,ContentContainerInstanceID) AS
(
-- ROOT set accordig to input parameter
SELECT ContainerInstanceID,SCA.ItemID,SCA.ContentContainerInstanceID
FROM StockContainerAssignments as SCA
WHERE SCA.ContainerInstanceID = 40 -- input parameter
UNION ALL
-- recursive data
SELECT ContainerInstanceID,SCA2.ItemID,SCA2.ContentContainerInstanceID
FROM ContainerContent_CTE AS CC
JOIN StockContainerAssignments as SCA2 on CC.InstanceID = SCA2.ContentContainerInstanceID
)
SELECT * FROM ContainerContent_CTE;
What I am trying to do is to take a top-level container, in this example it has ID = 40, which is my input parameter. Then, I try to connect other levels by linking ContainerInstanceID with ContentContainerInstanceID. In my example it is not null ar row ID = 73. This should add another 3 rows to my result set (so it should look similar to the example data I presented above), but I still get only top level rows:
InstanceID ItemID ContentContainerInstanceID
----------- ----------- --------------------------
40 29686 NULL
40 29683 NULL
40 23885 NULL
40 NULL 41
I appreciate hints to help me stumble over this subject.

You just had a few little things out of place. This should work for you.
with ContainerContent_CTE as
(
select SCA.ContainerInstanceID
,SCA.ItemID
,SCA.ContentContainerInstanceID
FROM StockContainerAssignments as SCA
WHERE SCA.ContainerInstanceID = 40 -- input parameter
UNION ALL
select SCA.ContainerInstanceID
,SCA.ItemID
,SCA.ContentContainerInstanceID
FROM StockContainerAssignments as SCA
inner join ContainerContent_CTE cte on cte.ContentContainerInstanceID = SCA.ContainerInstanceID
)
select *
from ContainerContent_CTE

This works for me
declare #t table (id int, instance int, container int);
insert into #t values
(73, 40, 41)
, (69, 40, NULL)
, (68, 40, NULL)
, (67, 40, NULL)
, (72, 41, NULL)
, (71, 41, NULL)
, (70, 41, NULL);
select * from #t;
with cte as
( select t.id, t.instance, t.container
from #t t
where t.instance = 40
union all
select t.id, t.instance, t.container
from cte
join #t t
on t.instance = cte.container
)
select * from cte;

As expected, a dumb little mistake got in a way - in the ON clause, I was connecting the parent and child with the oposite pairs of IDs. Since I'm only learning CTE, it was hard to see for me. Here it's fixed (for referrence):
;WITH ContainerContent_CTE(InstanceID,temID,ContentContainerInstanceID) AS
(
-- ROOT set accordig to input parameter
SELECT ContainerInstanceID,SCA.ItemID,SCA.ContentContainerInstanceID
FROM StockContainerAssignments as SCA
WHERE SCA.ContainerInstanceID = 40 -- input parameter
UNION ALL
-- recursive data
SELECT ContainerInstanceID,SCA2.ItemID,SCA2.ContentContainerInstanceID
FROM ContainerContent_CTE AS CC
INNER JOIN StockContainerAssignments as SCA2 on CC.ContentContainerInstanceID = SCA2.ContainerInstanceID)
SELECT * FROM ContainerContent_CTE;
Thank you for suggestions.

Related

Specific format Join results on SQL Server

I have trawled the internet looking for a solution but nothing so far.
Here are 2 sample tables joined on SID/ID
SID Name Attendance Class
1 abc good 1A
2 xyz bad 1B
3 dsk good 1A
4 uij bad 1B
5 sss bad 1A
6 fff good 1D
7 ccc good 1A
ID Lesson Result
1 Read Pass-67%
1 Write Pass-89%
1 Sing Pass-99%
2 Read Pass-75%
3 Sing Fail-47%
3 Read Pass-55%
4 Write Pass-90%
4 Sing Fail-10%
The results need to be in the following format.
A row showing the student name, followed by rows of the students' results.
If a student does not have any results they will not be included.
1, abc, good, 1A
1, Read, Pass-67%
1, Write, Pass-89%
1, Sing, Pass-99%
2, xyz, bad, 1B
2, Read, Pass-75%
3, dsk, good, 1A
3, Sing, Fail-47%
3, Read, Pass-55%
4, uij, bad, 1B
4, Write, Pass-90%
4, Sing, Fail-10%
I attempted using Union to no avail, it is similar to a pivot have not had any luck with that either. Is assume i’m missing a trick here, how can I get this done?
I have included the data if it makes it any easier!
CREATE TABLE RESULTS (ID Int, Lesson varchar(12), Result nvarchar(8))
insert into RESULTS (ID, Lesson, Result)
values
(1,'Read', 'Pass-67%'),
(1,'Write', 'Pass-89%'),
(1,'Sing', 'Pass-99%'),
(2,'Read', 'Pass-75%'),
(3,'Sing', 'Fail-47%'),
(3,'Read','Pass-55%'),
(4,'Write', 'Pass-90%'),
(4,'Sing', 'Fail-10%')
CREATE TABLE STUDENTS (ID int, Name varchar(5), Attendance nvarchar(10),
Class nvarchar (3))
insert into STUDENTS values
(1,'abc','good','1A'),
(2,'xyz','bad','1B'),
(3,'dsk','good','1A'),
(4,'uij','bad','1B'),
(5,'sss','bad','1A'),
(6,'fff','good','1D'),
(7,'ccc','good','1A')
You can use a UNION with a few workarounds.
;WITH Data AS
(
SELECT
S.ID,
S.Name,
S.Attendance,
S.Class,
IsStudent = 1
FROM
Students AS S
WHERE
EXISTS (SELECT 'at least one result' FROM Results AS R WHERE R.ID = S.ID)
UNION ALL
SELECT
ID = R.ID,
Name = R.Lesson,
Attendance = R.Result,
Class = NULL,
IsStudent = 0
FROM
Results AS R
)
SELECT
D.ID,
D.Name,
D.Attendance,
D.Class
FROM
Data AS D
ORDER BY
ID,
IsStudent DESC
But, as you can see on the final column names, you are mixing different data together which is not a good thing to do.
Use union all :
select t.*
from(select ID, Name, Attendance, class
from STUDENTS s
where exists (select 1 from RESULTS where id = s.id) union all
select ID, Lesson, Result, null
from RESULTS r
) t
order by id, (case when class is not null then 0 else 1 end);
Simply concat those columns and Union
SELECT CONVERT(VARCHAR(10),id)+' , '+Name+' , '+Attendance
AS ResultSet INTO #T FROM dbo.STUDENTS
UNION ALL
SELECT CONVERT(VARCHAR(10),ID)+' , '+Lesson+' , '+ Result
FROM dbo.RESULTS
SELECT * FROM #T ORDER BY ResultSet
DROP TABLE #T

SQL Server: ORDER BY parameters in IN statement

I have a SQL statement that is the following:
SELECT A.ID, A.Name
FROM Properties A
WHERE A.ID IN (110, 105, 104, 106)
When I run this SQL statement, the output is ordered according to the IN list by ID automatically and returns
104 West
105 East
106 North
110 South
I want to know if it is possible to order by the order the parameters are listed within the IN clause. so it would return
110 South
105 East
104 West
106 North
I think the easiest way in SQL Server is to use a JOIN with VALUES:
SELECT p.ID, p.Name
FROM Properties p JOIN
(VALUES (110, 1), (105, 2), (104, 3), (106, 4)) ids(id, ordering)
ON p.id = a.id
ORDER BY ids.ordering;
Sure...
just add an Order clause with a case in it
SELECT A.ID, A.Name
FROM Properties A
WHERE A.ID IN (110,105,104,106)
Order By case A.ID
when 110 then 0
when 105 then 1
when 104 then 2
when 106 then 3 end
With the help of a parsing function which returns the sequence as well
SELECT B.Key_PS
, A.ID
, A.Name
FROM Properties A
Join (Select * from [dbo].[udf-Str-Parse]('110,105,104,106',',')) B on A.ID=B.Key_Value
WHERE A.ID IN (110,105,104,106)
Order by Key_PS
The UDF if you need
CREATE FUNCTION [dbo].[udf-Str-Parse] (#String varchar(max),#Delimeter varchar(10))
--Usage: Select * from [dbo].[udf-Str-Parse]('Dog,Cat,House,Car',',')
-- Select * from [dbo].[udf-Str-Parse]('John Cappelletti was here',' ')
-- Select * from [dbo].[udf-Str-Parse]('id26,id46|id658,id967','|')
-- Select * from [dbo].[udf-Str-Parse]('hello world. It. is. . raining.today','.')
Returns #ReturnTable Table (Key_PS int IDENTITY(1,1), Key_Value varchar(max))
As
Begin
Declare #XML xml;Set #XML = Cast('<x>' + Replace(#String,#Delimeter,'</x><x>')+'</x>' as XML)
Insert Into #ReturnTable Select Key_Value = ltrim(rtrim(String.value('.', 'varchar(max)'))) FROM #XML.nodes('x') as T(String)
Return
End
The Parser alone would return
Select * from [dbo].[udf-Str-Parse]('110,105,104,106',',')
Key_PS Key_Value
1 110
2 105
3 104
4 106
What you could potentially do is:
Create a TVF that would split string and keep original order.
This questions seems to have this function already written: MS SQL: Select from a split string and retain the original order (keep in mind that there might be other approaches, not only those, covered in this question, I just gave it as an example to understand what function should do)
So now if you'd run this query:
SELECT *
FROM dbo.Split('110,105,104,106', ',') AS T;
It would bring back this table as a result.
items rownum
------------
110 1
105 2
104 3
106 4
Following that, you could simply query your table, join with this TVF passing your IDs as a parameter:
SELECT P.ID, P.Name
FROM Properties AS P
INNER JOIN dbo.Split('110,105,104,106', ',') AS T
ON T.items = P.ID
ORDER BY T.rownum;
This should retain order of parameters.
If you need better performance, I'd advice to put records from TVF into hash table, index it and then join with actual table. See query below:
SELECT T.items AS ID, T.rownum AS SortOrder
INTO #Temporary
FROM dbo.Split('110,105,104,106', ',') AS T;
CREATE CLUSTERED INDEX idx_Temporary_ID
ON #Temporary(ID);
SELECT P.ID, P.Name
FROM Properties AS P
INNER JOIN #Temporary AS T
ON T.ID = P.ID
ORDER BY T.SortOrder;
This should work better on larger data sets and equally well on small ones.
Here is a solution that does not rely on hard codes values or dynamic sql (to eliminate hard coding values).
I would build a table (maybe temp or variable) with OrderByValue and OrderBySort and insert from the application.
OrderByValue OrderBySort
110 1
105 2
104 3
106 4
Then I would join on the value and sort by the sort. The join will be the same as the In clause.
SELECT A.ID, A.Name
FROM Properties A
JOIN TempTable B On A.ID = B.OrderByValue
Order By B.OrderBySort
Another solution for this problem is prepare a temporary table for IN clause like
declare #InTable table (ID int, SortOrder int not null identity(1,1));
We can fill this temp table with your data in order you want
insert into #InTable values (110), (105), (104), (106);
At last we need to modify your question to use this temp table like this
select A.ID, A.Name
from Properties A
inner join #InTable as Sort on A.ID = Sort.ID
order by Sort.SortOrder
On the output you can see this
ID Name
110 South
105 East
104 West
106 North
In this solution you don't need to provide order in special way. You just need to insert values in order you want.

Need query to select direct and indirect customerID aliases

I need a query that will return all related alias id's from either column. Shown here are some alias customer ids, among thousands of other rows. If the input parameter to a query is id=7, I need a query that would return 5 rows (1,5,7,10,22). That is because they are all aliases of one-another. For example, 22 and 10 are indirect aliases of 7.
CustomerAlias
--------------------------
AliasCuID AliasCuID2
--------------------------
1 5
1 7
5 7
10 5
22 1
Here is an excerpt from the customer table.
Customer
----------------------------------
CuID CuFirstName CuLastName
----------------------------------
1 Mike Jones
2 Fred Smith
3 Jack Jackson
4 Emily Simpson
5 Mike Jones
6 Beth Smith
7 Mike jones
8 Jason Robard
9 Emilie Jiklonmie
10 Michael jones
11 Mark Lansby
12 Scotty Slash
13 Emilie Jiklonmy
22 mike jones
I've been able to come close, but I cannot seem to select the indirectly related aliases correctly. Given this query:
SELECT DISTINCT Customer.CuID, Customer.CuFirstName, Customer.CuLastName
FROM Customer WHERE
(Customer.CuID = 7) OR (Customer.CuID IN
(SELECT AliasCuID2
FROM CustomerAlias AS CustomerAlias_2
WHERE (AliasCuID = 7))) OR (Customer.CuID IN
(SELECT AliasCuID
FROM CustomerAlias AS CustomerAlias_1
WHERE (AliasCuID2 = 7)))
Returns 3 out of 5 of the desired ids of course. This lacks the indirectly related aliased id of 10 and 22 in the result rows.
1 Mike Jones
5 Mike Jones
7 Mike jones
* Based on suggestions below, I am trying a CTE hierarchical query.
I have this now after following some suggestions. It works for some, as long as the records in the table reference enough immediate ids. But, if the query uses id=10, then it still comes up short, just by the nature of the data.
DECLARE #id INT
SET #id = 10;
DECLARE #tmp TABLE ( a1 INT, a2 INT, Lev INT );
WITH Results (AliasCuID, AliasCuID2, [Level]) AS (
SELECT AliasCuID,
AliasCuID2,
0 as [Level]
FROM CustomerAlias
WHERE AliasCuID = #id OR AliasCuID2 = #id
UNION ALL
-- Recursive step
SELECT a.AliasCuID,
a.AliasCuID2,
r.[Level] + 1 AS [Level]
FROM CustomerAlias a
INNER JOIN Results r ON a.AliasCuID = r.AliasCuID2 )
INSERT INTO #tmp
SELECT * FROM Results;
WITH Results3 (AliasCuID, AliasCuID2, [Level]) AS (
SELECT AliasCuID,
AliasCuID2,
0 as [Level]
FROM CustomerAlias
WHERE AliasCuID = #id OR AliasCuID2 = #id
UNION ALL
-- Recursive step
SELECT a.AliasCuID,
a.AliasCuID2,
r.[Level] + 1 AS [Level]
FROM CustomerAlias a
INNER JOIN Results3 r ON a.AliasCuID2 = r.AliasCuID )
INSERT INTO #tmp
SELECT * FROM Results3;
SELECT DISTINCT a1 AS id FROM #tmp
UNION ALL
SELECT DISTINCT a2 AS id FROM #tmp
ORDER BY id
Note that this is a simplified the query to just give a list of related ids.
---
id
---
5
5
7
10
But, it is still unable to pull in ids 1 and 22.
This is not an easy problem to solve unless you have some idea of the depth of your search (https://stackoverflow.com/a/7569520/1803682) - which it looks like you do not - and take a brute force approach to it.
Assuming you do not know the depth you will need to write a stored proc. I followed this approach for a nearly identical problem: https://dba.stackexchange.com/questions/7147/find-highest-level-of-a-hierarchical-field-with-vs-without-ctes/7161#7161
UPDATE
If you don't care about the chain of how the alias's were created - I would run a script recursively to make them all refer to a single (master?) record. Then you can easily do the search and it will be quick - not a solution if you care about how the alias's get traversed though.
I created a SQL Fiddle for SQL Server 2012. Please let me know if you can or cannot access it.
My thought here was that you'd want to just keep checking the left and right branches recursively, separately. This logic probably falls apart if the relationships bounce between left and right. You could set up a third CTE to reference the first two, but joining on left to right and right to left, but ain't nobody got time for that.
The code is below as well.
CREATE TABLE CustomerAlias
(
AliasCuID INT,
AliasCuID2 INT
)
GO
INSERT INTO CustomerAlias
SELECT 1,5
UNION SELECT 1, 7
UNION SELECT 5, 7
UNION SELECT 10, 5
UNION SELECT 22, 1
GO
DECLARE #Value INT
SET #Value = 7
; WITH LeftAlias AS
(
SELECT AliasCuID
, AliasCuID2
FROM CustomerAlias
WHERE AliasCuID2 = #Value
UNION ALL
SELECT a.AliasCuID
, a.AliasCuID2
FROM CustomerAlias a
JOIN LeftAlias b
ON a.AliasCuID = b.AliasCuID2
)
, RightAlias AS
(
SELECT AliasCuID
, AliasCuID2
FROM CustomerAlias
WHERE AliasCuID = #Value
UNION ALL
SELECT a.AliasCuID
, a.AliasCuID2
FROM CustomerAlias a
JOIN LeftAlias b
ON a.AliasCuID2 = b.AliasCuID
)
SELECT DISTINCT A
FROM
(
SELECT A = AliasCuID
FROM LeftAlias
UNION ALL
SELECT A = AliasCuID2
FROM LeftAlias
UNION ALL
SELECT A = AliasCuID
FROM RightAlias
UNION ALL
SELECT A = AliasCuID2
FROM RightAlias
) s
ORDER BY A

SQL Query to eliminate similar entries

I am working on a problem in SQL Server 2008
I have a table with six columns:
PK INT
dOne SmallINT
dTwo SmallINT
dThree SmallINT
dFour SmallINT
dFiveSmallINT
dSix SmallINT
The table contains around a million recrods. It's probably worth noting that value in column n+1 > value in column n i.e. 97, 98, 99, 120, 135. I am trying to eliminate all rows which have 5 DIGITS in common (ignoring the PK) i.e.:
76, 89, 99, 102, 155, 122
11, 89, 99, 102, 155, 122
89, 99, 102, 155, 122, 130
In this case the algorithm should start at the first row and delete the second and third rows because they contain 5 matching digits. The first row persists.
I have tried to brute force the solution but finding all the duplicates for only the first record takes upwards of 25 seconds meaning processing the whole table would take... way too long (this should be a repeatable process).
I am fairly new to SQL but this is what I have come up with (I have come up with a few solutions but none were adequate... this is the latest attempt):
(I won't include all the code but I will explain the method, I can paste more if it helps)
Save the digits of record n into variables. SELECT all records which have one digit in common with record n FROM largeTable.
Insert all selected digits into #oneMatch and include [matchingOne] with the digit that matched.
Select all records which have one digit in common with record n FROM the temp table WHERE 'digit in common' != [matching]. INSERT all selected digits into #twoMatch and include [matchingOne] AND [matchingTwo]...
Repeat until inserting into #fiveMatch. Delete #fiveMatch from largeTable and move to record n+1
I am having a problem implementing this solution. How can I assign the matching variable depending on the WHERE clause?
-- SELECT all records with ONE matching field:
INSERT INTO #oneMatch (ID_pk, dOne, dTwo, dThree, dFour, dFive, dSix, mOne)
SELECT ID_pk, dOne, dTwo, dThree, dFour, dFive, dSix
FROM dbo.BaseCombinationsExtended
WHERE ( [dOne] IN (#dOne, #dTwo, #dThree, #dFour, #dFive, #dSix) **mOne = dOne?
OR [dTwo] IN (#dOne, #dTwo, #dThree, #dFour, #dFive, #dSix) **mOne = dTwo?
OR [dTwo] IN (#dOne, #dTwo, #dThree, #dFour, #dFive, #dSix) **mOne = dThree?
...
OR [dSix] IN (#dOne, #dTwo, #dThree, #dFour, #dFive, #dSix) **mOne = dSix?
)
I am able to 'fake' the above using six queries but that is too inefficient...
Sorry for the long description. Any help would be greatly appreciated (new solution or implementation of my attempt above) as this problem has been nagging at me for a while...
Unless I miss something this should produce the correct result.
declare #T table
(
PK INT identity primary key,
dOne SmallINT,
dTwo SmallINT,
dThree SmallINT,
dFour SmallINT,
dFive SmallINT,
dSix SmallINT
)
insert into #T values
(76, 89, 99, 102, 155, 122),
(11, 89, 99, 102, 155, 122),
(89, 99, 102, 155, 122, 130)
;with q1(PK, d1, d2, d3, d4, d5) as
(
select PK, dTwo, dThree, dFour, dFive, dSix
from #T
union all
select PK, dOne, dThree, dFour, dFive, dSix
from #T
union all
select PK, dOne, dTwo, dFour, dFive, dSix
from #T
union all
select PK, dOne, dTwo, dThree, dFive, dSix
from #T
union all
select PK, dOne, dTwo, dThree, dFour, dSix
from #T
union all
select PK, dOne, dTwo, dThree, dFour, dFive
from #T
),
q2 as
(
select PK,
row_number() over(partition by d1, d2, d3, d4, d5 order by PK) as rn
from q1
),
q3 as
(
select PK
from q2
where rn = 1
group by PK
having count(*) = 6
)
select T.*
from #T as T
inner join q3 as Q
on T.PK = Q.PK
I can't make any promises on performance, but you can try this. The first thing that I do is put the data into a more normalized structure.
CREATE TABLE dbo.Test_Sets_Normalized (my_id INT NOT NULL, c SMALLINT NOT NULL)
GO
INSERT INTO dbo.Test_Sets_Normalized (my_id, c)
SELECT my_id, c1 FROM dbo.Test_Sets UNION ALL
SELECT my_id, c2 FROM dbo.Test_Sets UNION ALL
SELECT my_id, c3 FROM dbo.Test_Sets UNION ALL
SELECT my_id, c4 FROM dbo.Test_Sets UNION ALL
SELECT my_id, c5 FROM dbo.Test_Sets UNION ALL
SELECT my_id, c6 FROM dbo.Test_Sets
GO
SELECT DISTINCT
T2.my_id
FROM
(SELECT DISTINCT my_id FROM dbo.Test_Sets_Normalized) T1
INNER JOIN (SELECT DISTINCT my_id FROM dbo.Test_Sets_Normalized) T2 ON T2.my_id > T1.my_id
WHERE
(
SELECT
COUNT(*)
FROM
dbo.Test_Sets_Normalized T3
INNER JOIN dbo.Test_Sets_Normalized T4 ON
T4.my_id = T2.my_id AND
T4.c = T3.c
WHERE
T3.my_id = T1.my_id) >= 5
That should get you the IDs that you need. Once you've confirmed that it does what you want, you can JOIN back to the original table and delete by IDs.
There's probably an improvement possible somewhere that doesn't require the DISTINCT. I'll give it a little more thought.
Edit - the following approach might be better than N squared performance, depending on the optimizer. If all 5 columns are indexed it should only need 6 index seeks per row, which is still N * logN. It does seem a little dopey though.
You could code generate the where condition based on all the permutations of 5 matches: so the records to delete would be given by:
SELECT * FROM SillyTable ToDelete WHERE EXISTS
(
SELECT PK From SillyTable Duplicate
WHERE ( (
(Duplicate.dOne=ToDelete.dOne)
AND (Duplicate.dTwo=ToDelete.dTwo)
AND (Duplicate.dThree=ToDelete.dThree)
AND (Duplicate.dFour=ToDelete.dFour)
AND (Duplicate.dFive=ToDelete.dFive)
) OR (
(Duplicate.dOne=ToDelete.dTwo)
AND (Duplicate.dTwo=ToDelete.dThree)
AND (Duplicate.dThree=ToDelete.dFour)
AND (Duplicate.dFour=ToDelete.dFive)
AND (Duplicate.dFive=ToDelete.dSix)
) OR (
(Duplicate.dTwo=ToDelete.dOne)
AND (Duplicate.dThree=ToDelete.dTwo)
AND (Duplicate.dFour=ToDelete.dThree)
AND (Duplicate.dFive=ToDelete.dFour)
AND (Duplicate.dSix=ToDelete.dFive)
) OR (
(Duplicate.dTwo=ToDelete.dTwo)
AND (Duplicate.dThree=ToDelete.dThree)
AND (Duplicate.dFour=ToDelete.dFour)
AND (Duplicate.dFive=ToDelete.dFive)
AND (Duplicate.dSix=ToDelete.dSix)
) ...
This goes on to cover all 36 combinations (there is one non-match on each side of the join, out of 6 possible columns, so 6*6 gives you all the possibilites). I would code generate this because it's a lot of typing, and what if you want 4 out of 6 matches tomorrow, but you could hand code it I guess.

Simple Query to Grab Max Value for each ID

OK I have a table like this:
ID Signal Station OwnerID
111 -120 Home 1
111 -130 Car 1
111 -135 Work 2
222 -98 Home 2
222 -95 Work 1
222 -103 Work 2
This is all for the same day. I just need the Query to return the max signal for each ID:
ID Signal Station OwnerID
111 -120 Home 1
222 -95 Work 1
I tried using MAX() and the aggregation messes up with the Station and OwnerID being different for each record. Do I need to do a JOIN?
Something like this? Join your table with itself, and exclude the rows for which a higher signal was found.
select cur.id, cur.signal, cur.station, cur.ownerid
from yourtable cur
where not exists (
select *
from yourtable high
where high.id = cur.id
and high.signal > cur.signal
)
This would list one row for each highest signal, so there might be multiple rows per id.
You are doing a group-wise maximum/minimum operation. This is a common trap: it feels like something that should be easy to do, but in SQL it aggravatingly isn't.
There are a number of approaches (both standard ANSI and vendor-specific) to this problem, most of which are sub-optimal in many situations. Some will give you multiple rows when more than one row shares the same maximum/minimum value; some won't. Some work well on tables with a small number of groups; others are more efficient for a larger number of groups with smaller rows per group.
Here's a discussion of some of the common ones (MySQL-biased but generally applicable). Personally, if I know there are no multiple maxima (or don't care about getting them) I often tend towards the null-left-self-join method, which I'll post as no-one else has yet:
SELECT reading.ID, reading.Signal, reading.Station, reading.OwnerID
FROM readings AS reading
LEFT JOIN readings AS highersignal
ON highersignal.ID=reading.ID AND highersignal.Signal>reading.Signal
WHERE highersignal.ID IS NULL;
In classic SQL-92 (not using the OLAP operations used by Quassnoi), then you can use:
SELECT g.ID, g.MaxSignal, t.Station, t.OwnerID
FROM (SELECT id, MAX(Signal) AS MaxSignal
FROM t
GROUP BY id) AS g
JOIN t ON g.id = t.id AND g.MaxSignal = t.Signal;
(Unchecked syntax; assumes your table is 't'.)
The sub-query in the FROM clause identifies the maximum signal value for each id; the join combines that with the corresponding data row from the main table.
NB: if there are several entries for a specific ID that all have the same signal strength and that strength is the MAX(), then you will get several output rows for that ID.
Tested against IBM Informix Dynamic Server 11.50.FC3 running on Solaris 10:
+ CREATE TEMP TABLE signal_info
(
id INTEGER NOT NULL,
signal INTEGER NOT NULL,
station CHAR(5) NOT NULL,
ownerid INTEGER NOT NULL
);
+ INSERT INTO signal_info VALUES(111, -120, 'Home', 1);
+ INSERT INTO signal_info VALUES(111, -130, 'Car' , 1);
+ INSERT INTO signal_info VALUES(111, -135, 'Work', 2);
+ INSERT INTO signal_info VALUES(222, -98 , 'Home', 2);
+ INSERT INTO signal_info VALUES(222, -95 , 'Work', 1);
+ INSERT INTO signal_info VALUES(222, -103, 'Work', 2);
+ SELECT g.ID, g.MaxSignal, t.Station, t.OwnerID
FROM (SELECT id, MAX(Signal) AS MaxSignal
FROM signal_info
GROUP BY id) AS g
JOIN signal_info AS t ON g.id = t.id AND g.MaxSignal = t.Signal;
111 -120 Home 1
222 -95 Work 1
I named the table Signal_Info for this test - but it seems to produce the right answer.
This only shows that there is at least one DBMS that supports the notation. However, I am a little surprised that MS SQL Server does not - which version are you using?
It never ceases to surprise me how often SQL questions are submitted without table names.
WITH q AS
(
SELECT c.*, ROW_NUMBER() OVER (PARTITION BY id ORDER BY signal DESC) rn
FROM mytable
)
SELECT *
FROM q
WHERE rn = 1
This will return one row even if there are duplicates of MAX(signal) for a given ID.
Having an index on (id, signal) will greatly improve this query.
with tab(id, sig, sta, oid) as
(
select 111 as id, -120 as signal, 'Home' as station, 1 as ownerId union all
select 111, -130, 'Car', 1 union all
select 111, -135, 'Work', 2 union all
select 222, -98, 'Home', 2 union all
select 222, -95, 'Work', 1 union all
select 222, -103, 'Work', 2
) ,
tabG(id, maxS) as
(
select id, max(sig) as sig from tab group by id
)
select g.*, p.* from tabG g
cross apply ( select top(1) * from tab t where t.id=g.id order by t.sig desc ) p
We can do using self join
SELECT T1.ID,T1.Signal,T2.Station,T2.OwnerID
FROM (select ID,max(Signal) as Signal from mytable group by ID) T1
LEFT JOIN mytable T2
ON T1.ID=T2.ID and T1.Signal=T2.Signal;
Or you can also use the following query
SELECT t0.ID,t0.Signal,t0.Station,t0.OwnerID
FROM mytable t0
LEFT JOIN mytable t1 ON t0.ID=t1.ID AND t1.Signal>t0.Signal
WHERE t1.ID IS NULL;
select a.id, b.signal, a.station, a.owner from
mytable a
join
(SELECT ID, MAX(Signal) as Signal FROM mytable GROUP BY ID) b
on a.id = b.id AND a.Signal = b.Signal
SELECT * FROM StatusTable
WHERE Signal IN (
SELECT A.maxSignal FROM
(
SELECT ID, MAX(Signal) AS maxSignal
FROM StatusTable
GROUP BY ID
) AS A
);
select
id,
max_signal,
owner,
ownerId
FROM (
select * , rank() over(partition by id order by signal desc) as max_signal from table
)
where max_signal = 1;