How to pivot two rows into two columns - sql

I have the following SQL Query:
select
distinct
Equipment_Reserved.Equipment_Attached_To,
Equipment.Name
from
Equipment,
Studies,
Equipment_Reserved
where
Studies.Study = 'MAINT19-01'
and
Equipment.idEquipment = Equipment_Reserved.Equipment_idEquipment
and
Studies.idStudies = Equipment_Reserved.Studies_idStudies
and
Equipment.Type = 'Probe'
This query produces the following results:
Equipment_Attached_To Name
2297 R1-P1
2297 R1-P2
2299 R1-P3
I would like to change it to the following:
Equipment_Attached_To Name1 Name2
2297 R1-P1 R1-P2
2299 R1-P3 NULL
Thanks for your help!

I'd first change your query from the old, legacy JOIN syntax to an explicit join as it makes the query easier to understand:
SELECT
DISTINCT
Equipment_Reserved.Equipment_Attached_To,
Equipment.Name
FROM
Equipment
INNER JOIN Equipment_Reserved ON Equipment_Reserved.Equipment_idEquipment = Equipment.idEquipment
INNER JOIN Studies ON Studies.idStudies = Equipment_Reserved.Studies_idStudies
WHERE
Studies.Study = 'MAINT19-01'
AND
Equipment.Type = 'Probe'
I don't think you actually need a PIVOT - I think you can do this with a nested query with the ROW_NUMBER function. I've seen that PIVOT queries often have worse query execution plans than nested-queries.
Let's add ROW_NUMBER (which require an ORDER BY as it's a windowing-function) and a matching ORDER BY in the whole query to make it consistent). Let's also use PARTITION BY so it resets the row-number for each Equipment_Attached_To value:
SELECT
DISTINCT
Equipment_Reserved.Equipment_Attached_To,
Equipment.Name,
ROW_NUMBER() OVER (PARTITION BY Equipment_Attached_To ORDER BY [Name]) AS RowNumber
FROM
Equipment
INNER JOIN Equipment_Reserved ON Equipment_Reserved.Equipment_idEquipment = Equipment.idEquipment
INNER JOIN Studies ON Studies.idStudies = Equipment_Reserved.Studies_idStudies
WHERE
Studies.Study = 'MAINT19-01'
AND
Equipment.Type = 'Probe'
ORDER BY
Equipment_Attached_To,
[Name]
This will give output like this:
Equipment_Attached_To Name RowNumber
2297 R1-P1 1
2297 R1-P2 2
2299 R1-P3 1
This can then be split out into explicit columns like so below. The use of MAX() is arbitrary (we could use MIN() instead) and only because we're dealing with a GROUP BY and because the CASE WHEN... restricts the input set to just 1 row anyway.
SELECT
Equipment_Attached_To,
MAX( CASE WHEN RowNumber = 1 THEN [Name] END ) AS Name1,
MAX( CASE WHEN RowNumber = 2 THEN [Name] END ) AS Name2
FROM
(
-- the query from above
)
GROUP BY
Equipment_Attached_To
ORDER BY
Equipment_Attached_To,
Name1,
Name2
So the final query is:
SELECT
Equipment_Attached_To,
MAX( CASE WHEN RowNumber = 1 THEN [Name] END ) AS Name1,
MAX( CASE WHEN RowNumber = 2 THEN [Name] END ) AS Name2
FROM
(
SELECT
DISTINCT
Equipment_Reserved.Equipment_Attached_To,
Equipment.Name,
ROW_NUMBER() OVER (PARTITION BY Equipment_Attached_To ORDER BY [Name]) AS RowNumber
FROM
Equipment
INNER JOIN Equipment_Reserved ON Equipment_Reserved.Equipment_idEquipment = Equipment.idEquipment
INNER JOIN Studies ON Studies.idStudies = Equipment_Reserved.Studies_idStudies
WHERE
Studies.Study = 'MAINT19-01'
AND
Equipment.Type = 'Probe'
)
GROUP BY
Equipment_Attached_To
ORDER BY
Equipment_Attached_To,
Name1,
Name2

Let's start with some basics.
To facilitate reading the code, I added alias to the tables using their initials.
Then, I converted the old join syntax which is partly deprecated to use the standard syntax since 1992 (27 years and people still use the old syntax).
Finally, since there are only 2 possible values, we can use MIN and MAX to separate them in 2 columns.
And because we're using aggregate functions, we remove the DISTINCT and use GROUP BY
The code now looks like this:
SELECT er.Equipment_Attached_To,
--Gets the first row for the id
MIN( e.Name) AS Name1,
--If the MAX is equal to the MIN, returns a NULL. If not, it returns the second value.
NULLIF( MAX(e.Name), MIN( e.Name)) AS Name2
FROM Equipment e
JOIN Studies s ON s.idStudies = er.Studies_idStudies
JOIN Equipment_Reserved er ON e.idEquipment = er.Equipment_idEquipment
WHERE s.Study = 'MAINT19-01'
AND e.Type = 'Probe'
GROUP BY er.Equipment_Attached_To;

Related

Find NON-duplicate column value combinations

This is for a migration script.
CompanyTable:
EmployeeId
DivisionId
abc
div1
def
div1
abc
div1
abc
div2
xyz
div2
In the below code I am Selecting duplicate EmployeeId-DivisionId combinations, that is, the records that have the same EmployeeId and DivisionId will be selected. So from the above table, the two rows that have abc-div1 combination will be selected by the below code.
How can I invert it? It seems so simple but I can't figure it out. I tried replacing with HAVING count(*) = 0 instead of > 1, I've tried fiddling with the equality signs in the ON and AND lines. Basically from the above table, I want to select the other three rows that don't have the abc-div1 combination. If there is a way to select all the unique EmployeeID-DivisionId combinations, let me know.
SELECT a.EmployeeID, a.DivisionId FROM CompanyTable a
JOIN ( SELECT EmployeeID, DivisionId
FROM CompanyTable
GROUP BY EmployeeID, DivisionId
HAVING count(*) > 1 ) b
ON a.EmployeeID = b.EmployeeID
AND a.DivisionId = b.DivisionId;
EmployeeId and DivisionId are both nvarchar(50) columns.
A windowed count would seem a suitable method:
select employeeid, divisionid
from (
select *, Count(*) over(partition by employeeid, divisionid) ct
from t
)t
where ct = 1;
As already mentioned, you must replace > 1 by its real opposite <= 1, this works: db<>fiddle
First, let's try rewriting your query using a common table expression (CTE), instead of a subquery:
WITH cteCompanyTableStats as (
SELECT
EmployeeID, DivisionId,
HasDuplicates = CASE WHEN count(*) > 1 THEN1 ELSE 0 END
FROM CompanyTable
GROUP BY EmployeeID, DivisionId
)
SELECT ct.*
FROM CompanyTable ct
inner join cteCompanyTableStats cts on
ct.EmployeeId = cts.EmployeeId
and ct.DivisionId = cts.DivisionId
and cts.HasDuplicates = 1
Notice how I've removed the HAVING clause & added a new HasDuplicates column? We're going to use that new column to find all of the table rows that -DON'T- have duplicates:
WITH cteCompanyTableStats as (
SELECT
EmployeeID, DivisionId,
HasDuplicates = CASE WHEN count(*) > 1 THEN1 ELSE 0 END
FROM CompanyTable
GROUP BY EmployeeID, DivisionId
)
SELECT ct.*
FROM CompanyTable ct
inner join cteCompanyTableStats cts on
ct.EmployeeId = cts.EmployeeId
and ct.DivisionId = cts.DivisionId
and cts.HasDuplicates = 0
The only character of SQL code that changed between the two queries was the last line, where and cts.HasDuplicates = ### is set.

SQL ROW_NUMBER OVER syntax

I have this:
SELECT ROW_NUMBER() OVER (ORDER BY vwmain.ch) as RowNumber,
vwmain.vehicleref,vwmain.capid,
vwmain.manufacturer,vwmain.model,vwmain.derivative,
vwmain.isspecial,
vwmain.created,vwmain.updated,vwmain.stocklevel,
vwmain.[type],
vwmain.ch,vwmain.co2,vwmain.mpg,vwmain.term,vwmain.milespa
FROM vwMain_LATEST vwmain
INNER JOIN HomepageFeatured
on vwMain.vehicleref = homepageFeatured.vehicleref
WHERE homepagefeatured.siteskinid = 1
AND homepagefeatured.Rotator = 1
AND RowNumber = 1
ORDER BY homepagefeatured.orderby
It fails on "Invalid column name RowNumber"
Not sure how to prefix it to access it?
Thanks
You can't reference the field like that. You can however use a subquery or a common-table-expression:
Here's a subquery:
SELECT *
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY vwmain.ch) as RowNumber,
vwmain.vehicleref,vwmain.capid,
vwmain.manufacturer,vwmain.model,vwmain.derivative,
vwmain.isspecial,
vwmain.created,vwmain.updated,vwmain.stocklevel,
vwmain.[type],
vwmain.ch,vwmain.co2,vwmain.mpg,vwmain.term,vwmain.milespa,
homepagefeatured.orderby
FROM vwMain_LATEST vwmain
INNER JOIN HomepageFeatured on vwMain.vehicleref = homepageFeatured.vehicleref
WHERE homepagefeatured.siteskinid = 1
AND homepagefeatured.Rotator = 1
) T
WHERE RowNumber = 1
ORDER BY orderby
Rereading your query, since you aren't partitioning by any fields, the order by at the end is useless (it contradicts the order of the ranking function). You're probably better off using top 1...
Using top:
SELECT top 1 vwmain.vehicleref,vwmain.capid,
vwmain.manufacturer,vwmain.model,vwmain.derivative,
vwmain.isspecial,
vwmain.created,vwmain.updated,vwmain.stocklevel,
vwmain.[type],
vwmain.ch,vwmain.co2,vwmain.mpg,vwmain.term,vwmain.milespa,
homepagefeatured.orderby
FROM vwMain_LATEST vwmain
INNER JOIN HomepageFeatured on vwMain.vehicleref = homepageFeatured.vehicleref
WHERE homepagefeatured.siteskinid = 1
AND homepagefeatured.Rotator = 1
ORDER BY homepagefeatured.orderby
(EDIT: This answer is a nearby pitfall. I leave it for documentation.)
Have a look here: Referring to a Column Alias in a WHERE Clause
This is the same situation.
It is a matter of how the sql query is parsed/compiled internally, so your field alias names are not known at the time the where clause is interpreted. Therefore you might try, in reference to the example above:
SELECT ROW_NUMBER() OVER (ORDER BY vwmain.ch) as RowNumber,
vwmain.vehicleref,vwmain.capid, vwmain.manufacturer,vwmain.model,vwmain.derivative, vwmain.isspecial,vwmain.created,vwmain.updated,vwmain.stocklevel, vwmain.[type],
vwmain.ch,vwmain.co2,vwmain.mpg,vwmain.term,vwmain.milespa
FROM vwMain_LATEST vwmain
INNER JOIN HomepageFeatured on vwMain.vehicleref = homepageFeatured.vehicleref
WHERE homepagefeatured.siteskinid = 1
AND homepagefeatured.Rotator = 1
AND ROW_NUMBER() OVER (ORDER BY vwmain.ch) = 1
ORDER BY homepagefeatured.orderby
Thus you see your expression in the select statement is exactly reused in the where clause.

Remove duplicates in SQL Result set of ONE table

Afternoon/Evening all,
I'm looking for the final touches to the below query. I need to remove the duplicate occurrences of a column in a particular row. Currently using the below SQL:
SELECT CBNEW.*
FROM CallbackNewID CBNEW
INNER JOIN (SELECT IDNEW, MAX(CallbackDate) AS MaxDate
FROM CallbackNewID
GROUP BY IDNEW) AS groupedCBNEW
ON (CBNEW.CallbackDate = groupedCBNEW.MaxDate) AND (CBNEW.IDNEW = groupedCBNEW.IDNEW);
My result set looks like the below
ID RecID Comp Rem Date_ IDNEW IDOLD CB? CallbackDate
138618 83209 1 0 2012-03-16 12:40:00 83209 83209 2 16-Mar-12
138619 83209 1 0 2012-03-16 12:40:00 83209 83209 2 16-Mar-12
110470 83799 1 0 2011-07-27 11:46:00 83799 83799 10 27-Jul-11
110471 83799 1 0 2011-07-27 11:46:00 83799 83799 10 27-Jul-11
This however gives me duplicate values in the CallBackDate and IDNEW Column because in the table there are some different Primary Keys with the same IDNEW and CallbackDate values.
If I dump this result into Excel, I can just use remove duplicates on the first ID column, and the problem's solved.
But what I want to do is make sure my result only includes the FIRST instance of the ID column, where IDNEW and CallbackDate are duplicated.
I'm sure I just need to append a tiny piece of SQL, but I'm stuck if I can find the answer so far.
Your help is very much appreciated.
Try adding MIN(ID) to the inner query and then adding it also on the ON clause:
SELECT CBNEW.*
FROM CallbackNewID CBNEW
INNER JOIN (SELECT IDNEW, MIN(ID) AS MinId, MAX(CallbackDate) AS MaxDate
FROM CallbackNewID
GROUP BY IDNEW) AS groupedCBNEW
ON (CBNEW.CallbackDate = groupedCBNEW.MaxDate)
AND (CBNEW.IDNEW = groupedCBNEW.IDNEW)
AND (CBNEW.ID = groupedCBNEW.MinId) ;
sqlfiddle demo
Here is a rather "brute force" approach. It just takes the results of your original query and does Min() on [ID], Max() on [Comp] and [Rem], and GROUP BY on everything else:
SELECT
Min(t.ID) AS MinOfID,
t.RecID,
Max(t.Comp) AS MaxOfComp,
Max(t.Rem) AS MaxOfRem,
t.Date_,
t.IDNEW,
t.IDOLD,
t.[CB?],
t.CallbackDate
FROM
(
SELECT CBNEW.*
FROM
CallbackNewID CBNEW
INNER JOIN
(
SELECT IDNEW, MAX(CallbackDate) AS MaxDate
FROM CallbackNewID
GROUP BY IDNEW
) AS groupedCBNEW
ON (CBNEW.CallbackDate = groupedCBNEW.MaxDate)
AND (CBNEW.IDNEW = groupedCBNEW.IDNEW)
) t
GROUP BY
t.RecID,
t.Date_,
t.IDNEW,
t.IDOLD,
t.[CB?],
t.CallbackDate;
It might not be terribly elegant, but if it works....
In MS SQL Server, I think you are looking for the ROW_NUMBER() function.
Something like this should help you get what you are looking for:
SELECT
X.*
FROM
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY DBNEW.IDNEW, DBNEW.MaxDate) [row_num]
FROM
CallbackNewID CBNEW
INNER JOIN
(
SELECT
IDNEW,
MAX(CallbackDate) AS MaxDate
FROM
CallbackNewID
GROUP BY
IDNEW
) AS groupedCBNEW ON (CBNEW.CallbackDate = groupedCBNEW.MaxDate) AND (CBNEW.IDNEW = groupedCBNEW.IDNEW)
) X
WHERE
X.row_num = 1
SELECT
A.*
FROM
(SELECT
*,
ROW_NUMBER() OVER (PARTITION BY IDNEW ORDER BY CallbackDate DESC)
AS [row_num]
FROM CallbackNewID
) A
WHERE
A.row_num = 1

Fastest way to check if the the most recent result for a patient has a certain value

Mssql < 2005
I have a complex database with lots of tables, but for now only the patient table and the measurements table matter.
What I need is the number of patient where the most recent value of 'code' matches a certain value. Also, datemeasurement has to be after '2012-04-01'. I have fixed this in two different ways:
SELECT
COUNT(P.patid)
FROM T_Patients P
WHERE P.patid IN (SELECT patid
FROM T_Measurements M WHERE (M.code ='xxxx' AND result= 'xx')
AND datemeasurement =
(SELECT MAX(datemeasurement) FROM T_Measurements
WHERE datemeasurement > '2012-01-04' AND patid = M.patid
GROUP BY patid
GROUP by patid)
AND:
SELECT
COUNT(P.patid)
FROM T_Patient P
WHERE 1 = (SELECT TOP 1 case when result = 'xx' then 1 else 0 end
FROM T_Measurements M
WHERE (M.code ='xxxx') AND datemeasurement > '2012-01-04' AND patid = P.patid
ORDER by datemeasurement DESC
)
This works just fine, but it makes the query incredibly slow because it has to join the outer table on the subquery (if you know what I mean). The query takes 10 seconds without the most recent check, and 3 minutes with the most recent check.
I'm pretty sure this can be done a lot more efficient, so please enlighten me if you will :).
I tried implementing HAVING datemeasurment=MAX(datemeasurement) but that keeps throwing errors at me.
So my approach would be to write a query just getting all the last patient results since 01-04-2012, and then filtering that for your codes and results. So something like
select
count(1)
from
T_Measurements M
inner join (
SELECT PATID, MAX(datemeasurement) as lastMeasuredDate from
T_Measurements M
where datemeasurement > '01-04-2012'
group by patID
) lastMeasurements
on lastMeasurements.lastmeasuredDate = M.datemeasurement
and lastMeasurements.PatID = M.PatID
where
M.Code = 'Xxxx' and M.result = 'XX'
The fastest way may be to use row_number():
SELECT COUNT(m.patid)
from (select m.*,
ROW_NUMBER() over (partition by patid order by datemeasurement desc) as seqnum
FROM T_Measurements m
where datemeasurement > '2012-01-04'
) m
where seqnum = 1 and code = 'XXX' and result = 'xx'
Row_number() enumerates the records for each patient, so the most recent gets a value of 1. The result is just a selection.

SQL Server : convert sub select query to join

I have 2 two tables questionpool and question where question is a many to one of question pool. I have created a query using a sub select query which returns the correct random results but I need to return more than one column from the question table.
The intent of the query is to return a random test from the 'question' table for each 'QuizID' from the 'Question Pool' table.
SELECT QuestionPool.QuestionPoolID,
(
SELECT TOP (1) Question.QuestionPoolID
FROM Question
WHERE Question.GroupID = QuestionPool.QuestionPoolID
ORDER BY NEWID()
)
FROM QuestionPool
WHERE QuestionPool.QuizID = '5'
OUTER APPLY is suited to this:
Select *
FROM QuestionPool
OUTER APPLY
(
SELECT TOP 1 *
FROM Question
WHERE Question.GroupID = QuestionPool.QuestionPoolID
ORDER BY NEWID()
) x
WHERE QuestionPool.QuizID = '5'
Another example of OUTER APPLY use http://www.ienablemuch.com/2012/04/outer-apply-walkthrough.html
Live test: http://www.sqlfiddle.com/#!3/d8afc/1
create table m(i int, o varchar(10));
insert into m values
(1,'alpha'),(2,'beta'),(3,'delta');
create table x(i int, j varchar, k varchar(10));
insert into x values
(1,'a','hello'),
(1,'b','howdy'),
(2,'x','great'),
(2,'y','super'),
(3,'i','uber'),
(3,'j','neat'),
(3,'a','nice');
select m.*, '' as sep, r.*
from m
outer apply
(
select top 1 *
from x
where i = m.i
order by newid()
) r
Not familiar with SQL server, but I hope this would do:
Select QuestionPool.QuestionPoolID, v.QuestionPoolID, v.xxx -- etc
FROM QuestionPool
JOIN
(
SELECT TOP (1) *
FROM Question
WHERE Question.GroupID = QuestionPool.QuestionPoolID
ORDER BY NEWID()
) AS v ON v.QuestionPoolID = QuestionPool.QuestionPoolID
WHERE QuestionPool.QuizID = '5'
Your query appears to be bringing back an arbitrary Question.QuestionPoolId for each QuestionPool.QuestionPoolId subject to the QuizId filter.
I think the following query does this:
select qp.QuestionPoolId, max(q.QuestionPoolId) as any_QuestionPoolId
from Question q join
qp.QuestionPoolId qp
on q.GroupId = qp.QuestionPoolId
WHERE QuestionPool.QuizID = '5'
group by qp.QuestionPoolId
This returns a particular question.
The following query would allow you to get more fields:
select qp.QuestionPoolId, q.*
from (select q.*, row_number() over (partition by GroupId order by (select NULL)) as randrownum
from Question q
) join
(select qp.QuestionPoolId, max(QuetionPool qp
on q.GroupId = qp.QuestionPoolId
WHERE QuestionPool.QuizID = '5' and
randrownum = 1
This uses the row_number() to arbitrarily enumerate the rows. The "Select NULL" provides the random ordering (alternatively, you could use "order by GroupId".
Common Table Expressions (CTEs) are rather handy for this type of thing...
http://msdn.microsoft.com/en-us/library/ms175972(v=sql.90).aspx