SQL Server : convert sub select query to join

SQL Server : convert sub select query to join - sql

I have 2 two tables questionpool and question where question is a many to one of question pool. I have created a query using a sub select query which returns the correct random results but I need to return more than one column from the question table.
The intent of the query is to return a random test from the 'question' table for each 'QuizID' from the 'Question Pool' table.
SELECT QuestionPool.QuestionPoolID,
(
SELECT TOP (1) Question.QuestionPoolID
FROM Question
WHERE Question.GroupID = QuestionPool.QuestionPoolID
ORDER BY NEWID()
)
FROM QuestionPool
WHERE QuestionPool.QuizID = '5'

OUTER APPLY is suited to this:
Select *
FROM QuestionPool
OUTER APPLY
(
SELECT TOP 1 *
FROM Question
WHERE Question.GroupID = QuestionPool.QuestionPoolID
ORDER BY NEWID()
) x
WHERE QuestionPool.QuizID = '5'
Another example of OUTER APPLY use http://www.ienablemuch.com/2012/04/outer-apply-walkthrough.html
Live test: http://www.sqlfiddle.com/#!3/d8afc/1
create table m(i int, o varchar(10));
insert into m values
(1,'alpha'),(2,'beta'),(3,'delta');
create table x(i int, j varchar, k varchar(10));
insert into x values
(1,'a','hello'),
(1,'b','howdy'),
(2,'x','great'),
(2,'y','super'),
(3,'i','uber'),
(3,'j','neat'),
(3,'a','nice');
select m.*, '' as sep, r.*
from m
outer apply
(
select top 1 *
from x
where i = m.i
order by newid()
) r

Not familiar with SQL server, but I hope this would do:
Select QuestionPool.QuestionPoolID, v.QuestionPoolID, v.xxx -- etc
FROM QuestionPool
JOIN
(
SELECT TOP (1) *
FROM Question
WHERE Question.GroupID = QuestionPool.QuestionPoolID
ORDER BY NEWID()
) AS v ON v.QuestionPoolID = QuestionPool.QuestionPoolID
WHERE QuestionPool.QuizID = '5'

Your query appears to be bringing back an arbitrary Question.QuestionPoolId for each QuestionPool.QuestionPoolId subject to the QuizId filter.
I think the following query does this:
select qp.QuestionPoolId, max(q.QuestionPoolId) as any_QuestionPoolId
from Question q join
qp.QuestionPoolId qp
on q.GroupId = qp.QuestionPoolId
WHERE QuestionPool.QuizID = '5'
group by qp.QuestionPoolId
This returns a particular question.
The following query would allow you to get more fields:
select qp.QuestionPoolId, q.*
from (select q.*, row_number() over (partition by GroupId order by (select NULL)) as randrownum
from Question q
) join
(select qp.QuestionPoolId, max(QuetionPool qp
on q.GroupId = qp.QuestionPoolId
WHERE QuestionPool.QuizID = '5' and
randrownum = 1
This uses the row_number() to arbitrarily enumerate the rows. The "Select NULL" provides the random ordering (alternatively, you could use "order by GroupId".

Common Table Expressions (CTEs) are rather handy for this type of thing...
http://msdn.microsoft.com/en-us/library/ms175972(v=sql.90).aspx

Related

Apache Phoenix SQL Join Limitation when using sub-queries

I have this query in Apache Phoenix SQL:
select WO.* from (
select "nr_id", "txt_commrcial_label"
from "e_application" APP
where "txt_commrcial_label" in ('a','b')
and "nr_id" not in (select "nr_ap_id"
from "e_workorder"
where "nr_id" in ('888'))
and "epochtimestampchanged" = (select max("epochtimestampchanged")
from "e_application"
where "nr_id" = APP."nr_id") ) as APP2,
--
(select Y.ID as WO_ID, Y."nr_id" as WO_nr_id, Y."nr_ap_id" as WO_nr_ap_id
from ( select "nr_id", max("epochtimestampchanged") as max_epochtimestampchanged
from "e_workorder"
where CAST(TO_NUMBER("epochtimestampchanged") AS TIMESTAMP) < TO_TIMESTAMP('2020-10-21 19:22:20.0')
group by "nr_id" ) as X, "e_workorder" as Y
where Y."nr_id" = X."nr_id"
and Y."epochtimestampchanged" < X.max_epochtimestampchanged ) as WO
--
where APP2."nr_id" = WO.WO_nr_ap_id;
I get java language illegal ... blurb for this not overly complex statement. But I cannot see the reason here or in the manuals.
The individual queries work (imagine the ( and , are not there), but no joy when these 2 sub-queries merged to a JOIN.
Do I need to persist the results to tables and then JOIN? Or is there way around this? I have the impression this is too complex in terms of sub-queries.

For others to note, this is a big and a different SQL Approach is needed as per below which is a work-around with note from Cloudera:
The best workaround is to explicitly define a join in the APP2 query.
See the APP_MAX_TIMESTAMP table joined with the APP table, defining
basically the same condition as in the original query (but using a
table join instead of an inner select):
The query that should work and should do the same as the original
query:
select
WO.*
from
(
select
"nr_id",
"txt_commrcial_label"
from
"e_application" APP
LEFT JOIN (
select
max("epochtimestampchanged") as max_app_timestamp,
"nr_id" as max_app_timestamp_nr_id
from
"e_application"
group by "nr_id"
) APP_MAX_TIMESTAMP
ON APP_MAX_TIMESTAMP.max_app_timestamp_nr_id = APP."nr_id"
where
"txt_commrcial_label" in
( list
)
and "nr_id" not in
(
select
"nr_ap_id"
from
"e_workorder"
where
"nr_id" in
(
'888'
)
)
and "epochtimestampchanged" = max_app_timestamp
)
as APP2,
(
select
Y.ID as WO_ID,
Y."nr_id" as WO_nr_id,
Y."nr_ap_id" as WO_nr_ap_id
from
(
select
"nr_id",
max("epochtimestampchanged") as max_epochtimestampchanged
from
"e_workorder"
where
CAST(TO_NUMBER("epochtimestampchanged") AS TIMESTAMP) < TO_TIMESTAMP('2022-10-10 19:22:20.0')
group by
"nr_id"
)
as X,
"e_workorder" as Y
where
Y."nr_id" = X."nr_id"
and Y."epochtimestampchanged" < X.max_epochtimestampchanged
)
as WO
where
APP2."nr_id" = WO.WO_nr_ap_id;

ORACLE SQL Pivot Issue

I am trying to pivot a sql result. I need to do this all in the one query. The below is telling me invalid identifier for header_id. I am using an Oracle database.
Code
Select * From (
select ppd.group_id,g.group_name, ct.type_desc,ht.hos_cat_descr
from item_history ih, item ci, contract ppd,
header ch, group g, cd_std_type ct, cd_hos h,
cd_std_hospital_cat ht
where ih.item_id = ci.item_id
and ih.header_id = ch.header_id
and ci.hos_id = h.hos_id
and ih.item_id = ci.item_id
and ch.user_no = ppd.user_no
and ppd.group_id = g.group_id
and ch.header_type = ct.header_type_id
and ci.hos_id = h.hos_id
and h.cat_id = ht.cat_id
)
Pivot
(
count(distinct header_id) as Volume
For hos_cat_descr IN ('A')
)

Your inner query doesn't have header_id in its projection, so the pivot clause doesn't have that column available to use. You need to add it, either as:
Select * From (
select ppd.group_id,g.group_name, ct.type_desc,ht.hos_cat_descr,ih.header_id
---------------------------------------------------------------^^^^^^^^^^^^^
from ...
)
Pivot
(
count(distinct header_id) as Volume
For hos_cat_descr IN ('A')
)
or:
Select * From (
select ppd.group_id,g.group_name, ct.type_desc,ht.hos_cat_descr,ch.header_id
---------------------------------------------------------------^^^^^^^^^^^^^
from ...
)
Pivot
(
count(distinct header_id) as Volume
For hos_cat_descr IN ('A')
)
It doesn't really matter which, since those two values must be equal as they are part of a join condition.
You could achieve the same thing with simpler aggregation instead of a pivot, but presumably you are doing more work in the pivot really.

How to pivot two rows into two columns

I have the following SQL Query:
select
distinct
Equipment_Reserved.Equipment_Attached_To,
Equipment.Name
from
Equipment,
Studies,
Equipment_Reserved
where
Studies.Study = 'MAINT19-01'
and
Equipment.idEquipment = Equipment_Reserved.Equipment_idEquipment
and
Studies.idStudies = Equipment_Reserved.Studies_idStudies
and
Equipment.Type = 'Probe'
This query produces the following results:
Equipment_Attached_To Name
2297 R1-P1
2297 R1-P2
2299 R1-P3
I would like to change it to the following:
Equipment_Attached_To Name1 Name2
2297 R1-P1 R1-P2
2299 R1-P3 NULL
Thanks for your help!

I'd first change your query from the old, legacy JOIN syntax to an explicit join as it makes the query easier to understand:
SELECT
DISTINCT
Equipment_Reserved.Equipment_Attached_To,
Equipment.Name
FROM
Equipment
INNER JOIN Equipment_Reserved ON Equipment_Reserved.Equipment_idEquipment = Equipment.idEquipment
INNER JOIN Studies ON Studies.idStudies = Equipment_Reserved.Studies_idStudies
WHERE
Studies.Study = 'MAINT19-01'
AND
Equipment.Type = 'Probe'
I don't think you actually need a PIVOT - I think you can do this with a nested query with the ROW_NUMBER function. I've seen that PIVOT queries often have worse query execution plans than nested-queries.
Let's add ROW_NUMBER (which require an ORDER BY as it's a windowing-function) and a matching ORDER BY in the whole query to make it consistent). Let's also use PARTITION BY so it resets the row-number for each Equipment_Attached_To value:
SELECT
DISTINCT
Equipment_Reserved.Equipment_Attached_To,
Equipment.Name,
ROW_NUMBER() OVER (PARTITION BY Equipment_Attached_To ORDER BY [Name]) AS RowNumber
FROM
Equipment
INNER JOIN Equipment_Reserved ON Equipment_Reserved.Equipment_idEquipment = Equipment.idEquipment
INNER JOIN Studies ON Studies.idStudies = Equipment_Reserved.Studies_idStudies
WHERE
Studies.Study = 'MAINT19-01'
AND
Equipment.Type = 'Probe'
ORDER BY
Equipment_Attached_To,
[Name]
This will give output like this:
Equipment_Attached_To Name RowNumber
2297 R1-P1 1
2297 R1-P2 2
2299 R1-P3 1
This can then be split out into explicit columns like so below. The use of MAX() is arbitrary (we could use MIN() instead) and only because we're dealing with a GROUP BY and because the CASE WHEN... restricts the input set to just 1 row anyway.
SELECT
Equipment_Attached_To,
MAX( CASE WHEN RowNumber = 1 THEN [Name] END ) AS Name1,
MAX( CASE WHEN RowNumber = 2 THEN [Name] END ) AS Name2
FROM
(
-- the query from above
)
GROUP BY
Equipment_Attached_To
ORDER BY
Equipment_Attached_To,
Name1,
Name2
So the final query is:
SELECT
Equipment_Attached_To,
MAX( CASE WHEN RowNumber = 1 THEN [Name] END ) AS Name1,
MAX( CASE WHEN RowNumber = 2 THEN [Name] END ) AS Name2
FROM
(
SELECT
DISTINCT
Equipment_Reserved.Equipment_Attached_To,
Equipment.Name,
ROW_NUMBER() OVER (PARTITION BY Equipment_Attached_To ORDER BY [Name]) AS RowNumber
FROM
Equipment
INNER JOIN Equipment_Reserved ON Equipment_Reserved.Equipment_idEquipment = Equipment.idEquipment
INNER JOIN Studies ON Studies.idStudies = Equipment_Reserved.Studies_idStudies
WHERE
Studies.Study = 'MAINT19-01'
AND
Equipment.Type = 'Probe'
)
GROUP BY
Equipment_Attached_To
ORDER BY
Equipment_Attached_To,
Name1,
Name2

Let's start with some basics.
To facilitate reading the code, I added alias to the tables using their initials.
Then, I converted the old join syntax which is partly deprecated to use the standard syntax since 1992 (27 years and people still use the old syntax).
Finally, since there are only 2 possible values, we can use MIN and MAX to separate them in 2 columns.
And because we're using aggregate functions, we remove the DISTINCT and use GROUP BY
The code now looks like this:
SELECT er.Equipment_Attached_To,
--Gets the first row for the id
MIN( e.Name) AS Name1,
--If the MAX is equal to the MIN, returns a NULL. If not, it returns the second value.
NULLIF( MAX(e.Name), MIN( e.Name)) AS Name2
FROM Equipment e
JOIN Studies s ON s.idStudies = er.Studies_idStudies
JOIN Equipment_Reserved er ON e.idEquipment = er.Equipment_idEquipment
WHERE s.Study = 'MAINT19-01'
AND e.Type = 'Probe'
GROUP BY er.Equipment_Attached_To;

error incorporating a select within a IFNULL in MariaDB

I'm creating a view in MariaDB and i'm having trouble making it work for a couple of fields. Currently this is working:
( SELECT DISTINCT IFNULL(grades.`grade`,'No Grade')
FROM `table` grades
WHERE userinfo.`id` = grades.`id`
AND grades.`Item Name` = 'SOMEINFO'
) 'SOMENAME',
But i need to add a select where the 'No grade' is, in the following form
( SELECT DISTINCT IFNULL( grades.`grade`,
SELECT IF( EXISTS
( SELECT *
FROM `another_table`
WHERE userid = 365
AND courseid = 2
), 'Enrolled', 'Not enrolled'
)
)
FROM `table` grades
WHERE userinfo.`id` = grades.`id`
AND grades.`Item Name` = 'SOMEINFO'
) 'SOMENAME',
i know that
SELECT IF( EXISTS( SELECT *
FROM `another_table`
WHERE userid = 365
AND courseid = 2
),
'Enrolled', 'Not enrolled'
)
is working too, but now the whole thing it's giving me an error, so any suggestions would be greatly appreciated
Thanks

This looks like a subquery:
(SELECT DISTINCT IFNULL(grades.`grade`,
SELECT IF( EXISTS (SELECT *
FROM `another_table`
WHERE userid = 365 AND courseid = 2
), 'Enrolled', 'Not enrolled'
)
)
FROM `table` grades
WHERE userinfo.`id` = grades.`id` AND
grades.`Item Name` = 'SOMEINFO'
) as SOMENAME,
You are using a subquery that returns two columns in a position where a scalar subquery is expected. A scalar subquery returns one column in at most one row.
Unfortunately, there is no easy way to do what you want in MySQL, because of the restrictions on views. I would advise you to rewrite the logic so the exists is handled using a left join in the from clause.

Reuse subquery result in WHERE-Clause for INSERT

i am using Microsoft SQL Server 2008
i would like to save the result of a subquery to reuse it in a following subquery.
Is this possible?
What is best practice to do this? (I am very new to SQL)
My query looks like:
INSERT INTO [dbo].[TestTable]
(
[a]
,[b]
)
SELECT
(
SELECT TOP 1 MAT_WS_ID
FROM #TempTableX AS X_ALIAS
WHERE OUTERBASETABLE.LT_ALL_MATERIAL = X_ALIAS.MAT_RM_NAME
)
,(
SELECT TOP 1 MAT_WS_NAME
FROM #TempTableY AS Y_ALIAS
WHERE Y_ALIAS.MAT_WS_ID = MAT_WS_ID
--(
--SELECT TOP 1 MAT_WS_ID
--FROM #TempTableX AS X_ALIAS
--WHERE OUTERBASETABLE.LT_ALL_MATERIAL = X_ALIAS.MAT_RM_NAME
--)
)
FROM [dbo].[LASERTECHNO] AS OUTERBASETABLE
My question is:
Is this correct what i did.
I replaced the second SELECT Statement in the WHERE-Clause for [b] (which is commented out and exactly the same as for [a]), with the result of the first SELECT Statement of [a] (=MAT_WS_ID).
It seems to give the right results.
But i dont understand why!
I mean MAT_WS_ID is part of both temporary tables X_ALIAS and Y_ALIAS.
So in the SELECT statement for [b], in the scope of the [b]-select-query, MAT_WS_ID could only be known from the Y_ALIAS table. (Or am i wrong, i am more a C++, maybe the scope things in SQL and C++ are totally different)
I just wannt to know what is the best way in SQL Server to reuse an scalar select result.
Or should i just dont care and copy the select for every column and the sql server optimizes it by its own?

One approach would be outer apply:
SELECT mat.MAT_WS_ID
, (
SELECT TOP 1 MAT_WS_NAME
FROM #TempTableY AS Y_ALIAS
WHERE Y_ALIAS.MAT_WS_ID = mat.MAT_WS_ID
)
FROM [dbo].[LASERTECHNO] AS OUTERBASETABLE
OUTER APPLY
(
SELECT TOP 1 MAT_WS_ID
FROM #TempTableX AS X_ALIAS
WHERE OUTERBASETABLE.LT_ALL_MATERIAL = X_ALIAS.MAT_RM_NAME
) as mat

You could rank rows in #TempTableX and #TempTableY partitioning them by MAT_RM_NAME in the former and by MAT_WS_ID in the latter, then use normal joins with filtering by rownum = 1 in both tables (rownum being the column containing the ranking numbers in each of the two tables):
WITH x_ranked AS (
SELECT
*,
rownum = ROW_NUMBER() OVER (PARTITION BY MAT_RM_NAME ORDER BY (SELECT 1))
FROM #TempTableX
),
y_ranked AS (
SELECT
*,
rownum = ROW_NUMBER() OVER (PARTITION BY MAT_WS_ID ORDER BY (SELECT 1))
FROM #TempTableY
)
INSERT INTO dbo.TestTable (a, b)
SELECT
x.MAT_WS_ID,
y.MAT_WS_NAME
FROM dbo.LASERTECHNO t
LEFT JOIN x_ranked x ON t.LT_ALL_MATERIAL = x.MAT_RM_NAME AND x.rownum = 1
LEFT JOIN y_ranked y ON x.MAT_WS_ID = y.MAT_WS_ID AND y.rownum = 1
;
The ORDER BY (SELECT 1) bit is a trick to specify an indeterminate ordering, which, accordingly, would result in indeterminate rownum = 1 rows picked by the query. That is to more or less duplicate your TOP 1 without an explicit order, but I would recommend you to specify a more sensible ORDER BY clause to make the results more predictable.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Server : convert sub select query to join - sql

Common Table Expressions (CTEs) are rather handy for this type of thing... http://msdn.microsoft.com/en-us/library/ms175972(v=sql.90).aspx

Related

Apache Phoenix SQL Join Limitation when using sub-queries

ORACLE SQL Pivot Issue

How to pivot two rows into two columns

error incorporating a select within a IFNULL in MariaDB

Reuse subquery result in WHERE-Clause for INSERT

Categories

Resources