How to use order by and rownum without subselect? - sql

I need to build a query with a order by and rownum but without use a sublect.
It is needed to get the first row of the query ordered.
In other words, I want the result of
select * from (
SELECT CAMP1,ORDERCAMP
FROM TABLENAME
ORDER BY ORDERCAMP) where rownum=1;
but whithout use a subselect. Is it possible?
I have a Oracle 11. You could say this is my whole query:
SELECT T1.CAMP_ID,
T2.CAMP
(SELECT OT.CAMP
FROM OTHERTABLE OT
WHERE OT.FK_TO_TABLE1=T1.CAMP_ID
ORDER BY OT.ORDERCAMP
)
FROM TABLE1 T1,
TABLE2 T2
WHERE T1.FK_TO_T2=T2.PK;
The subquery returns more than one row, and I cant use another subquery like
SELECT T1.CAMP_ID,
T2.CAMP
(SELECT *
FROM
(SELECT OT.CAMP
FROM OTHERTABLE OT
WHERE OT.FK_TO_TABLE1=T1.CAMP_ID
ORDER BY OT.ORDERCAMP
)
WHERE ROWNUM=1
)
FROM TABLE1 T1,
TABLE2 T2
WHERE T1.FK_TO_T2=T2.PK;
SELECT CAMP1,ORDERCAMP FROM TABLE2 ORDER BY ORDERCAMP
Because the T1.CAMP_ID is an invalid identifier in the third level subquery.
I hope I have explained myself enough.

Your current query (without the invalid ORDER BY) gets ORA-01427: single-row subquery returns more than one row. You can nest subqueries, but you can only refer back one level when joining; so if you did:
SELECT T1.CAMP_ID, T2.CAMP,
(SELECT CAMP FROM
FROM
(SELECT OT.CAMP
FROM OTHERTABLE OT
WHERE OT.FK_TO_TABLE1=T1.CAMP_ID
ORDER BY OT.ORDERCAMP
)
WHERE ROWNUM = 1)
FROM TABLE1 T1, TABLE2 T2 WHERE T1.FK_TO_T2=T2.PK;
... then you would get ORA-00904: "T1"."CAMP_ID": invalid identifier. Hence your question, presumably.
What you could do instead is join to the third table, and use the analytic ROW_NUMBER() function to assign the row number, and then use an outer select wrapped around the whole thing to only find the records with the lowest ORDERCAMP:
SELECT CAMP_ID, CAMP, OT_CAMP
FROM (
SELECT T1.CAMP_ID, T2.CAMP, OT.CAMP AS OT_CAMP,
ROW_NUMBER() OVER (PARTITION BY T1.CAMP_ID ORDER BY OT.ORDERCAMP) AS RN
FROM TABLE2 T2
JOIN TABLE1 T1 ON T1.FK_TO_T2=T2.PK
JOIN OTHERTABLE OT ON OT.FK_TO_TABLE1=T1.CAMP_ID
)
WHERE RN = 1;
The ROW_NUMBER() can partition on the T1.CAMP_ID primary key value, or anything else that is unique.
SQL Fiddle demo, including the inner query run on its own so you can see the RN numbers assigned before the outer filter is applied.
Another approach is to use the aggregate KEEP DENSE_RANK FIRST function
SELECT T1.CAMP_ID, T2.CAMP,
MAX(OT.CAMP) KEEP (DENSE_RANK FIRST ORDER BY OT.ORDERCAMP) AS OT_CAMP
FROM TABLE2 T2
JOIN TABLE1 T1 ON T1.FK_TO_T2=T2.PK
JOIN OTHERTABLE OT ON OT.FK_TO_TABLE1=T1.CAMP_ID
GROUP BY T1.CAMP_ID, T2.CAMP;
Which is a bit shorter and doesn't need an inner query. I'm not sure if there's any real advantage of one over the other.
SQL Fiddle demo.

In the most recent version of Oracle, you can do:
SELECT CAMP1, ORDERCAMP
FROM TABLENAME
ORDER BY ORDERCAMP
FETCH FIRST 1 ROWS ONLY;
Otherwise, I think you need a subquery of some sort.

You could use LIMIT or SELECT TOP 1
SELECT CAMP1, ORDERCAMP FROM TABLENAME ORDER BY ORDERCAMP LIMIT 1

Related

SQL MAX funtion where not all atributes are in the group by

So my current problem is that I have two tables that look like this:
table1(name, num_patient, quant, inst)
table2(inst_name, num_region)
Where I want to find the patient with max quantity per region.
I first had the idea of doing something like this:
SELECT num_region, num_patient, MAX(quant)
FROM
(SELECT num_patient, quant, num_region
FROM table1
INNER JOIN table2
ON table1.inst = table2.inst_name) AS joined_tables
GROUP BY num_region;
But this doesn't work since either num_patient has to be on the GROUP BY (and this way it doesn't return the max value by region anymore) or I have to remove it from the SELECT (also doesn't work because I need the name of each patient). I have tried to fix my issue with a WHERE quant = MAX() statement but couldn't get it to work. Is there any workaround to this?
Use DISTINCT ON:
SELECT DISTINCT ON (num_region), num_patient, quant, num_region
FROM table1 t1 JOIN
table2 t2
ON t1.inst = t2.inst_name
ORDER BY num_region, quant DESC;
DISTINCT ON is a convenient Postgres extension. It returns one row per keys specified in the SELECT, based on the ordering in the ORDER BY.
Being an extension, not all databases support this functionality -- even databases derived from Postgres. The traditional method would use ROW_NUMBER():
SELECT t.*
FROM (SELECT num_patient, quant, num_region,
ROW_NUMBER() OVER (PARTITION BY num_region ORDER BY quant DESC) as seqnum
FROM table1 t1 JOIN
table2 t2
ON t1.inst = t2.inst_name
) t
WHERE seqnum = 1;
This is a duplicate of the DISTINCT ON question I linked.
SELECT distinct on (num_region) num_patient, quant, num_region
FROM table1
INNER JOIN table2
ON table1.inst = table2.inst_name
ORDER BY num_region, quant desc

SQL Server query showing most recent distinct data

I am trying to build a SQL query to recover only the most young record of a table (it has a Timestamp column already) where the item by which I want to filter appears several times, as shown in my table example:
.
Basically, I have a table1 with Id, Millis, fkName and Price, and a table2 with Id and Name.
In table1, items can appear several times with the same fkName.
What I need to achieve is building up a single query where I can list the last record for every fkName, so that I can get the most actual price for every item.
What I have tried so far is a query with
SELECT DISTINCT [table1].[Millis], [table2].[Name], [table1].[Price]
FROM [table1]
JOIN [table2] ON [table2].[Id] = [table1].[fkName]
ORDER BY [table2].[Name]
But I don't get the correct listing.
Any advice on this? Thanks in advance,
A simple and portable approach to this greatest-n-per-group problem is to filter with a subquery:
select t1.millis, t2.name, t1.price
from table1 t1
inner join table2 t2 on t2.id = t1.fkName
where t1.millis = (select max(t11.millis) from table1 t11 where t11.fkName = t1.fkName)
order by t1.millis desc
using Common Table Expression:
;with [LastPrice] as (
select [Millis], [Price], ROW_NUMBER() over (Partition by [fkName] order by [Millis] desc) rn
from [table1]
)
SELECT DISTINCT [LastPrice].[Millis],[table2].[Name],[LastPrice].[Price]
FROM [LastPrice]
JOIN [table2] ON [table2].[Id] = [LastPrice].[fkName]
WHERE [LastPrice].rn = 1
ORDER BY [table2].[Name]

How to run the subquery with the non equality clause on Spark?

I have this query and I want to execute it on spark
SELECT A.PFR,
A.MFR,
A.MST,
(SELECT COUNT(*)
FROM Table1 T2
WHERE T1.PFR = T2.PFR
AND T1.MFR = T2.MFR
AND T1.MST >= T2.MST) AS RANK
FROM Table1 A
But spark didn't support subquery with non equality clause
I get this error
The correlated scalar subquery can only contain equality predicates
So I tried to use group by but I didn't get the correct results (I have the input and the out result)
SELECT A.PFR,
A.MFR,
A.MST,
B.countRank
FROM Table1 A
LEFT OUTER JOIN
(SELECT PFR,
MFR,
MST,
COUNT(MFR) countRank
FROM Table1 B
GROUP BY PFR,
MFR,
MST) B ON B.PFR = A.PFR
There are a method to convert this query to a join query.
Thanks in advance.
Just use rank():
SELECT A.PFR, A.MFR, A.MST,
RANK() OVER (PARTITION BY PFR, MFR
ORDER BY MST DESC
) as rank
FROM Table1 A;
If rank() doesn't do exactly what you want, then perhaps row_number() or dense_rank() work.

Different way of writing this SQL query with partition

Hi I have the below query in Teradata. I have a row number partition and from that I want rows with rn=1. Teradata doesn't let me use the row number as a filter in the same query. I know that I can put the below into a subquery with a where rn=1 and it gives me what I need. But the below snippet needs to go into a larger query and I want to simplify it if possible.
Is there a different way of doing this so I get a table with 2 columns - one row per customer with the corresponding fc_id for the latest eff_to_dt?
select cust_grp_id, fc_id, row_number() over (partition by cust_grp_id order by eff_to_dt desc) as rn
from table1
Have you considered using the QUALIFY clause in your query?
SELECT cust_grp_id
, fc_id
FROM table1
QUALIFY ROW_NUMBER()
OVER (PARTITION BY cust_grp_id
ORDER BY eff_to_dt desc)
= 1;
Calculate MAX eff_to_dt for each cust_grp_id and then join result to main table.
SELECT T1.cust_grp_id,
T1.fc_id,
T1.eff_to_dt
FROM Table1 AS T1
JOIN
(SELECT cust_grp_id,
MAX(eff_to_dt) AS max_eff_to_dt
FROM Table
GROUP BY cust_grp_id) AS T2 ON T2.cust_grp_id = T1.cust_grp_id
AND T2.max_eff_to_dt = T1.eff_to_dt
You can use a pair of JOINs to accomplish the same thing:
INNER JOIN My_Table T1 ON <some criteria>
LEFT OUTER JOIN My_Table T2 ON <some criteria> AND T2.eff_to_date > T1.eff_to_date
WHERE
T2.my_id IS NULL
You'll need to sort out the specific criteria for your larger query, but this is effectively JOINing all of the rows (T1), but then excluding any where a later row exists. In the WHERE clause you eliminate these by checking for a NULL value in a column that is NOT NULL (in this case I just assumed some ID value). The only way that would happen is if the LEFT OUTER JOIN on T2 failed to find a match - i.e. no rows later than the one that you want exist.
Also, whether or not the JOIN to T1 is LEFT OUTER or INNER is up to your specific requirements.

entry cannot be referenced in this part of the query (subquery) Error

I'm getting the following error on my query:
here is an entry for table "table1", but it cannot be referenced from this part of the query.
This is my query:
SELECT id
FROM property_import_image_results table1
LEFT JOIN (
SELECT created_at
FROM property_import_image_results
WHERE external_url = table1.external_url
ORDER BY created_at DESC NULLS LAST
LIMIT 1
) as table2 ON (pimr.created_at = table2.created_at)
WHERE table2.created_at is NULL
You need a lateral join to be able to reference the outer table in the sub-select for the join.
You are also referencing an alias pimr in the join condition, which isn't available anywhere in the query. So you need to change that to table1 in the join condition.
You should also given the table in the inner query an alias to avoid confusion:
SELECT id
FROM property_import_image_results table1
LEFT JOIN LATERAL (
SELECT p2.created_at
FROM property_import_image_results p2
WHERE p2.external_url = table1.external_url
ORDER BY p2.created_at DESC NULLS LAST
LIMIT 1
) as table2 ON (table1.created_at = table2.created_at)
WHERE table2.created_at is NULL
Edit
This kind of query can also be solved using window functions:
select id
from (
select id,
max(created_at) over (partition by external_url) as max_created
FROM property_import_image_results
) t
where created_at <> max_created;
This might be faster than aggregating and joining as you do. But it's hard to tell. The lateral joins are quite efficient as well. It has the advantage that you can add any column you like to the result because no grouping is required.