SQL Server ROW_NUMBER Left Join + when you don't know column names - sql

I'm writing a page that will create a query (for non-db users) and it create the query and run it returning the results for them.
I am using row_number to handle custom pagination.
How do I do a left join and a row_number in a subquery when I don't know the specific columns I need to return. I tried to use * but I get an error that
The column '' was specified multiple times
Here is the query I tried:
SELECT * FROM
(SELECT ROW_NUMBER() OVER (ORDER BY Test) AS ROW_NUMBER, *
FROM table1 a
LEFT JOIN table2 b
ON a.ID = b.ID) x
WHERE ROW_NUMBER BETWEEN 1 AND 50

Your query is going to fail in SQL Server regardless of the row_number() call. The * returns all columns, including a.id and b.id. These both have the same name. This is fine for a query, but for a subquery, all columns need distinct names.
You can use row_number() for an arbitrary ordering by using a "subquery with constant" in the order by clause:
SELECT * FROM
(SELECT ROW_NUMBER() OVER (ORDER BY (select NULL)) AS ROW_NUMBER, *
FROM table1 a
LEFT JOIN table2 b
ON a.ID = b.ID) x
WHERE ROW_NUMBER BETWEEN 1 AND 50 ;
This removes the dependency on the underlying column name (assuming none are named ROW_NUMBER).

Try this sql. It should work.
SELECT * FROM
(SELECT ROW_NUMBER() OVER (ORDER BY a.Test) AS ROW_NUMBER, a.*,b.*
FROM table1 a
LEFT JOIN table2 b
ON a.ID = b.ID) x
WHERE ROW_NUMBER BETWEEN 1 AND 50

Related

Oracle 11G DB : Result from 'clob' type column in view changed while using the view in a where clause

I have the current query that i'm running in Oracle:
WITH viewa
AS (SELECT c.columna
FROM sometable c
LEFT JOIN othertable u
ON ( c.id = u.id )
WHERE id= '111'
ORDER BY c.created_date)
SELECT columna
FROM (SELECT rownum AS row_num,
t.*
FROM viewa t)
WHERE row_num > (SELECT CASE
WHEN ( Count(*) > 100 ) THEN Count(*) - 100
ELSE 0
END AS num
FROM viewa)
the idea is to always get the first 100 rows.
as you can see, i'm creating a view at the beginning and use it twice:
in the from and in the where.
i'm doing that so i wouldn't need to fetch the first select twice and it also make the query more readable.
notice that columna is of type CLOB!!
when i'm doing the same query with other column types its working!
so its probably something related to the clob column
The weird think is that the results that im getting are empty values even though i have values in the DB!
when i'm removing the subselect in the where i'm getting the right result:
WITH viewa
AS (SELECT c.columna
FROM sometable c
LEFT JOIN othertable u
ON ( c.id = u.id )
WHERE id = '111'
ORDER BY c.created_date)
SELECT columna
FROM (SELECT rownum AS row_num,
t.*
FROM viewa t)
WHERE row_num > 0
seems like Oracle is turning the values for the Clob column "columnA" into null when using the view in the where.
is someone familiar with that?
know how to go around it ?
i solved it with a different query but i still would like to know if Oracle does change the view while fetching from it?
thank you
Without sample data this is hard but I'm guessing the reason is you're depending on rownum. Use the FETCH clause instead to limit the number of rows.
WITH viewa
AS (SELECT c.columna
FROM sometable c
LEFT JOIN othertable u ON ( c.id = u.id )
-- an order by clause should go here
FETCH FIRST 100 ROWS ONLY)
SELECT columna
FROM viewa
But you don't need that CTE at all, just do
SELECT c.columna
FROM sometable c
LEFT JOIN othertable u ON ( c.id = u.id )
-- an order by clause should go here
FETCH FIRST 100 ROWS ONLY
Note that the "first" rows are not guaranteed to be a specific set of rows unless you explicitly add an ORDER BY clause.
Since 11g does not have FETCH FIRST, you can just use rownumber as the limiting criteria. See Example at Oracle Live
select columna, created_date
from (
select c.columna, c.created_date
from sometable c
left join othertable u
on ( c.id = u.id )
where c.id = '111'
order by c.created_date
)
where rownum <= 10;

How to run the subquery with the non equality clause on Spark?

I have this query and I want to execute it on spark
SELECT A.PFR,
A.MFR,
A.MST,
(SELECT COUNT(*)
FROM Table1 T2
WHERE T1.PFR = T2.PFR
AND T1.MFR = T2.MFR
AND T1.MST >= T2.MST) AS RANK
FROM Table1 A
But spark didn't support subquery with non equality clause
I get this error
The correlated scalar subquery can only contain equality predicates
So I tried to use group by but I didn't get the correct results (I have the input and the out result)
SELECT A.PFR,
A.MFR,
A.MST,
B.countRank
FROM Table1 A
LEFT OUTER JOIN
(SELECT PFR,
MFR,
MST,
COUNT(MFR) countRank
FROM Table1 B
GROUP BY PFR,
MFR,
MST) B ON B.PFR = A.PFR
There are a method to convert this query to a join query.
Thanks in advance.
Just use rank():
SELECT A.PFR, A.MFR, A.MST,
RANK() OVER (PARTITION BY PFR, MFR
ORDER BY MST DESC
) as rank
FROM Table1 A;
If rank() doesn't do exactly what you want, then perhaps row_number() or dense_rank() work.

Only one expression can be specified in the select list when the subquery is not introduced with EXISTS. in subquery sqlserver

I want to execute this query in my database.As you can see both tables A and B has one-many relations ,but i need the latest record in B.so i here is my query :
select *,(select top 1 ResultTest ,ResultState2 from B where GasReceptionId=A.Id order by Id desc)
from A where OrganizationGasId= 4212
But i get this error
Msg 116, Level 16, State 1, Line 2
Only one expression can be specified in the select list when the subquery is not introduced with EXISTS.
You can rephrase this query as a basic join which uses an analytic function (e.g. row number) to identify the correct row's data from B to include with each record coming from the A table.
SELECT *
FROM
(
SELECT a.*, b.ResultTest, b.ResultState2,
ROW_NUMBER() OVER (PARTITION BY a.Id ORDER BY a.ID DESC) rn
FROM A a
LEFT JOIN B b
ON a.Id = b.GasReceptionId
WHERE
a.OrganizationGasId = 4212
) t
WHERE t.rn = 1;
A subquery in the SELECT clause must return exactly one column (and one or zero rows). So you can either have two subqueries:
select
a.*,
(select top 1 resulttest from b where gasreceptionid = a.id order by id desc) as test,
(select top 1 resultstate2 from b where gasreceptionid = a.id order by id desc) as state
from a
where a.organizationgasid = 4212;
Or, much better, move the subquery to the FROM clause. One way is OUTER APPLY:
select
a.*, r.resulttest, r.resultstate2
from a
outer apply
(
select top 1 resulttest, resultstate2
from b
where gasreceptionid = a.id
order by id desc
) r
where a.organizationgasid = 4212;

How to use order by and rownum without subselect?

I need to build a query with a order by and rownum but without use a sublect.
It is needed to get the first row of the query ordered.
In other words, I want the result of
select * from (
SELECT CAMP1,ORDERCAMP
FROM TABLENAME
ORDER BY ORDERCAMP) where rownum=1;
but whithout use a subselect. Is it possible?
I have a Oracle 11. You could say this is my whole query:
SELECT T1.CAMP_ID,
T2.CAMP
(SELECT OT.CAMP
FROM OTHERTABLE OT
WHERE OT.FK_TO_TABLE1=T1.CAMP_ID
ORDER BY OT.ORDERCAMP
)
FROM TABLE1 T1,
TABLE2 T2
WHERE T1.FK_TO_T2=T2.PK;
The subquery returns more than one row, and I cant use another subquery like
SELECT T1.CAMP_ID,
T2.CAMP
(SELECT *
FROM
(SELECT OT.CAMP
FROM OTHERTABLE OT
WHERE OT.FK_TO_TABLE1=T1.CAMP_ID
ORDER BY OT.ORDERCAMP
)
WHERE ROWNUM=1
)
FROM TABLE1 T1,
TABLE2 T2
WHERE T1.FK_TO_T2=T2.PK;
SELECT CAMP1,ORDERCAMP FROM TABLE2 ORDER BY ORDERCAMP
Because the T1.CAMP_ID is an invalid identifier in the third level subquery.
I hope I have explained myself enough.
Your current query (without the invalid ORDER BY) gets ORA-01427: single-row subquery returns more than one row. You can nest subqueries, but you can only refer back one level when joining; so if you did:
SELECT T1.CAMP_ID, T2.CAMP,
(SELECT CAMP FROM
FROM
(SELECT OT.CAMP
FROM OTHERTABLE OT
WHERE OT.FK_TO_TABLE1=T1.CAMP_ID
ORDER BY OT.ORDERCAMP
)
WHERE ROWNUM = 1)
FROM TABLE1 T1, TABLE2 T2 WHERE T1.FK_TO_T2=T2.PK;
... then you would get ORA-00904: "T1"."CAMP_ID": invalid identifier. Hence your question, presumably.
What you could do instead is join to the third table, and use the analytic ROW_NUMBER() function to assign the row number, and then use an outer select wrapped around the whole thing to only find the records with the lowest ORDERCAMP:
SELECT CAMP_ID, CAMP, OT_CAMP
FROM (
SELECT T1.CAMP_ID, T2.CAMP, OT.CAMP AS OT_CAMP,
ROW_NUMBER() OVER (PARTITION BY T1.CAMP_ID ORDER BY OT.ORDERCAMP) AS RN
FROM TABLE2 T2
JOIN TABLE1 T1 ON T1.FK_TO_T2=T2.PK
JOIN OTHERTABLE OT ON OT.FK_TO_TABLE1=T1.CAMP_ID
)
WHERE RN = 1;
The ROW_NUMBER() can partition on the T1.CAMP_ID primary key value, or anything else that is unique.
SQL Fiddle demo, including the inner query run on its own so you can see the RN numbers assigned before the outer filter is applied.
Another approach is to use the aggregate KEEP DENSE_RANK FIRST function
SELECT T1.CAMP_ID, T2.CAMP,
MAX(OT.CAMP) KEEP (DENSE_RANK FIRST ORDER BY OT.ORDERCAMP) AS OT_CAMP
FROM TABLE2 T2
JOIN TABLE1 T1 ON T1.FK_TO_T2=T2.PK
JOIN OTHERTABLE OT ON OT.FK_TO_TABLE1=T1.CAMP_ID
GROUP BY T1.CAMP_ID, T2.CAMP;
Which is a bit shorter and doesn't need an inner query. I'm not sure if there's any real advantage of one over the other.
SQL Fiddle demo.
In the most recent version of Oracle, you can do:
SELECT CAMP1, ORDERCAMP
FROM TABLENAME
ORDER BY ORDERCAMP
FETCH FIRST 1 ROWS ONLY;
Otherwise, I think you need a subquery of some sort.
You could use LIMIT or SELECT TOP 1
SELECT CAMP1, ORDERCAMP FROM TABLENAME ORDER BY ORDERCAMP LIMIT 1

SQL: Turn a subquery into a join: How to refer to outside table in nested join where clause?

I am trying to change my sub-query in to a join where it selects only one record in the sub-query. It seems to run the sub-query for each found record, taking over a minute to execute:
select afield1, afield2, (
select top 1 b.field1
from anothertable as b
where b.aForeignKey = a.id
order by field1
) as bfield1
from sometable as a
If I try to only select related records, it doesn't know how to bind a.id in the nested select.
select afield1, afield2, bfield1
from sometable a left join (
select top 1 id, bfield, aForeignKey
from anothertable
where anothertable.aForeignKey = a.id
order by bfield) b on
b.aForeignKey = a.id
-- Results in the multi-part identifier "a.id" could not be bound
If I hard code values in the nested where clause, the select duration drops from 60 seconds to under five. Anyone have any suggestions on how to join the two tables while not processing every record in the inner table?
EDIT:
I ended up adding
left outer join (
select *, row_number() over (partition by / order by) as rank) b on
b.aforeignkey = a.id and b.rank = 1
went from ~50 seconds to 8 for 22M rows.
Try this:
WITH qry AS
(
SELECT afield1,
afield2,
b.field1 AS bfield1,
ROW_NUMBER() OVER(PARTITION BY a.id ORDER BY field1) rn
FROM sometable a LEFT JOIN anothertable b
ON b.aForeignKey = a.id
)
SELECT *
FROM qry
WHERE rn = 1
Try this
select afield1,
afield2,
bfield1
from sometable a
left join
(select top 1 id, bfield, aForeignKey from anothertable where aForeignKey in(a.id) order by bfield) b on b.aForeignKey = a.id