How to replace IN CLAUSE USING EXISTS? - sql

select
TV.ATTRIBUTE
FROM
TABLE_VALUE TV
WHERE
TV.NUMBERS IN (SELECT MAX(TV1.NUMBERS) FROM TABLE_VALUE TV1
WHERE TV.UNIQUE_ID=TV1.UNIQUE_ID GROUP BY UNIQUE_ID )

I'm not sure exists would help here, because - as you put it - for each unique_id there be many numbers values, and you want to select attribute for highest numbers for that particular unique_id.
exists is useful when you want to check whether something ... well, exists, but that's not the case here.

You do not want EXISTS, instead you can use the RANK or DENSE_RANK analytic functions:
SELECT attribute
FROM (
SELECT attribute,
DENSE_RANK() OVER (PARTITION BY unique_id ORDER BY numbers DESC) AS rnk
FROM table_value
)
WHERE rnk = 1
or use the MAX analytic function:
SELECT attribute
FROM (
SELECT attribute,
numbers,
MAX(numbers) OVER (PARTITION BY unique_id) AS max_numbers
FROM table_value
)
WHERE numbers = max_numbers;
Either option will only read from the table once.
If you really did want to use EXISTS (or IN) then it will be less efficient as you will query the same table twice but you can do it with a HAVING clause:
SELECT tv.attribute
FROM table_value tv
WHERE EXISTS(
SELECT 1
FROM table_value tv1
WHERE tv1.unique_id = tv.unique_id
HAVING MAX(tv1.numbers) = tv.numbers
)
fiddle

Related

Listing multiple columns in a single row in SQL

(select ID,EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE,ROW_NUMBER() OVER(PARTITION BY EXTERNAL_TRANSACTION_ID ORDER BY ID ) AS SEQNUM
from AC_POS_TRANSACTION_TRK aptt WHERE [RESULT] ='Success'
GROUP BY ID, EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE )
Hello,
On above query, I want to get rows of transaction id's which has seqnum=1 and seqnum=2
But if that transaction id has no second row (seqnum=2), I dont want to get any row for that transaction id.
Thanks!!
Something like this
Not 100% sure if this is correct without you table definition, but my understanding is that you want to EXCLUDE records if that record has an entry with seqnum=2 -- you can't use a where clause alone because that would still return seqnum = 1.
You can use an exists /not exists or in/not in clause like this
(select ID,EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE,ROW_NUMBER() OVER(PARTITION BY EXTERNAL_TRANSACTION_ID ORDER BY ID ) AS SEQNUM
from AC_POS_TRANSACTION_TRK aptt WHERE [RESULT] ='Success'
and not exists ( select 1 from AC_POS_TRANSACTION_TRK a where a.id = aptt.id
and a.seqnum = 2)
GROUP BY ID, EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE )
basically what this does is it excludes records if a record exists as specified in the NOT EXISTS query.
One option you can try is to add a count of rows per group using the same partioning critera and then filter accordingly. Not entirely sure about your query without seeing it in context and with sample data - there's no aggregation so why use group by?
However can you try something along these lines
select * from (
select ID,EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE,
Row_Number() over(partition by EXTERNAL_TRANSACTION_ID order by ID) as SEQNUM,
Count(*) over(partition by EXTERNAL_TRANSACTION_ID) Qty
from AC_POS_TRANSACTION_TRK
where [RESULT] ='Success'
)x
where SEQNUM in (1,2) and Qty>1
This should do the job.
With Qry As (
-- Your original query goes here
),
Select Qry.*
From Qry
Where Exists (
Select *
From Qry Qry1
Where Qry1.EXTERNAL_TRANSACTION_ID = Qry.EXTERNAL_TRANSACTION_ID
And Qry1.SEQNUM = 1
)
And Exists (
Select *
From Qry Qry2
Where Qry2.EXTERNAL_TRANSACTION_ID = Qry.EXTERNAL_TRANSACTION_ID
And Qry2.SEQNUM = 2
)
BTW, your original query looks problematic to me, specifically I think that instead of a GROUP BY columns those columns should be in the PARTITION BY clause of the OVER statement, but without knowing more about the table structures and what you're trying to achieve, I could not say for sure.

Get minimum without using row number/window function in Bigquery

I have a table like as shown below
What I would like to do is get the minimum of each subject. Though I am able to do this with row_number function, I would like to do this with groupby and min() approach. But it doesn't work.
row_number approach - works fine
SELECT * FROM (select subject_id,value,id,min_time,max_time,time_1,
row_number() OVER (PARTITION BY subject_id ORDER BY value) AS rank
from table A) WHERE RANK = 1
min() approach - doesn't work
select subject_id,id,min_time,max_time,time_1,min(value) from table A
GROUP BY SUBJECT_ID,id
As you can see just the two columns (subject_id and id) is enough to group the items together. They will help differentiate the group. But why am I not able to use the other columns in select clause. If I use the other columns, I may not get the expected output because time_1 has different values.
I expect my output to be like as shown below
In BigQuery you can use aggregation for this:
SELECT ARRAY_AGG(a ORDER BY value LIMIT 1)[SAFE_OFFSET(1)].*
FROM table A
GROUP BY SUBJECT_ID;
This uses ARRAY_AGG() to aggregate each record (the a in the argument list). ARRAY_AGG() allows you to order the result (by value) and to limit the size of the array. The latter is important for performance.
After you concatenate the arrays, you want the first element. The .* transforms the record referred to by a to the component columns.
I'm not sure why you don't want to use ROW_NUMBER(). If the problem is the lingering rank column, you an easily remove it:
SELECT a.* EXCEPT (rank)
FROM (SELECT a.*,
ROW_NUMBER() OVER (PARTITION BY subject_id ORDER BY value) AS rank
FROM A
) a
WHERE RANK = 1;
Are you looking for something like below-
SELECT
A.subject_id,
A.id,
A.min_time,
A.max_time,
A.time_1,
A.value
FROM table A
INNER JOIN(
SELECT subject_id, MIN(value) Value
FROM table
GROUP BY subject_id
) B ON A.subject_id = B.subject_id
AND A.Value = B.Value
If you do not required to select Time_1 column's value, this following query will work (As I can see values in column min_time and max_time is same for the same group)-
SELECT
A.subject_id,A.id,A.min_time,A.max_time,
--A.time_1,
MIN(A.value)
FROM table A
GROUP BY
A.subject_id,A.id,A.min_time,A.max_time
Finally, the best approach is if you can apply something like CAST(Time_1 AS DATE) on your time column. This will consider only the date part regardless of the time part. The query will be
SELECT
A.subject_id,A.id,A.min_time,A.max_time,
CAST(A.time_1 AS DATE) Time_1,
MIN(A.value)
FROM table A
GROUP BY
A.subject_id,A.id,A.min_time,A.max_time,
CAST(A.time_1 AS DATE)
-- Make sure the syntax of CAST AS DATE
-- in BigQuery is as I written here or bit different.
Below is for BigQuery Standard SQL and is most efficient way for such cases like in your question
#standardSQL
SELECT AS VALUE ARRAY_AGG(t ORDER BY value LIMIT 1)[OFFSET(0)]
FROM `project.dataset.table` t
GROUP BY subject_id
Using ROW_NUMBER is not efficient and in many cases lead to Resources exceeded error.
Note: self join is also very ineffective way of achieving your objective
A bit late to the party, but here is a cte-based approach which made sense to me:
with mins as (
select subject_id, id, min(value) as min_value
from table
group by subject_id, id
)
select distinct t.subject_id, t.id, t.time_1, t.min_time, t.max_time, m.min_value
from table t
join mins m on m.subject_id = t.subject_id and m.id = t.id

How to avoid order by in group by query result [duplicate]

I am trying to display the records,order as in the where clause..
example:
select name from table where name in ('Yaksha','Arun','Naveen');
It displays Arun,Naveen,Yaksha (alphabetical order)
I want display it as same order i.e 'Yaksha''Arun','Naveen'
how to display this...
I am using oracle db.
Add this ORDER BY at the query's end:
order by case name when 'Yaksha' then 1
when 'Arun' then 2
when 'Naveen' then 3
end
(There's no other way to get that order. You need an ORDER BY to get a specific result set order.)
It may be a bit clunky, but you can create a custom ordering with a case expression:
SELECT *
FROM my_table
WHERE name IN ('Yaksha', 'Arun','Naveen')
ORDER BY CASE name WHEN 'Yaksha' THEN 1
WHEN 'Arun' THEN 2
WHEN 'Naveen' THEN 3
END ASC
A slightly longer option, but one that prevents duplication of the string literals is to use a subquery:
SELECT m.*
FROM my_table m
JOIN (SELECT 'Yaksha' AS name, 1 AS name_order FROM dual
UNION ALL
SELECT 'Arun' AS name, 2 AS name_order FROM dual
UNION ALL
SELECT 'Naveen' AS name, 3 AS name_order FROM dual) o
ON o.name = m.name
ORDER BY o.name_order ASC
You can try with something like the following:
SELECT *
FROM test
WHERE name IN ( 'Yaksha', 'Arun', 'Naveen' )
ORDER BY instr ( q'['Yaksha', 'Arun', 'Naveen']', name ) ASC
This way could be useful if your IN list is somehow dynamic.
If the list of values is dynamic or you just don't want to repeat the values you could use (or abuse, depending on your point of view) a table collection, and join your real table to a table collection expression instead of using IN:
select your_table.name
from table(sys.odcivarchar2list('Yaksha','Arun','Naveen')) t
join your_table on your_table.name = t.column_value;
Which will generally work, but of course without an order-by clause is not guaranteed to work, so you can use an inline view to assign the order:
select your_table.name from (
select row_number() over (order by null) as rn, column_value as name
from table(sys.odcivarchar2list('Yaksha','Arun','Naveen'))
) t
join your_table on your_table.name = t.name
order by t.rn;
This still relies on row_number() over (order by null) using the order of the elements in the collection; which relies on collection unnesting preserving the element order. I don't think that's guaranteed either, so there is still some risk involved.

How to display records from a table ordered as in the where clause?

I am trying to display the records,order as in the where clause..
example:
select name from table where name in ('Yaksha','Arun','Naveen');
It displays Arun,Naveen,Yaksha (alphabetical order)
I want display it as same order i.e 'Yaksha''Arun','Naveen'
how to display this...
I am using oracle db.
Add this ORDER BY at the query's end:
order by case name when 'Yaksha' then 1
when 'Arun' then 2
when 'Naveen' then 3
end
(There's no other way to get that order. You need an ORDER BY to get a specific result set order.)
It may be a bit clunky, but you can create a custom ordering with a case expression:
SELECT *
FROM my_table
WHERE name IN ('Yaksha', 'Arun','Naveen')
ORDER BY CASE name WHEN 'Yaksha' THEN 1
WHEN 'Arun' THEN 2
WHEN 'Naveen' THEN 3
END ASC
A slightly longer option, but one that prevents duplication of the string literals is to use a subquery:
SELECT m.*
FROM my_table m
JOIN (SELECT 'Yaksha' AS name, 1 AS name_order FROM dual
UNION ALL
SELECT 'Arun' AS name, 2 AS name_order FROM dual
UNION ALL
SELECT 'Naveen' AS name, 3 AS name_order FROM dual) o
ON o.name = m.name
ORDER BY o.name_order ASC
You can try with something like the following:
SELECT *
FROM test
WHERE name IN ( 'Yaksha', 'Arun', 'Naveen' )
ORDER BY instr ( q'['Yaksha', 'Arun', 'Naveen']', name ) ASC
This way could be useful if your IN list is somehow dynamic.
If the list of values is dynamic or you just don't want to repeat the values you could use (or abuse, depending on your point of view) a table collection, and join your real table to a table collection expression instead of using IN:
select your_table.name
from table(sys.odcivarchar2list('Yaksha','Arun','Naveen')) t
join your_table on your_table.name = t.column_value;
Which will generally work, but of course without an order-by clause is not guaranteed to work, so you can use an inline view to assign the order:
select your_table.name from (
select row_number() over (order by null) as rn, column_value as name
from table(sys.odcivarchar2list('Yaksha','Arun','Naveen'))
) t
join your_table on your_table.name = t.name
order by t.rn;
This still relies on row_number() over (order by null) using the order of the elements in the collection; which relies on collection unnesting preserving the element order. I don't think that's guaranteed either, so there is still some risk involved.

Trouble with computed column in SQL

I have a table that I wish to select a subset of columns from but also add on the end a computed column based upon where you are located in a queue. There are the following fields (that are pertinent):
id: int, auto increment, primary key
answertime: datetime, nullable
By default, when something is submitted to the queue, its answertime is NULL. So, I wish to select the ID of the thing in the queue as well as its rank in the queue (i.e. rank 1 is the next item that is unanswered, etc). Here's what I was thinking:
rank - id - COUNT(ids below my id where answertime is not null). However, I'm having an issue with the syntax of this query:
SELECT id AS outerid, COUNT(
SELECT * FROM tablename WHERE id<outerid AND answertime IS NOT NULL
)
FROM tablename
WHERE answertime IS NULL;
Now, obviously, this is wrong because I'm fairly confident you can't embed a select inside of an aggregate function, likewise flipping the SELECT and COUNT doesn't work as you can't embed a SELECT at that point in the code (it can only be used in a WHERE clause).
Is this even possible to do with just SQL or do I need to add some logic on the program end?
If it helps, I'm doing this on SQL Server 2008, although I doubt that would add any value.
You can do that, you just can't use SELECT * in an aggregate sub-query. Try this, which gets the COUNT value as a scalar result:
SELECT
id AS outerid,
(SELECT COUNT(Id) FROM tablename
WHERE id<outie.id AND answertime IS NOT NULL)
FROM tablename outie
WHERE answertime IS NULL;
You may need to choose for yourself between using COUNT(*), COUNT(Id) or some other column depending on what you're really after.
SELECT id AS outerid,
(SELECT COUNT(*) FROM tablename WHERE id < outerid AND answertime IS NOT NULL) AS othercol
FROM tablename -- ?
WHERE answertime IS NULL;
also, where's the FROM statement?
As suggested by #HLGEM, you could use ROW_NUMBER() to obtain your results. The method involves ranking the rows in tablename by id without partitioning and by id with partitioning by answertime. The difference between the rankings for every row where answertime is NULL would give you the same value as the one you are calculating using COUNT() in the subquery.
Here's an implementation of the method:
;
WITH ranked AS (
SELECT
*,
Rnk = ROW_NUMBER() OVER ( ORDER BY id),
PartRnk = ROW_NUMBER() OVER (PARTITION BY answertime ORDER BY id)
FROM tablename
)
SELECT
id, /* AS outerid, if you like */
Cnt = Rnk - PartRnk
FROM ranked
WHERE answertime IS NULL