How to optimise SQL query without using NOT IN

How to optimise SQL query without using NOT IN - sql

I have this Student table:
Id companyId status
----------------------------------------
101 1001 In-Progress
102 1001 In-Progress
103 1001 Final
104 1002 In-Progress
105 1003 Pending With Company
106 1003 In-Progress
107 1004 In-Progress
108 1004 In-Progress
109 1005 In-Progress
110 1005 Completed
111 1006 In-Progress
112 1006 Canceled
113 1007 In-Progress
114 1007 Pending with Student
I want output with these conditions:
Status is "In-Progress"
If have repeated companyId then other allowed staus is Completed, Canceled (Basically only allowed combination of the staus is In-Progress, Completed and Canceled)
With above condition o/p will look like this:
Id companyId status
--------------------------------
104 1002 In-Progress
107 1004 In-Progress
108 1004 In-Progress
109 1005 In-Progress
111 1006 In-Progress
We can achieve this by using NOT IN
SELECT *
FROM student
WHERE status = 'In-Progress'
AND companyId NOT IN (SELECT companyId FROM student
WHERE status IN ('Final', 'Pending With Company ', 'Pending with Student'));
But I'm looking for a solution without using NOT IN.

You can use not exists
SELECT *
FROM student a
WHERE a.status = 'In-Progress'
AND NOT EXISTS (SELECT null
FROM student b
WHERE a.companyid=b.companyid
AND b.status not in ('In progress','Completed','Cancelled')
)

you can achieve it using left join and choosing not match rows:
SELECT * FROM student a LEFT JOIN
( SELECT companyId FROM student WHERE status in
('Final','Pending With Company ','Pending with Student')) b
on a.companyid=b.companyId
where b.companyId is null
and a.status = 'In-Progress'

WITH status as (
select 1 as s,'In-Progress' d union all
select 100 as s,'Final' d union all
select 100 as s,'Pending With Company' d union all
select 100 as s,'Pending with Student' d union all
select 1 as s,'Cancelled' d union all
select 1 as s,'Completed'
)
SELECT * FROM (
SELECT
s.Id,
s.CompanyID,
s.status,
MAX(status.s) OVER (partition by s.CompanyId) m
FROM student s
LEFT jOIN status ON s.status = status.d
) x where x.m=1 and x.status='In-Progress'
Remember, above is simply not using NOT IN. Discussion about this query not being performant is different.

Related

Duplicates with condition (SQL)

I would like to get the number of duplicates for article_id for each merchant_id, where the zip_code is identical. Please see example below:
Table
merchant_id article_id zip_code
1 4555 1000
1 4555 1003
1 4555 1000
1 3029 1000
2 7539 1005
2 7539 1005
2 7539 1002
2 1232 1006
3 5555 1000
3 5555 1001
3 5555 1001
3 5555 1001
Output Table
merchant_id count_duplicate zip_code
1 2 1000
2 2 1005
3 3 1001
This is the query that I am currently using but I am struggling to include the zip_code condition:
SELECT merchant_id
,duplicate_count
FROM main_table mt
JOIN(select article_id, count(*) AS duplicate_count
from main_table
group by article_id
having count(article_id) =1) mt_1
ON mt.article_id ON mt_1.article_id = mt.article_id

This seems to return what you want. I'm not sure why article_id is not included in the result set:
select merchant_id, zip_code, count(*)
from main_table
group by merchant_id, article_id, zip_code
having count(*) > 1

duplicates with condition

I would like to get the number of duplicates for article_id for each merchant_id, where the zip_code is not identical. Please see example below:
Table
merchant_id article_id zip_code
1 4555 1000
1 4555 1003
1 4555 1002
1 3029 1000
2 7539 1005
2 7539 1005
2 7539 1002
2 1232 1006
3 5555 1000
3 5555 1001
3 5555 1002
3 5555 1003
Output Table
merchant_id count_duplicate
1 3
2 2
3 4
This is the query that I am currently using but I am struggling to include the zip_code condition:
SELECT merchant_id
,duplicate_count
FROM main_table mt
JOIN(select article_id, count(*) AS duplicate_count
from main_table
group by article_id
having count(article_id) >1) mt_1
ON mt.article_id ON mt_1.article_id = mt.article_id

If I understand correctly, you can use two levels of aggregation:
SELECT merchant_id, SUM(num_zips)
FROM (SELECT merchant_id, article_id, COUNT(DISTINCT zip_code) AS num_zips
FROM main_table
GROUP BY merchant_id, article_id
) ma
WHERE ma.num_zips > 1
GROUP BY merchant_id;

How to join three tables in SQL Server 2012 and calculate ranking based on 2 attributes

I have 3 tables:
tblEmployee
E_ID E_Name E_City
--------------------------------
101 sasa Mumbai
102 sdf California
103 trt Illinois
104 dssd Texas
105 trt Pennsylvania
106 wee Arizona
107 rer Texas
108 wqe California
109 sadd Michigan
tblGen
Tgenerate is boolean value
Emp_ID Tgenerate
--------------------
105 1
108 1
102 1
102 1
102 0
104 1
107 0
108 1
109 0
And the tblStat:
Emp_ID Status
------------------
103 Pending
107 Pending
103 Pending
101 Delivered
104 Pending
104 Pending
108 Pending
101 Delivered
105 Delivered
I have to join these 3 tables and want output like this
E_Name EmployeeID City TgenerateCount Delivered_Count Ranking
TgenerateCount is calculated for every employee. It is count of TgenerateCount having value 1, for ex 102 has 2 TgenerateCount and 109 has 0 TgenerateCount.
Delivered_Count is count of Status of those who has 'Delivered' status. For ex. 101 has 2 Delivered. I want to display every user in the output table.
Any help would be greatly appreciated.

As your two fact tables have a many:1 relationship with your dimension table, you should aggregate them before joining them.
SELECT
e.*,
COALESCE(g.rows, 0) AS TgenerateCount,
COALESCE(s.rows, 0) AS DeliveredCount,
RANK() OVER (ORDER BY COALESCE(g.rows, 0) + COALESCE(s.rows,0) DESC) AS ranking
FROM
tblEmployee e
LEFT JOIN
(
SELECT E_ID, COUNT(*) AS rows FROM tblGen WHERE Tgenerate = 1 GROUP BY E_ID
)
g
ON g.E_ID = e.E_ID
LEFT JOIN
(
SELECT E_ID, COUNT(*) AS rows FROM tblStat WHERE STATUS = 'Delivered' GROUP BY E_ID
)
s
ON s.E_ID = e.E_ID
You've been unclear on how the ranking should be completed, so this simply gives an example ranking.

SQL: Find max number of rows

I am totally new to coding so might be my question is silly sorry about it first.
I have a database that has CUST_REFERRED represent CUST_NUMBER who referred book someone
CUST_NUM NAME_S NAME_F ADDRESS Z_CODE CUST_REFERRED
1001 MORALES BONITA P.O. BOX 651 32328
1002 THOMPSON RYAN P.O. BOX 9835 90404
1003 SMITH LEILA P.O. BOX 66 32306
1004 PIERSON THOMAS 69821 SOUTH AVENUE 83707
1005 GIRARD CINDY P.O. BOX 851 98115
1006 CRUZ MESHIA 82 DIRT ROAD 12211
1007 GIANA TAMMY 9153 MAIN STREET 78710 1003
1008 JONES KENNETH P.O. BOX 137 82003
1009 PEREZ JORGE P.O. BOX 8564 91510 1003
1010 LUCAS JAKE 114 EAST SAVANNAH 30314
1011 MCGOVERN REESE P.O. BOX 18 60606
1012 MCKENZIE WILLIAM P.O. BOX 971 02110
1013 NGUYEN NICHOLAS 357 WHITE EAGLE AVE 34711 1006
1014 LEE JASMINE P.O. BOX 2947 82414
1015 SCHELL STEVE P.O. BOX 677 33111
1016 DAUM MICHELL 9851231 LONG ROAD 91508 1010
1017 NELSON BECCA P.O. BOX 563 49006
1018 MONTIASA GREG 1008 GRAND AVENUE 31206
1019 SMITH JENNIFER P.O. BOX 1151 07962 1003
1020 FALAH KENNETH P.O. BOX 335 08607
My idea is to find customer who referred max book. So as you can see 3 times 1003 number referred book who name is LEILA SMITH
I tried a code that;
SELECT
CUST_REFERRED,
COUNT(*)
FROM
CUSTOMER
GROUP BY
CUST_REFERRED
ORDER BY CUST_REFERRED ASC;
This code gives me:
1003 3
1006 1
1010 1
First, my question is I could not use LIMIT function to find max number
and the second question is How Can I add more information of customer?

SELECT NAME_F,
NAME_S,
ADDRESS,
CUST_REFERRED
FROM CUSTOMER
WHERE CUST_NUM = (SELECT MOST_CUS_REF
FROM (SELECT CUST_REFERRED MOST_CUS_REF, COUNT(CUST_REFERRED)
MOST_CUS_REF_COUNT
FROM (SELECT CUST_REFERRED
FROM customer
WHERE cust_referred IS NOT NULL
)
GROUP BY CUST_REFERRED
HAVING COUNT(CUST_REFERRED) = (SELECT MAX (cust_ref_num)
FROM (SELECT CUST_REFERRED,
COUNT(CUST_REFERRED) cust_ref_num
FROM (SELECT CUST_REFERRED
FROM customer
WHERE cust_referred IS NOT NULL
)
GROUP BY CUST_REFERRED
)
)
)
)
;

try this:
Select CUST_REFERRED, z.cnt from
(SELECT CUST_REFERRED, COUNT(*) cnt
FROM CUSTOMER where CUST_REFERRED is Not null
GROUP BY CUST_REFERRED) Z
where z.cnt =
(select Max(cnt) from
(SELECT COUNT(*) cnt
FROM CUSTOMER where CUST_REFERRED is Not null
GROUP BY CUST_REFERRED) ZZ)

Try this query --
;WITH CTE
AS (
SELECT CUSTOMER_REFID COUNT(*) AS REF_COUNT
FROM CUSTOMER
GROUP BY CUSTOMER_REFID
)
SELECT TOP 1 C2.CUSTOMER_ID
,C2.FIRST_NAME
,C2.LAST_NAME
,REF_COUNT
FROM CTE C1
INNER JOIN CUSTOMER C2
ON C1.CUSTOMER_REFID = C2.CUSTOMER_ID
ORDER BY REF_COUNT DESC

edited to add the referred customer details.
with data (cust_num, name_s, name_f, addr, code, cust_referred) as
(
/* begin: test data */
select 1001 ,'MORALES ','BONITA ','P.O. BOX 651 ',32328, null from dual union all
select 1002 ,'THOMPSON ','RYAN ','P.O. BOX 9835 ',90404, null from dual union all
select 1003 ,'SMITH ','LEILA ','P.O. BOX 66 ',32306, null from dual union all
select 1004 ,'PIERSON ','THOMAS ','69821, SOUTH AVENUE ',83707, null from dual union all
select 1005 ,'GIRARD ','CINDY ','P.O. BOX 851 ',98115, null from dual union all
select 1006 ,'CRUZ ','MESHIA ','82 DIRT ROAD ',12211, null from dual union all
select 1007 ,'GIANA ','TAMMY ','9153 MAIN STREET ',78710, 1003 from dual union all
select 1008 ,'JONES ','KENNETH ','P.O. BOX 137 ',82003, null from dual union all
select 1009 ,'PEREZ ','JORGE ','P.O. BOX 8564 ',91510, 1003 from dual union all
select 1010 ,'LUCAS ','JAKE ','114 EAST SAVANNAH ',30314, null from dual union all
select 1011 ,'MCGOVERN ','REESE ','P.O. BOX 18 ',60606, null from dual union all
select 1012 ,'MCKENZIE ','WILLIAM ','P.O. BOX 971 ',02110, null from dual union all
select 1013 ,'NGUYEN ','NICHOLAS ','357 WHITE EAGLE AVE ',34711, 1006 from dual union all
select 1014 ,'LEE ','JASMINE ','P.O. BOX 2947 ',82414, null from dual union all
select 1015 ,'SCHELL ','STEVE ','P.O. BOX 677 ',33111, null from dual union all
select 1016 ,'DAUM ','MICHELL ',',9851231, LONG ROAD ',91508, 1010 from dual union all
select 1017 ,'NELSON ','BECCA ','P.O. BOX 563 ',49006, null from dual union all
select 1018 ,'MONTIASA ','GREG ','1008 GRAND AVENUE ',31206, null from dual union all
select 1019 ,'SMITH ','JENNIFER ','P.O. BOX 1151 ',07962, 1003 from dual union all
select 1020 ,'FALAH ','KENNETH ','P.O. BOX 335 ',08607, null from dual
/* end: test data */
-- replace the above block with your table
-- eg. select * from customers_table
)
,
max_referred as
(
-- just interested in the first row after sorting by
-- the count of referred column values
select rownum, cust_referred, cnt from
(
select cust_referred, count(cust_referred) cnt from data group by cust_referred order by 2 desc
)
where rownum = 1
)
-- joining on cust_referred column in *data* and *max_referred* tables to get the customer details
-- and joining again to the *data* table for fetching the referred customer name
select
cust.cust_num, cust.name_s, cust.name_f, cust.addr, cust.code, cust.cust_referred, ms.name_f || ms.name_s as "Referred Customer"
from
data cust
join
max_referred mr on (cust.cust_referred = mr.cust_referred)
join
data ms
on (mr.cust_referred = ms.cust_num)
;

You can do it in a single table scan (i.e. without any self joins) using analytic functions:
SELECT *
FROM (
SELECT t.*,
MIN( CUST_REFERRED )
KEEP ( DENSE_RANK FIRST ORDER BY num_referrals DESC )
OVER ()
AS best_referrer
FROM (
SELECT c.*,
COUNT( CUST_REFERRED )
OVER ( PARTITION BY CUST_REFERRED )
AS num_referrals
FROM CUSTOMER c
) t
)
WHERE cust_num = best_referrer;

Google Interview Question: Recursive Query or Common Table Expression for the following scenario

I have two tables TableA and TableB in the following fashion:
Table A(ID, PairId)
--Here the Pair represented by PairId will always have 2 elements in it.
Data:
100,1
101,1
-----
104,2
109,2
TableB(A.ID, GroupId)
--Here the Group represented by GroupId will may have any number of elements.
--Also, A.ID means its a foriegn key from TableA
Data:
100,1000
102,1000
103,1000
--------
101,1001
104,1001
105,1001
-------
105,1002
106,1002
107,1002
Given an id from table A (say X),
Find the ids of its pairmates,
and then the groupmates(of the previous pairmates)
and then the pairmates(of all the ones we found in previous step)
and their groupmates (of all the ones we found in previous step) and so on....
until you dont find any pairmates or groupmates.
For instance, given X as 100
you will accumulate data in this fashion:
Include PairMates
100
101
Include GroupMates(of all the ones in prevstep)
100--groupmates of 100
102
103
101--groupmates of 101
104
105
Include PairMates(of all the ones in the prevstep)
100
102
103
101
104--Pairmate of 104
109
105
Include Groupmates(of all the ones in the prev step)
100
102
103
101
104
109
105--Groupmates of 105
106
107
Include Pairmates( of all the ones in the prevstep)
100
102
103
101
104
109
105
106
107
[None found]
Include Groupmates( of all the ones in the prevstep)
100
102
103
101
104
109
105
106
107
[Nonefound]
---since no pairmates and groupmates were added so the recursion ends

DECLARE #ID INT = <Given ID>
WITH CTE AS
(
SELECT * FROM TABLEB
WHERE ID IN (SELECT ID FROM TABLEA WHERE pairID IN (SELECT pairId FROM TABLEA WHERE ID = #ID))
UNION
SELECT * FROM TABLEB WHERE GroupID IN (SELECT GroupID FROM CTE)
)
SELECT * FROM CTE

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas