Can criteria be re-used in a T-SQL query? - sql

I have a SQL query that looks something like this:
SELECT
o.name,
o.type_id,
(SELECT COUNT(*) FROM a WHERE type_id = o.type_id AND id IN ('1, 2, 3 ... 1000')) AS count_a,
(SELECT COUNT(*) FROM b WHERE type_id = o.type_id AND id IN ('1, 2, 3 ... 1000')) AS count_b,
(SELECT COUNT(*) FROM c WHERE type_id = o.type_id AND id IN ('1, 2, 3 ... 1000')) AS count_c
FROM o
In the subqueries (count_a, count_b and count_c) the criteria specified in the IN clause is the same for each, but its a REALLY long list of numbers (that aren't in fact sequential) and im concerned that:
a) Im slowing the query down by making it too long
b) Its going to get too long and cause an error eventually
Is there a way to alias/reference that list of criteria (perhaps as a variable?) so that it can be re-used in each of the three places it appears in the query? Or am I worrying for nothing?
UPDATE
Given the suggestion of using a CTE, I have changed the query above to work like this for now:
WITH id_list AS (SELECT id FROM source WHERE id IN ('1, 2, 3 ... 1000'))
SELECT
o.name,
o.type_id,
(SELECT COUNT(*) FROM a WHERE type_id = o.type_id AND id IN (SELECT id FROM id_list)) AS count_a,
(SELECT COUNT(*) FROM b WHERE type_id = o.type_id AND id IN (SELECT id FROM id_list)) AS count_b,
(SELECT COUNT(*) FROM c WHERE type_id = o.type_id AND id IN (SELECT id FROM id_list)) AS count_c
FROM o
This cuts the overall length of the query down to about a third of what it was, and although the DB appears to take a couple of milliseconds longer to execute the query, at least I wont run into an error based on the length of query being too long.
QUESTION: Is there a quick way to break a comma separated list of numbers (1, 2, 3 ... 1000) into a result set that could be used as the CTE?

You could use a common-table-expression(CTE):
WITH Numbers AS
(
SELECT N
FROM (SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 1000)
AS T(N)
)
SELECT
o.name,
o.type_id,
(SELECT COUNT(*) FROM a WHERE type_id = o.type_id AND id IN (SELECT N FROM Numbers)) AS count_a,
(SELECT COUNT(*) FROM b WHERE type_id = o.type_id AND id IN (SELECT N FROM Numbers)) AS count_b,
(SELECT COUNT(*) FROM c WHERE type_id = o.type_id AND id IN (SELECT N FROM Numbers)) AS count_c
FROM o

If your ids are foreign keys into some other table, then you could make a table variable from your list, and then join into that repeatedly.
DECLARE #id_list TABLE (
id INT
)
INSERT INTO #id_list
SELECT id
FROM pk_location
WHERE id IN ('1, 2, 3 ... 1000')
SELECT
o.name,
o.type_id,
(SELECT COUNT(*) FROM a INNER JOIN #id_list i ON a.id = i.id WHERE type_id = o.type_id) AS count_a,
(SELECT COUNT(*) FROM b INNER JOIN #id_list i ON b.id = i.id WHERE type_id = o.type_id) AS count_b,
(SELECT COUNT(*) FROM c INNER JOIN #id_list i ON c.id = i.id WHERE type_id = o.type_id) AS count_c
FROM o

Related

How to select groups which has only the values we want and not select it if it also has other values in SQL

id
code
value
A
cod
2
A
buy
34
A
cod
4
B
cod
44
B
F
23
C
thk
45
C
cod
33
C
F
31
D
cod
22
In this table for example, I want those groups of id which has 'code' column value as ONLY cod or F. so query should return values of id = B and nothing else. ( Not even values with id = C because id=C also has 'thk' in code , not even id= D, and output should have ids with ONLY the mentioned two values)
expected output
id
code
value
B
cod
44
B
F
23
You want all rows for the ID of which not exists a forbidden row:
select id, code, value
from mytable
where not exists
(
select null
from mytable forbidden_row
where forbidden_row.id = mytable.id
and forbidden_row.code not in ('cod', 'F')
);
One of approaches with nested query
SELECT ID,Code, value FROM (
select ID, Code,
(SELECT count(*) FROM TableA a where Code = 'cod' and a.ID = TableA.ID) Cod,
(SELECT count(*) FROM TableA a where Code = 'F' and a.ID = TableA.ID) F,
(SELECT count(*) FROM TableA a where Code not in ('F','cod') and a.ID = TableA.ID) Other,
Value
from TableA
) SOURCE
WHERE Cod <> 0 AND F <> 0 and Other = 0
We can achieve this using CTE. Check this,
-- Split the two record category first, then check cod Or F condition.
WITH Count2 AS (
SELECT id
FROM YourTable
GROUP BY id
HAVING COUNT(id) = 2
),
codORF AS (
SELECT id, code, COUNT(id) FROM YourTable T1
LEFT JOIN Count2 T2 On T1.id = T2.id
WHERE code = 'cod' OR code = 'F'
GROUP BY id, code
Having COUNT(id) = 1
)
-- Finally to take all values
SELECT T1.*
FROM YourTable T1
INNER JOIN codORF T2 ON T1.id = T2.id
with main as (
select *, count(id) over(partition by id order by id) as total_rows
from sample
), next_and_before as (
select *,
COALESCE(lag(code) over(partition by id order by id),lead(code) over(partition by id order by id)) as before_next
from main where total_rows <= 2
)
select * from next_and_before
where lower(trim(concat(code,before_next)))in('codf','fcod','cod','f')
Its a bit of hacky solution:
first you are filtering out all the rows that have less than or equal to 2 rows, since there could be cases where you only have one row per id with a code value = 'f' or 'cod', if you don't want that then simply change the last part to: in ('codf','fcod')
then out of two rows, you are looking at the next and before value and checking if it contains other than 'f' or 'cod'
where clause will filter those out if they exist
Test Results from the link below:
Results of sample data

Easy way to get a single resultset from three identical tables?

The DB I'm working with has three tables with identical column layouts, OPEX, NOPEX and CAPEX. I would like to query all three for items with a matching AssetId and get a single result set so that I can process them all at the same time in my .Net code.
The twist is that I do need to know which table they came from.
I know I can do this with a series of CASE in the SELECT clause, perhaps using the ID column in each where it's non-zero to decide which of the tables it came from. But I would have to have one for each column and the tables are pretty wide.
Is there some other way to solve this problem?
In order to get them into one set, you would use a combination of UNION and EXISTS() checks. The UNION ALL will give you a single result set that contains data from all three tables, and the EXISTS check on each will confirm the table you are querying from has corresponding records in the other tables.
SELECT *, 'OPEX' AS table_name
FROM OPEX o
WHERE EXISTS (
SELECT 1
FROM NOPEX n
WHERE n.asset_id = o.asset_id)
AND EXISTS (
SELECT 1
FROM CAPEX c
WHERE c.asset_id = o.asset_id)
UNION ALL
SELECT *, 'NOPEX' AS table_name
FROM NOPEX n
WHERE EXISTS (
SELECT 1
FROM Opex o
WHERE o.asset_id = n.asset_id)
AND EXISTS (
SELECT 1
FROM CAPEX c
WHERE c.asset_id = n.asset_id)
UNION ALL
SELECT *, 'CAPEX' AS table_name
FROM CAPEX c
WHERE EXISTS (
SELECT 1
FROM Opex o
WHERE o.asset_id = c.asset_id)
AND EXISTS (
SELECT 1
FROM NOPEX n
WHERE n.asset_id = c.asset_id)
I guess you could also do INNER JOINs?
SELECT c.*, 'CAPEX' AS table_name
FROM CAPEX c
INNER JOIN OPEX o
ON o.asset_id = c.asset_id
INNER JOIN NOPEX n
ON n.asset_id = c.asset_id
UNION ALL
SELECT o.*, 'OPEX' AS table_name
FROM OPEX o
INNER JOIN CAPEX c
ON c.asset_id = o.asset_id
INNER JOIN NOPEX n
ON n.asset_id = o.asset_id
UNION ALL
SELECT n.*, 'NOPEX' AS table_name
FROM NOPEX n
INNER JOIN OPEX o
ON o.asset_id = n.asset_id
INNER JOIN CAPEX c
ON c.asset_id = n.asset_id
Similar answer to dfundako, but resolving sooner where AssetId is in all three tables and less hitting of the indexes on the related tables:
;with cte as (
select
AssetID
from (
select distinct
AssetID
from Opex
union all
select distinct
AssetID
from Nopex
union all
select distinct
AssetID
from Capex
) as AssetIDs
group by AssetId
having count(AssetId) = 3
)
select 'Opex', * from Opex as o
inner join cte
on o.AssetID = cte.AssetID
union all
select 'Nopex', * from Nopex as n
inner join cte
on n.AssetID = cte.AssetID
union all
select 'Capex', * from Capex as c
inner join cte
on c.AssetID = cte.AssetID

Set contains other set in SQL

Is anybody to know how to find sets which contain all sets of other sets
table
id elem
1 A
1 C
2 B
2 D
2 A
2 C
3 A
3 E
3 F
4 F
4 F
I want to get this:
id
2
3
In that case id 2 contains (A,C) - all set of id 1
and id 3 contains (F) - all set of id 4
In my query I need to get all id which contains all set (all elements) at least one id.
I will be very grateful. Thank you.
I didn't have much to go off of for this one, but it was interested so I guessed. I'm using SQL Server
Follow along in Rextester
I submit the answer which has nested selects, then I follow through on the step by step logic to clarify what's going on:
Build a quick table
CREATE TABLE a (id int, t varchar(10))
INSERT INTO a (id, t)
VALUES (1,'A'),(1,'B'),(1,'B'),(1,'B'),(2,'A')
,(2,'B'),(2,'C'),(3,'C'),(3,'D'),(4,'A'),(5,'P');
Here is the solution:
select distinct p_id supersets from
(
select e.p_id, e.c_id, count(*) matches from (
select distinct c.id p_id, c.t p_t, d.id c_id, d.t c_t from a c
inner join a d on c.t = d.t and c.id <> d.id) e
group by e.p_id, e.c_id) sup
inner join
(select id, count(distinct t) reqs from a group by id) sub
on sub.id = sup.c_id and sup.matches = sub.reqs;
Here are the logical steps broken out to help explain why I'm doing what I'm doing:
--Step1
--Create a list of matches between (distinct) values where IDs are not the same
select distinct c.id p_id, c.t p_t, d.id c_id, d.t c_t from a c
inner join a d on c.t = d.t and c.id <> d.id;
--Step2
--Create a unique list of parent IDs and their child IDs and the # of distinct matches
--For example 2 has 2 matches with 1 and vice versa
select e.p_id, e.c_id, count(*) matches from (
select distinct c.id p_id, c.t p_t, d.id c_id, d.t c_t from a c
inner join a d on c.t = d.t and c.id <> d.id) e
group by e.p_id, e.c_id;
--Step2a
--Create a sub query to see how many distinct values are in each "Set"
select id, count(distinct t) reqs from a group by id;
Now we put it all together in the join (above, first) to make sure the total # of matches from Parent to Child make up 100% of the child values ie(is a super set)
I think the following does what you want:
select t1.id
from t t1 join
(select t.*, count(*) over (partition by id) as cnt
from t
) t2
on t1.elem = t2.elem and t1.id <> t2.id
group by t1.id, t2.id, t2.cnt
having count(*) = cnt;
This matches each id to each other id based on the elements. If the number of matches equals the count in the second set, then all match -- and you have a superset.
I notice that you have duplicate elements. Let's handle this with a CTE:
with t as (
select distinct id, elem
from t
)
select t1.id
from t t1 join
(select t.*, count(*) over (partition by id) as cnt
from t
) t2
on t1.elem = t2.elem and t1.id <> t2.id
group by t1.id, t2.id, t2.cnt
having count(*) = cnt;

How to compare two tables in Hive based on counts

I have below hive tables
Table_1
ID
1
1
2
Table_2
ID
1
2
2
I am comparing two tables based on count of ID in both tables, I need the output like below
ID
1 - 2records in table 1 and 1 record in Table 2
2 - one record in Table 1 and 2 records in table 2
Table_1 is parent table
i am using below query
select count(*),ID from Table_1 group by ID;
select count(*),ID from Table_2 group by ID;
Just do a full outer join on your queries with the on condition as X.id = Y.id, and then select * from the resultant table checking for nulls on either side.
Select id, concat(cnt1, " entries in table 1, ",cnt2, "entries in table 2") from (select * from (select count(*) as cnt1, id from table1 group by id) X full outer join (select count(*) as cnt2, id from table2 group by id)
on X.id=Y.id
)
Try This. You may use a case statement to check if it should be record / records etc.
SELECT m.id,
CONCAT (COALESCE(a.ct, 0), ' record in table 1, ', COALESCE(b.ct, 0),
' record in table 2')
FROM (SELECT id
FROM table_1
UNION
SELECT id
FROM table_2) m
LEFT JOIN (SELECT Count(*) AS ct,
id
FROM table_1
GROUP BY id) a
ON m.id = a.id
LEFT JOIN (SELECT Count(*) AS ct,
id
FROM table_2
GROUP BY id) b
ON m.id = b.id;
You could use this Python program to do a full comparison of 2 Hive tables:
https://github.com/bolcom/hive_compared_bq
If you want a quick comparison just based on counts, then pass the "--just-count" option (you can also specify the group by column with "--group-by-column").
The script also allows you to visually see all the differences on all rows and all columns if you want a complete validation.

Select on records based on two column criterias

I would like to do an SQL query to select from the following table:
id type num
1 a 3
1 b 4
2 a 5
2 c 6
In the case where they have the same 'id' and be type 'a or b', so the result would look something like this:
id type num
1 a 3
1 b 4
Any one has any idea how that can be accomplished?
SELECT table1.*
FROM table1,
(
SELECT COUNT(*) as cnt, id
FROM (
SELECT *
FROM table1
WHERE type = 'a' OR type = 'b'
) sub1
GROUP BY id
HAVING cnt > 1
)sub2
WHERE table1.id = sub2.id
Tested here: http://sqlfiddle.com/#!2/4a031/1 seems to work fine.
Method 1:
select a.*
from some_table t
join some_table a on a.id = t.id and a.type = 'a'
join some_table b on b.id = t.id and b.type = 'b'
Method 2:
select *
from some_table t
where exists ( select *
from some_table x
where x.id = t.id
and x.type = 'a'
)
and exists ( select *
from some_table x
where x.id = t.id
and x.type = 'b'
)
The first technique offers the possibilities of duplicate rows in the results set, depending on the cardinality of id and type. The latter is guaranteed to provide a proper subset of the table.
Either query, assuming you have reasonable indices defined on the table should provide pretty equivalent performance.