SQL: Joining data from table with multiple conditions where value matches - sql

Table1 has columns POS, ID, NAME, TYPE
I have been running a standard query as so;
SELECT POS, NAME FROM Table1 WHERE ID = 100 AND TYPE LIKE '%someType%' ORDER BY POS ASC
Table2 has columns POS, ID, VALUE, ROLE
SELECT POS, VALUE FROM Table2 WHERE IS = 100 AND ROLE LIKE '%someType%' ORDER BY POS ASC
I would like to combine these two in order to return to a recordset with 3 columns; POS, Table1.NAME, and Table2.VALUE... No matter what I try with inner joins, I keep getting way more rows than I should. Also, if the corresponding value in Table does not exist, I would like it to return a null or something so that essentially a recordset could look like this;
POS NAME VALUE
1 A DF1
2 B DF1
3 C DF2
4 C null
5 null DF3
etc...
Is this possible at all?

You seem to be looking for a simple join. Something like:
select t1.pos, t1.name, t2.value
from table1 t1
inner join table2 t2
on t2.pos = t1.pos
and t2.is = t1.id
and t2.role like '%someType%'
where
t1.type like '%someType%'
and t1.id = 100
order by t1.pos
Please note that, if one or both of your original queries return more than one row, you will get a cartesian products of these in the resultset.
Or if you want to allow rows coming from both tables, even if there is no match in the other table, then you can full outer join both queries, like:
select coalesce(t1.pos, t2.pos) pos, t1.name, t2.value
from (
select pos, name from table1 where id = 100 and type like '%someType%'
) t1
full outer join (
select pos, value from table2 where is = 100 and role like '%someType%'
) t2
on t1.pos = t2.pos
order by coalesce(t1.pos, t2.pos)

Related

Joining and grouping to equate on two tables

I've tried to minify this problem as much as possible. I've got two tables which share some Id's (among other columns)
id id
---- ----
1 1
1 1
2 1
2
2
Firstly, I can get each table to resolve to a simple count of how many of each Id there is:
select id, count(*) from tbl1 group by id
select id, count(*) from tbl2 group by id
id | tbl1-count id | tbl2-count
--------------- ---------------
1 2 1 3
2 1 2 2
but then I'm at a loss, I'm trying to get the following output which shows the count from tbl2 for each id, divided by the count from tbl1 for the same id:
id | count of id in tbl2 / count of id in tbl1
==========
1 | 1.5
2 | 2
So far I've got this:
select tbl1.Id, tbl2.Id, count(*)
from tbl1
join tbl2 on tbl1.Id = tbl2.Id
group by tbl1.Id, tbl2.Id
which just gives me... well... something nowhere near what I need, to be honest! I was trying count(tbl1.Id), count(tbl2.Id) but get the same multiplied amount (because I'm joining I guess?) - I can't get the individual representations into individual columns where I can do the division.
This gives consideration to your naming of tables -- the query from tbl2 needs to be first so the results will include all records from tbl2. The LEFT JOIN will include all results from the first query, but only join those results that exist in tbl1. (Alternatively, you could use a FULL OUTER JOIN or UNION both results together in the first query.) I also added an IIF to give you an option if there are no records in tbl1 (dividing by null would produce null anyway, but you can do what you want).
Counts are cast as decimal so that the ratio will be returned as a decimal. You can adjust precision as required.
SELECT tb2.id, tb2.table2Count, tb1.table1Count,
IIF(ISNULL(tb1.table1Count, 0) != 0, tb2.table2Count / tb1.table1Count, null) AS ratio
FROM (
SELECT id, CAST(COUNT(1) AS DECIMAL(18, 5)) AS table2Count
FROM tbl2
GROUP BY id
) AS tb2
LEFT JOIN (
SELECT id, CAST(COUNT(1) AS DECIMAL(18, 5)) AS table1Count
FROM tbl1
GROUP BY id
) AS tb1 ON tb1.id = tb2.id
(A subqquery with a LEFT JOIN will allow the query optimizer to determine how to generate the results and will generally outperform a CROSS APPLY, as that executes a calculation for every record.)
Assuming your expected results are wrong, then this is how I would do it:
CREATE TABLE T1 (ID int);
CREATE TABLE T2 (ID int);
GO
INSERT INTO T1 VALUES(1),(1),(2);
INSERT INTO T2 VALUES(1),(1),(1),(2),(2);
GO
SELECT T1.ID AS OutID,
(T2.T2Count * 1.) / COUNT(T1.ID) AS OutCount --Might want a CONVERT to a smaller scale and precision decimal here
FROM T1
CROSS APPLY (SELECT T2.ID, COUNT(T2.ID) AS T2Count
FROM T2
WHERE T2.ID = T1.ID
GROUP BY T2.ID) T2
GROUP BY T1.ID,
T2.T2Count;
GO
DROP TABLE T1;
DROP TABLE T2;
You can aggregate in subqueries and then join:
select t1.id, t2.cnt * 1.0 / t1.cnt
from (select id, count(*) as cnt
from tbl1
group by id
) t1 join
(select id, count(*) as cnt
from tbl2
group by id
) t2
on t1.id = t2.id

Querying two tables to filter data using select case

I have two tables
Table 1 looks like this
ID Repeats
-----------
A 1
A 1
A 0
B 2
B 2
C 2
D 1
Table 2 looks like this
ID values
-----------
A 100
B 200
C 100
D 300
Using a view I need a result like this
ID values Repeats
-------------------
A 100 NA
B 200 2
C 100 2
D 300 1
that means, I want unique ID, its values and Repeats. Repeats value should display NA when there are multiple values against single ID and it should display the Repeats value in case there is single value for repeats.
Initially I needed to display the max value of repeats so I tried the following view
ALTER VIEW [dbo].[BookingView1]
AS
SELECT bv.*, bd2.Repeats FROM Table1 bv
JOIN
(
SELECT distinct bd.id, bd.Repeats FROM table2 bd
JOIN
(
SELECT Id, MAX(Repeats) AS MaxRepeatCount
FROM table2
GROUP BY Id
) bd1
ON bd.Id = bd1.Id
AND bd.Repeats = bd1.MaxRepeatCount
) bd2
ON bv.Id = bd2.Id;
and this returns the correct result but when trying to implement the CASE it fails to return unique ID results. Please help!!
One method uses outer apply:
select t2.*, t1.repeats
from table2 t2 outer apply
(select (case when max(repeats) = min(repeats) then max(repeats)
else 'NA'
end) as repeats
from table1 t1
where t1.id = t2.id
) t1;
Two notes:
This assumes that repeats is a string. If it is a number, you need to cast it to a string.
repeats is not null.
For the sake of completeness, I'm including another approach that will work if repeats is NULL. However, Gordon's answer has a much simpler query plan and should be preferred.
Option 1 (Works with NULLs):
SELECT
t1.ID, t2.[Values],
CASE
WHEN COUNT(*) > 1 THEN 'NA'
ELSE CAST(MAX(Repeats) AS VARCHAR(2))
END Repeats
FROM (
SELECT DISTINCT t1.ID, t1.Repeats
FROM #table1 t1
) t1
LEFT OUTER JOIN #table2 t2
ON t1.ID = t2.ID
GROUP BY t1.ID, t2.[Values]
Option 2 (does not contain explicit subqueries, but does not work with NULLs):
SELECT DISTINCT
t1.ID,
t2.[Values],
CASE
WHEN COUNT(t1.Repeats) OVER (PARTITION BY COUNT(DISTINCT t1.Repeats), t1.ID) > 1 THEN 'NA'
ELSE CAST(t1.Repeats AS VARCHAR(2))
END Repeats
FROM #table1 t1
LEFT OUTER JOIN #table2 t2
ON t1.ID = t2.ID
GROUP BY t1.ID, t2.[Values], t1.Repeats
NOTE:
This may not give desired results if table2 has different values for the same ID.

array_agg contains another array_agg

t1
id|entity_type
9|3
9|4
9|5
2|3
2|5
t2
id|entity_type
1|3
1|4
1|5
SELECT t1.id, array_agg(t1.entity_type)
FROM t1
GROUP BY
t1.id
HAVING ARRAY_AGG(t1.entity_type by t1.entity_type) =
(SELECT ARRAY_AGG(t2.entity_type by t2.entity_type)
FROM t2
WHERE t2.id = 1
GROUP BY t2.id);
Result:
t1.id = 9|array_agg{3,4,5}
I have two tables t1 and t2. I want to get value of t1.id where t1.entity_type array equals t2.entity_type array.
In this scenario everything works fine. For t2.id = 1 I receive t1.id = 9.
Both have the same array of entity_type: {3,4,5}
Now I'd like to get t1.id not only for equal sets, but also for smaller sets.
If I modify t2 this way:
t2
id|entity_type
1|3
1|4
and modify query this way:
SELECT t1.id, array_agg(t1.entity_type)
FROM t1
GROUP BY
t1.id
HAVING ARRAY_AGG(t1.entity_type by t1.entity_type) >= /*MODIFICATION*/
(SELECT ARRAY_AGG(t2.entity_type by t2.entity_type)
FROM t2
WHERE t2.id = 1
GROUP BY t2.id);
I don't receive the expected result:
t1.id = 1 has {3, 4, 5}
t2.id = 1 has {3, 4}
Arrays in t1 that contain the array in t2 should qualify. I expect to receive results as in first case but I get no rows.
Is there any method like: ARRAY_AGG contains another ARRAY_AGG?
Clean up
First of all, syntax error. I assume you mean:
ARRAY_AGG(t1.entity_type ORDER BY t1.entity_type)
Details in the manual.
Next, it would be inefficient to use two differing invocations of array_agg(). Use the same (ORDER BY in SELECT list and HAVING clause):
SELECT id, array_agg(entity_type ORDER BY entity_type) AS arr
FROM t1
GROUP BY 1
HAVING array_agg(entity_type ORDER BY entity_type) = (
SELECT array_agg(entity_type ORDER BY entity_type)
FROM t2
WHERE id = 1
-- GROUP BY id -- not needed
);
"contains" operator #>
Like Nick commented, your 2nd query would work with the "contains" operator #>
SELECT id, array_agg(entity_type ORDER BY entity_type) AS arr
FROM t1
GROUP BY 1
HAVING array_agg(entity_type ORDER BY entity_type) #> (
SELECT array_agg(entity_type ORDER BY entity_type)
FROM t2
WHERE id = 1
);
But this is very inefficient for big tables.
Faster query
This is a case of relational division. Depending on your (missing) exact table definition, there are more efficient techniques. We have gathered a whole arsenal under this related question:
How to filter SQL results in a has-many-through relation
Assuming (id, entity_type) is unique in both tables, this should be substantially faster for big tables, especially because it can use an index on t1 (as opposed to your original query):
SELECT t1.id
FROM t2
JOIN t1 USING (entity_type)
WHERE t2.id = 1
GROUP BY 1
HAVING count(*) = (SELECT count(*) FROM t2 WHERE id = 1);
You need two indexes:
First on t2.id, typically covered by the primary key.
Second on t1.entity_type:
CREATE INDEX t1_foo_idx ON t1 (entity_type, id);
The added id column is optional to allow index-only scans. Sequence of columns is essential:
Is a composite index also good for queries on the first field?
SQL Fiddle.

Select rows from table such that sum of computed values of their column is less than given limit

I have a table myTable of the following structure:
id: int PRIMARY KEY
number: int
I'd like to randomly select 3 rows at most from myTable under the conditions:
Three rows at most should be selected
We should select id and a value calculated as 0,3*RAND()*number converted to INT. The alias of the computed column is randomValue
Only rows with randomValue>0 should be included in the result
Sum of randomValues should be less than given treshold, say 60.
So far, I've written this:
SELECT TOP 3 id,randomValue
FROM(
SELECT id, CONVERT(INT,(0.3*RAND()*number)) AS randomValue
FROM myTable
WHERE number>0
) AS D
WHERE randomValue>0
ORDER BY NEWID()
The code above selects at most 3 random rows where randomValue is greater than zero. However, I don't know how to fulfill condition 4, i.e. how to achieve that sum of randomValues in selected rows is less than 60.
This is myTable where I'm testing the solutions:
WITH
random_values AS (
SELECT
id,
CONVERT(INT,(0.3*RAND()*number)) AS randomValue
FROM myTable
WHERE number>0
),
valid_sets AS (
SELECT
t1.id id1,
t2.id id2,
t3.id id3
FROM random_values t1
INNER JOIN random_values t2 ON (t2.id > t1.id)
INNER JOIN random_values t3 ON (t3.id > t2.id)
WHERE t1.randomValue + t2.randomValue + t3.randomValue < 60
)
SELECT c.id,c.number
FROM (SELECT TOP 1 * FROM valid_sets ORDER BY NEWID()) a
UNPIVOT(id FOR n IN (id1,id2,id3)) b
INNER JOIN myTable c ON (b.id = c.id)

mysql - union tables by unique field

I have two tables with the same structure:
id name
1 Merry
2 Mike
and
id name
1 Mike
2 Alis
I need to union second table to first with keeping unique names, so that result is:
id name
1 Merry
2 Mike
3 Alis
Is it possible to do this with MySQL query, without using php script?
This is not a join (set multiplication), this is a union (set addition).
SELECT #r := #r + 1 AS id, name
FROM (
SELECT #r := 0
) vars,
(
SELECT name
FROM table1
UNION
SELECT name
FROM table2
) q
This will select all names from table1 and combine those with all the names from table2 which are not in table1.
(
select *
from table1
)
union
(
select *
from table2 t2
left join table1 t1 on t2.name = t1.name
where t1.id is null
)
Use:
SELECT a.id,
a.name
FROM TABLE_A a
UNION
SELECT b.id,
b.name
FROM TABLE_B b
UNION will remove duplicates.
As commented, it all depends on what your 'id' means, cause in the example, it means nothing.
SELECT DISTINCT(name) FROM t1 JOIN t2 ON something
if you only want the names
SELECT SUM(something), name FROM t1 JOIN t2 ON something GROUP BY name
if you want to do some group by
SELECT DISTINCT(name) FROM t1 JOIN t2 ON t1.id = t2.id
if the id's are the same
SELECT DISTINCT COALESCE(t1.name,t2.name) FROM
mytable t1 LEFT JOIN mytable t2 ON (t1.name=t2.name);
will get you a list of unique names from the 2 tables. If you want them to get new ids (like Alis does in your desired results), that's something else and requires the answers to a couple of questions:
do any of the names need to maintain their previous id. And if they do, which table's id should be preferred?
why do you have 2 tables with the same structure? ie what are you trying to accomplish when you generate the unique name list?