Scalar subquery produced more than one element, using UNNEST - sql

I have the following sentence, as I have read, the UNNEST should be used, but I don't know how
select
(
select
case
when lin = 4 then 1
when lin != 4 then null
end as FLAG,
from table_1
WHERE
table_1.date = table_2.date
AND
table_1.acc = table_2.acc
) as FLAG
FROM
table_2
I have a lot of subqueries and that's why I can't use LEFT JOIN.
Currently my table_2 has 13 million records, and table_1 has 400 million records, what I want is to be able to show a FLAG for each account, knowing that the data universes are different.

I can't use LEFT JOIN ...
Scalar subquery produced more than one element ...
simply use below version of your query - which logically is equivalent to your original query but eliminating the issue you have
select *,
(
select 1 as FLAG
from table_1
WHERE
table_1.date = table_2.date
AND
table_1.acc = table_2.acc
AND lin = 4
LIMIT 1
) as FLAG
FROM
table_2

This indicates that you expect to see one and only one value from the subquery but when you join to table_1 you are getting duplicates on the date and acc fields. If you aggregate the value or remove the duplicates from table_1 that should solve your issue, although at that point why not just use a more efficient JOIN?
-- This will solve your immediate problem, use any aggregation
-- technique, I picked MAX because why not. I do not know your use case.
select
(
select
MAX(case
when lin = 4 then 1
when lin != 4 then null
end) as FLAG
from table_1
WHERE
table_1.date = table_2.date
AND
table_1.acc = table_2.acc
) as FLAG
FROM
table_2
A better way to do this would be
select
case
when t1.lin = 4 then 1
when t1.lin != 4 then null
end as FLAG
FROM
table_2 t2
LEFT JOIN
table_1 t1 ON (
t1.date = t2.date
AND t1.acc = t2.acc
)
As you said, left joins do not work for you. If you want multiple results to be nested in the same subquery then just wrap your subquery in an ARRAY() function to allow for repeated values.
select
ARRAY(
select
case
when lin = 4 then 1
when lin != 4 then null
end as FLAG
from table_1
WHERE
table_1.date = table_2.date
AND
table_1.acc = table_2.acc
-- Now you must filter out results that will return
-- NULL because NULL is not allowed in an ARRAY
AND
table_1.lin = 4
) as FLAG
FROM
table_2

Related

Querying two tables to filter data using select case

I have two tables
Table 1 looks like this
ID Repeats
-----------
A 1
A 1
A 0
B 2
B 2
C 2
D 1
Table 2 looks like this
ID values
-----------
A 100
B 200
C 100
D 300
Using a view I need a result like this
ID values Repeats
-------------------
A 100 NA
B 200 2
C 100 2
D 300 1
that means, I want unique ID, its values and Repeats. Repeats value should display NA when there are multiple values against single ID and it should display the Repeats value in case there is single value for repeats.
Initially I needed to display the max value of repeats so I tried the following view
ALTER VIEW [dbo].[BookingView1]
AS
SELECT bv.*, bd2.Repeats FROM Table1 bv
JOIN
(
SELECT distinct bd.id, bd.Repeats FROM table2 bd
JOIN
(
SELECT Id, MAX(Repeats) AS MaxRepeatCount
FROM table2
GROUP BY Id
) bd1
ON bd.Id = bd1.Id
AND bd.Repeats = bd1.MaxRepeatCount
) bd2
ON bv.Id = bd2.Id;
and this returns the correct result but when trying to implement the CASE it fails to return unique ID results. Please help!!
One method uses outer apply:
select t2.*, t1.repeats
from table2 t2 outer apply
(select (case when max(repeats) = min(repeats) then max(repeats)
else 'NA'
end) as repeats
from table1 t1
where t1.id = t2.id
) t1;
Two notes:
This assumes that repeats is a string. If it is a number, you need to cast it to a string.
repeats is not null.
For the sake of completeness, I'm including another approach that will work if repeats is NULL. However, Gordon's answer has a much simpler query plan and should be preferred.
Option 1 (Works with NULLs):
SELECT
t1.ID, t2.[Values],
CASE
WHEN COUNT(*) > 1 THEN 'NA'
ELSE CAST(MAX(Repeats) AS VARCHAR(2))
END Repeats
FROM (
SELECT DISTINCT t1.ID, t1.Repeats
FROM #table1 t1
) t1
LEFT OUTER JOIN #table2 t2
ON t1.ID = t2.ID
GROUP BY t1.ID, t2.[Values]
Option 2 (does not contain explicit subqueries, but does not work with NULLs):
SELECT DISTINCT
t1.ID,
t2.[Values],
CASE
WHEN COUNT(t1.Repeats) OVER (PARTITION BY COUNT(DISTINCT t1.Repeats), t1.ID) > 1 THEN 'NA'
ELSE CAST(t1.Repeats AS VARCHAR(2))
END Repeats
FROM #table1 t1
LEFT OUTER JOIN #table2 t2
ON t1.ID = t2.ID
GROUP BY t1.ID, t2.[Values], t1.Repeats
NOTE:
This may not give desired results if table2 has different values for the same ID.

Join two tables and return rows with N values in common

I have two databases with identical schema that I merged and now I want to return the records that are possible matches.
That is, return three records in the updated database that might look like:
id foo bar baz meow mix
36 123 234 567
962 123 345 456 567
962 345
I want to be able to search for records that have any n column values in common (here n=2 obviously). In the example, records 1 and 2 have identical 'foo' and 'mix' values and records 2 and 3 have identical 'id' and 'bad' values.
I know it should be an INNER JOIN but my problem is that I want it to be able to return any records that have any n column values in common so I don't know what to join them on.
SELECT * FROM table t1 INNER JOIN table t2 ON ...
Any help would be greatly appreciated!
Addendum:
#Gordon Linoff
Ok, that worked! I generalized it for any table and any number of columns and excluded identical matches with:
"SELECT t1.*, t2.* FROM {0} t1 JOIN {0} t2 ON {1} WHERE ({2}) BETWEEN 2 AND {3}".format(table, ' or '.join(['t1.{0}=t2.{0}'.format(c) for c in columns]), '+'.join(['(CASE WHEN t1.{0}=t2.{0} THEN 1 ELSE 0 END)'.format(c) for c in columns]),len(columns)-1)
Thanks!
UPDATE: The table I am reforming this on has ~10k records and this is so slow! Is there a faster way to do this?
You can do this with a self-join and order by:
select t1.*, t2.*,
((case when t1.id = t2.id then 1 else 0 end) +
(case when t1.foo = t2.foo then 1 else 0 end) +
(case when t1.bar = t2.bar then 1 else 0 end) +
. . .
) as NumMatches
from table t1 join
table t2
on t1.id = t2.id or
t1.foo = t2.foo or
t1.bar = t2.bar or
. . .
order by NumMatches desc;
If you want exactly two or more matches, then that depends on the database. In MySQL you can say having NumMatches >= 2. In other databases, you either have to repeat the case statement in the where clause or use a subquery.
I doubt you are going to find any shortcuts...you're going to need to spell out all the matches you want, ie:
SELECT * FROM table WHERE foo=bar or foo=baz or foo=meow or foo=mix, ...

Optimizing tricky SQL search query

I am trying to come up with a simple, performant query for the following problem:
Let's say there are several entities (items) which all have a unique ID. The entities have a variable set of attributes (properties), which therefore have been moved to a separate table:
T_Items_Props
=======================
Item_ID Prop_ID Value
-----------------------
101 1 'abc'
101 2 '123'
102 1 'xyz'
102 2 '123'
102 3 '102'
... ... ...
Now I want to search for an item, that matches some specified search-criteria, like this:
<<Pseudo-SQL>>
SELECT Item_Id(s)
FROM T_Items_Props
WHERE Prop 1 = 'abc'
AND Prop 2 = '123'
...
AND Prop n = ...
This would be fairly easy if I had a table like Items(Id, Prop_1, Prop_2, ..., Prop_n). Then I could do a simple SELECT where the search criteria could simply (even programmatically) be inserted in the WHERE-clause, but in this case I would have to do something like:
SELECT t1.Item_ID
FROM T_Items_Props t1
, T_Items_Props t2
, ...
, T_Items_Props tn -- (depending on how many properties to compare)
AND t1.Item_ID = t2.Item_ID
AND t1.Prop_ID = 1 AND t1.Value = 'abc'
AND t2.Prop_ID = 2 AND t2.Value = '123'
...
AND tn.Prop_ID = n AND tn.Value = ...
Is there a better/simpler/faster way to do this?
To make the query more readable, you could do something like:
SELECT
t1.Item_ID
FROM
T_Items_Props t1
where convert(varchar(10), t1.Item_ID) + ';' + t1.Value in (
'1;abc',
'2;123',
...
)
NOTE: This assumes, that your IDs will not have more than 10 digets. It might also slow your query down, due to the extra type conversion and string concatanation.
You could count the number of correct Props. This isn't very good in case there could be duplicates. E.g.:
Prop_ID = 1 AND Value = 'abc'
Prop_ID = 2 AND Value = '123'
and the table would look like:
T_Items_Props
=======================
Item_ID Prop_ID Value
-----------------------
101 1 'abc'
101 1 'abc'
this would then be true, although it shouldn't.
But if you wanna give it a try, here's how:
SELECT nested.* FROM (
SELECT item_id, count(*) AS c FROM t_items_props
WHERE ((prop = 1 AND value = 'abc')
OR (prop = 2 AND value = '123')
... more rules here ...)
GROUP BY item_id) nested
WHERE nested.c > 2 ... number of rules ...
I've offered this in a previous post of similar querying intentions. The user could have 2 criteria one time, and five criteria another and wanted an easy way to build the SQL command. To simplify the need of having to add FROM tables and update the WHERE clause, you can simplify by doing joins and put that criteria right at the join level... So, each criteria is it's own set added to the mix.
SELECT
t1.Item_ID
FROM
T_Items_Props t1
JOIN T_Items_Props t2
on t1.Item_ID = t2.Item_ID
AND t2.Prop_ID = 2
AND t2.Value = '123'
JOIN T_Items_Props t3
on t1.Item_ID = t3.Item_ID
AND t3.Prop_ID = 6
AND t3.Value = 'anything'
JOIN T_Items_Props t4
on t1.Item_ID = t4.Item_ID
AND t4.Prop_ID = 15
AND t4.Value = 'another value'
WHERE
t1.Prop_ID = 1
AND t1.Value = 'abc'
Notice the primary query will always start with a minimum of the "T1" property/value criteria, but then, notice the JOIN clauses... they are virtually the same so it is very easy to implement via a loop... Just keep aliasing the T2, T3, T4... as needed. This will start with any items that meet the T1 criteria, but then also require all the rest to be found too.
You can use a join statement together with filtering or faceted search. It gives better performance because you can limit the search space. Here is a good example: Faceted Search (solr) vs Good old filtering via PHP?.

SQL Server--Is it possible to work around using a temporary table for a query that filters based on an alias case column?

I am trying to alter a base query that selects data from several joined tables, and filters out rows based on the CASE WHEN below. The result set is to be returned as follows:
If all of the rows return 0 in the CASE column, return one line with '0' in the OVERDUE column (the "return one line" portion is taken care of by DISTINCT.)
If any of the rows return 1 for the CASE column, return one line with '1' in the OVERDUE column.
The base is as follows:
SELECT DISTINCT t1.*,
CASE WHEN t3.MTemp > t3.MTempLimit
then 1
when t3.TotHours > t3.THoursLimit
then 1
else 0
end [Overdue]
from table_1 t1
LEFT JOIN table_2 t2 on t1.ResNo = t2.ResNo and t1.PCode = t2.PCode
LEFT JOIN table_3 t3 on t2.RepJobNo = t3.RepJobNo
LEFT JOIN table_4 t4 on t4.TypeID = t2.RepType
WHERE t2.RepStat = 1
The catch is, I've already created a working version of this by using a temp table and doing a IF EXISTS/ELSE query on the temp table's OVERDUE column. However, I've been informed that this solution may not be useable (due to having to go through certain front-end software).
Is it possible to do a workaround for this that does not involve using a temporary table? I've been making attempts at using both a derived table and CTEs, neither of which have yielded anything usable, due to the fact that one cannot use IF/ELSE clauses after those (which was what I was counting on).
I'm still getting the hang of T-SQL, so any help would be greatly appreciated.
Sounds like a simple ROW_NUMBER() and a couple of CTEs will work:
;WITH RS1 as (
SELECT t1.*,
CASE WHEN t3.MTemp > t3.MTempLimit
then 1
when t3.TotHours > t3.THoursLimit
then 1
else 0
end [Overdue]
from table_1 t1
LEFT JOIN table_2 t2 on t1.ResNo = t2.ResNo and t1.PCode = t2.PCode
LEFT JOIN table_3 t3 on t2.RepJobNo = t3.RepJobNo
LEFT JOIN table_4 t4 on t4.TypeID = t2.RepType
WHERE t2.RepStat = 1
), RS2 as (
select *,ROW_NUMBER() OVER (ORDER BY Overdue DESC) rn
from RS1
)
select * from RS2 where rn = 1
(There's no need for a DISTINCT now that we're only returning one row)
In general any temporary table referenced in another query can simply be substituted for as follow, so that this:
insert #temp
select -- definition of temptable
;
select ...
from #temp
join ...
becomes
select
from (
-- definition of temptable
) temp
join ...

MYSQL join, return first matching row only from where join condition using OR

I'm having a problem with a particular MySQL query.
I have table1, and table2, table2 is joined onto table1.
Now the problem is that I am joining table2 to table1 with a condition that looks like:
SELECT
table1.*, table2.*
JOIN table2 ON ( table2.table1_id = table1.id
AND ( table2.lang = 'fr'
OR table2.lang = 'eu'
OR table2.lang = 'default') )
I need it to return only 1 row from table2, even though there might exists many table2 rows for the correct table1.id, with many different locales.
I am looking for a way to join only ONE row with a priority of the locales, first check for one where lang = something, then if that doesn't manage to join/return anything, then where lang = somethingelse, and lastly lang = default.
FR, EU can be different for many users, and rows in the database might exist for many different locales.. I need to select the most suitable ones with the correct fallback priority.
I tried doing the query above with a GROUP BY table2.table1_id, and it seemed to work, but I realised that if the best matching (first OR) was entered later in the table (higher primary ID) it would return 2nd or default priority as the grouped by row..
Any tips?
Thank you!
it still doesn't seem to "know" what t1.id is :(
Here follows my entire query, it goes to show table1 = _product_sku, table2 = _product_sku_data, t1 = ps, t2 = psd
SELECT ps.id, psd.description, psd.lang
FROM _product_sku ps
CROSS JOIN
( SELECT lang, title, description
FROM _product_sku_data
WHERE product_sku_id = ps.id
ORDER BY CASE WHEN lang='$this->profile_language_preference' THEN 0
WHEN lang='$this->browser_language' THEN 1
WHEN lang='default' THEN 2
ELSE 3
END
LIMIT 1
) AS psd
Edit:
This version uses variables to provide some sort of ranking within the available languages. Tables and test-data are not from the original question, but from the query OP provided as an answer.
It produced the expected results when I tried it:
SELECT id, description, lang
FROM
(
SELECT ps.id, psd.description, psd.lang,
CASE
WHEN #id != ps.id THEN #rownum := 1
ELSE #rownum := #rownum + 1
END AS rank,
#id := ps.id
FROM _product_sku ps
JOIN _product_sku_data psd ON ( psd.product_sku_id = ps.id )
JOIN ( SELECT #id:=NULL, #rownum:=0 ) x
ORDER BY id,
CASE WHEN lang='$this->profile_language_preference' THEN 0
WHEN lang='$this->browser_language' THEN 1
WHEN lang='default' THEN 2
ELSE 3
END
) x
WHERE rank = 1;
Old version which did not work, since ps.id is not known in the WHERE clause:
This one should return you the rows of table1 with the "best matching" row of table2 by using LIMIT 1 and ordering languages as defined:
SELECT t1.id, t2.lang, t2.some_column
FROM table1 t1
CROSS JOIN
( SELECT lang, some_column
FROM table2
WHERE table1_id = t1.id
ORDER BY CASE WHEN lang='fr' THEN 0
WHEN lang='eu' THEN 1
WHEN lang='default' THEN 2
ELSE 3
END
LIMIT 1
) t2