What is the difference between the IN operator and = operator in SQL? - sql

I am just learning SQL, and I'm wondering what the difference is between the following lines:
WHERE s.parent IN (SELECT l.parent .....)
versus
WHERE s.parent = (SELECT l.parent .....)

IN
will not generate an error if you have multiple results on the subquery. Allows to have more than one value in the result returned by the subquery.
=
will generate an error if you have more than one result on the subquery.
SQLFiddle Demo (IN vs =)

when you are using 'IN' it can compare multiple values....like
select * from tablename where student_name in('mari','sruthi','takudu')
but when you are using '=' you can't compare multiple values
select * from tablenamewhere student_name = 'sruthi'
i hope this is the right answer

The "IN" clause is also much much much much slower. If you have many results in the select portion of
IN (SELECT l.parent .....),
it will be extremely inefficient as it actually generates a separate select sql statement for each and every result within the select statement ... so if you return 'Cat', 'Dog', 'Cow'
it will essentially create a sql statement for each result... if you have 200 results... you get the full sql statement 200 times...takes forever... (This was as of a few years ago... maybe imporved by now... but it was horribly slow on big result sets.)
Much more efficient to do an inner join such as:
Select id, parent
from table1 as T
inner join (Select parent from table2) as T2 on T.parent = T2.parent

For future visitors.
Basically in case of equals (just remember that here we are talking like where a.name = b.name), each cell value from table 1 will be compared one by one to each cell value of all the rows from table 2, if it matches then that row will be selected (here that row will be selected means that row from table 1 and table 2) for the overall result set otherwise will not be selected.
Now, in case of IN, complete result set on the right side of the IN will be used for comparison, so its like each value from table 1 will be checked on whether this cell value is present in the complete result set of the IN, if it is present then that value will be shown for all the rows of the IN’s result set, so let say IN result set has 20 rows, so that cell value from table 1 will be present in overall result set 20 times (i.e. that particular cell value will have 20 rows).
For more clarity see below screen shot, notice below that how complete result set from the right of the IN (and NOT IN) is considered in the overall result set; whole emphasis is on the fact that in case comparison using =, matching row from second table is selected, while in case of IN complete result from the second table is selected.

In can match a value with more than one values, in other words it checks if a value is in the list of values so for e.g.
x in ('a', 'b', 'x') will return true result as x is in the the list of values
while = expects only one value, its as simple as
x = y returns false
and
x = x returns true

The general rule of thumb is:
The = expects a single value to compare with. Like this:
WHERE s.parent = 'father_name'
IN is extremely useful in scenarios where = cannot work i.e. scenarios where you need the comparison with multiple values.
WHERE s.parent IN ('father_name', 'mother_name', 'brother_name', 'sister_name')
Hope this is useful!!!

IN
This helps when a subquery returns more than one result.
=
This operator cannot handle more than one result.
Like in this example:
SQL>
Select LOC from dept where DEPTNO = (select DEPTNO from emp where
JOB='MANAGER');
Gives ERROR ORA-01427: single-row subquery returns more than one row
Instead use
SQL>
Select LOC from dept where DEPTNO in (select DEPTNO from emp
where JOB='MANAGER');

1) Sometimes = also used as comparison operator in case of joins which IN doesn't.
2) You can pass multiple values in the IN block which you can't do with =. For example,
SELECT * FROM [Products] where ProductID IN((select max(ProductID) from Products),
(select min(ProductID) from Products))
would work and provide you expected number of rows.However,
SELECT * FROM [Products] where ProductID = (select max(ProductID) from Products)
and ProductID =(select min(ProductID) from Products)
will provide you 'no result'. That means, in case subquery supposed to return multiple number of rows , in that case '=' isn't useful.

Related

how to use listagg operator so that the query should fetch comma separated values

SELECT (SELECT STRING_VALUE
FROM EMP_NODE_PROPERTIES
WHERE NODE_ID=AN.ID ) containedWithin
FROM EMP_NODE AN
WHERE AN.STORE_ID = ALS.ID
AND an.TYPE_QNAME_ID=(SELECT ID
FROM EMP_QNAME
where LOCAL_NAME = 'document')
AND
AND AN.UUID='13456677';
from the above query I am getting below error.
ORA-01427: single-row subquery returns more than one row
so how to change the above query so that it should fetch comma separated values
This query won't return error you mentioned because
there are two ANDs and
there's no ALS table (or its alias).
I suggest you post something that is correctly written, then we can discuss other errors.
Basically, it is either select string_value ... or select id ... (or even both of them) that return more than a single value.
The most obvious "solution" is to use select DISTINCT
another one is to include where rownum = 1
or, use aggregate functions, e.g. select max(string_value) ...
while the most appropriate option would be to join all tables involved and decide which row (value) is correct and adjust query (i.e. its WHERE clause) to make sure that desired value will be returned.
You would seem to want something like this:
SELECT LISTAGG(NP.STRING_VALUE, ',') WITHIN GROUP(ORDER BY NP.STRING_VALUE)
as containedWithin
FROM EMP_NODE N
JOIN EMP_QNAME Q
ON N.TYPE_QNAME_ID = Q.ID
LEFT JOIN EMP_NODE_PROPERTIES NP
ON NP.NODE_ID = N.ID
WHERE Q.LOCAL_NAME = 'document'
AND AN.UUID = '13456677';
This is a bit speculative because your original query would not run for the reason explained by Littlefoot.

SQL Server ISNULL multiple columns

I have the following query which works great but how do I add multiple columns in its select statement? Following is the query:
SELECT ISNULL(
(SELECT DISTINCT a.DatasourceID
FROM [Table1] a
WHERE a.DatasourceID = 5 AND a.AgencyID = 4 AND a.AccountingMonth = 201907), NULL) TEST
So currently I only get one column (TEST) but would like to add other columns such as DataSourceID, AgencyID and AccountingMonth.
If you want to output a row for some condition (or requested values ) and output a row when it does not meet condition,
you can set a pseudo table for your requested values in the FROM clause and make a left outer join with your Table1.
SELECT ISNULL(Table1.DatasourceId, 999999),
Table1.AgencyId,
Table1.AccountingMonth,
COUNT(*) as count
FROM ( VALUES (5, 4, 201907 ),
(6, 4, 201907 ))
AS requested(DatasourceId, AgencyId, AccountingMonth)
LEFT OUTER JOIN Table1 ON requested.agencyid=Table1.AgencyId
AND requested.datasourceid = Table1.DatasourceId
AND requested.AccountingMonth = Table1.AccountingMonth
GROUP BY Table1.DatasourceId, Table1.AgencyId, Table1.AccountingMonth
Note that:
I have put a ISNULL for the first column like you did to output a particular value (9999) when no value is found.
I did not put the ISNULL(...,NULL) like your query in the other columns since IMHO it is not necessary: if there is no value, a null will be output anyway.
I added a COUNT(*) column to illustrate an aggregate, you could use another (SUM, MIN, MAX) or none if you do not need it.
The set of requested values is provided as a constant table values (see https://learn.microsoft.com/en-us/sql/t-sql/queries/table-value-constructor-transact-sql?view=sql-server-2017)
I have added multiple rows for requested conditions : you can request for multiple datasources, agencies or months in one query with one line for each in the output.
If you want only one row, put only one row in "requested" pseudo table values.
There must be a GROUP BY, even if you do not want to use an aggregate (count, sum or other) in order to have the same behavior as your distinct clause , it restricts the output to single lines for requested values.
To me it seems that you want to see does data exists, i guess that your's AgencyID is foreign key to agency table, DataSourceID also to DataSource, and that you have AccountingMonth table which has all accounting periods:
SELECT ds.ID as DataSourceID , ag.ID as AgencyID , am.ID as AccountingMonth ,
ISNULL(COUNT(a.*),0) as Count
FROM [Table1] a
RIGHT JOIN [Datasource] ds ON ds.ID = a.DataSourceID
RIGHT JOIN [Agency] ag ON ag.ID = a.AgencyID
RIGHT JOIN [AccountingMonth] am on am.ID = a.AccountingMonth
GROUP BY ds.ID, ag.ID, am.ID
In this way you can see count of records per group by criteria. Notice RIGHT join, you must use RIGHT JOIN if you want to include all record from "Right" table.
In yours query you have DISTINCT a.DatasourceID and WHERE a.DatasourceID = 5 and it returns 5 if in table exists rows that match yours WHERE criteria, and returns null if there is no data. If you remove WHERE a.DatasourceID = 5 your query would break with error: subquery returned multiple rows.
the way you are doing only allows for one column and one record and giving it the name of test. It does not look like you really need to test for null. because you are returning null so that does nothing to help you. Remove all the null testing and return a full recordset distinct will also limit your returns to 1 record. When working with a single table you don't need an alias, if there are no spaces or keywords braced identifiers not required. if you need to see if you have an empty record set, test for it in the calling program.
SELECT DatasourceID, AgencyID,AccountingMonth
FROM Table1
WHERE DatasourceID = 5 AND AgencyID = 4 AND AccountingMonth = 201907

Select values based on DISTINCT combination of rest of columns Oracle DB

I want to select row IDs associated with distinct column combinations in the remainder of a table. For instance, if the distinct rows are
I want to get the row IDs associated with each row. I can't query for distinct IDs since they are the row's primary key (and hence are all distinct).
So far I have:
SELECT e.ID
FROM E_UPLOAD_TEST e
INNER JOIN (
SELECT DISTINCT WHAT, MATERIALS, ERROR_FIELD, UNITS, SEASONALITY, DATA_TYPE, DETAILS, METHODS, DATA_FORMAT
FROM E_UPLOAD_TEST) c
ON e.WHAT = c.WHAT AND e.MATERIALS = c.MATERIALS AND e.ERROR_FIELD = c.ERROR_FIELD AND e.DATA_TYPE = c.DATA_TYPE AND e.METHODS = c.METHODS AND e.DATA_FORMAT = c.DATA_FORMAT;
which runs but doesn't return anything. Am I missing a GROUP BY and/or MIN() statement?
#serg is correct. Every single row in your example has at least one column value that is null. That means that no row will match your join condition. That is why your query results in no rows found.
Modifying your condition might get you what you want so long has your data isn't changing frequently. If it is changing frequently, then you probably want a single query for the entire job otherwise you'll have to set your transaction so that it is immune to data changes.
An example of such a condition change is this:
( (e.WHAT is null and c.WHAT is null) or (e.WHAT = c.WHAT) )
But such a change makes sense only if two rows having a null value in the same column means the same thing for both rows and it has to mean the same thing as time marches on. What "WHAT is null" means today might not be the same thing tomorrow. And that is probably why C. J. Date hates nulls so much.
Instead of comparing, use the decode function which compares two null values correctly.
e.WHAT = c.WHAT -> DECODE(e.WHAT, c.WHAT, 1) = 1

performing an update query with a select subquery returning either the same value for ALL of the records or ora-01427 error

I need to update a column in one table with the results from a select sub-query (and they should ultimately be different). But When I do this, I either get the exact same number for the hundreds of records, or I get the ORA-01427: single row sub-query returns more than one row query. error.
Can you please take a look and see what it is that I am overlooking? (I could just be overlooking something simple for all I know)
UPDATE WD_PRODUCT_CLASS
SET CURRENT_CASES = ( WITH STUFF_COUNT AS
(
SELECT sum(CURRENT_DETAIL.COMBINED_QTY) AS TOTAL_CASES
FROM CURRENT_DETAIL, SKU_MAJORS, WD_PRODUCT_CLASS
WHERE CURRENT_DETAIL.LOC_ID =
&PARM_LOC_ID
AND CURRENT_DETAIL.INVEN_ID = SKU_MAJORS.INVEN_ID
AND WD_PRODUCT_CLASS.CATEGORY = SKU_MAJORS.CONT_DESC
GROUP BY WD_PRODUCT_CLASS.CATEGORY
)
(
SELECT SUM(Z.TOTAL_CASES) FROM STUFF_COUNT Z
)
);
Maybe you need someting like this:
UPDATE WD_PRODUCT_CLASS wpc
SET wpc.CURRENT_CASES = (
SELECT sum(cd.COMBINED_QTY)
FROM CURRENT_DETAIL cd join SKU_MAJORS sm ON cd.INVEN_ID = sm.INVEN_ID
WHERE cd.LOC_ID = &PARM_LOC_ID
AND sm.CONT_DESC = wpc.CATEGORY
)
WHERE 1=1; -- if you don't set a condition all the rows will be updated
Your query updates the table with the same values because you're using a not correlated subquery in the SET clause. This subquery don't depends on the parent query, so it's calculated only once.
I suppose you need a correlated subquery so I changed your update + removed some extra parts.

sql server - how to execute the second half of 'or' clause only when first one fails

Suppose I have a table with following records
value text
company/about about Us
company company
company/contactus company contact
I have a very simple query in sql server as below. I am having problem with the 'or' condition. In below query, I am trying to find text for value 'company/about'. If it is not found, then only I want to run the other side of 'or'. The below query returns two records as below
value text
company/about about Us
company company
Query
select
*
from
tbl
where
value='company/about' or
value=substring('company/about',0,charindex('/','company/about'))
How can I modify the query so the result set looks like
value text
company/about about Us
A bit roundabout, but you can check for the existence of results from the first where clause:
select
*
from
tbl
where
value='company/about' or
(
not exists (select * from tbl where value='company/about')
and
value=substring('company/about',0,charindex('/','company/about'))
)
Since your second condition can be re-written as value = 'company' this would work (at least for the data and query you've presented):
select top(1) [value], [text]
from dbo.MyTable
where value in ('company/about', 'company')
order by len(value) desc
The TOP() ignores the second row if both are found, and the ORDER BY ensures that the first row is always the one with 'company/about', if it exists.