SQL query for selecting records where any of a given list of integers is between columnA and columnB - sql

How can I get records from my table where any of a list of integers is in the range defined by columnA and columnB integer values?
I know about the IN operator when comparing against a column value instead of a range defined by a pair of columns.
For example: select * from mytable where mytable.colA in (1,3,5,6); would get all records where colA is either 1,3,5 or 6.
Is there anything like that for ranges? Or should I do like:
select * from mytable where 1 between mytable.colA and mytable.colb
OR
3 between mytable.colA and mytable.colb
OR
5 between mytable.colA and mytable.colb
OR
6 between mytable.colA and mytable.colb;

Maybe this way:
select distinct mytable.*
from mytable
join (select 1 nr union all select 3 union all select 5 union all select 6) n
on n.nr between mytable.colA and mytable.colb
Update:
Just tested on MariaDB (10.0.19) and a 1M-row indexed table.. Your original query is ways faster.

A common tactic is to set up a temporary table, and use that to join on your main table.
A simple way to set one up is like so:
DECLARE #TempList table (LookFor int not null)
INSERT #TempList (LookFor) values
(1)
,(3)
,(5)
,(6)
As this is a table, you can use querying logic to populate it.
Next up, join this into your target table. For your example above:
SELECT mt.*
from myTable mt
inner join #TempList tl
on tl.LookFor = mt.ColA
And, if I'm interpreting correctly, this might be what you're really looking for:
SELECT mt.*
from myTable mt
inner join #TempList tl
on tl.LookFor between mt.ColA and mt.ColB

Related

Finding the id's which include multiple criteria in long format

Suppose I have a table like this,
id
tagId
1
1
1
2
1
5
2
1
2
5
3
2
3
4
3
5
3
8
I want to select id's where tagId includes both 2 and 5. For this fake data set, It should return 1 and 3.
I tried,
select id from [dbo].[mytable] where tagId IN(2,5)
But it takes 2 and 5 into account respectively. I also did not want to keep my table in wide format since tagId is dynamic. It can reach any number of columns. I also considered filtering with two different queries to find (somehow) the intersection. However since I may search more than two values inside the tagId in real life, it sounds inefficient to me.
I am sure that this is something faced before when tag searching. What do you suggest? Changing table format?
One option is to count the number of distinct tagIds (from the ones you're looking for) each id has:
SELECT id
FROM [dbo].[mytable]
WHERE tagId IN (2,5)
GROUP BY id
HAVING COUNT(DISTINCT tagId) = 2
This is actually a Relational Division With Remainder question.
First, you have to place your input into proper table format. I suggest you use a Table Valued Parameter if executing from client code. You can also use a temp table or table variable.
DECLARE #ids TABLE (tagId int PRIMARY KEY);
INSERT #ids VALUES (2), (5);
There are a number of different solutions to this type of question.
Classic double-negative EXISTS
SELECT DISTINCT
mt.Id
FROM mytable mt
WHERE NOT EXISTS (SELECT 1
FROM #ids i
WHERE NOT EXISTS (SELECT 1
FROM mytable mt2
WHERE mt2.id = mt.id
AND mt2.tagId = i.tagId)
);
This is not usually efficient though
Comparing to the total number of IDs to match
SELECT mt.id
FROM mytable mt
JOIN #ids i ON i.tagId = mt.tagId
GROUP BY mt.id
HAVING COUNT(*) = (SELECT COUNT(*) FROM #ids);
This is much more efficient. You can also do this using a window function, it may be more or less efficient, YMMV.
SELECT mt.Id
FROM mytable mt
JOIN (
SELECT *,
total = COUNT(*) OVER ()
FROM #ids i
) i ON i.tagId = mt.tagId
GROUP BY mt.id
HAVING COUNT(*) = MIN(i.total);
Another solution involves cross-joining everything and checking how many matches there are using conditional aggregation
SELECT mt.id
FROM (
SELECT
mt.id,
mt.tagId,
matches = SUM(CASE WHEN i.tagId = mt.tagId THEN 1 END),
total = COUNT(*)
FROM mytable mt
CROSS JOIN #ids i
GROUP BY
mt.id,
mt.tagId
) mt
GROUP BY mt.id
HAVING SUM(matches) = MIN(total)
AND MIN(matches) >= 0;
db<>fiddle
There are other solutions also, see High Performance Relational Division in SQL Server

Microsoft SQL Server - Convert column values to list for SELECT IN

I have this (3 int columns in one table)
Int1 Int2 Int3
---------------
1 2 3
I would like to run such query with another someTable:
SELECT * FROM someTable WHERE someInt NOT IN (1,2,3)
where 1,2,3 are list of INTs converted to a list that I can use with SELECT * NOT IN statement
Any suggestions how to achieve this without stored procedures in Micorosft SQL Server 2019 ?
If you want rows in some table that are not in one of three columns of another table, then use not exists:
select t.*
from sometable t
where not exists (select 1
from t t2
where t.someint in (t2.int1, t2.int2, t2.int3)
);
The subquery returns a row where there is a match. The outer query then rejects any rows with a match.
Seems like you actually want a NOT EXISTS?
SELECT {Your Columns}
FROM dbo.someTable sT
WHERE NOT EXISTS (SELECT 1
FROM dbo.oneTable oT
WHERE sT.someInt NOT IN (oT.int1,oT.int2,oT.int3));
An alternative method would be to unpivot the data, and then use an equality operator:
SELECT {Your Columns}
FROM dbo.someTable sT
WHERE NOT EXISTS (SELECT 1
FROM dbo.oneTable oT
CROSS APPLY (VALUES(oT.int1),(oT.int2),(oT.int3))V(I)
WHERE V.I = sT.someInt);

Combine three columns from different tables into one row

I am new to sql and are trying to combine a column value from three different tables and combine to one row in DB2 Warehouse on Cloud. Each table consists of only one row and unique column name. So what I want to is just join these three to one row their original column names.
Each table is built from a statement that looks like this:
SELECT SUM(FUEL_TEMP.FUEL_MLAD_VALUE) AS FUEL
FROM
(SELECT ML_ANOMALY_DETECTION.MLAD_METRIC AS MLAD_METRIC, ML_ANOMALY_DETECTION.MLAD_VALUE AS FUEL_MLAD_VALUE, ML_ANOMALY_DETECTION.TAG_NAME AS TAG_NAME, ML_ANOMALY_DETECTION.DATETIME AS DATETIME, DATA_CONFIG.SYSTEM_NAME AS SYSTEM_NAME
FROM ML_ANOMALY_DETECTION
INNER JOIN DATA_CONFIG ON
(ML_ANOMALY_DETECTION.TAG_NAME =DATA_CONFIG.TAG_NAME AND
DATA_CONFIG.SYSTEM_NAME = 'FUEL')
WHERE ML_ANOMALY_DETECTION.MLAD_METRIC = 'IFOREST_SCORE'
AND ML_ANOMALY_DETECTION.DATETIME >= (CURRENT DATE - 9 DAYS)
ORDER BY DATETIME DESC)
AS FUEL_TEMP
I have tried JOIN, INNER JOIN, UNION/UNION ALL, but can't get it to work as it should. How can I do this?
Use a cross-join like this:
create table table1 (field1 char(10));
create table table2 (field2 char(10));
create table table3 (field3 char(10));
insert into table1 values('value1');
insert into table2 values('value2');
insert into table3 values('value3');
select *
from table1
cross join table2
cross join table3;
Result:
field1 field2 field3
---------- ---------- ----------
value1 value2 value3
A cross join joins all the rows on the left with all the rows on the right. You will end up with a product of rows (table1 rows x table2 rows x table3 rows). Since each table only has one row, you will get (1 x 1 x 1) = 1 row.
Using UNION should solve your problem. Something like this:
SELECT
WarehouseDB1.WarehouseID AS TheID,
'A' AS TheSystem,
WarehouseDB1.TheValue AS TheValue
FROM WarehouseDB1
UNION
SELECT
WarehouseDB2.WarehouseID AS TheID,
'B' AS TheSystem,
WarehouseDB2.TheValue AS TheValue
FROM WarehouseDB2
UNION
WarehouseDB3.WarehouseID AS TheID,
'C' AS TheSystem,
WarehouseDB3.TheValue AS TheValue
FROM WarehouseDB3
Ill adapt the code with your table names and rows if you tell me what they are. This kind of query would return something like the following:
TheID TheSystem TheValue
1 A 10
2 A 20
3 B 30
4 C 40
5 C 50
As long as your column names match in each query, you should get the desired results.

Query Optimization with millions of row in table

i have a table which has 4 columns
PKID,OutMailID,JobMailingDate,InsertDatetime
This is how the data ot inserted into the table
PKID is the primary Key of the table
for a single outMailID with JObMailingDate there are on avg 3 records are present in the table with
different insert date time. The table is having millions of records
I have many other table which has the same data but those is partaining to different category
Now i would like to find out the
1) Find All OutMailID Whose InsertDatetime is in between the Parameter data range
2) Once i have the list of OutMailID I would Like to Find the Minimum InsertDatetime for all these OutMailID Where this min Date falls between Param 1 and Param2
The Data for the table is like this
Select 1 as PKID,1 as OutMailID,'2010/01/01' as JobMailingDate,'2010/01/01' as InsertDatetime
UNION ALL
Select 2 as PKID,1 as OutMailID,'2010/01/01' as JobMailingDate,'2010/01/02' as InsertDatetime
UNION ALL
Select 3 as PKID,1 as OutMailID,'2010/01/01' as JobMailingDate,'2010/01/03' as InsertDatetime
UNION ALL
Select 4 as PKID,1 as OutMailID,'2010/01/01' as JobMailingDate,'2010/01/04' as InsertDatetime
All the above 2 steps i want to perform in a single Query so my query is somethig like this
Select
OutMailID,Min(InsertDatetime)
from
Table T
INNER JOIN
(
Select
OutMailID
from
Table
Where
InsertDatetime Between #Param1 and #Param2
) as T1 On (T1.OutMailID = T.outMailID)
Group by
OutMailID
Having Min(InsertDatetime) Between Between #Param1 and #Param2
But this is not Performing well. can anyone please suggest me a good way of doing this
The second problem is that once i have the output of first query then i use the same above query for other category to find out the min InsertDatatime in that category and once i have all the min date for all the category then i have to find the Min insert date among all the category
Can you please help me on this
Thanks
Atul
Does this query give you the desired results?
Select T.OutMailID, Min(T.InsertDatetime)
from Table T
INNER JOIN Table T1 On T1.OutMailID = T.outMailID
And T2.InsertDatetime Between #Param1 and #Param2
Group by OutMailID
How about using on this with statement, the with is like views that keeps everything in cache to have it for later, here is an example
with Table1 as (
Select OutMailID from Table Where InsertDatetime Between #Param1 and #Param2
),
Table2 as (
Select 4 as PKID,1 as OutMailID,'2010/01/01' as JobMailingDate,'2010/01/04' as InsertDatetime
)
select * from Table as T
inner join Table1 as T1 on T1.OutMailID = T.outMailID
group by T.OutMailID
That way you can reuse the Table1 multiple times without re-querying it again.
I think a simpler way to express your requirement is that you want all OutMailId whose first InsertDateTime is in the period specified.
It turns out that the JOIN is not necessary at all for this. This is a simpler version of your query:
Select t.OutMailID, Min(InsertDatetime)
from Table T
Group by OutMailID
Having Min(InsertDatetime) Between #Param1 and #Param2;
Many databases could take advantage of an index on Table(OutMailId, InsertDateTime) for this query.
Now, this query might not be super efficient, particularly if the range is small relative to the entire data. So, sticking with the above index, the following might work better:
select t.*
from (select OutMailId, min(InsertDatetime) as min_InsertDatetime
from table t
where InsertDatetime Between #Param1 and #Param2
group by OutMailId
) t
where not exists (select 1
from table t2
where t2.OutMailId = t.OutMailId and
t2.InsertDateTime < #Param1
);
This should use the index for the first subquery, limiting the number of ids. It should use the same index for the not exists, on a reduced number of rows.

Using the distinct function in SQL

I have a SQL query I am running. What I was wanting to know is that is there a way of selecting the rows in a table where the value in on one of those columns is distinct? When I use the distinct function, It returns all of the distinct rows so...
select distinct teacher from class etc.
This works fine, but I am selecting multiple columns, so...
select distinct teacher, student etc.
but I don't want to retrieve the distinct rows, I want the distinct rows where the teacher is distinct. So this query would probably return the same teacher's name multiple times because the student value is different but what I would like is to return rows where the teachers are distinct, even if it means returning the teacher and one student name (because I don't need all the students).
I hope what I am trying to ask is clear but is it possible to use the distinct function on a single column even when selecting multiple columns or is there any other solution to this problem? Thanks.
The above is just an example I am giving. I don't know if using 'distinct' is the solution to my problem. I am not using teacher etc. that was just an example to get the idea accross. I am selecting multiple columns (about 10) from different tables. I have a query to get the tabled result I want. Now I want to query that table to find the unique values in one particular column. So using the teacher example again, say I have wrote a query and I have all the teachers and all the pupils they teach. Now I want to go through each row in this table and email the teacher a message. But I don't want to email the teacher numerous times, just the once, so I want to return all the columns from the table I have, where only the teacher value is distinct.
Col A Col B Col C Col D
a b c d
a c d b
b a a c
b c c c
A query I have produces the above table. Now I want only those rows where Col A values are unique. How would I go about it?
You have misunderstood the DISTINCT keyword. It is not a function and it does not modify a column. You cannot SELECT a, DISTINCT(b), c, DISTINCT(d) FROM SomeTable. DISTINCT is a modifier for the query itself, i.e. you don't select a distinct column, you make a SELECT DISTINCT query.
In other words: DISTINCT tells the server to go through the whole result set and remove all duplicate rows after the query has been performed.
If you need a column to contain every value once, you need to GROUP BY that column. Once you do that, the server now needs to do which student to select with each teacher, if there are multiple, so you need to provide a so-called aggregate function like COUNT(). Example:
SELECT teacher, COUNT(student) AS amountStudents
FROM ...
GROUP BY teacher;
One option is to use a GROUP BY on Col A. Example:
SELECT * FROM table_name
GROUP BY Col A
That should return you:
abcd
baac
Based on the limited details you provided in your question (you should explain how/why your data is in different tables, what DB server you are using, etc) you can approach this from 2 different directions.
Reduce the number of columns in your query to only return the "teacher" and "email" columns but using the existing WHERE criteria. The problem you have with your current attempt is both DISTINCT and GROUP BY don't understand that you one want 1 row for each value of the column that you are trying to be distinct about. From what I understand, MySQL has support for what you are doing using GROUP BY but MSSQL does not support result columns not included in the GROUP BY statement. If you don't need the "student" columns, don't put them in your result set.
Convert your existing query to use column based sub-queries so that you only return a single result for non-grouped data.
Example:
SELECT t1.a
, (SELECT TOP 1 b FROM Table1 t2 WHERE t1.a = t2.a) AS b
, (SELECT TOP 1 c FROM Table1 t2 WHERE t1.a = t2.a) AS c
, (SELECT TOP 1 d FROM Table1 t2 WHERE t1.a = t2.a) AS d
FROM dbo.Table1 t1
WHERE (your criteria here)
GROUP BY t1.a
This query will not be fast if you have a lot of data, but it will return a single row per teacher with a somewhat random value for the remaining columns. You can also add an ORDER BY to each sub-query to further tweak the values returned for the additional columns.
I'm not sure if I am understanding this right but couldn't you do
SELECT * FROM class WHERE teacher IN (SELECT DISTINCT teacher FROM class)
This would return all of the data in each row where the teacher is distinct
distinct requires a unique result-set row. This means that whatever values you select from your table will need to be distinct together as a row from any other row in the result-set.
Using distinct can return the same value more than once from a given field as long as the other corresponding fields in the row are distinct as well.
As soulmerge and Shiraz have mentioned you'll need to use a GROUP BY and subselect. This worked for me.
DECLARE #table TABLE (
[Teacher] [NVarchar](256) NOT NULL ,
[Student] [NVarchar](256) NOT NULL
)
INSERT INTO #table VALUES ('Teacher 1', 'Student 1')
INSERT INTO #table VALUES ('Teacher 1', 'Student 2')
INSERT INTO #table VALUES ('Teacher 2', 'Student 3')
INSERT INTO #table VALUES ('Teacher 2', 'Student 4')
SELECT
T.[Teacher],
(
SELECT TOP 1 T2.[Student]
FROM #table AS T2
WHERE T2.[Teacher] = T.[Teacher]
) AS [Student]
FROM #table AS T
GROUP BY T.[Teacher]
Results
Teacher 1, Student 1
Teacher 2, Student 3
You need to do it with a sub select where you take TOP 1 of student where the teacher is the same.
You may try "GROUP BY teacher" to return what you need.
What is the question your query is trying to answer?
Do you need to know which classes have only one teacher?
select class_name, count(teacher)
from class group by class_name having count(teacher)=1
Or are you looking for teachers with only one student?
select teacher, count(student)
from class group by teacher having count(student)=1
Or is it something else? The question you've posed assumes that using DISTINCT is the correct approach to the query you're trying to construct. It seems likely this is not the case. Could you describe the question you're trying to answer with DISTINCT?
You will need to say how your data is stored in-memory for us to say how you can query it.
But you could do a separate query to just get the distinct teachers.
select distinct teacher from class
I am struggling to understand exactly what you wish to do.. but you can do something like this:
SELECT DISTINCT ColA FROM Table WHERE ...
If you only select a singular column, the distinct will only grab those.
If you could clarify a little more, I could try to help a bit more.
You could use GROUP BY to separate the return values based on a single column value.
All you have to do is select just the columns you want the first one and do a select Distinct
Select Distinct column1 -- where your criteria...
The following might help you get to your solution. The other poster did point to this but his syntax for group by was incorrect.
Get all teachers that teach any classes.
Select teacher_id, count(*)
from teacher_table inner join classes_table
on teacher_table.teacher_id = classes_table.teacher_id
group by teacher_id
Noone seems to understand what you want. I will take another guess.
Select * from tbl
Where ColA in (Select ColA from tbl Group by ColA Having Count(ColA) = 1)
This will return all data from rows where ColA is unique -i.e. there isn't another row with the same ColA value. Of course, that means zero rows from the sample data you provided.
select cola,colb,colc
from yourtable
where cola in
(
select cola from yourtable where your criteria group by cola having count(*) = 1
)
declare #temp as table (colA nchar, colB nchar, colC nchar, colD nchar, rownum int)
insert #temp (colA, colB, colC, colD, rownum)
select Test.ColA, Test.ColB, Test.ColC, Test.ColD, ROW_NUMBER() over (order by ColA) as rownum
from Test
select t1.ColA, ColB, ColC, ColD
from #temp as t1
join (
select ColA, MIN(rownum) [min]
from #temp
group by Cola)
as t2 on t1.Cola = t2.Cola and t1.rownum = t2.[min]
This will return a single row for each value of the colA.
CREATE FUNCTION dbo.DistinctList
(
#List VARCHAR(MAX),
#Delim CHAR
)
RETURNS
VARCHAR(MAX)
AS
BEGIN
DECLARE #ParsedList TABLE
(
Item VARCHAR(MAX)
)
DECLARE #list1 VARCHAR(MAX), #Pos INT, #rList VARCHAR(MAX)
SET #list = LTRIM(RTRIM(#list)) + #Delim
SET #pos = CHARINDEX(#delim, #list, 1)
WHILE #pos > 0
BEGIN
SET #list1 = LTRIM(RTRIM(LEFT(#list, #pos - 1)))
IF #list1 <> ''
INSERT INTO #ParsedList VALUES (CAST(#list1 AS VARCHAR(MAX)))
SET #list = SUBSTRING(#list, #pos+1, LEN(#list))
SET #pos = CHARINDEX(#delim, #list, 1)
END
SELECT #rlist = COALESCE(#rlist+',','') + item
FROM (SELECT DISTINCT Item FROM #ParsedList) t
RETURN #rlist
END
GO