Joining a list of values with table rows in SQL - sql

Suppose I have a list of values, such as 1, 2, 3, 4, 5 and a table where some of those values exist in some column. Here is an example:
id name
1 Alice
3 Cindy
5 Elmore
6 Felix
I want to create a SELECT statement that will include all of the values from my list as well as the information from those rows that match the values, i.e., perform a LEFT OUTER JOIN between my list and the table, so the result would be like follows:
id name
1 Alice
2 (null)
3 Cindy
4 (null)
5 Elmore
How do I do that without creating a temp table or using multiple UNION operators?

If in Microsoft SQL Server 2008 or later, then you can use Table Value Constructor
Select v.valueId, m.name
From (values (1), (2), (3), (4), (5)) v(valueId)
left Join otherTable m
on m.id = v.valueId
Postgres also has this construction VALUES Lists:
SELECT * FROM (VALUES (1, 'one'), (2, 'two'), (3, 'three')) AS t (num,letter)
Also note the possible Common Table Expression syntax which can be handy to make joins:
WITH my_values(num, str) AS (
VALUES (1, 'one'), (2, 'two'), (3, 'three')
)
SELECT num, txt FROM my_values
With Oracle it's possible, though heavier From ASK TOM:
with id_list as (
select 10 id from dual union all
select 20 id from dual union all
select 25 id from dual union all
select 70 id from dual union all
select 90 id from dual
)
select * from id_list;

the following solution for oracle is adopted from this source. the basic idea is to exploit oracle's hierarchical queries. you have to specify a maximum length of the list (100 in the sample query below).
select d.lstid
, t.name
from (
select substr(
csv
, instr(csv,',',1,lev) + 1
, instr(csv,',',1,lev+1 )-instr(csv,',',1,lev)-1
) lstid
from (select ','||'1,2,3,4,5'||',' csv from dual)
, (select level lev from dual connect by level <= 100)
where lev <= length(csv)-length(replace(csv,','))-1
) d
left join test t on ( d.lstid = t.id )
;
check out this sql fiddle to see it work.

Bit late on this, but for Oracle you could do something like this to get a table of values:
SELECT rownum + 5 /*start*/ - 1 as myval
FROM dual
CONNECT BY LEVEL <= 100 /*end*/ - 5 /*start*/ + 1
... And then join that to your table:
SELECT *
FROM
(SELECT rownum + 1 /*start*/ - 1 myval
FROM dual
CONNECT BY LEVEL <= 5 /*end*/ - 1 /*start*/ + 1) mypseudotable
left outer join myothertable
on mypseudotable.myval = myothertable.correspondingval

Assuming myTable is the name of your table, following code should work.
;with x as
(
select top (select max(id) from [myTable]) number from [master]..spt_values
),
y as
(select row_number() over (order by x.number) as id
from x)
select y.id, t.name
from y left join myTable as t
on y.id = t.id;
Caution: This is SQL Server implementation.
fiddle

For getting sequential numbers as required for part of output (This method eliminates values to type for n numbers):
declare #site as int
set #site = 1
while #site<=200
begin
insert into ##table
values (#site)
set #site=#site+1
end
Final output[post above step]:
select * from ##table
select v.id,m.name from ##table as v
left outer join [source_table] m
on m.id=v.id

Suppose your table that has values 1,2,3,4,5 is named list_of_values, and suppose the table that contain some values but has the name column as some_values, you can do:
SELECT B.id,A.name
FROM [list_of_values] AS B
LEFT JOIN [some_values] AS A
ON B.ID = A.ID

Related

How to missing numbers by 100s in oracle

I need to find the missing numbers in a table column in oracle, where the missing numbers must be taken by 100s , meaning that if it's found 1 number at least between 2000 and 2099 , all missing numbers between 2000 and 2099 must be returned and so on.
here is an example that clarify what I need:
create table test1 ( a number(9,0));
insert into test1 values (2001);
insert into test1 values (2002);
insert into test1 values (2004);
insert into test1 values (2105);
insert into test1 values (3006);
insert into test1 values (9410);
commit;
the result must be 2000,2003,2005 to 2099,2100 to 2104,2106 to 2199,3000 to 3005,3007 to 3099,9400 to 9409,9411 to 9499.
I started with this query but it's obviously not returning what I need :
SELECT Level+(2000-1) FROM dual CONNECT BY LEVEL <= 9999
MINUS SELECT a FROM test1;
You can use the hiearchy query as follows:
SQL> SELECT A FROM (
2 SELECT A + COLUMN_VALUE - 1 AS A
3 FROM ( SELECT DISTINCT TRUNC(A, - 2) A
4 FROM TEST_TABLE) T
5 CROSS JOIN TABLE ( CAST(MULTISET(
6 SELECT LEVEL FROM DUAL CONNECT BY LEVEL <= 100
7 ) AS SYS.ODCINUMBERLIST) ) LEVELS
8 )
9 MINUS
10 SELECT A FROM TEST_TABLE;
A
----------
2000
2003
2005
2006
2007
2008
2009
.....
.....
I like to use standard recursive queries for this.
with nums (a, max_a) as (
select min(a), max(a) from test1
union all
select a + 1, max_a from nums where a < max_a
)
select n.a
from nums n
where not exists (select 1 from test1 t where t.a = n.a)
order by n.a
The with clause takes the minimum and maximum value of a in the table, and generates all numbers in between. Then, the outer query filters on those that do not exist in the table.
If you want to generate ranges of missing numbers instead of a comprehensive list, you can use window functions instead:
select a + 1 start_a, lead_a - 1 end_a
from (
select a, lead(a) over(order by a) lead_a
from test1
) t
where lead_a <> a + 1
Demo on DB Fiddle
EDIT:
If you want the missing values within ranges of thousands, then we can slightly adapt the recursive solution:
with nums (a, max_a) as (
select distinct floor(a / 100) * 100 a, floor(a / 100) * 100 + 100 from test1
union all
select a + 1, max_a from nums where a < max_a
)
select n.a
from nums n
where not exists (select 1 from test1 t where t.a = n.a)
order by n.a
Assuming you define fixed upper and lower bound for the range, then just need to eliminate the results of the current query by use of NOT EXISTS such as
SQL> exec :min_val:=2000
SQL> exec :min_val:=2499
SQL> SELECT *
FROM
(
SELECT level + :min_val - 1 AS nr
FROM dual
CONNECT BY level <= :max_val - :min_val + 1
)
WHERE NOT EXISTS ( SELECT * FROM test1 WHERE a = nr )
ORDER BY nr;
/
Demo

can't use JOIN with generate_series on Redshift

generate_series function on Redshift works as expected, when used in a simple select statement.
WITH series AS (
SELECT n as id from generate_series (-10, 0, 1) n
) SELECT * FROM series;
-- Works fine
As soon as I add a JOIN condition, redshift throws
com.amazon.support.exceptions.ErrorException: Function
generate_series(integer,integer,integer)" not supported"
DROP TABLE testing;
CREATE TABLE testing (
id INT
);
WITH series AS (
SELECT n as id from generate_series (-10, 0, 1) n
) SELECT * FROM series S JOIN testing T ON S.id = T.id;
-- Function "generate_series(integer,integer,integer)" not supported.
Redshift Version
SELECT version();
-- PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.1485
Are there any workarounds to make this work?
generate_series is not supported by Redshift. It works only standalone on a leader node.
A workaround would be using row_number against any table that has sufficient number of rows:
with
series as (
select (row_number() over ())-11 from some_table limit 10
) ...
also, this question was asked multiple times already
You are correct that this does not work on Redshift.
See here.
The easiest workaround is to create a permanent table "manually" beforehand with the values within that table, e.g. you could have rows on that table for -1000 to +1000, then select the range from that table,
So for your example you would have something like
WITH series AS (
SELECT n as id from (select num as n from newtable where num between -10 and 0) n
) SELECT * FROM series S JOIN testing T ON S.id = T.id;
Does that work for you?
Alternatively, if you cannot create the table beforehand or prefer not to, you could use something like this
with ten_numbers as (select 1 as num union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9 union select 0)
,generted_numbers AS
(
SELECT (1000*t1.num) + (100*t2.num) + (10*t3.num) + t4.num-5000 as gen_num
FROM ten_numbers AS t1
JOIN ten_numbers AS t2 ON 1 = 1
JOIN ten_numbers AS t3 ON 1 = 1
JOIN ten_numbers AS t4 ON 1 = 1
)
select gen_num from generted_numbers
where gen_num between -10 and 0
order by 1;

INSERTing results FROM table USING CTE

I'm trying to insert values from a Table using CTE but something doesn't work. I suppose that CTE doesn't work in this way?
Can anyone explain to me what's wrong here and suggest me an alternative, or correct my Query if I'm missing something?
WITH t1 AS( SELECT id_sitio as id_sitio FROM Sitio WHERE nombre = 'Insert your site name here')
INSERT INTO Sitio_tipo_equipo (id_sitio, id_tipo_equipo)
VALUES
(t1.id_sitio ,4),
(t1.id_sitio, 6),
(t1.id_sitio, 7),
(t1.id_sitio, 9);
I think this is your intention:
WITH t1 AS (
SELECT id_sitio
FROM Sitio
WHERE nombre = 'Insert your site name here'
)
INSERT INTO Sitio_tipo_equipo(id_sitio, id_tipo_equipo)
SELECT t1.id_sitio, n.n
FROM t1 CROSS JOIN
(SELECT 4 as n UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 9
) n
You wouldn't normally use a CTE with an insert . . . values statement.
With the information at hand...
INSERT INTO Sitio_tipo_equipo (id_sitio, id_tipo_equipo)
SELECT id_sitio,eq.id_tipo_equipo
FROM Sitio
CROSS APPLY (
SELECT 4 [id_tipo_equipo] UNION
SELECT 6 UNION
SELECT 7 UNION
SELECT 9
) eq
WHERE nombre = 'Insert your site name here'
What I would do if I can correctly guess your database structure:
INSERT INTO Sitio_tipo_equipo (id_sitio, id_tipo_equipo)
SELECT id_sitio,id_tipo_equipo
FROM Sitio
CROSS APPLY tipo_equipo
WHERE nombre = 'Insert your site name here'
AND id_tipo_equipo IN (4,6,7,9)

Find overlapping sets of data in a table

I need to identify duplicate sets of data and give those sets who's data is similar a group id.
id threshold cost
-- ---------- ----------
1 0 9
1 100 7
1 500 6
2 0 9
2 100 7
2 500 6
I have thousands of these sets, most are the same with different id's. I need find all the like sets that have the same thresholds and cost amounts and give them a group id. I'm just not sure where to begin. Is the best way to iterate and insert each set into a table and then each iterate through each set in the table to find what already exists?
This is one of those cases where you can try to do something with relational operators. Or, you can just say: "let's put all the information in a string and use that as the group id". SQL Server seems to discourage this approach, but it is possible. So, let's characterize the groups using:
select d.id,
(select cast(threshold as varchar(8000)) + '-' + cast(cost as varchar(8000)) + ';'
from data d2
where d2.id = d.id
for xml path ('')
order by threshold
) as groupname
from data d
group by d.id;
Oh, I think that solves your problem. The groupname can serve as the group id. If you want a numeric id (which is probably a good idea, use dense_rank():
select d.id, dense_rank() over (order by groupname) as groupid
from (select d.id,
(select cast(threshold as varchar(8000)) + '-' + cast(cost as varchar(8000)) + ';'
from data d2
where d2.id = d.id
for xml path ('')
order by threshold
) as groupname
from data d
group by d.id
) d;
Here's the solution to my interpretation of the question:
IF OBJECT_ID('tempdb..#tempGrouping') IS NOT NULL DROP Table #tempGrouping;
;
WITH BaseTable AS
(
SELECT 1 id, 0 as threshold, 9 as cost
UNION SELECT 1, 100, 7
UNION SELECT 1, 500, 6
UNION SELECT 2, 0, 9
UNION SELECT 2, 100, 7
UNION SELECT 2, 500, 6
UNION SELECT 3, 1, 9
UNION SELECT 3, 100, 7
UNION SELECT 3, 500, 6
)
, BaseCTE AS
(
SELECT
id
--,dense_rank() over (order by threshold, cost ) as GroupId
,
(
SELECT CAST(TblGrouping.threshold AS varchar(8000)) + '/' + CAST(TblGrouping.cost AS varchar(8000)) + ';'
FROM BaseTable AS TblGrouping
WHERE TblGrouping.id = BaseTable.id
ORDER BY TblGrouping.threshold, TblGrouping.cost
FOR XML PATH ('')
) AS MultiGroup
FROM BaseTable
GROUP BY id
)
,
CTE AS
(
SELECT
*
,DENSE_RANK() OVER (ORDER BY MultiGroup) AS GroupId
FROM BaseCTE
)
SELECT *
INTO #tempGrouping
FROM CTE
-- SELECT * FROM #tempGrouping;
UPDATE BaseTable
SET BaseTable.GroupId = #tempGrouping.GroupId
FROM BaseTable
INNER JOIN #tempGrouping
ON BaseTable.Id = #tempGrouping.Id
IF OBJECT_ID('tempdb..#tempGrouping') IS NOT NULL DROP Table #tempGrouping;
Where BaseTable is your table, and and you don't need the CTE "BaseTable", because you have a data table.
You may need to take extra-precautions if your threshold and cost fields can be NULL.

How do you find a missing number in a table field starting from a parameter and incrementing sequentially?

Let's say I have an sql server table:
NumberTaken CompanyName
2 Fred 3 Fred 4 Fred 6 Fred 7 Fred 8 Fred 11 Fred
I need an efficient way to pass in a parameter [StartingNumber] and to count from [StartingNumber] sequentially until I find a number that is missing.
For example notice that 1, 5, 9 and 10 are missing from the table.
If I supplied the parameter [StartingNumber] = 1, it would check to see if 1 exists, if it does it would check to see if 2 exists and so on and so forth so 1 would be returned here.
If [StartNumber] = 6 the function would return 9.
In c# pseudo code it would basically be:
int ctr = [StartingNumber]
while([SELECT NumberTaken FROM tblNumbers Where NumberTaken = ctr] != null)
ctr++;
return ctr;
The problem with that code is that is seems really inefficient if there are thousands of numbers in the table. Also, I can write it in c# code or in a stored procedure whichever is more efficient.
Thanks for the help
A solution using JOIN:
select min(r1.NumberTaken) + 1
from MyTable r1
left outer join MyTable r2 on r2.NumberTaken = r1.NumberTaken + 1
where r1.NumberTaken >= 1 --your starting number
and r2.NumberTaken is null
I called my table Blank, and used the following:
declare #StartOffset int = 2
; With Missing as (
select #StartOffset as N where not exists(select * from Blank where ID = #StartOffset)
), Sequence as (
select #StartOffset as N from Blank where ID = #StartOffset
union all
select b.ID from Blank b inner join Sequence s on b.ID = s.N + 1
)
select COALESCE((select N from Missing),(select MAX(N)+1 from Sequence))
You basically have two cases - either your starting value is missing (so the Missing CTE will contain one row), or it's present, so you count forwards using a recursive CTE (Sequence), and take the max from that and add 1
Edit from comment. Yes, create another CTE at the top that has your filter criteria, then use that in the rest of the query:
declare #StartOffset int = 2
; With BlankFilters as (
select ID from Blank where hasEntered <> 1
), Missing as (
select #StartOffset as N where not exists(select * from BlankFilters where ID = #StartOffset)
), Sequence as (
select #StartOffset as N from BlankFilters where ID = #StartOffset
union all
select b.ID from BlankFilters b inner join Sequence s on b.ID = s.N + 1
)
select COALESCE((select N from Missing),(select MAX(N)+1 from Sequence))
this may now return a row that does exist in the table, but hasEntered=1
Tables:
create table Blank (
ID int not null,
Name varchar(20) not null
)
insert into Blank(ID,Name)
select 2 ,'Fred' union all
select 3 ,'Fred' union all
select 4 ,'Fred' union all
select 6 ,'Fred' union all
select 7 ,'Fred' union all
select 8 ,'Fred' union all
select 11 ,'Fred'
go
Try the set based approach - should be faster
select min(t1.NumberTaken)+1 as "min_missing" from t t1
where not exists (select 1 from t t2
where t1.NumberTaken = t2.NumberTaken+1)
and t1.NumberTaken > #StartingNumber
This is Sybase syntax, so massage for SQL server consumption if needed.
Create a temp table with all numbers from StartingValue to EndValue and LEFT OUTER JOIN to your data table.