Find missing values in a sequence (sql) - sql

Table1
Empid number
----------------
100 1
100 2
100 4
100 5
100 6
101 1
I'm self learning SQL, and a task I've come across is finding the missing values in sequence up to 12 and out putting which empid is associated.
I've attempted an approach that takes the above table and starts like
SELECT a number +1 , Min("through), MIn(by number) - 1
The entire approach use the existing numbers to find the missing "next/previous number. I'm able to output which numbers are missing. However I do not know how to group it with the associated id.
I also feel like I've complicated the task, I'm looking for guidance from anyone who can help on the best / most efficient way of going about this

Assuming that all empids and numbers are in the table somewhere, you can do this with a cross join and filter. In MS Access, this looks like:
select e.empid, n.number
from (select distinct empid from t) as e,
(select distinct number from t) as n
where not exists (select 1
from t
where t.empid = e.empid and t.number = n.number
);
This will not quite work for the data you have supplied. To handle that situation, you need a table that has the 12 numbers you are looking for.

Assumes you create a numbers table having Number column with 12 records value 1 to 12.
SELECT N.*, E.*
FROM NUMBERS N
CROSS JOIN (SELECT Distinct EmpID FROM table1) E
LEFT JOIN table1 T
on T.EmpID = E.EmpID
and T.Number = N.Number
WHERE T.EmpID is null
or substitute a derrived table for numbers table above
something like
(Select 1 as Number UNION ALL
Select 2 as Number UNION ALL
Select 3 as Number UNION ALL
Select 4 as Number UNION ALL
Select 5 as Number UNION ALL
Select 6 as Number UNION ALL
Select 7 as Number UNION ALL
Select 8 as Number UNION ALL
Select 9 as Number UNION ALL
Select 10 as Number UNION ALL
Select 11 as Number UNION ALL
Select 112 as Number)
I cant remember if MS Access will let you do this though...

Related

SQLite - Return Rows Even If They Are Duplicates

I have a simple SQLite table which has just one ID column.
I have some variable IDs that may be duplicates of each other like: 1,2,3,4,3,1 (These IDs are just examples, there could be hundreds of them).
And I have a simple query as follows:
SELECT ID FROM TABLE WHERE ID in (1,2,3,4,3,1)
In the usual case the answer contains only 4 rows with ids 1,2,3,4. Is there any way to force SQLite to return rows in the order of the request (1,2,3,4,3,1) even if they are duplicates?
I have n IDs in my query and I want n rows in return even if they are duplicates.
Edit: The Table Definition is:
CREATE TABLE TEST(ID TEXT PRIMARY KEY)
You can use left join:
select t.*
from (select 1 as id, 1 as ord union all
select 2 as id, 2 as ord union all
select 3 as id, 3 as ord union all
select 4 as id, 4 as ord union all
select 3 as id, 5 as ord union all
select 1 as id, 6 as ord
) ids left join
t
on t.id = ids.id
order by ids.ord;

SQL Duplicate Row Results

I have a very simple select query which is being used to create an input file for a piece of software. I have the query pulling all the required fields, however I need to replicate the results six times with a hard coded ID number (1,2,3,4,5,6).
I have seen CROSS APPLY and PIVOT but the problem is the column I need to use for these doesn't exist as I'm hard coded then number.
Any help would be much appreciated.
Thanks in Advance
Maybe like this:
select CJ.ID,T.* from dbo.Table T
CROSS JOIN
(select 1 ID UNION ALL select 2 ID UNION ALL select 3 ID UNION ALL select 4 ID UNION ALL select 5 ID UNION ALL select 6 ID) CJ
Bit of a pure guess here, but are you saying that every row in your table needs to be repeated 6 times with the ID 1-6? If so, you can use a CTE of the values 1-6 and CROSS APPLY to that.
WITH Nums AS(
SELECT *
FROM (VALUES (1),(2),(3),(4),(5),(6)) V(N))
SELECT *
FROM YourTable YT
CROSS APPLY Nums;

Sql in oracle to find out missing records from its distinct values

I am sorry , this one is not working... May be I should have clarified this earlier. The values A,B,C,D etc... Are the Distinct values for CODE in the Table. There are several hundreds of IDs in the table and each ID can have one to many Code values. In the above example assume that there are 5 distinct values of Code from table A. There are 3 IDs and each ID is associated in Table A as follows
ID Code
1 A
1 B
1 C
2 D
2 A
3 B
3 C
4 A
4 B
4 C
4 D
4 E
As you see above there are several IDs associated with different Code values. I need the result as follows
ID CODE
1 D
1 E
2 B
2 C
2 E
3 A
3 D
3 E
ID 4 should not return anything because it contain all possible Codes (in this case A,B,C,D,E)
First you should take distinct values for both column in different sub-query, second cross join them - that will give you all possible combination,
finally exclude combination which are already presnet
select *
from
(select distinct ID
from your_table) ytI, /* this sub-query will return all possible ID */
(select distinct code
from your_table) ytc /* this sub-query will return all possible code */
where (ytI.ID,ytc.Code) /* there will be cross-join as there are no join condition between first two tables*/
not in /* exclude those records which are already present */
(select id,code
from your_table yt_i)
try this
select T2.ID, T1.missing_value
from
(
select 'A' missing_value from dual UNION
select 'B' from dual UNION
select 'C' from dual UNION
select 'D' from dual UNION
select 'E' from dual
) T1,
(
select distinct id from MYTABLE
) T2
WHERE NOT EXISTS
(
SELECT * FROM MYTABLE M WHERE M.CODE = T1.missing_value and M.ID = T2.ID
)
ORDER BY T2.ID, T1.missing_value

Selecting a sequence in SQL

There seems to be a few blog posts on this topic but the solutions really are not so intuitive. Surely there's a "Canonical" way?
I'm using Teradata SQL.
How would I select
A range of number
A date range
E.g.
SELECT 1:10 AS Nums
SELECT 1-1-2010:5-1-2014 AS Dates1
The result would be 10 rows (1 - 10) in the first SELECT query and ~(365 * 3.5) rows in the second?
The "canonical" way to do this in SQL is using recursive CTEs, which the more recent versions of Teradata support.
For your first example:
with recursive nums(n) as (
select 1 as n
union all
select n + 1
from nums
where n < 10
)
select *
from nums;
You can do something similar for dates.
EDIT:
You can also do this by using row_number() and an existing table:
with nums(n) as (
select n
from (select row_number() over (order by col) as n
from ExstingTable t
) t
where n <= 10
)
select *
from nums;
ExistingTable is just any table with enough rows. The best choice of col is the primary key.
with digits(n) as (
select 1 as n union all select 2 union all select 3 union all select 4 union all select 5 union all
select 6 union all select 7 union all select 8 union all select 9 union all select 10
)
select *
from digits;
If your version of Teradata supports multiple CTEs, you can build on the above:
with digits(n) as (
select 1 as n union all select 2 union all select 3 union all select 4 union all select 5 union all
select 6 union all select 7 union all select 8 union all select 9 union all select 10
),
nums(n) as (
select d1.n*100 + d2.n*10 + d3.n
from digits d1 cross join digits d2 cross join digits d3
)
select *
from nums;
In Teradata you can use the existing sys_calendar to get those dates:
SELECT calendar_date
FROM sys_calendar.CALENDAR
WHERE calendar_date BETWEEN DATE '2010-01-01' AND DATE '2014-05-01';
Note:
DATE '2010-01-01' is the only recommended way to write a date in Teradata
There's probably another custom calendar for the specific business needs of your company, too. Everyone will have access rights to it.
You might also use this for the range of numbers:
SELECT day_of_calendar
FROM sys_calendar.CALENDAR
WHERE day_of_calendar BETWEEN 1 AND 10;
But you should check Explain to see if the estimated number of rows is correct. sys_calendar is a kind of template and day_of_calendar is a calculated column, so no statistics exists on that and Explain will return an estimated number of 14683 (20 percent of the number of rows in that table) instead of 10. If you use it in additional joins the optimizer might do a bad plan based on that totally wrong number.
Note:
If you use sys_calendar you are limited to a maximum of 73414 rows, dates between 1900-01-01 and 2100-12-31 and numbers between 1 and 73414, your business calendar might vary.
Gordon Linoff's recursive query is not really efficient in Teradata, as it's a sequential row-by-row processing in a parallel database (each loop is an "all-AMPs step" in Explain) and the optimizer doesn't know how many rows will be returned.
If you need those ranges regularly you might consider creating a numbers table, I usually got one with a million rows or I use my calendar with the full range of 10000 years :-)
--DROP TABLE nums;
CREATE TABLE nums(n INT NOT NULL PRIMARY KEY CHECK (n BETWEEN 0 AND 999999));
INSERT INTO Nums
WITH cte(n) AS
(
SELECT day_of_calendar - 1
FROM sys_calendar.CALENDAR
WHERE day_of_calendar BETWEEN 1 AND 1000
)
SELECT
t1.n +
t2.n * 1000
FROM cte t1 CROSS JOIN cte t2;
COLLECT STATISTICS COLUMN(n) ON Nums;
The COLLECT STATS is the most important step to get correct estimates.
Now it's a simple
SELECT n FROM nums WHERE n BETWEEN 1 AND 10;
There's also a nice UDF on GitHub for creating sequences which is easy to use:
SELECT DATE '2010-01-01' + SEQUENCE
FROM TABLE(gen_sequence(0,DATE '2014-05-01' - DATE '2010-01-01')) AS t;
SELECT SEQUENCE
FROM TABLE(gen_sequence(1,10)) AS t;
But it's usually hard to convince your DBA to install any C-UDFs and the number of rows returned is unknown again.
sequence 1 to 10
sel sum (1) over (ROWS UNBOUNDED PRECEDING) as seq_val
from sys_calendar.CALENDAR
qualify row_number () over (order by 1)<=10

Inserting rows where column can have many values

I am writing a stored proc that inserts rows into a table. The issue is that many of the columns can have a list of different values and all of the rows in the db need to reflect these values. For example:
I have a table: Table1(state, number)
state will need to be 1-50 as its value and number is 1-3. There needs to be a row for each state with each number.
(1,1)
(1,2)
(1,3)
(2,1)...etc
There has got to be a nice way to do this but my research has not been fruitful. Does anyone have any suggestions?
A good way to generate the values is using a cross join. Here is an example:
insert into table(state, number)
select s.state, n.number
from (select 'AK' as state union all select 'AL' union all . . .
) s cross join
(select 1 as number union all select 2 union all select 3
) n
You may already have a lists of states and/or numbers, in which case you can use this. For example:
insert into table(state, number)
select s.state, n.number
from (select state from states
) s cross join
(select 1 as number union all select 2 union all select 3
) n
Your need is a cross join between two tables, one containing 50 rows, the other 3 rows.
In Oracle:
select *
from
(
select rownum as state
from dual
connect by rownum <= 50
) t1
,
(
select rownum as num
from dual
connect by rownum <= 3
) t2
Fiddle