Select the row_number or fake identity - sql

I have a table that did not have an identity column added to it. I don't really need one for any specific purpose.
I had 821 rows and I did a test of 500 more. Now I need to check those 500 files and I was looking for a simple way to
select * from table where row_number > 821
I've tried row_number() but I can't have them be ordered, I need all rows above 821 returned.

A table is an unordered bag of rows. You can't identify the first 820 rows that were inserted unless you have some column to identify the order of insertion (a trusted IDENTITY or datetime column, for example). Otherwise you are throwing a bunch of marbles on the floor and asking the first person who walks into the room to identify the first 820 marbles that fell.
Here is a very simple example that demonstrates that output order cannot be predicted, and certainly can't be relied upon to be FIFO (and also shows a case where HABO's "solution" breaks):
CREATE TABLE dbo.foo(id INT, x CHAR(1));
CREATE CLUSTERED INDEX x ON dbo.foo(x);
-- or CREATE INDEX x ON dbo.foo(x, id); -- doesn't require a clustered index to prove
INSERT dbo.foo VALUES(1,'z');
INSERT dbo.foo VALUES(2,'y');
INSERT dbo.foo VALUES(3,'x');
INSERT dbo.foo VALUES(4,'w');
INSERT dbo.foo VALUES(5,'v');
INSERT dbo.foo VALUES(6,'u');
INSERT dbo.foo VALUES(7,'t');
INSERT dbo.foo VALUES(8,'s');
SELECT TOP (5) id, x, ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM dbo.foo;
SELECT * FROM dbo.foo;
Results:
---- ---- ----
8 s 1
7 t 2
6 u 3
5 v 4
4 w 5
---- ----
8 s
7 t
6 u
5 v
4 w
3 x
2 y
1 z
SQLfiddle demo

The following is a hack that may help, but is NOT a generally useful solution:
Row_Number() over (order by (select NULL)) as UntrustworthyRowNumber

Related

Add column to ensure composite key is unique

I have a table which needs to have a composite primary key based on 2 columns (Material number, Plant).
For example, this is how it is currently (note that these rows are not unique):
MATERIAL_NUMBER PLANT NUMBER
------------------ ----- ------
000000000000500672 G072 1
000000000000500672 G072 1
000000000000500672 G087 1
000000000000500672 G207 1
000000000000500672 G207 1
However, I'll need to add the additional column (NUMBER) to the composite key such that each row is unique, and it must work like this:
For each MATERIAL_NUMBER, for each PLANT, let NUMBER start at 1 and increment by 1 for each duplicate record.
This would be the desired output:
MATERIAL_NUMBER PLANT NUMBER
------------------ ----- ------
000000000000500672 G072 1
000000000000500672 G072 2
000000000000500672 G087 1
000000000000500672 G207 1
000000000000500672 G207 2
How would I go about achieving this, specifically in SQL Server?
Best Regards!
SOLVED.
See below:
SELECT MATERIAL_NUMBER, PLANT, (ROW_NUMBER() OVER (PARTITION BY MATERIAL_NUMBER, PLANT ORDER BY VALID_FROM)) as NUMBER
FROM Table_Name
Will output the table in question, with the NUMBER column properly defined
Suppose this is actual table,
create table #temp1(MATERIAL_NUMBER varchar(30),PLANT varchar(30), NUMBER int)
Suppose you want to insert only single record then,
declare #Num int
select #Num=isnull(max(number),0) from #temp1 where MATERIAL_NUMBER='000000000000500672' and PLANT='G072'
insert into #temp1 (MATERIAL_NUMBER,PLANT , NUMBER )
values ('000000000000500672','G072',#Num+1)
Suppose you want to insert bulk record.Your bulk record sample data is like
create table #temp11(MATERIAL_NUMBER varchar(30),PLANT varchar(30))
insert into #temp11 (MATERIAL_NUMBER,PLANT)values
('000000000000500672','G072')
,('000000000000500672','G072')
,('000000000000500672','G087')
,('000000000000500672','G207')
,('000000000000500672','G207')
You want to insert `#temp11` in `#temp1` maintaining number id
insert into #temp1 (MATERIAL_NUMBER,PLANT , NUMBER )
select t11.MATERIAL_NUMBER,t11.PLANT
,ROW_NUMBER()over(partition by t11.MATERIAL_NUMBER,t11.PLANT order by (select null))+isnull(maxnum,0) as Number from #temp11 t11
outer apply(select MATERIAL_NUMBER,PLANT,max(NUMBER)maxnum from #temp1 t where t.MATERIAL_NUMBER=t11.MATERIAL_NUMBER
and t.PLANT=t11.PLANT group by MATERIAL_NUMBER,PLANT) t
select * from #temp1
drop table #temp1
drop table #temp11
Main question is Why you need number column ? In mot of the cases you don't need number column,you can use ROW_NUMBER()over(partition by t11.MATERIAL_NUMBER,t11.PLANT order by (select null)) to display where you need. This will be more efficient.
Or tell the actual situation and number of rows involved where you will be needing Number column.

SQL - List all pages in between record while maintaining ID key

I'm trying to come up with a useful way to list all pages in between the first of last page of a document into new rows while maintaining the ID number as a key, or cross reference. I have a few ways of getting pages in between, but I'm not exactly sure how to maintain the key in a programmatic way.
Example Input:
First Page Last Page ID
ABC_001 ABC_004 1
ABC_005 ABC_005 2
ABC_006 ABC_010 3
End Result:
All Pages ID
ABC_001 1
ABC_002 1
ABC_003 1
ABC_004 1
ABC_005 2
ABC_006 3
ABC_007 3
ABC_008 3
ABC_009 3
ABC_010 3
Any help is much appreciated. I'm using SQL mgmt studio.
One approach would be to set up a numbers table, that contains a list of numbers that you may possibly find in the column content:
CREATE TABLE numbers( idx INTEGER);
INSERT INTO numbers VALUES(1);
INSERT INTO numbers VALUES(2);
...
INSERT INTO numbers VALUES(10);
Now, assuming that all page values have 7 characters, with the last 3 being digits, we can JOIN the original table with the numbers table to generate the missing records:
SELECT
CONCAT(
SUBSTRING(t.First_Page, 1, 4),
REPLICATE('0', 3 - LEN(n.idx)),
n.idx
) AS [ALl Pages],
t.id
FROM
mytable t
INNER JOIN numbers n
ON n.idx >= CAST(SUBSTRING(t.First_Page, 5, 3) AS int)
AND n.idx <= CAST(SUBSTRING(t.Last_Page, 5, 3) AS int)
This demo on DB Fiddle with your sample data returns:
ALl Pages | id
:-------- | -:
ABC_001 | 1
ABC_002 | 1
ABC_003 | 1
ABC_004 | 1
ABC_005 | 2
ABC_006 | 3
ABC_007 | 3
ABC_008 | 3
ABC_009 | 3
ABC_010 | 3
To find all pages from First Page to Last Page per Book ID, CAST your page numbers from STRING to INTEGER, then add +1 to each page number until you reach the Last Page.
First, turn your original table into a table variable with the Integer data types using a TRY_CAST.
DECLARE #Book TABLE (
[ID] INT
,[FirstPage] INT
,[LastPage] INT
)
INSERT INTO #Book
SELECT [ID]
,TRY_CAST(RIGHT([FirstPage], 3) AS int) AS [FirstPage]
,TRY_CAST(RIGHT([LastPage], 3) AS int) AS [LastPage]
FROM [YourOriginalTable]
Set the maximum page that your pages will increment to using a variable. This will cap out your results to the correct number of pages. Otherwise your table would have many more rows than you need.
DECLARE #LastPage INT
SELECT #LastPage = MAX([LastPage]) FROM #Book
Turning a three-column table (ID, First Page, Last Page) into a two-column table (ID, Page) will require an UNPIVOT.
We're tucking that UNPIVOT into a CTE (Common Table Expression: basically a smart version of a temporary table (like a #TempTable or #TableVariable, but which you can only use once, and is a little more efficient in certain circumstances).
In addition to the UNPIVOT of your [First Name] and [Last Name] columns into a tall table, we're going to append every other combination of page number per ID using a UNION ALL.
;WITH BookCTE AS (
SELECT [ID]
,[Page]
FROM (SELECT [ID]
,[FirstPage]
,[LastPage]
FROM #Book) AS bp
UNPIVOT
(
[Page] FOR [Pages] IN ([FirstPage], [LastPage])
) AS up
UNION ALL
SELECT [ID], [Page] + 1 FROM BookCTE WHERE [Page] + 1 < #LastPage
)
Now that your data is held in a table format using a CTE with all combinations of [ID] and [Page] up to the maximum page in your #Book table, it's time to join your CTE with the #Book table.
SELECT DISTINCT
cte.ID
,cte.Page
FROM BookCTE AS cte
INNER JOIN #Book AS bk
ON bk.ID = cte.ID
WHERE cte.Page <= bk.[LastPage]
ORDER BY
cte.ID
,cte.Page
OPTION (MAXRECURSION 10000)
See also:
How to generate a range of numbers between two numbers (I based my code off of #Jayvee's answer)
Assigning variables using SET vs SELECT
SQL Server UNPIVOT
SQL Server CTE Basics
Recursive CTEs Explained
Note: will update with re-integrating string portion of FirstPage and LastPage (which I assume is based on book title). Stand by.

Normalization into three tables in postgres incl one association table

Say I have original data like so:
foo bar baz
1 a b
1 x y
2 z q
And I want to end up with three tables, of which I and III are the main tables and II is an association table between I and III
I. e.:
I
id foo
1 1
2 2
II
id I_id III_id
1 1 1
2 1 2
3 2 3
NB that I_ID is a serial and not foo
III
id bar baz
1 a b
2 x y
3 z q
How would I go about inserting this in one go?
I have played around with CTEs but I am stuck at the following: if I start with III and then return the ID's I cannot see how I can get back to the I table since there is nothing connecting them (yet)
My previous solutions have ended up in pre-generating id sequences which feels so-so
What if you generate a dense rank ?
you generate first a big table with the information you need.
select foo,
bar,
baz,
dense_rank() over (order by foo) as I_id,
dense_rank() over (order by bar, baz) as III_id,
row_number() over (order by 1) as II_id
from main_Table
Then you just have to transfer in the table you want with a distinct.
Start with "main" tables, create two main entities and then use their IDs to insert a record to "connection table" between them, you can use CTE for that of course (I assume that "main" tables I and III have default nextval(..) in PK column, pooling next ID from sequences):
with ins1 as (
insert into tabl1(foo)
values(...)
returning *
), ins3 as (
insert into tabl3(bar, baz)
values (.., ..)
returning *
)
insert into tabl2(i_id, ii_id)
select ins1.id, ins3.id
from ins1, ins3 -- implicit CROSS JOIN here:
-- we assume that only single row was
-- inserted to each "main" table
-- or you need Cartesian product to be inserted to `II`
returning *
;

Select exactly 5 rows from a table

I have an odd requirement which ideally should be solved in SQL, not the surrounding app.
I need to select exactly 5 rows regardless of how many are actually available. In practice the number of rows available will usually be less than 5 and on some rare occasions it will be more than 5. The "extra" rows should have null in every column.
The app is written in a technology that isn't Turing Complete. This requirement is much more difficult to solve in the app's code than you might imagine! To describe it, the app is effectively a transformer: It takes in a bunch of queries and spits out a report. So please understand the app is NOT written in a "programming language" in the traditional sense.
So for example, if I have a table:
A | B
-----
1 | X
2 | Y
3 | Z
Then a valid result would be
A | B
-----------
2 | Y
1 | X
3 | Z
null | null
null | null
I know this is an unusual requirement. Sadly it can't be solved in the application due to the technology being used.
Ideally this shouldn't require changes to the database but if there is no other way that changes can be arranged.
Any suggestions?
You can do something like this:
select top 5 a, b
from (select a, b, 1 as priority from t union all
select null, null, 2 cross join
(values(1, 2, 3, 4, 5)) v(5)
) x
order by priority;
That is, create dummy rows, append them, and then choose the first five.
I do think that this work should be done in the app, but you can do it in SQL.
Create Table #Test (A int, B int)
Insert #Test Values (1,1)
Insert #Test Values (2,1)
Insert #Test Values (3,1)
Select Top 5 * From
(
Select A, B From #Test
Union All
Select Null, Null
Union All
Select Null, Null
Union All
Select Null, Null
Union All
Select Null, Null
Union All
Select Null, Null
) A
Wrap this in a stored proc..
declare #rowcount int
select top 5* from dbo.test
set #rowcount=##rowcount
if #rowcount<5
Begin
select * from dbo.test
union all
select null from dbo.numbers where n<=5-#rowcount
End
If you use some sort of tally table (although the numbers themselves do not matter, only that the table has enough records), you can use it to create the dummy rows. e.g. using sys.columns:
select top 5 a,b from
(
select a, b, 0 ord from yourTable
union all
select null a,null b, 1 from sys.columns
) t
order by ord
The advantage of the tally would be that if you need another number of rows in the future, you only need to change the top x (provided the tally table has enough rows)
Get those 3 records from your table.
Take a counter variable.
and then from your code add the NULL content until your counter gets 5.

Given a parent / child key table, how can we recursively insert a copy of the structure into another table?

I have a recursive CTE which gives me a listing of a set of parent child keys as follows, lets say its in a temp table called [#relationtree]:
Parent | Child
--------------
1 | 3
3 | 5
5 | 6
5 | 9
I want to create a copy of these relationships into a table with, lets say, the following stucture:
CREATE TABLE [dbo].[Relations]
(
[Id] int identity(1,1)
[ParentId] int
)
How can I insert the above records but recursively obtain the previously inserted identity value to be able to insert that value as the ParentId column for each copy of a child I insert?
I would expect to have at the end of this in [dbo].[Relations] (given our current seed value is, say 50)
Id | ParentId
-------------
... other rows present before this query ...
50 | NULL
51 | 50
52 | 51
53 | 51
I'm not sure that scope_identity can work in this situation, or that creating a new temp table with a list of new IDs and inserting identity columns manually is the correct approach?
I could write a cursor / loop to do this, but there must be a nice way of doing some recursive select magic!
Since you're trying to put the tree into a segment of the table it looks like you're going to need to use SET IDENTITY_INSERT ON for the table anyway. You're going to need to make sure that there is room for the new tree. In this case, I'll assume that 49 is the current maximum id in your table so that we don't need to be concerned with overrunning a tree that's later in the table.
You'll need to be able to map the IDs from the old tree to the new tree. Unless there's some rule around the ids, the exact mapping should be irrelevant as long as it's accurate, so in that case, I'd just do something like this:
SET IDENTITY_INSERT dbo.Relations ON
;WITH CTE_MappedIDs AS
(
SELECT
old_id,
ROW_NUMBER() OVER(ORDER BY old_id) + 49 AS new_id
FROM
(
SELECT DISTINCT parent AS old_id FROM #relationtree
UNION
SELECT DISTINCT child AS old_id FROM #relationtree
) SQ
)
INSERT INTO dbo.Relations (Id, ParentId)
SELECT
CID.new_id,
PID.new_id
FROM
#relationtree RT
INNER JOIN CTE_MappedIDs PID ON PID.old_id = RT.parent
INNER JOIN CTE_MappedIDs CID ON CID.old_id = RT.parent
-- We need to also add the root node
UNION ALL
SELECT
NID.new_id,
NULL
FROM
#relationtree RT2
INNER JOIN CTE_MappedIDs NID ON NID.old_id = RT2.parent
WHERE
RT2.parent NOT IN (SELECT DISTINCT child FROM #relationtree)
SET IDENTITY_INSERT dbo.Relations OFF
I haven't tested that, but if it doesn't work as expected then hopefully it will point you in the right direction.
I know you already have a working answer, but I think you can accomplish the same thing a little more simply (not that there is anything at all wrong with Tom H's answer) using the LAG function to inspect the previous row, assuming you have SQL Server 2012 or later.
Setup:
CREATE TABLE #relationtree (
Parent INT,
Child INT
)
CREATE TABLE #relations (
Id INT IDENTITY(1,1),
ParentId INT
)
INSERT INTO #relationtree (Parent, Child) VALUES(1,3), (3,5), (5,6), (5,9)
INSERT INTO #relations (ParentId) values(1), (3), (5)
Solution:
DECLARE #offset INT = IDENT_CURRENT('#relations')
;WITH relationtreeids AS (
SELECT *,
ROW_NUMBER() OVER(ORDER BY Parent, Child) - 2 AS UnmodifiedParentId -- Simulate an identity field
FROM #relationtree
)
INSERT INTO #relations
-- The LAG window function allows you to inspect the previous row
SELECT CASE WHEN LAG(Parent) OVER(ORDER BY Parent) IS NULL
THEN NULL
WHEN LAG(Parent) OVER(ORDER BY Parent) = Parent
THEN UnmodifiedParentId + #offset ELSE UnmodifiedParentId + #offset + 1
END AS ParentId
FROM relationtreeids
Output:
Id ParentId
1 1
2 3
3 5
4 NULL
5 4
6 5
7 5