SQL Query Create isDuplicate Column with IDs

SQL Query Create isDuplicate Column with IDs - sql

I have a SQL Server 2005 database I'm working with. For the query I am using, I want to add a custom column that can start at any number and increment based on the row entry number.
For example, I start at number 10. Each row in my results will have an incrementing number 10, 11, 12, etc..
This is an example of the SELECT statement I would be using.
int customVal = 10;
SELECT
ID, customVal++
FROM myTable
The format of the above is clearly wrong, but it is conceptually what I am looking for.
RESULTS:
ID CustomColumn
-------------------
1 10
2 11
3 12
4 13
How can I go about implementing this kind functionality?
I cannot find any reference to incrementing variables within results. Is this the case?
EDIT: The customVal number will be pulled from another table. I.e. probably do a Select statement into the customVal variable. You cannot assume the the ID column will be any usable values.
The CustomColumn will be auto-incrementing starting at the customVal.

Use the ROW_NUMBER ranking function - http://technet.microsoft.com/en-us/library/ms186734.aspx
DECLARE #Offset INT = 9
SELECT
ID
, ROW_NUMBER() OVER (ORDER BY ID) + #Offset
FROM
Table

Related

Select records from a specific key onwards

I have a table that has more than three trillion records
The main key of this table is guid
As below
GUID Value mid id
0B821574-8E85-4FB7-8047-553393E385CB 4 51 15
716F74B0-80D8-4869-86B4-99FF9EB10561 0 510 153
7EBA2C31-FFC8-4071-B11A-9E2B7ED16B2B 2 5 3
85491F90-E4C6-4030-B1E5-B9CA36238AE2 1 58 7
F04FA30C-0C35-4B9F-A01C-708C0189815D 20 50 13
guid is primary key
I want to select 10 records from where the key is equal to, for example, 85491F90-E4C6-4030-B1E5-B9CA36238AE2

You can use order by and top. Assuming that guid defines the ordering of the rows:
select top (10) t.*
from mytable t
where guid >= '85491F90-E4C6-4030-B1E5-B9CA36238AE2'
order by guid
If the ordering is defined in an other column, say id (that should be unique as well), then you would use a correlated subquery for filterig:
select top (10) t.*
from mytable t
where id >= (select id from mytable t1 where guid = '85491F90-E4C6-4030-B1E5-B9CA36238AE2')
order by id

To read data onward You can use OFFSET .. FETCH in the ORDER BY since MS SQL Server 2012. According learn.microsoft.com something like this:
-- Declare and set the variables for the OFFSET and FETCH values.
DECLARE #StartingRowNumber INT = 1
, #RowCountPerPage INT = 10;
-- Create the condition to stop the transaction after all rows have been returned:
WHILE (SELECT COUNT(*) FROM mytable) >= #StartingRowNumber
BEGIN
-- Run the query until the stop condition is met:
SELECT *
FROM mytable WHERE guid = '85491F90-E4C6-4030-B1E5-B9CA36238AE2'
ORDER BY id
OFFSET #StartingRowNumber - 1 ROWS
FETCH NEXT #RowCountPerPage ROWS ONLY;
-- Increment #StartingRowNumber value:
SET #StartingRowNumber = #StartingRowNumber + #RowCountPerPage;
CONTINUE
END;
In the real world it will not be enough, because another processes could (try) read or write data in your table at the same time.
Please, read documentation, for example, search for "Running multiple queries in a single transaction" in the https://learn.microsoft.com/en-us/sql/t-sql/queries/select-order-by-clause-transact-sql
Proper indexes for fields id and guid must to be created/applied to provide performance

Group BY Statement error to get unique records

I am new to SQL Server, used to work with MYSQL and trying to get the records from a table using Group By.
The table structure is given below:
SELECT S1.ID,S1.Template_ID,S1.Assigned_By,S1.Assignees,S1.Active FROM "Schedule" AS S1;
Output:
ID Template_ID Assigned_By Assignees Active
2 25 1 3 1
3 25 5 6 1
6 26 5 6 1
I need to get the values of all columns using the Group By statement below
SELECT Template_ID FROM "Schedule" WHERE "Assignees" IN(6, 3) GROUP BY "Template_ID";
Output:
Template_ID
25
26
I tried the following code to fetch the table using Group By, but it's fetching all the rows.
SELECT S1.ID,S1.Template_ID,S1.Assigned_By,S1.Assignees,S1.Active FROM "Schedule" AS S1 INNER JOIN(SELECT Template_ID FROM "Schedule" WHERE "Assignees" IN(6, 3) GROUP BY "Template_ID") AS S2 ON S2.Template_ID=S1.Template_ID
My Output Should be like,
ID Template_ID Assigned_By Assignees Active
2 25 1 3 1
6 26 5 6 1
I was wondering whether I can get ID of the column as well? I use the ID for editing the records in the web.

The query doesn't work as expected in MySQL either, except by accident.
Nonaggregated columns in MySQL aren't part of the SQL standard and not even allowed in MySQL 5.7 and later unless the default value of the ONLY_FULL_GROUP_BY mode is changed.
In earlier versions the result is non-deterministic.
The server is free to choose any value from each group, so unless they are the same, the values chosen are nondeterministic. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause.
This means there's was no way to know what rows will be returned this query :
SELECT S1.ID,S1.Template_ID,S1.Assigned_By,S1.Assignees,S1.Active
FROM "Schedule" AS S1
GROUP BY Template_ID;
To get deterministic results you'd need a way to rank rows with the ranking functions introduced in MySQL 8, like ROW_NUMBER(). These are already available in SQL Server since SQL Server 2012 at least. The syntax is the same for both databases :
WITH ranked as AS
(
SELECT
ID,Template_ID,Assigned_By,Assignees Active,
ROW_NUMBER(PARTITION BY Template_ID Order BY ID)
FROM Scheduled
WHERE Assignees IN(6, 3)
)
SELECT ID,Template_ID,Assigned_By,Assignees Active
FROM ranked
Where RN=1
PARTITION BY Template_ID splits the result rows based on their Template_ID value into separate partitions. Within that partition, the rows are ordered based on the ORDER BY clause. Finally, ROW_NUMBER calculates a row number for each ordered partition row.

Updating tables from generate_series(int, int) function Postgres 9.0

I have a table with data in it I want to keep, but I have to add a new column of integers for ordering purposes. Now this ordering will be different depending on the clientID as each different client wants different ordering. So in my example there are 3 different clients, the first client has 10 rows of data the second has 15, and the third has 87. So basically I'm looking for a query that will let me update the ordering column in a way that will allow me to do a select on the table that would give results like this.
Select ordering from table Where clientID = 1
-----------
Ordering
1
2
3
4
5
6
7
8
9
10
Now the query I'm currently using to do this is
UPDATE data SET ordering = generate_series
FROM (SELECT * FROM generate_series(1,87)) as k <
where clientid = '3'
This will update all the correct rows but only with the first value, so all the values in ordering would be 1. I feel like I'm missing something here or this just doesn't work in postgres as it does in other SQL languages. Any solution here will help I would also like to know why my update would not work as I expected in postgres. Also I cannot change versions of postgres based on the environment I work in.

I don't see why you would need generate_series(). A window function that numbers all rows for each client should do:
update data
set ordering = t.rn
from (select pk_column,
row_number() over (partition by clientid order by pk_column) as rn
from data
) t
where t.pk_column = data.pk_column;
pk_column is the primary key column of the table data

Selecting most recent and specific version in each group of records, for multiple groups

The problem:
I have a table that records data rows in foo. Each time the row is updated, a new row is inserted along with a revision number. The table looks like:
id rev field
1 1 test1
2 1 fsdfs
3 1 jfds
1 2 test2
Note: the last record is a newer version of the first row.
Is there an efficient way to query for the latest version of a record and for a specific version of a record?
For instance, a query for rev=2 would return the 2, 3 and 4th row (not the replaced 1st row though) while a query for rev=1 yields those rows with rev <= 1 and in case of duplicated ids, the one with the higher revision number is chosen (record: 1, 2, 3).
I would not prefer to return the result in an iterative way.

To get only latest revisions:
SELECT * from t t1
WHERE t1.rev =
(SELECT max(rev) FROM t t2 WHERE t2.id = t1.id)
To get a specific revision, in this case 1 (and if an item doesn't have the revision yet the next smallest revision):
SELECT * from foo t1
WHERE t1.rev =
(SELECT max(rev)
FROM foo t2
WHERE t2.id = t1.id
AND t2.rev <= 1)
It might not be the most efficient way to do this, but right now I cannot figure a better way to do this.

Here's an alternative solution that incurs an update cost but is much more efficient for reading the latest data rows as it avoids computing MAX(rev). It also works when you're doing bulk updates of subsets of the table. I needed this pattern to ensure I could efficiently switch to a new data set that was updated via a long running batch update without any windows of time where we had partially updated data visible.
Aging
Replace the rev column with an age column
Create a view of the current latest data with filter: age = 0
To create a new version of your data ...
INSERT: new rows with age = -1 - This was my slow long running batch process.
UPDATE: UPDATE table-name SET age = age + 1 for all rows in the subset. This switches the view to the new latest data (age = 0) and also ages older data in a single transaction.
DELETE: rows having age > N in the subset - Optionally purge old data
Indexing
Create a composite index with age and then id so the view will be nice and fast and can also be used to look up by id. Although this key is effectively unique, its temporarily non-unique when you're ageing the rows (during UPDATE SET age=age+1) so you'll need to make it non-unique and ideally the clustered index. If you need to find all versions of a given id ordered by age, you may need an additional non-unique index on id then age.
Rollback
Finally ... Lets say you're having a bad day and the batch processing breaks. You can quickly revert to a previous data set version by running:
UPDATE table-name SET age = age - 1 -- Roll back a version
DELETE table-name WHERE age < 0 -- Clean up bad stuff
Existing Table
Suppose you have an existing table that now needs to support aging. You can use this pattern by first renaming the existing table, then add the age column and indexing and then create the view that includes the age = 0 condition with the same name as the original table name.
This strategy may or may not work depending on the nature of technology layers that depended on the original table but in many cases swapping a view for a table should drop in just fine.
Notes
I recommend naming the age column to RowAge in order to indicate this pattern is being used, since it's clearer that its a database related value and it complements SQL Server's RowVersion naming convention. It also won't conflict with a column or view that needs to return a person's age.
Unlike other solutions, this pattern works for non SQL Server databases.
If the subsets you're updating are very large then this might not be a good solution as your final transaction will update not just the current records but all past version of the records in this subset (which could even be the entire table!) so you may end up locking the table.

This is how I would do it. ROW_NUMBER() requires SQL Server 2005 or later
Sample data:
DECLARE #foo TABLE (
id int,
rev int,
field nvarchar(10)
)
INSERT #foo VALUES
( 1, 1, 'test1' ),
( 2, 1, 'fdsfs' ),
( 3, 1, 'jfds' ),
( 1, 2, 'test2' )
The query:
DECLARE #desiredRev int
SET #desiredRev = 2
SELECT * FROM (
SELECT
id,
rev,
field,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) rn
FROM #foo WHERE rev <= #desiredRev
) numbered
WHERE rn = 1
The inner SELECT returns all relevant records, and within each id group (that's the PARTITION BY), computes the row number when ordered by descending rev.
The outer SELECT just selects the first member (so, the one with highest rev) from each id group.
Output when #desiredRev = 2 :
id rev field rn
----------- ----------- ---------- --------------------
1 2 test2 1
2 1 fdsfs 1
3 1 jfds 1
Output when #desiredRev = 1 :
id rev field rn
----------- ----------- ---------- --------------------
1 1 test1 1
2 1 fdsfs 1
3 1 jfds 1

If you want all the latest revisions of each field, you can use
SELECT C.rev, C.fields FROM (
SELECT MAX(A.rev) AS rev, A.id
FROM yourtable A
GROUP BY A.id)
AS B
INNER JOIN yourtable C
ON B.id = C.id AND B.rev = C.rev
In the case of your example, that would return
rev field
1 fsdfs
1 jfds
2 test2

SELECT
MaxRevs.id,
revision.field
FROM
(SELECT
id,
MAX(rev) AS MaxRev
FROM revision
GROUP BY id
) MaxRevs
INNER JOIN revision
ON MaxRevs.id = revision.id AND MaxRevs.MaxRev = revision.rev

SELECT foo.* from foo
left join foo as later
on foo.id=later.id and later.rev>foo.rev
where later.id is null;

How about this?
select id, max(rev), field from foo group by id
For querying specific revision e.g. revision 1,
select id, max(rev), field from foo where rev <= 1 group by id

How to select 10 rows below the result returned by the SQL query?

Here is the SQL table:
KEY | NAME | VALUE
---------------------
13b | Jeffrey | 23.5
F48 | Jonas | 18.2
2G8 | Debby | 21.1
Now, if I type:
SELECT *
FROM table
WHERE VALUE = 23.5
I will get the first row.
What I need to accomplish is to get the first and the next two rows below. Is there a way to do it?
Columns are not sorted and WHERE condition doesn't participate in the selection of the rows, except for the first one. I just need the two additional rows below the returned one - the ones that were entered after the one which has been returned by the SELECT query.

Without a date column or an auto-increment column, you can't reliably determine the order the records were entered.
The physical order with which rows are stored in the table is non-deterministic.

You need to define an order to the results to do this. There is no guaranteed order to the data otherwise.
If by "the next 2 rows after" you mean "the next 2 records that were inserted into the table AFTER that particular row", you will need to use an auto incrementing field or a "date create" timestamp field to do this.

If each row has an ID column that is unique and auto incrementing, you could do something like:
SELECT * FROM table WHERE id > (SELECT id FROM table WHERE value = 23.5)

If I understand correctly, you're looking for something like:
SELECT * FROM table WHERE value <> 23.5

You can obviously write a program to do that but i am assuming you want a query. What about using a Union. You would also have to create a new column called value_id or something in those lines which is incremented sequentially (probably use a sequence). The idea is that value_id will be incremented for every insert and using that you can write a where clause to return the remaining two values you want.
For example:
Select * from table where value = 23.5
Union
Select * from table where value_id > 2 limit 2;
Limit 2 because you already got the first value in the first query

You need an order if you want to be able to think in terms of "before" and "after".
Assuming you have one you can use ROW_NUMBER() (see more here http://msdn.microsoft.com/en-us/library/ms186734.aspx) and do something like:
With MyTable
(select row_number() over (order by key) as n, key, name, value
from table)
select key, name, value
from MyTable
where n >= (select n from MyTable where value = 23.5)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas