how to select the last inserted row in a table - sql

as shown below, i have a large db-table with several columns. i want to be able to select the last inserted row in that table..how can i achieve that in postgresql
output
"timeofinsertion","selectedsiteid","devenv","threshold","ec50ewco","dose","apprateofproduct","concentrationofactingr","unclass_intr_nzccs","unclass_inbu_nzccs","vl_intr_nzccs","percentage_vl_per_total_nzccs_intr","vl_inbu_nzccs","percentage_vl_per_total_nzccs_inbu","totalvlnzccsinsite","percentage_total_vlnzccs_per_site","l_intr_nzccs","percentage_l_per_total_nzccs_intr","l_inbu_nzccs","percentage_l_per_total_nzccs_inbu","totallnzccsinsite","percentage_total_lnzccs_per_site","m_intr_nzccs","percentage_m_per_total_nzccs_intr","m_inbu_nzccs","percentage_m_per_total_nzccs_inbu","totalmnzccsinsite","percentage_total_mnzccs_per_site","h_intr_nzccs","percentage_h_per_total_nzccs_intr","h_inbu_nzccs","percentage_h_per_total_nzccs_inbu","totalhnzccsinsite","percentage_total_hnzccs_per_site","unclass_intr_zccs","unclass_inbu_zccs","vl_intr_zccs","percentage_vl_per_total_zccs_intr","vl_inbu_zccs","percentage_vl_per_total_zccs_inbu","totalvlzccsinsite","percentage_total_vlzccs_per_site","l_intr_zccs","percentage_l_per_total_zccs_intr","l_inbu_zccs","percentage_l_per_total_zccs_inbu","totallzccsinsite","percentage_total_lzccs_per_site","m_intr_zccs","percentage_m_per_total_zccs_intr","m_inbu_zccs","percentage_m_per_total_zccs_inbu","totalmlzccsinsite","percentage_total_mzccs_per_site","h_intr_zccs","percentage_h_per_total_zccs_intr","h_inbu_zccs","percentage_h_per_total_zccs_inbu","totalhzccsinsite","percentage_total_hzccs_per_site","totalunclassnzccs","totalunclasszccs","totalnzccsintr","totalnzccsinbu","totalnzccsinsite","totalzccsintr","totalzccsinbu","totalzccsinsite","totalvlinsite","percentageof_total_vl_insite_per_site","totallinsite","percentageof_total_l_insite_per_site","totalminsite","percentageof_total_m_insite_per_site","totalhinsite","percentageof_total_h_insite_per_site","total_unclass_with_nodatacells_excluded","total_unclass_with_nodatacells_included","total_with_nodatacells_excluded","total_with_nodatacells_included"
"3-2-2023 10:0:3:745762","202311011423",test,1,"3.125","0.75","75","100","0","0","0","0","0","0","0","0.0","0","0","0","0","0","0.0","0","0","0","0","0","0.0","0","0","0","0","0","0.0","0","0","0","0.0","32","91.4","32","82.1","0","0.0","3","8.6","3","7.7","4","100.0","0","0.0","4","10.3","0","0.0","0","0.0","0","0.0","0","0","0","0","0","4","35","39","32","82.1","3","7.7","4","10.3","0","0.0","0","0","39","39"

If you have a column to sort by, you can use:
select *
from the_table
order by timeofinsertion desc
limit 1;
If you want to get the complete row that you have just inserted, it might be easier to use the returning clause with your INSERT statement:
insert into the_table (timeofinsertion, selectedsiteid, ...)
values (current_timestamp, ....)
returning *;

In MSSQL server you can you SCOPE_IDENTITY it return last inserted Identity Column Value.
Otherwise you can you order by desc with top 1 of Identity column
Check Example :
Create Table tbl_Name
(
RowId Int Not Null Primary Key Identity(1,1),
Name Varchar(100)
)
INsert Into tbl_Name (Name) Values ('My Name')
Select * From tbl_Name Where RowId = SCOPE_IDENTITY()
Select Top 1 * From tbl_Name Order by RowId desc

Related

Increment column IDs by 1 in insert into statement

I have a table I want to insert into based on two other tables.
In the table I'm inserting into, I need to find the Max value and then do +1 every time to basically create new IDs for each of the 2000 values I'm inserting.
I tried something like
MAX(column_name) + 1
But it didn't work. I CANNOT make the column an IDENTITY and ideally the increment by one should happen in the INSERT INTO ... SELECT ... statement.
Many Thanks!
You can declare a variable with the last value from the table and put it on the insert statement, like this:
DECLARE #Id INT
SET #Id = (SELECT TOP 1 Id FROM YoutTable ORDER BY DESC)
INSERT INTO YourTable VALUES (#Id, Value, Value)
If its mysql, you could do something like this..
insert into yourtable
select
#rownum:=#rownum+1 'colname', t.* from yourtable t, (SELECT #rownum:=2000) r
The example to generate rownumber taken from here
If its postgresql, you could use
insert into yourtable
select t.*,((row_number() over () ) + 2000) from yourtable t
Please note the order for the select is different on both the queries, you may need to adjust your insert statement accordingly.
Use a sequence, that's what they are for.
create sequence table_id_sequence;
Then adjust the sequence to the current max value:
select setval('table_id_sequence', (select max(id_column) from the_table));
The above only needs to be done once.
After the sequence is set up, always use that for any subsequent inserts:
insert into (id_column, column_2, column_3)
select nextval('table_id_sequence'), column_2, column_3
from some_other_table;
If you will never have any any concurrent inserts into that table (but only then) you can get away with using max() + 1
insert into (id_column, column_2, column_3)
select row_number() over () + mx.id, column_2, column_3
from some_other_table
cross join (
select max(id_column) from the_table
) as mx(id);
But again: the above is NOT safe for concurrent inserts.
The sequence solution is also going to perform better (especially if the target table grows in size)

Deleting duplicates rows from redshift

I am trying to delete some duplicate data in my redshift table.
Below is my query:-
With duplicates
As
(Select *, ROW_NUMBER() Over (PARTITION by record_indicator Order by record_indicator) as Duplicate From table_name)
delete from duplicates
Where Duplicate > 1 ;
This query is giving me an error.
Amazon Invalid operation: syntax error at or near "delete";
Not sure what the issue is as the syntax for with clause seems to be correct.
Has anybody faced this situation before?
Redshift being what it is (no enforced uniqueness for any column), Ziggy's 3rd option is probably best. Once we decide to go the temp table route it is more efficient to swap things out whole. Deletes and inserts are expensive in Redshift.
begin;
create table table_name_new as select distinct * from table_name;
alter table table_name rename to table_name_old;
alter table table_name_new rename to table_name;
drop table table_name_old;
commit;
If space isn't an issue you can keep the old table around for a while and use the other methods described here to validate that the row count in the original accounting for duplicates matches the row count in the new.
If you're doing constant loads to such a table you'll want to pause that process while this is going on.
If the number of duplicates is a small percentage of a large table, you might want to try copying distinct records of the duplicates to a temp table, then delete all records from the original that join with the temp. Then append the temp table back to the original. Make sure you vacuum the original table after (which you should be doing for large tables on a schedule anyway).
If you're dealing with a lot of data it's not always possible or smart to recreate the whole table. It may be easier to locate, delete those rows:
-- First identify all the rows that are duplicate
CREATE TEMP TABLE duplicate_saleids AS
SELECT saleid
FROM sales
WHERE saledateid BETWEEN 2224 AND 2231
GROUP BY saleid
HAVING COUNT(*) > 1;
-- Extract one copy of all the duplicate rows
CREATE TEMP TABLE new_sales(LIKE sales);
INSERT INTO new_sales
SELECT DISTINCT *
FROM sales
WHERE saledateid BETWEEN 2224 AND 2231
AND saleid IN(
SELECT saleid
FROM duplicate_saleids
);
-- Remove all rows that were duplicated (all copies).
DELETE FROM sales
WHERE saledateid BETWEEN 2224 AND 2231
AND saleid IN(
SELECT saleid
FROM duplicate_saleids
);
-- Insert back in the single copies
INSERT INTO sales
SELECT *
FROM new_sales;
-- Cleanup
DROP TABLE duplicate_saleids;
DROP TABLE new_sales;
COMMIT;
Full article: https://elliot.land/post/removing-duplicate-data-in-redshift
That should have worked. Alternative you can do:
With
duplicates As (
Select *, ROW_NUMBER() Over (PARTITION by record_indicator
Order by record_indicator) as Duplicate
From table_name)
delete from table_name
where id in (select id from duplicates Where Duplicate > 1);
or
delete from table_name
where id in (
select id
from (
Select id, ROW_NUMBER() Over (PARTITION by record_indicator
Order by record_indicator) as Duplicate
From table_name) x
Where Duplicate > 1);
If you have no primary key, you can do the following:
BEGIN;
CREATE TEMP TABLE mydups ON COMMIT DROP AS
SELECT DISTINCT ON (record_indicator) *
FROM table_name
ORDER BY record_indicator --, other_optional_priority_field DESC
;
DELETE FROM table_name
WHERE record_indicator IN (
SELECT record_indicator FROM mydups);
INSERT INTO table_name SELECT * FROM mydups;
COMMIT;
This method will preserve permissions and the table definition of the original_table.
The most upvoted answer does not preserve permissions on the table or the original definition of the table.
In real world production environment this method is how you should be doing as this is safest and easiest way to execute in production environment.
This will have a DOWN TIME in PROD.
Create Table with unique rows
CREATE TABLE unique_table as
(
SELECT DISTINCT * FROM original_table
)
;
Backup the original_table
CREATE TABLE backup_table as
(
SELECT * FROM original_table
)
;
Truncate the original_table
TRUNCATE original_table;
Insert records from unique_table into original_table
INSERT INTO original_table
(
SELECT * FROM unique_table
)
;
To avoid DOWN TIME run the below queries in a TRANSACTION and instead of TRUNCATE use DELETE
BEGIN transaction;
CREATE TABLE unique_table as
(
SELECT DISTINCT * FROM original_table
)
;
CREATE TABLE backup_table as
(
SELECT * FROM original_table
)
;
DELETE FROM original_table;
INSERT INTO original_table
(
SELECT * FROM unique_table
)
;
END transaction;
Simple answer to this question:
Firstly create a temporary table from the main table where value of row_number=1.
Secondly delete all the rows from the main table on which we had duplicates.
Then insert the values of temporary table into the main table.
Queries:
Temporary table
select id,date into #temp_a
from
(select *
from (select a.*,
row_number() over(partition by id order by etl_createdon desc) as rn
from table a
where a.id between 59 and 75 and a.date = '2018-05-24')
where rn =1)a
deleting all the rows from the main table.
delete from table a
where a.id between 59 and 75 and a.date = '2018-05-24'
inserting all values from temp table to main table
insert into table a select * from #temp_a.
The following deletes all records in 'tablename' that have a duplicate, it will not deduplicate the table:
DELETE FROM tablename
WHERE id IN (
SELECT id
FROM (
SELECT id,
ROW_NUMBER() OVER (partition BY column1, column2, column3 ORDER BY id) AS rnum
FROM tablename
) t
WHERE t.rnum > 1);
Postgres administrative snippets
Your query does not work because Redshift does not allow DELETE after the WITH clause. Only SELECT and UPDATE and a few others are allowed (see WITH clause)
Solution (in my situation):
I did have an id column on my table events that contained duplicate rows and uniquely identifies the record. This column id is the same as your record_indicator.
Unfortunately I was unable to create a temporary table because I ran into the following error using SELECT DISTINCT:
ERROR: Intermediate result row exceeds database block size
But this worked like a charm:
CREATE TABLE temp as (
SELECT *,ROW_NUMBER() OVER (PARTITION BY id ORDER BY id) AS rownumber
FROM events
);
resulting in the temp table:
id | rownumber | ...
----------------
1 | 1 | ...
1 | 2 | ...
2 | 1 | ...
2 | 2 | ...
Now the duplicates can be deleted by removing the rows having rownumber larger than 1:
DELETE FROM temp WHERE rownumber > 1
After that rename the tables and your done.
with duplicates as
(
select a.*, row_number (over (partition by first_name, last_name, email order by first_name, last_name, email) as rn from contacts a
)
delete from contacts
where contact_id in (
select contact_id from duplicates where rn >1
)

One column as two columns

I have a table which will displays User ids in one column. I want to display that column as two columns
For example:
In First Row, First user id in Column 1 and Second user id in Column2,
In Second Row, second user id in Column 1 and third user id in Column2,
In Third row, third user id in Column 1 and fourth user id in Column2,
and so on.
SQL tables represent unordered sets. There is no concept of "first userid", "second userid" etc. unless you have a column that specifies the ordering.
You can do what you want using the ANSI standard lead() function. You haven't specified the database you are using, so this seems like the best answer:
select userid, lead(userid order by rownumber) as next_userid
from (<subquery>) s;
You can use row_number to create a order column and use it to join the table with itself, see this example (MS-SQL):
declare #Users as table
(
Id int not null
,Name varchar(20) not null
)
insert into #Users
values
(10, 'First')
,(220,'Second')
,(301,'Third')
,(999,'Last')
declare #Temp table
(
Id int not null
,Name varchar(20) not null
,[Order] int not null
)
insert into #Temp
select us.Id, us.Name, ROW_NUMBER() OVER(order by us.Id) as [Order]
from #Users us
select *
from #Temp t0
join #Temp t1 on t1.[Order] = t0.[Order] + 1
Try this solution. It works in PostgreSQL:
SELECT firstTable.id as firstid, MIN(nextTable.id) as nextid
FROM [your_table_name] as firstTable
LEFT JOIN [your_table_name] as nextTable
ON firstTable.id < nextTable.id
GROUP BY firstid
ORDER BY firstid ASC

Remove duplicated rows in sql

I want to remove duplicated rows in sql. My table looks like that:
CREATE TABLE test_table
(
id Serial,
Date Date,
Time Time,
Open double precision,
High double precision,
Low double precision
);
DELETE FROM test_table
WHERE ctid IN (SELECT min(ctid)
FROM test_table
GROUP BY id
HAVING count(*) > 1);
with the below delete statement I am searching in the secret column ctid for duplicated entries and delete them. However this does not work correctly. The query gets executed properly, but does not delete anything.
I appreciate your answer!
UPDATE
This is some sample data(without the generated id):
2013.11.07,12:43,1.35162,1.35162,1.35143,1.35144
2013.11.07,12:43,1.35162,1.35162,1.35143,1.35144
2013.11.07,12:44,1.35144,1.35144,1.35141,1.35142
2013.11.07,12:45,1.35143,1.35152,1.35143,1.35151
2013.11.07,12:46,1.35151,1.35152,1.35149,1.35152
Get out of the habit of using ctid, xid, etc. - they're not advertised for a reason.
One way of dealing with duplicate rows in one shot, depending on how recent your postgres version is:
with unique_rows
as
(
select distinct on (id) *
from test_table
),
delete_rows
as
(
delete
from test_table
)
insert into test_table
select *
from unique_rows
;
Or break everything up in three steps and use temp tables:
create temp table unique_rows
as
select distinct on (id) *
from test_table
;
create temp table delete_rows
as
delete
from test_table
;
insert into test_table
select *
from unique_rows
;
Not sure if you can use row_number with partiontions in postgresql but if so you can do this to find duplicates, you can add or substract columns from the partion by to define what duplicates are in the set
WITH cte AS
(
SELECT id,ROW_NUMBER() OVER(PARTITION BY Date, Time ORDER BY date, time) AS rown
FROM test_table
)
delete From test_table
where id in (select id from cte where rown > 1);

the row no in query output

I have a numeric field (say num) in table along with pkey.
select * from mytable order by num
now how I can get the row no in query output of a particular row for which I have pkey.
I'm using sql 2000.
Sounds like you want a row number for each record returned.
In SQL 2000, you can either do this:
SELECT (SELECT COUNT(*) FROM MyTable t2 WHERE t2.num <= t.num) AS RowNo, *
FROM MyTable t
ORDER BY num
which assumes num is unique. If it's not, then you'd have to use the PK field and order by that.
Or, use a temp table (or table var):
CREATE TABLE #Results
(
RowNo INTEGER IDENTITY(1,1),
MyField VARCHAR(10)
)
INSERT #Results
SELECT MyField
FROM MyTable
ORDER BY uum
SELECT * FROM #Results
DROP TABLE #Results
In SQL 2005, there is a ROW_NUMBER() function you could use which makes life a lot easier.
as i understand your question you want to get the number of all rows returned, right?
if so use ##rowcount
As Ada points out, this task became a lot easier in SQL Server 2005....
SELECT whatever, RowNumber from (
SELECT pk
, whatever
, ROW_NUMBER() OVER(ORDER BY num) AS 'RowNumber'
FROM mytable
)
WHERE pk = 23;