Updating a 3 Million record table using LEAD() function - sql

I need to update a EDW_END_DATE column in a Dimension table using the LEAD() function and the table has 3 Million records , the Oracle query seems to be running forever .
UPDATE
Edwstu.Stu_Class_D A
SET
EDW_END_DATE =
(
SELECT
Edw_New_End_Dt
FROM
(
SELECT
LEAD(Edw_Begin_Date-1,1,'31-DEC-2099') over ( PARTITION BY
Acad_Term_Code ,Class_Number Order By Edw_Begin_Date ASC) AS
Edw_New_End_Dt,
STU_CLASS_KEY
FROM
Edwstu.Stu_Class_D)
B
WHERE
A.STU_CLASS_KEY = B.STU_CLASS_KEY
);

Try to update it using MERGE statement:
MERGE INTO EDWSTU.STU_CLASS_D A
USING (
SELECT
LEAD(EDW_BEGIN_DATE - 1, 1, '31-DEC-2099') OVER(
PARTITION BY ACAD_TERM_CODE, CLASS_NUMBER
ORDER BY
EDW_BEGIN_DATE ASC
) AS EDW_NEW_END_DT,
STU_CLASS_KEY
FROM
EDWSTU.STU_CLASS_D
)
B ON ( A.STU_CLASS_KEY = B.STU_CLASS_KEY )
WHEN MATCHED THEN
UPDATE SET A.EDW_END_DATE = B.EDW_NEW_END_DT;
Cheers!!

You are updating all the rows in the table. This is generally an expensive operation due to locking and logging.
You might consider regenerating the table entirely. Note: before doing this, backup the table.
-- create the table with the results you want
create table temp_stu_class_d as
select d.*, lead(Edw_Begin_Date - 1, 1, date '20199-12-31') as Edw_New_End_Dt
from Edwstu.Stu_Class_D d;
-- remove the contents of the current table
truncate table Edwstu.Stu_Class_D
insert into Edwstu.Stu_Class_D ( . . . , Edw_End_Dt) -- list the columns here
select . . . , Edw_New_End_Dt -- and here
from temp_stu_class_d;
The insert is generally much more efficient than logging each update.

Related

Find the most recently updated rows according to a multi-column grouping

I'm using SQL Server and T-SQL.
Sample Data:
I have data similar to the following readily consumable test data.
--===== Set the proper date format for the test data.
SET DATEFORMAT dmy
;
--===== Create and populate the Test Table
DROP TABLE IF EXISTS #TestTable
;
CREATE TABLE #TestTable
(
Item VARCHAR(10) NOT NULL
,GroupA TINYINT NOT NULL
,GroupB SMALLINT NOT NULL
,Updated DATE NOT NULL
,Idx INT NOT NULL
)
;
INSERT INTO #TestTable WITH (TABLOCK)
(Item,GroupA,GroupB,Updated,Idx)
VALUES ('ABC',7,2020,'14/11/2019',8) --Return this row
,('ABC',7,2020,'10/11/2019',7)
,('ABC',6,2019,'14/11/2019',6) --Return this row
,('ABC',5,2018,'13/11/2019',5) --Return this row
,('ABC',5,2018,'12/11/2019',4)
,('ABC',7,2018,'14/11/2019',3) --Return this row
,('ABC',7,2019,'25/11/2019',2) --Return this row
,('ABC',7,2019,'18/11/2019',1)
;
--===== Display the test data
SELECT * FROM #TestTable
;
Problem Description:
I need help in writing a query that will return the rows marked as "--Return this row". I know how to write a basic SELECT but have no idea how to pull this off.
The basis of the problem is to return the latest updated row for each "group" of rows. A "group" of rows is determined by the combination of the Item, GroupA, and GroupB columns and I need to return the full rows found.
Use row_number() :
select t.*
from (select t.*, row_number() over (partition by item, groupa, groupb order by updated desc) as seq
from table t
) t
where seq = 1;
select table.Item,table.GroupA,table.GroupB,table.Updated,Idx
FROM (select Item,GroupA,GroupB,max(Updated) Updated
from table
group by Item,GroupA,GroupB) a
inner join table
on(a.Item = table.Item and a.GroupA = table.GroupA and a.GroupB = table.GroupB and
a.Updated = table.Updated)

SQL query is taking too long to insert

I am trying to insert data into P_TABLE which is taking lot of time ~5-6 hrs to insert .Its a simple insert joining with big tables .Is there any way to reduce the timing? its a truncate and load process
I have provided the necessary information including Explain.
P_TABLE -- PARTITION ON TEAM
WH_TAB --- Total count = 2222000000
UNIQUE INDEX ON EX_ID,PROD_CD,CAM_CD,SEG_CD,LIST_CD,MAIL_DT
PARTITION BY RANGE (MAIL_DT)
REF_TAB--Total count= 240000000
ACT_TAB --Total count = 31239890
ALTER SESSION ENABLE PARALLEL DML;
INSERT /*+ append */ INTO P_TABLE
(
V_CODE,
CST_ID,
EX_ID,
PROD_CD,
CAM_CD,
SEG_CD,
LIST_CD,
MAIL_DT
)
SELECT
'ABC',
COALESCE(REF.CST_ID, WH.CST_ID),
WH.EX_ID,
PROD_CD,
CAM_CD,
SEG_CD,
LIST_CD,
F.TEAM,
FROM WH_TAB WH
LEFT OUTER JOIN
(
SELECT EX_ID, CST_ID, ACCT_ID, row_number() over(partition by EX_ID order by CST_ID asc) RN
FROM REF_TAB
) REF
LEFT OUTER JOIN ACT_TAB F
on F.CST_ID=REF.CST_ID
ON REF.RN=1 AND REF.EX_ID=WH.EX_ID
WHERE TRUNC(MAIL_DT) >= add_months(TRUNC(sysdate),-13)
AND WH.CAM_CD NOT LIKE 'ORD%';
COMMIT;
I'd suggest you run the select part of the statement without the insert to establish whether the slow part is the query or the insert.
It's likely that it's the select that's the problem, so the table structure, including indexes and explain output are really needed to say much more.

How to Retrieve id of inserted row when using upsert with WITH cluase in Posgres 9.5?

I'm trying to do upset query in Postgres 9.5 using "WITH"
with s as (
select id
from products
where product_key = 'test123'
), i as (
insert into products (product_key, count_parts)
select 'test123', 33
where not exists (select 1 from s)
returning id
)
update products
set product_key='test123', count_parts=33
where id = (select id from s)
returning id
Apparently I'm retrieving the id only on the updates and get nothing on insertions even though I know insertions succeeded.
I need to modify this query in a way I'll be able the get the id both on insertions and updates.
Thanks!
It wasn't clear to me why you do at WITH first SELECT, but the reason you get only returning UPDATE id is because you're not selecting INSERT return.
As mentioned (and linked) in comments, Postgres 9.5 supports INSERT ON CONFLICT Clause which is a much cleaner way to use.
And some examples of before and after 9.5:
Before 9.5: common way using WITH
WITH u AS (
UPDATE products
SET product_key='test123', count_parts=33
WHERE product_key = 'test123'
RETURNING id
),i AS (
INSERT
INTO products ( product_key, count_parts )
SELECT 'test123', 33
WHERE NOT EXISTS( SELECT 1 FROM u )
RETURNING id
)
SELECT *
FROM ( SELECT id FROM u
UNION SELECT id FROM i
) r;
After 9.5: using INSERT .. ON CONFLICT
INSERT INTO products ( product_key, count_parts )
VALUES ( 'test123', 33 )
ON CONFLICT ( product_key ) DO
UPDATE
SET product_key='test123', count_parts=33
RETURNING id;
UPDATE:
As hinted in a comment there might be slight cons using INSERT .. ON CONFLICT way.
In case table using auto-increment and this query happens a lot, then WITH might be a better option.
See more: https://stackoverflow.com/a/39000072/1161463

How to select values by date field (not as simple as it sounds)

I have a table called tblMK The table contains a date time field.
What I wish to do is create a query which will each time, select the 2 latest entries (by the datetime column) and then get the date difference between them and show only that.
How would I go around creating this expression. This doesn't necessarily need to be a query, it could be a view/function/procedure or what ever works. I have created a function called getdatediff which receives to dates, and returns a string the says (x days y hours z minutes) basically that will be the calculated field. So how would I go around doing this?
Edit: I need to each time select 2 and 2 and so on until the oldest one. There will always be an even amount of rows.
Use only sql like this:
create table t1(c1 integer, dt datetime);
insert into t1 values
(1, getdate()),
(2, dateadd(day,1,getdate())),
(3, dateadd(day,2,getdate()));
with temp as (select top 2 dt
from t1
order by dt desc)
select datediff(day,min(dt),max(dt)) as diff_of_dates
from temp;
sql fiddle
On MySQL use limit clause
select max(a.updated_at)-min(a.updated_at)
From
( select * from mytable order by updated_at desc limit 2 ) a
Thanks guys I found the solution please ignore the additional columns they are for my db:
; with numbered as (
Select part,taarich,hulia,mesirakabala,
rowno = row_number() OVER (Partition by parit order.by taarich)
From tblMK)
Select a.rowno-1,a.part, a.Julia,b.taarich,as.taarich_kabala,a.taarich, a.mesirakabala,getdatediff(b.taarich,a.taarich) as due
From numbered a
Left join numbered b ON b.parit=a.parit
And b.rowno = a.rowno - 1
Where b.taarich is not null
Order by part,taarich
Sorry about mistakes I might of made, I'm on my smartphone.

Make SQL Select same row multiple times

I need to test my mail server. How can I make a Select statement
that selects say ID=5469 a thousand times.
If I get your meaning then a very simple way is to cross join on a derived query on a table with more than 1000 rows in it and put a top 1000 on that. This would duplicate your results 1000 times.
EDIT: As an example (This is MSSQL, I don't know if Access is much different)
SELECT
MyTable.*
FROM
MyTable
CROSS JOIN
(
SELECT TOP 1000
*
FROM
sysobjects
) [BigTable]
WHERE
MyTable.ID = 1234
You can use the UNION ALL statement.
Try something like:
SELECT * FROM tablename WHERE ID = 5469
UNION ALL
SELECT * FROM tablename WHERE ID = 5469
You'd have to repeat the SELECT statement a bunch of times but you could write a bit of VB code in Access to create a dynamic SQL statement and then execute it. Not pretty but it should work.
Create a helper table for this purpose:
JUST_NUMBER(NUM INT primary key)
Insert (with the help of some (VB) script) numbers from 1 to N. Then execute this unjoined query:
SELECT MYTABLE.*
FROM MYTABLE,
JUST_NUMBER
WHERE MYTABLE.ID = 5469
AND JUST_NUMBER.NUM <= 1000
Here's a way of using a recursive common table expression to generate some empty rows, then to cross join them back onto your desired row:
declare #myData table (val int) ;
insert #myData values (666),(888),(777) --some dummy data
;with cte as
(
select 100 as a
union all
select a-1 from cte where a>0
--generate 100 rows, the max recursion depth
)
,someRows as
(
select top 1000 0 a from cte,cte x1,cte x2
--xjoin the hundred rows a few times
--to generate 1030301 rows, then select top n rows
)
select m.* from #myData m,someRows where m.val=666
substitute #myData for your real table, and alter the final predicate to suit.
easy way...
This exists only one row into the DB
sku = 52 , description = Skullcandy Inkd Green ,price = 50,00
Try to relate another table in which has no constraint key to the main table
Original Query
SELECT Prod_SKU , Prod_Descr , Prod_Price FROM dbo.TB_Prod WHERE Prod_SKU = N'52'
The Functional Query ...adding a not related table called 'dbo.TB_Labels'
SELECT TOP ('times') Prod_SKU , Prod_Descr , Prod_Price FROM dbo.TB_Prod,dbo.TB_Labels WHERE Prod_SKU = N'52'
In postgres there is a nice function called generate_series. So in postgreSQL it is as simple as:
select information from test_table, generate_series(1, 1000) where id = 5469
In this way, the query is executed 1000 times.
Example for postgreSQL:
CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; --To be able to use function uuid_generate_v4()
--Create a test table
create table test_table (
id serial not null,
uid UUID NOT NULL,
CONSTRAINT uid_pk PRIMARY KEY(id));
-- Insert 10000 rows
insert into test_table (uid)
select uuid_generate_v4() from generate_series(1, 10000);
-- Read the data from id=5469 one thousand times
select id, uid, uuid_generate_v4() from test_table, generate_series(1, 1000) where id = 5469;
As you can see in the result below, the data from uid is read 1000 times as confirmed by the generation of a new uuid at every new row.
id |uid |uuid_generate_v4
----------------------------------------------------------------------------------------
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"5630cd0d-ee47-4d92-9ee3-b373ec04756f"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"ed44b9cb-c57f-4a5b-ac9a-55bd57459c02"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"3428b3e3-3bb2-4e41-b2ca-baa3243024d9"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"7c8faf33-b30c-4bfa-96c8-1313a4f6ce7c"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"b589fd8a-fec2-4971-95e1-283a31443d73"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"8b9ab121-caa4-4015-83f5-0c2911a58640"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"7ef63128-b17c-4188-8056-c99035e16c11"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"5bdc7425-e14c-4c85-a25e-d99b27ae8b9f"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"9bbd260b-8b83-4fa5-9104-6fc3495f68f3"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"c1f759e1-c673-41ef-b009-51fed587353c"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"4a70bf2b-ddf5-4c42-9789-5e48e2aec441"
Of course other DBs won't necessarily have the same function but it could be done:
See here.
If your are doing this in sql Server
declare #cnt int
set #cnt = 0
while #cnt < 1000
begin
select '12345'
set #cnt = #cnt + 1
end
select '12345' can be any expression
Repeat rows based on column value of TestTable. First run the Create table and insert statement, then run the following query for the desired result.
This may be another solution:
CREATE TABLE TestTable
(
ID INT IDENTITY(1,1),
Col1 varchar(10),
Repeats INT
)
INSERT INTO TESTTABLE
VALUES ('A',2), ('B',4),('C',1),('D',0)
WITH x AS
(
SELECT TOP (SELECT MAX(Repeats)+1 FROM TestTable) rn = ROW_NUMBER()
OVER (ORDER BY [object_id])
FROM sys.all_columns
ORDER BY [object_id]
)
SELECT * FROM x
CROSS JOIN TestTable AS d
WHERE x.rn <= d.Repeats
ORDER BY Col1;
This trick helped me in my requirement.
here, PRODUCTDETAILS is my Datatable
and orderid is my column.
declare #Req_Rows int = 12
;WITH cte AS
(
SELECT 1 AS Number
UNION ALL
SELECT Number + 1 FROM cte WHERE Number < #Req_Rows
)
SELECT PRODUCTDETAILS.*
FROM cte, PRODUCTDETAILS
WHERE PRODUCTDETAILS.orderid = 3
create table #tmp1 (id int, fld varchar(max))
insert into #tmp1 (id, fld)
values (1,'hello!'),(2,'world'),(3,'nice day!')
select * from #tmp1
go
select * from #tmp1 where id=3
go 1000
drop table #tmp1
in sql server try:
print 'wow'
go 5
output:
Beginning execution loop
wow
wow
wow
wow
wow
Batch execution completed 5 times.
The easy way is to create a table with 1000 rows. Let's call it BigTable. Then you would query for the data you want and join it with the big table, like this:
SELECT MyTable.*
FROM MyTable, BigTable
WHERE MyTable.ID = 5469