PostgreSQL query for multiple update - sql

I have a table in which I have 4 columns: emp_no,desig_name,from_date and to_date:
emp_no desig_name from_date to_date
1001 engineer 2004-08-01 00:00:00
1001 sr.engineer 2010-08-01 00:00:00
1001 chief.engineer 2013-08-01 00:00:00
So my question is to update first row to_date column just one day before from_date of second row as well as for the second one aslo?
After update it should look like:
emp_no desig_name from_date to_date
1001 engineer 2004-08-01 00:00:00 2010-07-31 00:00:00
1001 sr.engineer 2010-08-01 00:00:00 2013-07-31 00:00:00
1001 chief.engineer 2013-08-01 00:00:00

You can calculate the "next" date using the lead() function.
This calculated value can then be used to update the table:
with calc as (
select promotion_id,
emp_no,
from_date,
lead(from_date) over (partition by emp_no order by from_date) as next_date
from emp
)
update emp
set to_date = c.next_date - interval '1' day
from calc c
where c.promotion_id = emp.promotion_id;
As you can see getting that value is quite easy, and storing derived information is very often not a good idea. You might want to consider a view that calculates this information on the fly so you don't need to update your table each time you insert a new row.
SQLFiddle example: http://sqlfiddle.com/#!15/31665/1

Related

PGSQL query to get a list of sequential dates from today

I am having an calendar table where I have added the list of dates on which no action should be performed.
The table is as follows and the date format is YYYY-MM-DD
date
2021-01-01
2021-04-05
2021-04-06
2021-04-07
2021-08-10
2021-11-22
2021-11-23
2021-11-24
2021-12-25
2021-12-31
Considering today is 2021-11-24.
The expected output is
date
2021-11-24
2021-11-23
2021-11-22
And Considering today is 2021-12-25
then the expected output is
date
2021-12-25
And Considering today is 2021-12-27
then the output should contain no data.
date
It should get me the sequence with today's date in descending order without a break of sequence.
I searched on various posts I did find some of the posts related to my question but the query was little complex with nested subqueries. Is there a way to achieve the output in a more optimized way. I am new to pgsql.
Create example table:
CREATE TABLE calendar (d date);
INSERT INTO calendar VALUES ('2021-11-23'),('2021-11-20');
Query:
SELECT * FROM
(SELECT CURRENT_DATE - '1 day'::interval * generate_series(0,10) AS d) a
LEFT JOIN calendar c ON (c.d=a.d);
a.d | c.d
---------------------+------------
2021-11-14 00:00:00 | Null
2021-11-15 00:00:00 | Null
2021-11-16 00:00:00 | Null
2021-11-17 00:00:00 | Null
2021-11-18 00:00:00 | Null
2021-11-19 00:00:00 | Null
2021-11-20 00:00:00 | 2021-11-20
2021-11-21 00:00:00 | Null
2021-11-22 00:00:00 | Null
2021-11-23 00:00:00 | 2021-11-23
2021-11-24 00:00:00 | Null
Subquery "a" generates a date series, and then we join it to the table.
You can add conditions , for example "WHERE calendar.d IS NULL", or "IS NOT NULL" depending on the filtering you want.
You can simply filter by a date range, building it by subtracting 2 days from today:
select "date"
from maintenance_dates_70099898
where "date" <= now()::date --you want to see today and 2 days prior; Last 3 days total
and "date" >= now()::date - '2 days'::interval
order by 1 desc;
With a runnable test:
drop table if exists maintenance_dates_70099898;
create table maintenance_dates_70099898 ("date" date);
insert into maintenance_dates_70099898
("date")
values
('2021-01-01'),
('2021-04-05'),
('2021-04-06'),
('2021-04-07'),
('2021-08-10'),
('2021-11-22'),
('2021-11-23'),
('2021-11-24'),
('2021-12-25'),
('2021-12-31');
select "date"
from maintenance_dates_70099898
where "date" <= now()::date --you want to see today and 2 days prior; Last 3 days total
and "date" >= now()::date - '2 days'::interval
order by 1 desc;
-- date
--------------
-- 2021-11-24
-- 2021-11-23
-- 2021-11-22
--(3 rows)
select "date"
from maintenance_dates_70099898
where "date" >= '2021-12-25'::date - '2 days'::interval
and "date" <= '2021-12-25'::date
order by 1 desc;
-- date
--------------
-- 2021-12-25
--(1 row)
I assume that for 2021-12-27 you do want to see 2021-12-25, as it's within the 3 day range prior.
select "date"
from maintenance_dates_70099898
where "date" >= '2021-12-28'::date - '2 days'::interval
and "date" <= '2021-12-28'::date
order by 1 desc;
-- date
--------
--(0 rows)
The main issue appears to be not having a known number of days thus disabling a simple range validation/selection. However to the rescue there is a RECURSIVE cte to pluck off each previous date that is exactly 1 day prior to the last and terminate when no longer holds.
with recursive no_action(no_act_dt) as
( select no_act_dt
from no_action_calendar
where no_act_dt = :parm_date::date
union all
select c.no_act_dt
from no_action_calendar c
join no_action a
on (c.no_act_dt = a.no_act_dt - 1)
)
select *
from no_action
order by no_act_dt desc;
If you use this often or from several points, you can parametrize it with a SQL function. (see demo for both).
create or replace
function consective_no_action_dates (date_in date)
returns setof date
language sql
as $$
with recursive no_action(no_act_dt) as
( select no_act_dt
from no_action_calendar
where no_act_dt = date_in
union all
select c.no_act_dt
from no_action_calendar c
join no_action a
on (c.no_act_dt = a.no_act_dt - 1)
)
select *
from no_action
order by no_act_dt desc;
$$;

How can I extract the values of the last aggregation date in sql

I have the following table.
id user time_stamp
1 Mike 2020-02-13 00:00:00 UTC
2 John 2020-02-13 00:00:00 UTC
3 Levy 2020-02-12 00:00:00 UTC
4 Sam 2020-02-12 00:00:00 UTC
5 Frodo 2020-02-11 00:00:00 UTC
Let's say 2020-02-13 00:00:00 UTC is the last day and I would like to query this table to only display last days results? I want to create a view in Bigquery so that I only and always get the last day's results?
So that in the end I get something like this (For last day which is 2020-02-13 00:00:00 UTC )
id user time_stamp
1 Mike 2020-02-13 00:00:00 UTC
2 John 2020-02-13 00:00:00 UTC
You can use window functions:
select t.* except (seqnum)
from (select t.*,
dense_rank() over (order by time_stamp) as seqnum
from t
) t
where seqnum = 1;
This may not work well on a large amount of data -- because of the way that BQ implements window functions with no partitioning. So, you might find that this works better (especially if the above runs out of resources):
select t.*
from t join
(select max(time_stamp) as max_time_stamp
from t
) tt
on t.time_stamp = max_time_stamp;
Also, if the timestamps actually have date components, then you will want to convert to a date or remove the time component somehow.

Oracle SQL, Recursion, Spreadsheet calculation

I want to transfer a calculation from an Excel spreadsheet to an Oracle SQL Query.
There are three predefined columns ID, IncommingDate and ProcessingTime.
Now I want to calculate two additional columns namely Processing Start and Processing End.
The result should look as follows:
With the Formulas:
One can see that the ProcessingStart of one entry should be the maximum of its IncommingDate and the ProcessingEnd of the previous entry.
How can I achieve this using SQL?
I have prepared an example query here:
WITH example AS
(
SELECT
1 AS id,
to_date ('01.01.2018 00:00:00','dd.MM.yyyy HH24:mi:ss') AS IncommingDate,
60 AS "Processing Time [sec.]"
FROM
dual
UNION ALL
SELECT
2,
to_date ('01.01.2018 00:05:00','dd.MM.yyyy HH24:mi:ss'),
60
FROM
dual
UNION ALL
SELECT
3,
to_date ('01.01.2018 00:05:30','dd.MM.yyyy HH24:mi:ss'),
60
FROM
dual
UNION ALL
SELECT
4,
to_date ('01.01.2018 00:10:00','dd.MM.yyyy HH24:mi:ss'),
60
FROM
dual
)
SELECT
*
FROM
example
Does anybody of you knows a way to do this?
It looks like you need to use recursive subquery factoring:
with rcte (id, IncommingDate, ProcessingTime, ProcessingStart, ProcessingEnd) as (
select id,
IncommingDate,
ProcessingTime,
IncommingDate,
IncommingDate + (ProcessingTime/86400)
from example
where id = 1
union all
select e.id,
e.IncommingDate,
e.ProcessingTime,
greatest(e.IncommingDate, r.ProcessingEnd),
greatest(e.IncommingDate, r.ProcessingEnd) + (e.ProcessingTime/86400)
from rcte r
-- assumes IDs are the ordering criteris and are contiguous
join example e on e.id = r.id + 1
)
select * from rcte;
ID INCOMMINGDATE PROCESSINGTIME PROCESSINGSTART PROCESSINGEND
---------- ------------------- -------------- ------------------- -------------------
1 2018-01-01 00:00:00 60 2018-01-01 00:00:00 2018-01-01 00:01:00
2 2018-01-01 00:05:00 60 2018-01-01 00:05:00 2018-01-01 00:06:00
3 2018-01-01 00:05:30 60 2018-01-01 00:06:00 2018-01-01 00:07:00
4 2018-01-01 00:10:00 60 2018-01-01 00:10:00 2018-01-01 00:11:00
The anchor member is ID 1, and can do a simple calculation for that first step to get the start/end times.
The recursive member then find the next original row and uses greatest() to decide whether to do its calculations based on it's incoming time or the previous end time.
This is assuming that the ordering is based on the IDs, and that they are contiguous. If that isn't how you are actually ordering then it's only a bit more complicated.

Teradata SQL: Determine how many accounts had status change in given month

Ok, so I have a table that looks something like this:
Acct_id Eff_dt Expr_dt Prod_cd Open_dt
-------------------------------------------------------
111 2012-05-01 2013-06-01 A 2012-05-01
111 2013-06-02 2014-03-08 A 2012-05-01
111 2014-03-09 9999-12-31 B 2012-05-01
222 2015-07-15 2015-11-11 A 2015-07-15
222 2015-11-12 2016-08-08 B 2015-07-15
222 2016-08-09 9999-12-31 A 2015-07-15
333 2016-01-01 2016-04-15 B 2016-01-01
333 2016-04-16 2016-08-08 B 2016-01-01
333 2016-08-09 9999-12-31 A 2016-01-01
444 2017-02-03 2017-05-15 A 2017-02-03
444 2017-05-16 2017-12-02 A 2017-02-03
444 2017-12-03 9999-12-31 B 2017-02-03
555 2017-12-12 9999-12-31 B 2017-12-12
There are many more columns that I'm not including as they're otherwise not relevant.
What I'm trying to determine is how many accounts had a change in Prod_cd in a given month, but then only in one direction (so from A > B in this example). Sometimes however an account was first opened as B, and then later changed to A. Or it was opened as A, changed to B, and moved back to A. I only want to know the current set of accounts where in a given month the Prod_cd changed from A to B.
Eff_dt is the date when a change was made to an account (could be any change, such as address change, name change, or what I'm looking for, product code change).
Expr_dt is the expiration date of that row, essentially the last day before a new change was made. When the date of that row is 9999-12-31, that's the most current row.
Open_dt is the date the account was created.
I created a query at first that was something like this:
select
count(distinct acct_id)
from table
where prod_cd = 'B'
and expr_dt = '9999-12-31'
and eff_dt between '2017-12-01' and '2017-12-31'
and open_dt < '2017-12-01'
But it's giving me results that don't look right. I want to specifically track the # of conversions that happened, but the count of accounts I'm getting seems way too high.
There is probably a way to create a more reliable query using window functions, but given that the Prod_cd changes can happen in multiple directions, I'm not sure how to write that query. Any help would be appreciated!
If you are specifically looking for the switch A --> B, then the simplest method is to use lag(). But, Teradata requires a slightly different formulation:
select count(distinct acct_id)
from (select t.*,
max(prod_cd) over (partition by acct_id order by effdt rows between 1 preceding and 1 preceding) as prev_prod_cd
from t
) t
where prod_cd = 'B' and prev_prod_cd = 'A' and
expr_dt = '9999-12-31' and
eff_dt between '2017-12-01' and '2017-12-31' and
open_dt < '2017-12-01';
I am guessing that the date conditions go in the outer query -- meaning that they lag() does not use them.
Similar to Gordon's answer, but using a supported window function (instead of LAG) and using Teradata's QUALIFY clause to do the lag-gy lookup:
SELECT DISTINCT acct_id
FROM mytable
QUALIFY
MAX(prod_cd) OVER (PARTITION BY acct_id ORDER BY eff_dt ASC ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) = 'A'
AND prod_cd = 'B'
AND expr_dt = '9999-12-31'
AND eff_dt between DATE '2013-01-01' and DATE '2017-12-31'
AND open_dt < DATE '2017-12-01'

what is the difference between setting date condition with extract date and date between d1 and d2 in sql

i have written two queries which i expected would give me the same data.
Query 1
select transaction, count(*)
from table
where create_date between to_Date('02/11/2017','MM/DD/YYYY') and to_date('02/17/2017','MM/DD/YYYY')
group by transaction
Query 2
select transaction, count(*)
from table
where extract(day from create_date) between 11 and 17
and extract(month from create_date)=2
and extract(year from create_date)=2017
group by transaction
Results from query 1
Transaction1 1155
Transaction2 333
Transaction3 5188
Results from query 2
Transaction1 1422
Transaction2 415
Transaction3 6155
why am i getting different results?
The first query gets the values where the values are between 2017-02-11 00:00:00 and 2017-02-17 00:00:00.
The second query gets the values where the values are between 2017-02-11 00:00:00 and 2017-02-17 23:59:59.
So, if there are values between 2017-02-17 00:00:01 and 2017-02-17 23:59:59 then they will be included in the COUNT of the second query but not the first.
Try:
select transaction, count(*)
from table
where create_date >= DATE '2017-02-11'
AND create_date < DATE '2017-02-18'
group by transaction
or
select transaction, count(*)
from table
where TRUNC( create_date ) BETWEEN DATE '2017-02-11' AND DATE '2017-02-18'
group by transaction
(Note: the later query will not use indexes on create_date and would need a function-based index on TRUNC( create_date ) instead.)
TO_DATE - Convert String to Datetime, and internally the between clause is working on HH: MI: SS in your first query , for making same result from second one need to take care about the HH: MI: SS in your second query