subquery calculate days between dates - sql

Sub query, SQL, Oracle
I'm new to sub queries and hoping to get some assistance. My thought was the sub query would run first and then the outer query would execute based on the sub query filter of trans_code = 'ABC'. The query works but it pulls all dates from all transaction codes, trans_code 'ABC' and 'DEF' ect.
The end goal is to calculate the number of days between dates.
The table structure is:
acct_num effective_date
1234 01/01/2020
1234 02/01/2020
1234 03/01/2020
1234 04/01/2021
I want to execute a query to look like this:
account Effective_Date Effective_Date_2 Days_Diff
1234 01/01/2020 02/01/2020 31
1234 02/01/2020 03/01/2020 29
1234 03/01/2020 04/01/2021 395
1234 04/01/2021 0
Query:
SELECT t3.acct_num,
t3.trans_code,
t3.effective_date,
MIN (t2.effective_date) AS effective_date2,
MIN (t2.effective_date) - t3.effective_date AS days_diff
FROM (SELECT t1.acct_num, t1.trans_code, t1.effective_date
FROM lawd.trans t1
WHERE t1.trans_code = 'ABC') t3
LEFT JOIN lawd.trans t2 ON t3.acct_num = t2.acct_num
WHERE t3.acct_num = '1234' AND t2.effective_date > t3.effective_date
GROUP BY t3.acct_num, t3.effective_date, t3.trans_code
ORDER BY t3.effective_date asc
TIA!

Use lead():
select t.*,
lead(effective_date) over (partition by acct_num order by effect_date) as next_efffective_date,
(lead(effective_date) - effective_date) as diff
from lawd.trans t

Related

SQL - Get MIN of a column for each key in other table

I have two tables as such:
ORDERS
Date
TransactID
COL3
2021-06
1234
4
2021-09
1238
8
Agg
Date
User
TransactID
2021-06
3333
1234
2021-03
3333
XXXX
2021-02
3333
XXXX
2021-09
4444
1238
2021-05
4444
XXXX
2021-01
4444
XXXX
In AGG, a User can have many transactions, the ORDERS table is just a subset of it.
For each TransactID in Orders, I need to go into the Agg table and get the MIN date for the User associated with the TransactID.
Then, I need to calculate the date difference between the ORDERS.Date and the minimum AGG.DATE. The result is stored in SDP.COL3. COL3 can basically be described as Days Since First Transaction.
I have never done a SQL problem that is this multistep, and need some guidance. Any help would be greatly appreciated!
If I've got it right
SELECT SDP.TXN_ID, sdp.dt, datediff(sdp.dt, min(a1.DT)) diff
FROM SDP
JOIN AGG a1 on a1.UserID =
(SELECT a2.UserID
FROM AGG a2
WHERE SDP.TXN_ID = a2.TXN_ID
ORDER BY a2.UserID
limit 1)
GROUP BY SDP.TXN_ID, sdp.dt
You can omit
ORDER BY a2.UserID
limit 1
provided each transaction is always belonging to a single user.
The fiddle
based on your SQL Fidddle (http://sqlfiddle.com/#!9/101497/1) this should get you started
SELECT TXN_ID, DT, USERID
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY sdp.TXN_ID ORDER BY sdp.DT ASC) AS [index],
sdp.TXN_ID,
sdp.DT,
agg.USERID
FROM sdp
LEFT JOIN agg ON sdp.TXN_ID = agg.TXN_ID) A
WHERE [index] = 1
For more information you should look at
https://www.sqlshack.com/sql-partition-by-clause-overview/
https://www.sqltutorial.org/sql-window-functions/sql-partition-by/
https://learnsql.com/blog/partition-by-with-over-sql/

How to calculate average monthly number of some action in some perdion in Teradata SQL?

I have table in Teradata SQL like below:
ID trans_date
------------------------
123 | 2021-01-01
887 | 2021-01-15
123 | 2021-02-10
45 | 2021-03-11
789 | 2021-10-01
45 | 2021-09-02
And I need to calculate average monthly number of transactions made by customers in a period between 2021-01-01 and 2021-09-01, so client with "ID" = 789 will not be calculated because he made transaction later.
In the first month (01) were 2 transactions
In the second month was 1 transaction
In the third month was 1 transaction
In the nineth month was 1 transactions
So the result should be (2+1+1+1) / 4 = 1.25, isn't is ?
How can I calculate it in Teradata SQL? Of course I showed you sample of my data.
SELECT ID, AVG(txns) FROM
(SELECT ID, TRUNC(trans_date,'MON') as mth, COUNT(*) as txns
FROM mytable
-- WHERE condition matches the question but likely want to
-- use end date 2021-09-30 or use mth instead of trans_date
WHERE trans_date BETWEEN date'2021-01-01' and date'2021-09-01'
GROUP BY id, mth) mth_txn
GROUP BY id;
Your logic translated to SQL:
--(2+1+1+1) / 4
SELECT id, COUNT(*) / COUNT(DISTINCT TRUNC(trans_date,'MON')) AS avg_tx
FROM mytable
WHERE trans_date BETWEEN date'2021-01-01' and date'2021-09-01'
GROUP BY id;
You should compare to Fred's answer to see which is more efficent on your data.

Join records only on first match

im trying to join two tables. I only want the first matching row to be joined the others have to be null.
One of the tables contains daily records per User and the second table contains the goal for each user and day.
The joined result table should only join the firs ocurrence of User and Day and set the others to null. The Goal in the joined table can be interpreted as DailyGoal.
Example:
Table1 Table2
Id Day User Value Id Day User Goal
================================ ============================
01 01/01/2020 Bob 100 01 01/01/2020 Bob 300
02 01/01/2020 Bob 150 02 02/01/2020 Carl 170
03 01/01/2020 Bob 50
04 02/01/2020 Carl 200
05 02/01/2020 Carl 30
ResultTable
Day User Value Goal
============================================
01/01/2020 Bob 100 300
01/01/2020 Bob 150 (null)
01/01/2020 Bob 50 (null)
02/01/2020 Carl 200 170
02/01/2020 Carl 30 (null)
I tryed doing top1, distinct, subqueries but I cant find way to do it. Is this possible?
One option uses window functions:
select t1.*, t2.goal
from (
select t1.*,
row_number() over(partition by day, user order by id) as rn
from table1 t1
) t1
left join table2 t2 on t2.day = t1.day and t2.user = t1.user and t1.rn = 1
A case expression is even simpler:
select t1.*,
case when row_number() over(partition by day, user order by id) = 1
then t2.goal
end as goal
from table1 t1

Select Most Recent Entry in SQL

I'm trying to select the most recent non zero entry from my data set in SQL. Most examples of this are satisfied with returning only the date and the group by variables, but I would also like to return the relevant Value. For example:
ID Date Value
----------------------------
001 2014-10-01 32
001 2014-10-05 10
001 2014-10-17 0
002 2014-10-03 17
002 2014-10-20 60
003 2014-09-30 90
003 2014-10-10 7
004 2014-10-06 150
005 2014-10-17 0
005 2014-10-18 9
Using
SELECT ID, MAX(Date) AS MDate FROM Table WHERE Value > 0 GROUP BY ID
Returns:
ID Date
-------------------
001 2014-10-05
002 2014-10-20
003 2014-10-10
004 2014-10-06
005 2014-10-18
But whenever I try to include Value as one of the selected variables, SQLServer results in an error:
"Column 'Value' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause."
My desired result would be:
ID Date Value
----------------------------
001 2014-10-05 10
002 2014-10-20 60
003 2014-10-10 7
004 2014-10-06 150
005 2014-10-18 9
One solution I have thought of would be to look up the results back in the original Table and return the Value that corresponds to the relevant ID & Date (I have already trimmed down and so I know these are unique), but this seems to me like a messy solution. Any help on this would be appreciated.
NOTE: I do not want to group by Value as this is the result I am trying to pull out in the end (i.e. for each ID, I want the most recent Value). Further Example:
ID Date Value
----------------------------
001 2014-10-05 10
001 2014-10-06 10
001 2014-10-10 10
001 2014-10-12 8
001 2014-10-18 0
Here, I only want the last non zero entry. (001, 2014-10-12, 8)
SELECT ID, MAX(Date) AS MDate, Value FROM Table WHERE Value > 0 GROUP BY ID, Value
Would return:
ID Date Value
----------------------------
001 2014-10-10 10
001 2014-10-12 8
This can also be done using a window function which is very ofter faster than a join on a grouped query:
select id, date, value
from (
select id,
date,
value,
row_number() over (partition by id order by date desc) as rn
from the_table
) t
where rn = 1
order by id;
Assuming you don't have repeated dates for the same ID in the table, this should work:
SELECT A.ID, A.Date, A.Value
FROM
T1 AS A
INNER JOIN (SELECT ID,MAX(Date) AS Date FROM T1 WHERE Value > 0 GROUP BY ID) AS B
ON A.ID = B.ID AND A.Date = B.Date
select a.id, a.date, a.value from Table1 a inner join (
select id, max(date) mydate from table1
where Value>0 group by ID) b on a.ID=b.ID and a.Date=b.mydate
Using Subqry,
SELECT ID, Date AS MDate, VALUE
FROM table t1
where date = (Select max(date)
from table t2
where Value >0
and t1.id = t2.id
)
Answers provided are perfectly adequate, but Using CTE:
;WITH cteTable
AS
(
SELECT
Table.ID [ID], MAX(Date) [MaxDate]
FROM
Table
WHERE
Table.Value > 0
GROUP BY
Table.ID
)
SELECT
cteTable.ID, cteTable.Date, Table.Value
FROM
Table INNER JOIN cteTable ON (Table.ID = cteTable.ID)

Select info from table where row has max date

My table looks something like this:
group date cash checks
1 1/1/2013 0 0
2 1/1/2013 0 800
1 1/3/2013 0 700
3 1/1/2013 0 600
1 1/2/2013 0 400
3 1/5/2013 0 200
-- Do not need cash just demonstrating that table has more information in it
I want to get the each unique group where date is max and checks is greater than 0. So the return would look something like:
group date checks
2 1/1/2013 800
1 1/3/2013 700
3 1/5/2013 200
attempted code:
SELECT group,MAX(date),checks
FROM table
WHERE checks>0
GROUP BY group
ORDER BY group DESC
problem with that though is it gives me all the dates and checks rather than just the max date row.
using ms sql server 2005
SELECT group,MAX(date) as max_date
FROM table
WHERE checks>0
GROUP BY group
That works to get the max date..join it back to your data to get the other columns:
Select group,max_date,checks
from table t
inner join
(SELECT group,MAX(date) as max_date
FROM table
WHERE checks>0
GROUP BY group)a
on a.group = t.group and a.max_date = date
Inner join functions as the filter to get the max record only.
FYI, your column names are horrid, don't use reserved words for columns (group, date, table).
You can use a window MAX() like this:
SELECT
*,
max_date = MAX(date) OVER (PARTITION BY group)
FROM table
to get max dates per group alongside other data:
group date cash checks max_date
----- -------- ---- ------ --------
1 1/1/2013 0 0 1/3/2013
2 1/1/2013 0 800 1/1/2013
1 1/3/2013 0 700 1/3/2013
3 1/1/2013 0 600 1/5/2013
1 1/2/2013 0 400 1/3/2013
3 1/5/2013 0 200 1/5/2013
Using the above output as a derived table, you can then get only rows where date matches max_date:
SELECT
group,
date,
checks
FROM (
SELECT
*,
max_date = MAX(date) OVER (PARTITION BY group)
FROM table
) AS s
WHERE date = max_date
;
to get the desired result.
Basically, this is similar to #Twelfth's suggestion but avoids a join and may thus be more efficient.
You can try the method at SQL Fiddle.
Using an in can have a performance impact. Joining two subqueries will not have the same performance impact and can be accomplished like this:
SELECT *
FROM (SELECT msisdn
,callid
,Change_color
,play_file_name
,date_played
FROM insert_log
WHERE play_file_name NOT IN('Prompt1','Conclusion_Prompt_1','silent')
ORDER BY callid ASC) t1
JOIN (SELECT MAX(date_played) AS date_played
FROM insert_log GROUP BY callid) t2
ON t1.date_played = t2.date_played
SELECT distinct
group,
max_date = MAX(date) OVER (PARTITION BY group), checks
FROM table
Should work.