Join Runing Sum in TSQL - sql

I would like to join a running sum until a specific time point. E.g. I have two tables
Table A
TimestampOfInterest
2001-01-01
2001-02-01
2001-03-01
Table B
Timestamp Credits
2001-01-01 1
2001-01-05 1
2001-02-10 1
2001-03-15 1
Joining B -> A should lead to
TimestampOfInterest Credits
2001-01-01 0
2001-02-01 2
2001-03-01 3
That is the sum of credits until the given TimestampOfInterest.
Can someone help?
Lazloo

not sure you need join. You can simply do this:
Select TimestampOfInterest,
(Select SUM(Credits)
from TableB
where Timestamp < A.TimeStampOfInterest and Category = A.Category) Credits
From TableA A

Related

SQL For each value from a table, execute a query on another table depending on that value

I have two simple tables:
Table #1 apples_consumption
report_date
apples_consumed
2022-01-01
5
2022-02-01
7
2022-03-01
2
Table #2 hotel_visitors
visitor_id
check_in_date
check_out_date
1
2021-12-01
2022-02-01
2
2022-01-01
NULL
3
2022-02-01
NULL
4
2022-03-01
NULL
My purpose is to get a table which shows the ratio between number of visitors in the hotel to number of apples consumed by that time.
For the example above the desired query output should look like this:
report_date
visitors_count
apples_consumed
2022-01-01
2 -->(visitors #1, #2)
5
2022-02-01
3 -->(visitors #1, #2, #3)
7
2022-03-01
3 -->(visitors #2, #3, #4)
2
If I were to write a solution to this task using code I would go over each report_date from the apples_consumption table and count how many visitors have a lower/equal check_in_date than that report_date and also have a check_out_date = NULL or check_out_date greater/equal than that report_date
I came up with this query:
select
ac.report_date,
ac.apples_consumed,
(
select count(*)
from hotel_visitors hv
where
hv.check_in_date <= ac.report_date and
(hv.check_out_date is null or hv.check_out_date >= ac.report_date
) as visitors_count
from
apples_consumptions ac
order by
ac.report_date
The query above works but it is very inefficient (I can see its relatively long execution time for larger datasets and by the way its written [it runs the inner count(*) query for as many rows as the outer apples_consuptions table has)
I am looking for a more efficient way to achieve this result and your help will be highly appreciated!
It is very rarely a good idea to put a subselect in your select list.
Join your tables and then use an aggregate count:
select a.report_date, count(v.visitor_id) as visitors_count, a.apples_consumed
from apples_consumption a
left join hotel_visitors v
on a.report_date
between v.check_in_date
and coalesce(v.check_out_date, '9999-12-31')
group by a.report_date, a.apples_consumed
order by a.report_date;
db<>fiddle here

CASE statement in the WHERE clause, with further conditioning after THEN

I have a table of invoices and one of contracts that invoices are linked to. However, not every invoice is linked to a contract. I want to return all invoices linked to contracts which are within a timeframe (Starting before 2021-03-01), and also the invoices with no contracts linked at all. I believe this is something like: contract IS null OR CASE WHEN contract IS NOT NULL THEN (condition on timestamp), but I don't know how to write it. Note that the actual conditioning will be done on multiple columns, so I am looking for the general form of multiple sub-conditions under the conditions in the WHERE clause.
Example:
Invoice Table
InvoiceID
ContractID
1
NULL
2
1
3
NULL
4
2
5
3
6
4
Contract Table
ContractID
Contract Start Timestamp
1
2021-01-01 00:00:00
2
2021-02-01 00:00:00
3
2021-03-02 00:00:00
4
2021-05-01 00:00:00
Desired Result
InvoiceID
ContractID
1
NULL
2
1
3
NULL
4
2
maybe a simple left join like this can help you in filtering
select
I.*
from Invoice I
left outer join Contract C -- left join gets even the NULL contractid invoices
on C.ContractID= I.ContractID
where
C.[Contract Start Timestamp] IS NULL
OR C.[Contract Start Timestamp]< '2021-03-01'

SQL get count(*) before several dates

It sounds so simple but I can't figure it out. I have 2 tables:
TABLE 1 contains a list of projects with the dates at which they were approved.
PROJECT
APPROVAL_DATE
A
12/06/2019
A
01/09/2020
A
05/08/2021
A
07/12/2021
B
01/05/2018
B
06/09/2019
B
12/23/2020
TABLE 2 contains dates when tests were performed on these projects.
PROJECT
TEST_DATE
A
01/06/2019
A
01/07/2019
A
02/21/2019
...
...
A
06/22/2021
...
...
B
01/12/2021
...
...
THIS IS WHAT I NEED: For each project, I want to count the total number of tests prior to each APPROVAL_DATE, so I would have this:
PROJECT
APPROVAL_DATE
TOTAL_TESTS_BEFORE_APPROVAL_DATE
A
12/06/2019
1264
A
01/09/2020
1568
A
05/08/2021
1826
A
07/12/2021
2209
B
01/05/2018
560
B
06/09/2019
790
B
12/23/2020
1560
here is how you can do it using left join :
select t1.project, t1.APPROVAL_DATE, count(t2.test_date) TOTAL_TESTS_BEFORE_APPROVAL_DATE
from table1 t1
left join table2 t2
on t1.project = t2.project
and t1.APPROVAL_DATE > t2.TEST_DATE
group by t1.project, t1.APPROVAL_DATE

Remove Transaction using ID and Date range in Redshift SQL

I want to create a query in redshift that will remove transaction items (Table A) base on its validity date (Table B)
Lets call it table A and Table B
Table A contains all transaction of Items
Table A
Item_Code Date
1. I0001 2019-12-01
2. I0002 2019-12-02
3. I0001 2020-01-01
4. I0003 2020-01-01
then Table 2 contains Item validity date
Table B
Item_Code Valid_From Valid_To
1. I0001 2019-01-01 2019-12-31
2. I0002 2019-01-01 2019-12-31
3. I0003 2019-01-01 2019-12-31
and my expected output will be
Item_Code Date
1. I0001 2019-12-01
2. I0002 2019-12-02
THis will give your expected results
Select * from TableA a where not exists(select 1 from TableB b on a.Date between b.Valid_From
and Valid_To)
And once you sure thats what you want, you delete it with
delete from TableA a where not exists(select 1 from TableB b on a.Date between b.Valid_From
and Valid_To)
I would join both tables and insert the validity as a join condition.
This will delete all invalid items or you can reverse the validity condition to detect which items are invalid
SELECT A.ITEM_CODE,A.DATE
FROM Table a
INNER JOIN TABLE B
ON A.ITEM_CODE=B.ITEM_CODE AND A.DATE=> B.Valid_from AND A.date <= B.Valid_to

Query to complete empty spaces on table in Postgres

I need to create a view that given the first two tables (A and B), I get the result like in table C.
Basically I need to fill empty spaces on table B, using the first previous value available like shown below.
I've accomplished this using two loops on a procedure, but I'd like to try a solution using just selects statements.
table_a
date
1/1/2013
2/1/2013
3/1/2013
4/1/2013
5/1/2013
6/1/2013
7/1/2013
8/1/2013
9/1/2013
10/1/2013
....
table_b
date value
1/1/2013 10
3/1/2013 5
7/1/2013 30
10/1/2013 40
table_c - Desired result
date value
1/1/2013 10
2/1/2013 10
3/1/2013 5
4/1/2013 5
5/1/2013 5
6/1/2013 5
7/1/2013 30
8/1/2013 30
9/1/2013 30
10/1/2013 40
Does someone has any idea on how to accomplish this?
My sql is very rusty so I have this nagging feeling there's a better way, but what I came up with was to join against a sub-select that's a self join of table_b to make a new table b with date ranges. With that, it's easy to match table_a with the proper value.
I left a test on sqlfiddle so you can see the assumptions I made. This is the code below :
select date_format(a.date,'%m/%d/%Y') as date, b.value as value
from table_a as a join
(select b1.date as start, IFNULL(min(b2.date),'9999-12-31') as end, b1.value as value
from table_b as b1 left outer join table_b as b2
on b1.date < b2.date
group by b1.date) as b
on a.date >= b.start and a.date < b.end
The self join trims out the extra b2 entries with a group by and taking the min b2 date that's larger than b1's date. In the case of the very last entry, there is no b2 date larger so it ends up null; that I map to 12/31/9999 to be a really large date.