SQL get count(*) before several dates - sql

It sounds so simple but I can't figure it out. I have 2 tables:
TABLE 1 contains a list of projects with the dates at which they were approved.
PROJECT
APPROVAL_DATE
A
12/06/2019
A
01/09/2020
A
05/08/2021
A
07/12/2021
B
01/05/2018
B
06/09/2019
B
12/23/2020
TABLE 2 contains dates when tests were performed on these projects.
PROJECT
TEST_DATE
A
01/06/2019
A
01/07/2019
A
02/21/2019
...
...
A
06/22/2021
...
...
B
01/12/2021
...
...
THIS IS WHAT I NEED: For each project, I want to count the total number of tests prior to each APPROVAL_DATE, so I would have this:
PROJECT
APPROVAL_DATE
TOTAL_TESTS_BEFORE_APPROVAL_DATE
A
12/06/2019
1264
A
01/09/2020
1568
A
05/08/2021
1826
A
07/12/2021
2209
B
01/05/2018
560
B
06/09/2019
790
B
12/23/2020
1560

here is how you can do it using left join :
select t1.project, t1.APPROVAL_DATE, count(t2.test_date) TOTAL_TESTS_BEFORE_APPROVAL_DATE
from table1 t1
left join table2 t2
on t1.project = t2.project
and t1.APPROVAL_DATE > t2.TEST_DATE
group by t1.project, t1.APPROVAL_DATE

Related

T-SQL get values for specific group

I have a table EmployeeContract similar like this:
ContractId
EmployeeId
ValidFrom
ValidTo
Salary
12
5
2018-02-01
2019-06-31
x
25
8
2015-01-01
2099-12-31
x
50
5
2019-07-01
2021-05-31
x
52
6
2011-08-01
2021-12-31
x
72
8
2010-08-01
2014-12-31
x
52
6
2011-08-01
2021-12-31
x
Table includes history contracts in company for each employee. I need to get date when employees started work and last date of contract. Sometime records has duplicates.
For example, based on data from above:
EmployeeId
ValidFrom
ValidTo
5
2018-02-01
2021-05-31
8
2010-08-01
2099-12-31
6
2011-08-01
2021-12-31
Base on this article: https://www.techcoil.com/blog/sql-statement-for-selecting-the-latest-record-in-each-group/
I prepared query like this:
select minv.*, maxv.maxvalidto from
(select distinct con.[EmployeeId], mvt.maxvalidto
from [EmployeeContract] con
join (select [EmployeeId], max(validto) as maxvalidto
FROM [EmployeeContract]
group by [EmployeeId]) mvt
on con.[EmployeeId] = mvt.[EmployeeId] and mvt.maxvalidto = con.validto) maxv
join
(select distinct con.[EmployeeId], mvf.minvalidfrom
from [EmployeeContract] con
join (select [EmployeeId], min(validfrom) as minvalidfrom
FROM [EmployeeContract]
group by [EmployeeId]) mvf
on con.[EmployeeId] = mvf.[EmployeeId] and mvf.minvalidfrom = con.validfrom) minv
on minv.[EmployeeId] = maxv.[EmployeeId]
order by 1
But I'm not satisfied, i think it's not easy to read, and probably optimize is poor. How can I do it better?
I think you want group by:
select employeeid, min(validfrom), max(validto)
from employeecontract
group by employeeid

Problems with complex query

There are two tables.
In the first I have columns:
id - a person
time - the time of receiving the bonus (timestamp)
money - size of bonus
And the second:
id
time - time of getting a rank (timestamp)
range - military rank (int)
The task is to withdraw the amount and number of bonuses received by people in the rank of captain (range = 7) with aggregation by day.
I have no ideas how to do a table with this data. I can summarize data by all days such as
SELECT DISTINCTROW Payment.user_id AS user_id, Sum(IIf(IsNull(Payment.money),0,Payment.money)) AS [Sum - money], Count(Payment.money) AS [Count - Payment], Format(Payment.time, "Short Date") as day
FROM Payment
GROUP BY Payment.user_id, Format (Payment.time, "Short Date")
Having ((Count(Payment.money) > 0));
Can you help me with second part and summarize them? thanks
For example: first table (Payment):
user_id time money
a 01.01.10 00:00:00 15,00
a 01.01.10 10:00:00 2,00
a 03.01.10 00:00:00 3,00
c 04.01.10 00:00:00 4,00
c 04.01.10 00:05:00 5,00
d 06.01.10 00:00:00 6,00
e 07.01.10 00:00:00 7,00
e 08.01.10 00:00:00 8,00
The second one:
user_id time range
a 01.01.10 00:00:00 6
a 01.01.10 09:00:00 7
a 04.01.10 00:00:00 8
b 04.01.10 00:00:00 4
c 04.01.10 00:05:00 7
d 06.01.10 00:00:00 5
e 07.01.10 00:00:00 6
f 08.01.10 00:00:00 6
g 08.01.10 00:00:00 7
I expected:
user_id time sum
a 01.01.10 2
a 03.01.10 3
c 04.01.10 5
Here is one possible method using joins:
select t1.user_id, datevalue(p.time) as [time], sum(p.money) as [sum]
from
(
(select t.user_id, t.time from rank t where t.range = 7) t1
inner join payment p on t1.user_id = p.user_id
)
left join
(select t.user_id, t.time from rank t where t.range > 7) t2 on p.user_id = t2.user_id
where
p.time >= t1.time and (t2.user_id is null or p.time < t2.time)
group by
t1.user_id, datevalue(p.time)
I have assumed that your second table is called rank (this was not stated in your question).
Here, the subquery t1 obtains the set of users with range = 7 (captain), and the subquery t2 obtains the set of users with range > 7. I then select all records with a payment date greater than or equal to the date of promotion to captain, but less than any subsequent promotion (if it exists).
This yields the following result:
+---------+------------+------+
| user_id | time | sum |
+---------+------------+------+
| a | 01/01/2010 | 2.00 |
| a | 03/01/2010 | 3.00 |
| c | 04/01/2010 | 5.00 |
+---------+------------+------+
Unless I have misunderstood, I would argue that your expected result is incorrect as the payment below occurs before user_id = c achieved the rank of captain:
c 04.01.10 00:00:00 4,00
c 04.01.10 00:05:00 7

Take the last row Group By date

I need to select content statistics group By Date.
Here example of records :
id cid viewCount created_at
1 1 50 31-12-2018 18:00:00
2 1 50 01-01-2019 18:00:00
3 2 50 01-01-2019 18:00:00
4 2 100 01-01-2019 19:00:00
5 2 150 01-01-2019 20:00:00
6 3 1000 01-01-2019 15:00:00
Need to return :
id cid viewCount date
1 1 50 31-12-2018
2 1 50 01-01-2019
5 2 150 01-01-2019
6 3 1000 01-01-2019
I tried the following code
$qb = $this->createQueryBuilder('c');
$qb->select('a.id as id')
->addSelect('COALESCE(SUM(a.viewCount),0) as viewCount')
->addSelect('DATE_FORMAT(a.createdAt, \'%d-%m-%Y\') as date');
->innerJoin('c.analytics', 'a')
->groupBy('c.cid')
->addGroupBy('date')
->orderBy('a.createdAt', 'ASC');
return:
id cid viewCount date
1 1 50 31-12-2018
2 1 50 01-01-2019
3 2 50 01-01-2019
4 2 100 01-01-2019
5 2 150 01-01-2019
6 3 1000 01-01-2019
I have tried to create a subquery :
$qbLastHour = $this->createQueryBuilder('cc');
$qbLastHour->select('MAX(DATE_FORMAT(aa.createdAt, \'%H\'))')
->innerJoin('cc.analytics', 'aa')
->where('cc.id=c.id')
->groupBy('cc.cid')
->addGroupBy('s');
$qb->addSelect(sprintf("(%s) AS r", $qbLastHour->getDQl()));
But something go wrong because i dont groupBy date at the subquery.
If someone can help me. Thank you
Update
Here is an attempt, in sql again, to select only one row per date and cid based on the max time per day
SELECT id, c.cid, viewCount, max_date
FROM content a
JOIN content_analytic c ON a.id = c.content_id
RIGHT JOIN (SELECT c.cid, DATE_FORMAT(created_at, '%d-%m-%Y') dt, MAX(created_at) max_date
FROM content a
JOIN content_analytic c ON a.id = c.content_id
GROUP BY dt, c.cid) x ON x.max_date = a.created_at and x.cid = c.cid
This is how I believe the query should be in pure sql
SELECT c.cid, COALESCE(SUM(a.viewCount), 0), DATEFORMAT(a.created_at, ‘%d-%m-%Y’) as date
FROM content a
INNER JOIN content_analytic c ON a.id = c.content_id
GROUP BY c.cid, date
ORDER BY date

Tracing original Value through Iteration SQL

Suppose there is a data collection system that, whenever a record is altered, it is then saved as a new record with a prefix (say M-[most recent number in que and is unique]).
Suppose I am given the following data set:
Customer | Original_Val
1 1020
2 1011
3 1001
I need to find the most recent value for each customer given the following table:
Customer | Most_Recent_Val | Pretained_To_Val | date
1 M-2000 M-1050 20170225
1 M-1050 M-1035 20170205
1 M-1035 1020 20170131
1 1020 NULL 20170101
2 M-1031 1011 20170105
2 1011 NULL 20161231
3 1001 NULL 20150101
My desired output would be:
Customer | Original_Val | Most_Recent_Val | date
1 1020 M-2000 20170225
2 1011 M-1031 20170105
3 1001 1001 20150101
For customer 1, there are 4 levels i.e (M-2000 <- M-1050 <- M-1035 <- 1020) Note that there would be no more than 10 levels of depth for each customer.
Much Appreciated! Thanks in advance.
Find the min and max of each customer and then join it together. Something like this:
Select
[min].Customer
,[min].Most_Recent_Val as Original_Val
,[max].Most_Recent_Val as Most_Recent_Val
,[max].date
From
(
Select
Customer
,Most_Recent_Val
,date
From
table t1
inner join (
Select
Customer
,MIN(date) as MIN_Date
From
table
Group By
Customer
) t2 ON t2.Customer = t1.Customer
and t2.MIN_Date = t1.Date
) [min]
inner join (
Select
Customer
,Most_Recent_Val
,date
From
table t1
inner join (
Select
Customer
,MAX(date) as MAX_Date
From
table
Group By
Customer
) t2 ON t2.Customer = t1.Customer
and t2.MAX_Date = t1.Date
) [max] ON [max].Customer = [min].Customer

Zero Fill Data for Missing Dates

I have two tables:
Cust Sales Week
123 4 1/8/2015
123 3 1/22/2015
234 4 1/1/2015
.
Week
1/1/2015
1/8/2015
1/15/2015
1/22/2015
I want to combine them so that every Cust has every date and where there are no Sales it is filled with 0.
Cust Sales Week
123 4 1/1/2015
123 0 1/8/2015
123 0 1/15/2015
123 3 1/22/2015
234 4 1/1/2015
234 0 1/8/2015
234 0 1/15/2015
234 0 1/22/2015
Is there a way I can 'select distinct(Cust)' and join them somehow?
First, generate the rows you want using a cross join. Then bring in the data you want using a left join:
select c.cust, w.week, coalesce(t.sales, 0) as sales
from weeks w cross join
(select distinct cust from t) c left join
t
on t.cust = c.cust and t.week = w.week;
You can left join on the dates table and use isnull on the sales column. Use an equivalent of isnull in Netezza.
select t1.cust, isnull(t1.sales,0), t2.week
from daystable2 t2 left join salestable1 t1 on t1.week = t2.week
I think this will do the trick
SELECT week, cust, COALESCE(sales, 0)
FROM week_tbl a LEFT JOIN cust_table b
ON a.week = b.week