How to get 100 records from a table with a value from year 2022 more than from year 2021 - sql

Here is the code that is working, but can it be more efficient and avoid subquery?
condition used: (revenue_2022 - revenue_2021) > 0
or revenue_2022 > revenue_2021
select
id
from
main_tbl
where
(
(
select
revenue
from
main_tbl
where
id = ts.id
and bq_year = 2022
and revenue > 0
) -
(
select
revenue
from
main_tbl
where
id = ts.id
and year = 2021
and revenue > 0
)
) > 0
limit 100
main_table:
id | revenue | year
----------------------
1 | 4500 | 2022
1 | 4600 | 2021
2 | 3300 | 2022
3 | 5800 | 2022
3 | 5500 | 2021
expected output is the id 3 since its revenue for the year 2022 is greater than the year 2021
And 2 is not considered, since it doesn't have the year 2021 to compare with.

It's a little unclear what you'd like to do if there are other years, or more than one entry per id+year, but you could do something like:
select
id
from main_table y2022, main_table y2021
where y2022.year = 2022
and y2021.year = 2021
and y2022.id = y2021.id
and y2022.revenue > y2021.revenue
limit 100

Related

SQL: Find number of active "events" each month

Background
I have an SQL table that contains all events, with each event containing a unique identifier.
As you can see for some IDs the "event" stretches across multiple months. What I'm trying to find is the number of "active events" per month.
For example event ID:342, is active in both the month of Jan and Feb. So it should count towards both Jan and Feb's final count.
Example dataset
ID
Start Date
End Date
342
01 Jan 2022
12 Feb 2022
231
12 Feb 2022
26 Feb 2022
123
20 Jan 2022
10 Apr 2022
Desired output:
Month
Start Date
Jan
2
Feb
3
Mar
1
Apr
1
btw: I'm using Alibaba's ODPS SQL and not MySQL or Postgres. So i appreciate if the solution provided could be SQL system agnostic. Thanks!
Here is an example is MySQL 8, using a recursive CTE to construct the list of months. It would be more efficient to use a Calendar Table.
If you are not using MySQL you will need to modify the syntax of the query.
create table dataset(
ID int, Start_date Date,End_date Date);
insert into dataset values
(342,'2022-01-01','2022-02-12'),
(231,'2022-01-12','2022-02-26'),
(123,'2022-01-20','2022-04-10');
/*
Desired output:
Month Start Date
Jan 2
Feb 3
Mar 1
Apr 1
*/
✓
✓
✓
select
min(month(Start_date)),
max(month(End_date))
from dataset;
min(month(Start_date)) | max(month(End_date))
---------------------: | -------------------:
1 | 4
with recursive m as
(select min(month(Start_date)) mon from dataset
union all
select mon + 1 from m
where mon < (select max(month(End_date)) from dataset)
)
select
mon "month",
count(id) "Count"
from m
left join dataset
on month(Start_date)<= mon
and month(End_date) >= mon
group by mon
order by mon;
month | Count
----: | ----:
1 | 3
2 | 3
3 | 1
4 | 1
db<>fiddle here

PostgreSQL - How to get month/year even if there are no records within that date?

What I'm trying to do in this case is to get the ''most future'' record of a Bills table and get all the record prior 13 months from that last record, so what I've tried is something like this
SELECT
users.name,
EXTRACT(month from priority_date) as month,
EXTRACT(year from priority_date) as year,
SUM("money_balance") as "money_balance"
FROM bills
JOIN users on users.id = bills.user_id
WHERE priority_date >= ( SELECT
DATE_TRUNC('month', MAX(debts.priority_date))
FROM bills
INNER JOIN users ON bills.property_id = users.id
WHERE users.company_id = 15
AND users.active = true
AND bills.paid = false ) - interval '13 month'
AND priority_date <= ( SELECT
MAX(bills.priority_date)
FROM bills
INNER JOIN users ON bills.property_id = users.id
WHERE users.community_id = 15
AND users.active = true
AND debts.paid = false )
AND users.company_id = 15
AND bills.paid = false
AND users.active = true
GROUP BY 1,2,3
ORDER BY year, month
So for instance, lets say the most future date for a created bill is December 2022, this query will give me the info from November 2021 to December 2022
The data will give me something like
name
month
year
money_balance
Joshua..
11
2021
300
Joshua..
1
2022
111
Mark..
1
2022
200
...
...
...
...
John
12
2022
399
In the case of Joshua, because he had no bills to pay in December 2021, it doesn't return anything for that month/year.
Is it possible to return the months/year where there are no records for that month, for each user?
Something like
name
month
year
money_balance
Joshua..
11
2021
300
Joshua..
12
2021
0
Joshua..
1
2022
111
other users
....
...
...
Thank you so much!
We can use a CTE to create the list of months, using the maximum and minimum dates from bill, and then cross join it onto users to get a line for all users for all months. We then left join onto bills to populate the last column.
The problem with this approach is that we can end up with a lot of rows with no value.
create table bills(user_id int,priority_date date, money_balance int);
create table users(id int, name varchar(25));
insert into users values(1,'Joshua'),(2,'Mark'),(3,'John');
insert into bills values(1,'2021-11-01',300),(1,'2022-01-01',111),(2,'2022-01-01',200),(3,'2021-12-01',399);
;with months as
(SELECT to_char(generate_series(min(priority_date), max(priority_date), '1 month'), 'Mon-YY') AS "Mon-YY"
from bills)
SELECT
u.name,
"Mon-YY",
--EXTRACT(month from "Mon-YY") as month,
--EXTRACT(year from "Mon-YY") as year,
SUM("money_balance") as "money_balance"
FROM months m
CROSS JOIN users u
LEFT JOIN bills b
ON u.id = b.user_id
AND to_char(priority_date,'Mon-YY') = m."Mon-YY"
GROUP BY
u.name,
"Mon-YY"
ORDER BY "Mon-YY", u.name
name | Mon-YY | money_balance
:----- | :----- | ------------:
John | Dec-21 | 399
Joshua | Dec-21 | null
Mark | Dec-21 | null
John | Jan-22 | null
Joshua | Jan-22 | 111
Mark | Jan-22 | 200
John | Nov-21 | null
Joshua | Nov-21 | 300
Mark | Nov-21 | null
db<>fiddle here

How to calculate average monthly number of some events in MS SQL Server?

I have table in MS SQL Server like below:
ID_EVENT | PRODUCT| DATE
-------------------------------
123 | A | 2021-01-15
456 | A | 2021-01-22
789 | A | 2021-02-05
110 | B | 2021-01-18
124 | B | 2021-02-11
I need to calculate average monthly (for January and February) number of ID_EVENT per PRODUCT. So as a result I need something like below:
PRODUKT | AVG_PER_MNTH
-----------------
A | 1.5
B | 1
A = 1.5 because 3 / 2 = 1.5 --> (number of ID_EVENT for January + number of ID_EVENT for February) / number of months which we analyse (2 -> January and February)
B = 1 because 1 / 1 = 1 --> (number of ID_EVENT for January + number of ID_EVENT for February) / number of months which we analyse (2 -> January and February)
How can I do that in MS SQL Server ?
One option, aggregating first by month and year, then by product:
WITH cte AS (
SELECT PRODUCT, 1.0*COUNT(*) AS cnt
FROM yourTable
GROUP BY PRODUCT, FORMAT(DATE, 'MM.yyyy')
)
SELECT PRODUCT, AVG(cnt) AS AVG_PER_MNTH
FROM cte
GROUP BY PRODUCT;
Demo
WITH CTE AS
(SELECT PRODUCT
,COUNT(*) ac cnt,
,DATEADD(month, DATEDIFF(month, 0, Date), 0) AS StartOfMonth
FROM YourTable
GROUP BY PRODUCT
,DATEADD(month, DATEDIFF(month, 0, Date), 0)
)
SELECT
PRODUCT
,AVG(cnt) as MonthlyAverage
FROM CTE
GROUP BY PRODUCT

Using a window function in BigQuery to create running sum of active quarters

I am working to enhance a dataset by creating a column that would allow me to track how many active quarters a given company has had for a given row. A company is "active" if they recognize revenue within that quarter.
Each row of my dataset represents one month's performance for a single company.
I have been able to use a WINDOW function to create a running sum for active months successfully:
COUNTIF(Revenue IS NOT NULL) OVER
(partition by Company_Name ORDER BY month_end ASC ROWS BETWEEN unbounded preceding and current row) AS cumulative_active_months
I am now struggling to convert my logic to count the quarters rather than the months.
This is a rough idea of what my table currently looks like.
Row Month Month_end Fiscal_Quarter Company_Name Revenue Active month count
----- ------- ------------ ---------------- -------------- --------- --------------------
1 Jul 2016-07-31 FY17-Q2 Foo x,xxx 1
2 Jul 2016-07-31 FY17-Q2 Bar xxx,xxx 1
3 Aug 2016-08-31 FY17-Q2 Foo xx,xxx 2
4 Aug 2016-08-31 FY17-Q2 Bar xxx 2
5 Sep 2016-09-30 FY17-Q2 Foo xx 3
6 Sep 2016-09-30 FY17-Q2 Bar x,xxx 3
7 Oct 2016-10-31 FY17-Q3 Foo xx 4
8 Oct 2016-10-31 FY17-Q3 Bar Null 3
This what ideally I'd like for my table to look like.
Row Month Month_end Fiscal_Quarter Company_Name Revenue Active month count Active quarter count
----- ------- ------------ ---------------- -------------- --------- -------------------- ----------------------
1 Jul 2016-07-31 FY17-Q2 Foo x,xxx 1 1
2 Jul 2016-07-31 FY17-Q2 Bar xxx,xxx 1 1
3 Aug 2016-08-31 FY17-Q2 Foo xx,xxx 2 1
4 Aug 2016-08-31 FY17-Q2 Bar xxx 2 1
5 Sep 2016-09-30 FY17-Q2 Foo xx 3 1
6 Sep 2016-09-30 FY17-Q2 Bar x,xxx 3 1
7 Oct 2016-10-31 FY17-Q3 Foo xx 4 2
8 Oct 2016-10-31 FY17-Q3 Bar Null 3 1
If this is counting active months:
COUNTIF(Revenue IS NOT NULL) OVER (PARTITION BY Company_Name ORDER BY month_end ASC) AS cumulative_active_months
Then this is the corresponding count for quarters would use COUNT(DISTINCT):
COUNT(DISTINCT CASE WHEN Revenue IS NOT NULL THEN Fiscal_Quarter END) OVER (PARTITION BY Company_Name ORDER BY month_end ASC) AS cumulative_active_quarters
Unfortunately, BigQuery does not support this, so you can use a subquery and cumulative sum:
select t.* except (seqnum),
countif(seqnum = 1) over (partition by company_name order by month_end) as cnt
from (select t.*,
(case when revenue is not null
then row_number() over (partition by Company_Name, Fiscal_Quarter order by month_end)
else 0
end) as seqnum
from t
) t;
Note: This does not count the current quarter until there is revenue, which I think makes sense.

Getting value repeated for next year if value is recurring

I have a record for project and release number and I need to repeat the row in next year if value is recurring.
First image is showing the data that I have it
My expected output is:
Explanation of output: In year 2017 value_type ITA has frequency as Recurring so, This value should be repeated in all next year(i.e 2018, 2019 and 2020). like that in year 2018 OC and PA is recurring so it also need to repeated in 2019 and 2020.
For that I created a new view for only recurring value and tried to join that view with base table. But it is not giving me proper result.
Can anyone please help me with this?
Thanks in advance..
DECLARE #EndYear INT =2020 --Also you can get from data by MAX(Year)
;WITH tb(PROJECT_ID,RELEASE_NO,[YEAR],VALUE_TYPE,VAL_DES,COST,RUN_TATE,FREQUENCY)
AS(
SELECT 111,1,2016,'IT','EXPENSE',0,NULL,NULL UNION
SELECT 111,1,2016,'IR','INCOME',10000,NULL,NULL UNION
SELECT 111,1,2016,'OC','EXPENSE',-200000,NULL,NULL UNION
SELECT 111,1,2016,'Vendor','EXPENSE',-5000,NULL,NULL UNION
SELECT 111,1,2017,'BC','INCOME',200000,NULL,NULL UNION
SELECT 111,1,2017,'ITA','INCOME',5000,5000,'Recurring' UNION
SELECT 111,1,2017,'OC','EXPENSE',-200000,NULL,NULL UNION
SELECT 111,1,2018,'OC','EXPENSE',-10000,-10000,'Recurring' UNION
SELECT 111,1,2018,'PA','INCOME',100000,100000,'Recurring' UNION
SELECT 111,1,2019,'icc','INCOME',500,NULL,NULL UNION
SELECT 111,1,2020,NULL,NULL,NULL,NULL,NULL
),Recurring AS (
SELECT tb.PROJECT_ID,tb.RELEASE_NO,tb.VALUE_TYPE,tb.VAL_DES,MIN([YEAR]) AS StartYear,MAX(COST) AS COST,MAX(tb.RUN_TATE) AS RUN_TATE
FROM tb WHERE FREQUENCY='Recurring'
GROUP BY tb.PROJECT_ID,tb.RELEASE_NO,tb.VALUE_TYPE,tb.VAL_DES
)
SELECT * FROM tb union
SELECT r.PROJECT_ID,r.RELEASE_NO,n.number AS [YEAR],r.VALUE_TYPE,r.VAL_DES,r.COST,r.RUN_TATE,NULL AS FREQUENCY FROM
Recurring AS r
OUTER APPLY (
SELECT sv.number FROM master.dbo.spt_values AS sv WHERE sv.type='P' AND sv.number BETWEEN r.StartYear+1 AND #EndYear
)n
PROJECT_ID RELEASE_NO YEAR VALUE_TYPE VAL_DES COST RUN_TATE FREQUENCY
----------- ----------- ----------- ---------- ------- ----------- ----------- ---------
111 1 2016 IR INCOME 10000 NULL NULL
111 1 2016 IT EXPENSE 0 NULL NULL
111 1 2016 OC EXPENSE -200000 NULL NULL
111 1 2016 Vendor EXPENSE -5000 NULL NULL
111 1 2017 BC INCOME 200000 NULL NULL
111 1 2017 ITA INCOME 5000 5000 Recurring
111 1 2017 OC EXPENSE -200000 NULL NULL
111 1 2018 ITA INCOME 5000 5000 NULL
111 1 2018 OC EXPENSE -10000 -10000 Recurring
111 1 2018 PA INCOME 100000 100000 Recurring
111 1 2019 icc INCOME 500 NULL NULL
111 1 2019 ITA INCOME 5000 5000 NULL
111 1 2019 OC EXPENSE -10000 -10000 NULL
111 1 2019 PA INCOME 100000 100000 NULL
111 1 2020 NULL NULL NULL NULL NULL
111 1 2020 ITA INCOME 5000 5000 NULL
111 1 2020 OC EXPENSE -10000 -10000 NULL
111 1 2020 PA INCOME 100000 100000 NULL