Joining two aggregated queries

Joining two aggregated queries - sql

I have 2 tables that look like this
users:
id | created_at
payments:
id | created_at
I need a table that is grouped by year and month and contains both number of users and payments
stats:
month | year | users | payments
Where users column contains number of registered users and payments - number of payments. I can get two tables separately, but how can I join them?
select
month(created_at) as month,
year(created_at) as year,
count(*) users
from
users
group by
month, year
having
users > 0
order by
year desc, month desc;
select
month(created_at) as month,
year(created_at) as year,
count(*) payments
from
payments
group by
month, year
having
payments > 0
order by
year desc, month desc;

The comparison to users > 0 and payments > 0 are useless. In addition, order by in subqueries is meaningless.
You can do this with a full join:
select month, year, coalesce(users, 0) as users, coalesce(payments, 0) as payments
from (select month(created_at) as month, year(created_at) as year,
count(*) as users
from users
group by month, year
) u full join
(select month(created_at) as month, year(created_at) as year,
count(*) as payments
from payments
group by month, year
) p
using (month, year)
order by year desc, month desc;
If you know you have users and payments for all months (that you care about), you can use an inner join rather than a full join.

I think that's what you're looking for :
select a.*, b.payments from (
select month(created_at) as month, year(created_at) as year, count(*) users
from users group by month, year having users > 0 order by year desc, month desc
) a left join (
select month(created_at) as month, year(created_at) as year, count(*) payments
from payments group by month, year having payments > 0 order by year desc, month desc
) b on a.month = b.month and b.year = b.year

Related

PostgreSQL Query To Obtain Value that Occurs more than once in 12 months

I have the following query to return the number of users that booked a flight at least twice, but I need to identify those which have booked a flight more than once in the range of 12 months
SELECT COUNT(*)
FROM sales
WHERE customer in
(
SELECT customer
FROM sales
GROUP BY customer
HAVING COUNT(*) > 1
)

You would use window functions. The simplest method is lag():
select count(distinct customer)
from (select s.*,
lag(date) over (partition by customer order by date) as prev_date
from sales s
) s
where prev_date > s.date - interval '12 month';

At the cost of a self-join, #AdrianKlaver's answer can adapt to any 12-month period.
SELECT COUNT(DISTINCT customer) FROM
(SELECT customer
FROM sales s1
JOIN sales s2
ON s1.customer = s2.customer
AND s1.ticket_id <> s2.ticket_id
AND s2.date_field BETWEEN s1.date_field AND (s1.date_field + interval'1 year')
GROUP BY customer
HAVING COUNT(*) > 1) AS subquery;

A stab at it with a made up date field:
SELECT COUNT(*)
FROM sales
WHERE customer in
(
SELECT customer
FROM sales
WHERE date_field BETWEEN '01/01/2019' AND '12/31/2019'
GROUP BY customer
HAVING COUNT(*) > 1
)

SQL order with equal group size

I have a table with columns month, name and transaction_id. I would like to count the number of transactions per month and name. However, for each month I want to have the top N names with the highest transaction counts.
The following query groups by month and name. However the LIMIT is applied to the complete result and not per month:
SELECT
month,
name,
COUNT(*) AS transaction_count
FROM my_table
GROUP BY month, name
ORDER BY month, transaction_count DESC
LIMIT N
Does anyone have an idea how I can get the top N results per month?

Use row_number():
SELECT month, name, transaction_count
FROM (SELECT month, name, COUNT(*) AS transaction_count,
ROW_NUMBER() OVER (PARTITION BY month ORDER BY COUNT(*) DESC) as seqnum
FROM my_table
GROUP BY month, name
) mn
WHERE seqnum <= N
ORDER BY month, transaction_count DESC

Running Count Distinct using Over Partition By

I have a data set with user ids that have made purchases over time. I would like to show a YTD distinct count of users that have made a purchase, partitioned by State and Country. The output would have 4 columns: Country, State, Year, Month, YTD Count of Distinct Users with purchase activity.
Is there a way to do this? The following code works when I exclude the month from the view and do a distinct count:
Select Year, Country, State,
COUNT(DISTINCT (CASE WHEN ActiveUserFlag > 0 THEN MBR_ID END)) AS YTD_Active_Member_Count
From MemberActivity
Where Month <= 5
Group By 1,2,3;
The issue occurs when the user has purchases across multiple months, because I can’t aggregate at a monthly level then sum, because it duplicates user counts.
I need to see the YTD count for each month of the year, for trending purposes.

Return each member only once for the first month they make a purchase, count by month and then apply a Cumulative Sum:
select Year, Country, State, month,
sum(cnt)
over (partition by Year, Country, State
order by month
rows unbounded preceding) AS YTD_Active_Member_Count
from
(
Select Year, Country, State, month,
COUNT(*) as cnt -- 1st purchses per month
From
( -- this assumes there's at least one new active member per year/month/country
-- otherwise there would be mising rows
Select *
from MemberActivity
where ActiveUserFlag > 0 -- only active members
and Month <= 5
-- and year = 2019 -- seems to be for this year only
qualify row_number() -- only first purchase per member/year
over (partition by MBR_ID, year
order by month --? probably there's a purchase_date) = 1
) as dt
group by 1,2,3,4
) as dt
;

Count users in the first month they appear:
select Country, State, year, month,
sum(case when ActiveUserFlag > 0 and seqnum = 1 then 1 else 0 end) as YTD_Active_Member_Count
from (select ma.*,
row_number() over (partition by year order by month) as seqnum
from MemberActivity ma
) ma
where Month <= 5
group by Country, State, year, month;

How can I return a row for each group even if there were no results?

I'm working with a database containing customer orders. These orders contain the customer id, order month, order year, order half month( either first half 'FH' or last half 'LH' of the month), and quantity ordered.
I want to query monthly totals for each customer for given month. Here's what I have so far.
SELECT id, half_month, month, year, SUM(nbr_ord)
FROM Orders
WHERE month = 7
AND year = 2015
GROUP BY id, half_month, year, month
The problem with this is that if a customer did not order anything during one half_month there will not be a row returned for that period.
I want there to be a row for each customer for every half month. If they didn't order anything during a half month then a row should be returned with their id, the month, year, half month, and 0 for number ordered.

First, generate all the rows, which you can do with a cross join of the customers and the time periods. Then, bring in the information for the aggregation:
select i.id, t.half_month, t.month, t.year, coalesce(sum(nbr_ord), 0)
from (select distinct id from orders) i cross join
(select distinct half_month, month, year
from orders
where month = 7 and year = 2015
) t left join
orders o
on o.id = i.id and o.half_month = t.half_month and
o.month = t.month and o.year = t.year
group by i.id, t.half_month, t.month, t.year;
Note: you might have other sources for the id and date parts. This pulls them from orders.

IF you know the entire dataset has an occurance of each half_month, month, year combination you could use the listing of those 3 things as the left side of a left join. That would look like this:
Select t1.half_month, t1.month, t1.year, t2.ID, t2.nbr_ord from
(Select half_month, month, year)t1
Left Join
(SELECT id, half_month, month, year, SUM(nbr_ord)nbr_ord
FROM Orders
WHERE month = 7
AND year = 2015
GROUP BY id, half_month, year, month)t2
on t1.half_month = t2.half_month
and t1.month = t2.month
and t1.year = t2.year

SELECT m.id, m.half_month, m.year, t.nbr_order
FROM (
SELECT Id, sum(nbr_order) AS nbr_order
FROM Orders
GROUP BY id
) t
INNER JOIN Orders m
ON t.Id = m.id
WHERE m.month = 7
AND m.year = 2015;

Group by select that returns only groups with updated rows

I have a Table "Postings"
CREATE TABLE POSTINGS(
Account_FK INT,
Department_FK INT,
Project_FK INT,
Company_FK INT,
Year INT,
Month INT,
Amount float,
Handled BIT
)
Im trying to make a select statement that will select the sum of Amounts for each company each month.
Like this:
SELECT Company_FK, Year, Month, Sum(Amount)
FROM Postings
GROUP BY Company_FK, Year, Month
But i will only need the rows that have not been handled. E.g. the rows with Handled = 0
SELECT Company_FK, Year, Month, Sum(Amount)
FROM Postings
WHERE Handled = 0
GROUP BY Company_FK, Year, Month
Now this query will sum up only the rows with Handled = 0 for the Company each year and month.
But i will also need the sums to include all the other rows for the company. I mean, if one row in the company is not handled. I will need to return the company sum of all company-rows.
So if Company_FK = 1 has three postings. All of which have handled = 1. Then this company could be ignored. But if Company_FK = 2 has three postings. And one of them has handled = 0, then i would need to return the sum of all three rows.
Do you understand what i mean?
Any suggestions?

Try this:
SELECT distinct c.Company_FK, c.Year, c.Month, b.Amount FROM Postings c
INNER JOIN
(SELECT Company_FK, Year, Month, Sum(Amount) as 'Amount'
FROM Postings
GROUP BY Company_FK, Year, Month) b ON c.Company_FK = b.Company_FK and c.Year = b.Year and c.Month = b.Month
WHERE c.Handled = 0

You can add a HAVING statement after your GROUP BY
HAVING COUNT(*) > SUM(CONVERT(INT, Handled))
You will need to remove the WHERE clause as well, because the HAVING will be the filter
Full Query:
SELECT Company_FK, Year, Month, Sum(Amount)
FROM Postings
GROUP BY Company_FK, Year, Month
HAVING COUNT(*) > SUM(CONVERT(INT, Handled))

SELECT Company_FK, Year, Month, Sum(Amount) as "Amount"
FROM Postings
GROUP BY Company_FK, Year, Month
HAVING COUNT(*) > SUM(CONVERT(INT, Handled))

SELECT Company_FK, Year, Month, Sum(Amount)
FROM Postings p
WHERE Exists (select top 1 1 from Postings po where p.company_fk=po.company_fk and Handled=0 )
GROUP BY Company_FK, Year, Month

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Joining two aggregated queries - sql

Related

PostgreSQL Query To Obtain Value that Occurs more than once in 12 months

SQL order with equal group size

Running Count Distinct using Over Partition By

How can I return a row for each group even if there were no results?

Group by select that returns only groups with updated rows

Categories

Resources