Aggregating table values by dates in table [SQL] - sql

I have a table with cust_id, sku, date, order_number (if it's that customer's 1st, 2nd...) and amount. My end goal is to find the total spent by each customer during their first year (exact dates will be relative to each customer).
The sales data table is rld_sales with the following columns:
cust_id
sku,
date of purchase,
order number (if it's that customer's 1st or 2nd...)
-- CREATES TABLE OF CUSTOMER_ID AND Y1, Y2 [END DATES]
CREATE TABLE customer_timetable AS (SELECT
rld_sales. "customer_id" AS cust_id,
CAST(date(rld_sales."date") + INTERVAL '1 year' AS DATE) as y1,
CAST(date(rld_sales."date") + INTERVAL '2 year' AS DATE) as y2
FROM
rld_sales
WHERE
transaction_order = 1
GROUP BY
rld_sales. "customer_id",
date
);
-- JOINS CUSTOMER_TIMETABLE WITH RLD_SALES
CREATE TABLE t1 AS (
SELECT
customer_timetable.cust_id,
customer_timetable.y1,
customer_timetable.y2,
rld_sales.*
FROM
customer_timetable
JOIN rld_sales ON (customer_timetable.cust_id = rld_sales.customer_id)
);
SELECT
t1. "cust_id",
SUM(t1. "amount"),
SUM(t2. "amount")
FROM
t1
WHERE
CAST(date AS DATE) BETWEEN y1
AND y2
LEFT JOIN (
SELECT
t1. "amount"
FROM
t1
WHERE
CAST(date AS DATE) > t1. "y2") t2 ON t1. "cust_id" = t2. "cust_id"
GROUP BY
t1. "cust_id"
SELECT
*
FROM
customer_timetable;
So far I've had minimal success creating intermediate tables and joining one at a time, but I feel like there has to be a much more elegant way to make it all happen in a single query.

If I guess you are using MySQL database, This following query will give you some guideline of writing standard SQL to achieve your requirement-
SELECT
A.customer_id,
CASE
WHEN A."date" >= DATE_ADD(CAST(NOW() AS DATE), INTERVAL -1 YEAR) THEN 'Last_Year'
WHEN A."date" >= DATE_ADD(CAST(NOW() AS DATE), INTERVAL -2 YEAR) THEN 'Last_Previous_Year'
ELSE 'Before_That'
END 'Year',
SUM(amount) -- Not sure your amount in which table. Add Table Alias before the coulmn name if required
FROM rld_sales A
INNER JOIN customer_timetable B ON A.customer_id = B.cust_id
GROUP BY
A.customer_id,
CASE
WHEN A."date" >= DATE_ADD(CAST(NOW() AS DATE), INTERVAL -1 YEAR) THEN 'Last_Year'
WHEN A."date" >= DATE_ADD(CAST(NOW() AS DATE), INTERVAL -2 YEAR) THEN 'Last_Previous_Year'
ELSE 'Before_That'
END
To get yearly values per column ID wise, use this following query-
SELECT A.customer_id,
SUM(CASE WHEN A."Year" = 'Last_Year' THEN Amount ELSE 0 END) 'Last_Year',
SUM(CASE WHEN A."Year" = 'Last_Previous_Year' THEN Amount ELSE 0 END) 'Last_Previous_Year',
SUM(CASE WHEN A."Year" = 'Before_That' THEN Amount ELSE 0 END) 'Before_That'
FROM
(
SELECT
A.customer_id,
CASE
WHEN A."date" >= DATE_ADD(CAST(NOW() AS DATE), INTERVAL -1 YEAR) THEN 'Last_Year'
WHEN A."date" >= DATE_ADD(CAST(NOW() AS DATE), INTERVAL -2 YEAR) THEN 'Last_Previous_Year'
ELSE 'Before_That'
END 'Year',
SUM(amount) Amount -- Not sure your amount in which table. Add Table Alias before the coulmn name if required
FROM rld_sales A
INNER JOIN customer_timetable B ON A.customer_id = B.cust_id
GROUP BY
A.customer_id,
CASE
WHEN A."date" >= DATE_ADD(CAST(NOW() AS DATE), INTERVAL -1 YEAR) THEN 'Last_Year'
WHEN A."date" >= DATE_ADD(CAST(NOW() AS DATE), INTERVAL -2 YEAR) THEN 'Last_Previous_Year'
ELSE 'Before_That'
END
)A
GROUP BY A.customer_id

I would write this as:
select s.customer_id, sum(s.amount)
from rld_sales s
where s.date < (select min(s2.date) + interval 1 year
from rld_sales s2
where s2.customer_id = s.customer_id
)
group by s.customer_id;
Temporary tables would only seem to make this query more complicated.

Related

Filtering query with join is not returning the correct results

I am trying to filter my query that looks at payments data joining with another table (accounts table) as I want the data filtered by the condition accounts.provider = 'z'. However, the results I'm returned are exact multiples of the real figures (times 13, 20 etc) - different dates are a different multiple. The query is also really slow, so looking for advice to make it run quicker too.
SELECT
distinct on (t.day) t.day as day,
coalesce(collected_payments,0)
from
( SELECT day::date
FROM generate_series(timestamp '2017-03-13', current_date + interval '1 week', interval '1 day') day
) d
left JOIN (
SELECT date_trunc('day', t.payment_date)::date AS day,
sum(case when t.payment_amount > 0
and t.description not ilike '%credit%'
and t.state = 'success'
then t.payment_amount end) as collected_payments
FROM payments t
inner join payments p on p.payment_date = date_trunc('day', t.payment_date)::date
inner join accounts on accounts.id = p.account_id and accounts.provider = 'z'
where date_trunc('day', t.payment_date)::date <= current_date + interval '1 week'
and date_trunc('day', t.payment_date)::date >= current_date - interval'1 months'
GROUP BY 1
) t USING (day)
ORDER BY day desc

postgres sql query to convert group by result in multiple columns

I have two tables
financial_account having columns account_name
financial_transaction having columns transaction_date,transaction_type, transaction_amount
I need data as SUM(transaction_amount) where transaction_type='A' under column SUM_A and SUM(transaction_amount) under column SUM_B where transaction_type='B'
I took reference of this stackoverflow post , wrote query as below :
select fa.account_name ,
to_char(current_date - interval '1' month, 'Mon-YY') as "Previous Month",
SUM(case when ft.transaction_type='A' then ft.transaction_amount else 0 end) as "SUM_A",
SUM(case when ft.transaction_type='B' then ft.transaction_amount else 0 end) as "SUM_B"
from financial_transaction ft
join financial_account fa on fa.account_name = 'XYZ'
where ft.transaction_date >= date_trunc('month', now()) - interval '1 month' and
ft.transaction_date < date_trunc('month', now())
group by ft.transaction_type,fa.account_name
having ft.transaction_type in ('A','B')
However, this query is generating data in two rows
I needed data in single row format.
How can i get data in 1 one row format?
Basically, you want to remove ft.transaction_type from the group by clause, so all rows of the same account are grouped together.
Let me pinpoint that your query seems to be missing a join condition between the transactions and accounts.
I would write the query as:
select
fa.account_name,
trunc(current_date - interval '1' month) as previous_month,
sum(ft.transaction_amount) filter(where ft.transaction_type = 'A') as sum_a,
sum(ft.transaction_amount) filter(where ft.transaction_type = 'B') as sum_b
from financial_transaction ft
inner join financial_account fa on ??
where
fa.account_name = 'XYZ'
and ft.transaction_date >= date_trunc('month', current_date) - interval '1 month'
and ft.transaction_date < date_trunc('month', current_date)
and ft.transaction_type in ('A','B')
group by fa.account_name
Changes to your original code:
Fixed the group by clause
I represented that missing join condition as ??.
The condition on the transaction type should belong to the where clause rather than the having clause
The conditional sums can be simplified with the standard filter clause
Considering your query is working properly, you can write your query like below:
select
fa.account_name ,
to_char(current_date - interval '1' month, 'Mon-YY') as "Previous Month",
SUM(case when ft.transaction_type='A' then ft.transaction_amount else 0 end) as "SUM_A",
SUM(case when ft.transaction_type='B' then ft.transaction_amount else 0 end) as "SUM_B"
from financial_transaction ft
join financial_account fa on fa.account_name = 'XYZ'
where ft.transaction_date >= date_trunc('month', now()) - interval '1 month' and
ft.transaction_date < date_trunc('month', now()) and ft.transaction_type in ('A','B')
group by 1,2
You can write it like below also:
select
fa.account_name ,
to_char(current_date - interval '1' month, 'Mon-YY') as "Previous Month",
SUM(ft.transaction_amount) filter (where ft.transaction_type='A') as "SUM_A",
SUM(ft.transaction_amount) filter (where ft.transaction_type='B') as "SUM_B",
from financial_transaction ft
join financial_account fa on fa.account_name = 'XYZ'
where ft.transaction_date >= date_trunc('month', now()) - interval '1 month' and
ft.transaction_date < date_trunc('month', now()) and ft.transaction_type in ('A','B')
group by 1,2

How to calculate SUM for each criteria in 1 field in SQL?

I am back again lol, I am trying to calculate the following:
find out how many users had a balance above £2000 at least once in the last 30 days, so it should be credit-debit to get each users balance.
I have attached the database
I have tried the following, basically a self join, but the output is missing values.
SELECT user_id, (credit_amount - debit_amount) AS balance
FROM (SELECT A.user_id, A.type, B.type, A.amount AS debit_amount, B.amount AS credit_amount
FROM public.transaction A, public.transaction B
WHERE A.user_id = B.user_id
AND a.type LIKE 'debit'
AND b.type LIKE 'credit'
AND A.created_at >= CURRENT_DATE - INTERVAL '30 days'
AND A.created_at <= CURRENT_DATE) AS table_1
WHERE (credit_amount - debit_amount) > 2000
;
However, user_id 3 is being skipped due to having no credit during the time interval & some values are being missed.. any help would be nice, thank you.
find out how many users had a balance above £2000 at least once in the last 30 days,
You can use window functions to compute the running balance of each user during the period. Then, you just need to count the distinct users whose running balance ever exceeded the threshold:
select count(distinct user_id) no_users
from (
select
user_id,
sum(case when type = 'credit' then amount else -amount end)
over(partition by user_id order by created_at) balance
from transaction
where created_at >= current_date - interval '30' day and created_at < current_date
) t
where balance > 2000
Use conditional aggregation:
select user_id,
(sum(amount) filter (where type = 'credit') -
coalesce(sum(amount) filter (where type = 'debit'), 0)
)
from public.transaction t
where t.created_at >= CURRENT_DATE - INTERVAL '30 days' and
t.created_at < CURRENT_DATE
group by user_id;
SELECT user_id,
c.credit_amount - b.debit_amount AS balance
FROM public.transaction a
JOIN (SELECT
user_id, type, amount AS debit_amount,
FROM public.transaction
where a.type LIKE 'debit') b on a.user_id = b.user_id
JOIN (SELECT
user_id, type, amount AS credit_amount
FROM public.transaction
where type LIKE 'credit') c on a.user_id = c.user_id
WHERE a.created_at >= CURRENT_DATE - INTERVAL '30 days'
AND a.created_at <= CURRENT_DATE) AS table_1
AND (c.credit_amount - b.debit_amount) > 2000
GROUP BY a.user_id;

Summing Moving Range and Criteria, Grouping by Day

What I'm trying to do is sum the last 30 days based on criteria and group by the day, one day of code looks like this:
select
sum(case when f.hire_date__c between '2017-08-01 00:00:00' and '2017-09-01
00:00:00'
and t.createddate between '2017-08-01 00:00:00' and '2017-09-01 00:00:00'
and t.name = 'Request' then 1 else 0 end) as Requests
from case_task_c as t
join case_file_c as f
on f.id = t.case_file__c
I could adjust dates accordingly for the 30 day look back based on today's date, etc. What I can't figure out is to have this query group by day for each day, i.e, yesterdays results, the day prior, etc for the adjusted date ranges.
So far I have this:
select
date(cast(f.hire_date__c as date)),
row_number() over (order by f.hire_date__c desc) as rownumber,
rr.Cancels as Cancels,
qq.hires as hires,
sum(rr.Cancels) over (rows between 1 following and 30 following) as
CumulCancel,
sum(qq.Hires) over (rows between 1 following and 30 following) as Hires
from case_file_c as f
left join(
select
cast(f.hire_date__c as date) as date1,
sum(case when
t.name = 'Cancellation Request' then 1 else 0 end) as Cancels
from case_task_c as t
join case_file_c as f
on f.id = t.case_file__c
group by date1)
as rr
on rr.date1 = cast(f.hire_date__c as date)
left join(
select
cast(f.hire_date__c as date) as date2,
sum(case when f.hire_date__c is not null then 1 else 0 end) as
hires
from sf_case_file_c as f
group by date2) as qq
on qq.date2 = cast(f.hire_date__c as date)
where f.hire_date__c is not null
and f.hire_date__c >= '2017-01-01 00:00:00'
and f.hire_date__c between date_add('day',-30,current_date) and current_date
group by f.hire_date__c, rr.Cancels, qq.hires
order by f.hire_date__c desc
Even using 'current_date - interval -30 day' is just looking up.. the current date.
Using Postgres 8.0.2
Use group by like following. You are converting datetime to date in column selection but not in group by
GROUP BY date(cast(f.hire_date__c as date)),rr.Cancels, qq.hires

Postgresql group month wise with missing values

first an example of my table:
id_object;time;value;status
1;2014-05-22 09:30:00;1234;1
1;2014-05-22 09:31:00;2341;2
1;2014-05-22 09:32:00;1234;1
...
1;2014-06-01 00:00:00;4321;1
...
Now i need count all rows with status=1 and id_object=1 monthwise for example. this is my query:
SELECT COUNT(*)
FROM my_table
WHERE id_object=1
AND status=1
AND extract(YEAR FROM time)=2014
GROUP BY extract(MONTH FROM time)
The result for this example is:
2
1
2 for may and 1 for june but i need a output with all 12 months, also months with no data. for this example i need this ouput:
0 0 0 0 2 1 0 0 0 0 0 0
Thx for help.
you can use generate_series() function like this:
select
g.month,
count(m)
from generate_series(1, 12) as g(month)
left outer join my_table as m on
m.id_object = 1 and
m.status = 1 and
extract(year from m.time) = 2014 and
extract(month from m.time) = g.month
group by g.month
order by g.month
sql fiddle demo
Rather than comparing with an extracted value, you'll want to use a range-table instead. Something that looks like this:
month startOfMonth nextMonth
1 '2014-01-01' '2014-02-01'
2 '2014-02-01' '2014-03-01'
......
12 '2014-12-01' '2015-01-01'
As in #Roman's answer, we'll start with generate_series(), this time using it to generate the range table:
WITH Month_Range AS (SELECT EXTRACT(MONTH FROM month) AS month,
month AS startOfMonth,
month + INTERVAL '1 MONTH' AS nextMonth
FROM generate_series(CAST('2014-01-01' AS DATE),
CAST('2014-12-01' AS DATE),
INTERVAL '1 month') AS mr(month))
SELECT Month_Range.month, COUNT(My_Table)
FROM Month_Range
LEFT JOIN My_Table
ON My_Table.time >= Month_Range.startOfMonth
AND My_Table.time < Month_Range.nextMonth
AND my_table.id_object = 1
AND my_table.status = 1
GROUP BY Month_Range.month
ORDER BY Month_Range.month
(As a side note, I'm now annoyed at how PostgreSQL handles intervals)
SQL Fiddle Demo
The use of the range will allow any index including My_Table.time to be used (although not if an index was built over an EXTRACTed column.
EDIT:
Modified query to take advantage of the fact that generate_series(...) will also handle date/time series.
generate_series can generate timestamp series
select
g.month,
count(t)
from
generate_series(
(select date_trunc('year', min(t.time)) from t),
(select date_trunc('year', max(t.time)) + interval '11 months' from t),
interval '1 month'
) as g(month)
left outer join
t on
t.id_object = 1 and
t.status = 1 and
date_trunc('month', t.time) = g.month
where date_trunc('year', g.month) = '2014-01-01'::date
group by g.month
order by g.month