How to get row count based on other column groups - sql

I need to count the number of dates with which the invoices are made to certain customers in each month
Consider the table named Table1
Branch Month Date Amount
----------------------------------------
B1 April 01/04/20 10000
B1 April 14/04/20 13000
B1 May 01/05/20 25000
B1 May 14/05/20 23000
I tried the folllowing code
Select
Branch, Month, Date, Amount,
Row_Number() over (partition by Branch order by Month) as rowcount
from table1
and the result was
Branch Month Date Amount rowcount
----------------------------------------
B1 April 01/04/20 10000 1
B1 April 14/04/20 13000 2
B1 May 01/05/20 25000 3
B1 May 14/05/20 23000 4
The result I need should count the column named branch based on column month, the desired result is
Branch Month Date Amount rowcount
---------------------------------------------
B1 April 01/04/20 10000 1
B1 April 14/04/20 13000 2
B1 May 01/05/20 25000 1
B1 May 14/05/20 23000 2
Here rowcount is based on both column branch and Month, how can I get this result?

You need to partition by month. Something like this:
row_number() over (partition by Branch, month order by Month)
Note that month is repeated. An order by is needed in SQL Server, so you need something. Other options are:
row_number() over (partition by Branch, month order by (select null))
row_number() over (partition by Branch, month order by date)
I suspect the last is what you really want.

Related

How to calculate average monthly number of some action in some perdion in Teradata SQL?

I have table in Teradata SQL like below:
ID trans_date
------------------------
123 | 2021-01-01
887 | 2021-01-15
123 | 2021-02-10
45 | 2021-03-11
789 | 2021-10-01
45 | 2021-09-02
And I need to calculate average monthly number of transactions made by customers in a period between 2021-01-01 and 2021-09-01, so client with "ID" = 789 will not be calculated because he made transaction later.
In the first month (01) were 2 transactions
In the second month was 1 transaction
In the third month was 1 transaction
In the nineth month was 1 transactions
So the result should be (2+1+1+1) / 4 = 1.25, isn't is ?
How can I calculate it in Teradata SQL? Of course I showed you sample of my data.
SELECT ID, AVG(txns) FROM
(SELECT ID, TRUNC(trans_date,'MON') as mth, COUNT(*) as txns
FROM mytable
-- WHERE condition matches the question but likely want to
-- use end date 2021-09-30 or use mth instead of trans_date
WHERE trans_date BETWEEN date'2021-01-01' and date'2021-09-01'
GROUP BY id, mth) mth_txn
GROUP BY id;
Your logic translated to SQL:
--(2+1+1+1) / 4
SELECT id, COUNT(*) / COUNT(DISTINCT TRUNC(trans_date,'MON')) AS avg_tx
FROM mytable
WHERE trans_date BETWEEN date'2021-01-01' and date'2021-09-01'
GROUP BY id;
You should compare to Fred's answer to see which is more efficent on your data.

Apply SUM( where date between date1 and date2)

My table is currently looking like this:
+---------+---------------+------------+------------------+
| Segment | Product | Pre_Date | ON_Prepaid |
+---------+---------------+------------+------------------+
| RB | 01. Auto Loan | 2020-01-01 | 10645976180.0000 |
| RB | 01. Auto Loan | 2020-01-02 | 4489547174.0000 |
| RB | 01. Auto Loan | 2020-01-03 | 1853117000.0000 |
| RB | 01. Auto Loan | 2020-01-04 | 9350258448.0000 |
+---------+---------------+------------+------------------+
I'm trying to sum values of 'ON_Prepaid' over the course of 7 days, let's say from '2020-01-01' to '2020-01-07'.
Here is what I've tried
drop table if exists ##Prepay_summary_cash
select *,
[1W_Prepaid] = sum(ON_Prepaid) over (partition by SEGMENT, PRODUCT order by PRE_DATE rows between 1 following and 7 following),
[2W_Prepaid] = sum(ON_Prepaid) over (partition by SEGMENT, PRODUCT order by PRE_DATE rows between 8 following and 14 following),
[3W_Prepaid] = sum(ON_Prepaid) over (partition by SEGMENT, PRODUCT order by PRE_DATE rows between 15 following and 21 following),
[1M_Prepaid] = sum(ON_Prepaid) over (partition by SEGMENT, PRODUCT order by PRE_DATE rows between 22 following and 30 following),
[1.5M_Prepaid] = sum(ON_Prepaid) over (partition by SEGMENT, PRODUCT order by PRE_DATE rows between 31 following and 45 following),
[2M_Prepaid] = sum(ON_Prepaid) over (partition by SEGMENT, PRODUCT order by PRE_DATE rows between 46 following and 60 following),
[3M_Prepaid] = sum(ON_Prepaid) over (partition by SEGMENT, PRODUCT order by PRE_DATE rows between 61 following and 90 following),
[6M_Prepaid] = sum(ON_Prepaid) over (partition by SEGMENT, PRODUCT order by PRE_DATE rows between 91 following and 181 following)
into ##Prepay_summary_cash
from ##Prepay1
Things should be fine if the dates are continuous; however, there are some missing days in 'Pre_Date' (you know banks don't work on Sundays, etc.).
So I'm trying to work on something like
[1W] = SUM(ON_Prepaid) over (where Pre_date between dateadd(d,1,Pre_date) and dateadd(d,7,Pre_date))
something like that. So if per se there's no record on 2020-01-05, the result should only sum the dates on the 1,2,3,4,6,7 of Jan 2020, instead of 1,2,3,4,6,7,8 (8 because of "rows 7 following"). Or for example I have missing records over the span of 30 days or something, then all those 30 should be summed as 0s. So 45 days should return only the value of 15 days.
I've tried looking up all over the forum and the answers did not suffice. Can you guys please help me out? Or link me to a thread which the problem had already been solved.
Thank you so much.
Things should be fine if the dates are continuous
Then make them continuous. Left join your real data (grouped up so it is one row per day) onto your calendar table (make one, or use a recursive cte to generate you a list of 360 dates from X hence) and your query will work out
WITH d as
(
SELECT *
FROM
(
SELECT *
FROM cal
CROSS JOIN
(SELECT DISTINCT segment s, product p FROM ##Prepay1) x
) c
LEFT JOIN ##Prepay1 p
ON
c.d = p.pre_date AND
c.segment = p.segment AND
c.product = p.product
WHERE
c.d BETWEEN '2020-01-01' AND '2021-01-01' -- date range on c.d not c.pre_date
)
--use d.d/s/p not d.pre_date/segment/product in your query (sometimes the latter are null)
select *,
[1W_Prepaid] = sum(ON_Prepaid) over (partition by s, s order by d.d rows between 1 following and 7 following),
...
CAL is just a table with a single column of dates, one per day, no time, extending for n thousand days into the past/future
Wish to note that months have variable number of days so 6M is a bit of a misnomer.. might be better to call the month ones 180D, 90D etc
Also want to point out that your query performs a per row division of your data into into groups. If you want to perform sums up to 180 days after the date of the row you need to pull a year's worth of data so that on row 180(June) you have the December data available to sum (dec being 6 months from June)
If you then want to restrict your query to only showing up to June (but including data summed from 6 months after June) you need to wrap it all again in a sub query. You cannot "where between jan and jun" in the query that does the sum over because where clauses are done before window clauses (doing so will remove the dec data before it is summed)
Some other databases make this easier, Oracle and Postgres spring to mind; they can perform sum in a range where the other rows values are within some distance of the current row's values. SQL server only usefully supports distancing based on a row's index rather than its values (the distancing-based-on-values support is limited to "rows that have the same value", rather than "rows that have values n higher or lower than the current row"). I suppose the requirement could be met with a cross apply, or a coordinated sub in the select, though I'd be careful to check the performance..
SELECT *,
(SELECT SUM(tt.a) FROM x tt WHERE t.x = tt.x AND tt.y = t.y AND tt.z BETWEEN DATEADD(d, 1, t.z) AND DATEADD(d, 7, t.z) AS 1W
FROM
x t

How to get monthwise sum from table?

Table Transaction(Id, DateTime, Debit, Credit)
I want a monthwise sum of Debit and Credit.
What is a good option to retrieve monthwise result?
Sample Output:
Month Id Debit Credit
January 1 200 70
January 2 400 80
February 1 400 90
February 2 300 50
Try this below script with GROUP BY function. I have added YEAR in consideration other wise same month from different year will count as same month.
SELECT YEAR(DateTime),
MONTH(DateTime),
Id,
SUM(Debit) total_debit,
SUM(Credit) total_credit
FROM your_table
GROUP BY YEAR(DateTime), MONTH(DateTime), Id
Apply Group by clause to SQL Query
group by month(DateTime),Year(DateTime)

Display # of customers per month that have previous Sale date > 3 months and # of these customers that have a "Sale Date" in the given month

Basically, my requirement is - for a given month, how many customers had their "previous Sale date" 3 months before the given month and of these customers how many of them have a "Sale date" in the given month.
I tried using Lag function, but my column "Reactivated_Guests" is giving me null value always.
SELECT datepart(month,["sale date"]) `"Sale_Month",count(distinct
["user id"]) "Lost_Guests",
lag("Guests",4) OVER (ORDER BY "Sale_Month")+
lag("Guests",5) OVER (ORDER BY "Sale_Month")+
lag("Guests",6) OVER (ORDER BY "Sale_Month")+
lag("Guests",7) OVER (ORDER BY "Sale_Month")+
lag("Guests",8) OVER (ORDER BY "Sale_Month")+
lag("Guests",9) OVER (ORDER BY "Sale_Month")+
lag("Guests",10) OVER (ORDER BY "Sale_Month")+
lag("Guests",11) OVER (ORDER BY "Sale_Month")+
lag("Guests",12) OVER (ORDER BY "Sale_Month") "Reactivated_Guests"
group by "Sale_Month"
order by "Sale_Month"
My expected output is month-wise # of guests that have their previous "Sale date" greater than 3 months before the given month (Lost_Guests) and of these customers how many have a "Sale date" in the given month (Reactivated_Guests)
Expected Result :
Sale_Month Lost_Guests Reactivated_Guests
(prev Sale date > 3 months) (Prev Sale date > 3 months and
have a Sale date in given month)
June 1,200 110
July 1,800 130
Aug 1,900 140
Actual Result :
Sale_Month Lost_Guests Reactivated_Guests
June 1,200 null
July 1,800 null
Aug 1,900 null
Sample Data :
Customer Sale Date
AAAAA 11/15/2018
BBBBB 11/16/2018
CCCCC 9/23/2018
CCCCC 1/25/2019
AAAAA 3/16/2019 ----> so for given month of March, AAAAA to be
CCCCC 3/18/2019 considered in "Lost_Guests" because
AAAAA's previous sale date (11/15/2018) is
more than 3 months from the given month
(March - 2019) and AAAAA to be considered in
"Reactivated_guests" because AAAAA has a
Sale date in the given month (March-2019)
----> for given month of March, CCCCC shall not
be considered in "Lost guests" and
"Reactivated Guests" because
previous sale date (1/25/2019) is less
than 3 months from given month (March-2019)
and hence does not appear in
"Reactivated_Guests" as well
This addresses the original version of the question.
You seem to want something like this:
select sale_month, count(distinct user_id) as guests,
count(distinct case when min_sale_date < sale_date - interval '3 month' then user_id end) as old_guests
from (select t.*,
min(sale_date) over (partition by user_id) as min_sale_date
from t
) t
group by sale_month
order by sale_month;
Note that date functions are very database dependent, so the exact syntax might vary depending on your database.

sql quest with amount and exchange rate

How to choose customers who have made a large amount of payments in December 2018 if we take into account the exchange rate
I have a table:
Trandate date - transaction date
Transum numeric (20,2) - amount of payment
CurrencyRate numeric (20,2) - currency exchange rate
ID_Client Trandate Transum CurrencyRate Currency
--------------------------------------------------------
1 2018.12.01 100 1 UAH
1 2018.12.02 150 2 USD
2 2018.12.01 200 1 UAH
3 2018.12.01 250 3 EUR
3 2018.12.02 300 1 UAH
3 2018.12.03 350 2 USD
7 2019.01.08 600 1 UAH
but I think that "max" is not at all what I need
SELECT ID_Client, MAX(Transum*CurrencyRate)
FROM `Payment.TotalPayments`
WHERE YEAR(Trandate) = 2018
AND MONTH(Trandate) = 12
I need something this
ID_Client Transum
3 1750
Where 1750 is a "UAH" and 350USD + 300UAH + 250EUR, exchange rate of USD is 2, exchange rate of EUR is 3.
If you're trying to get the sum of transaction amounts by client for the year 2018 and month of December, you could write it like this:
SELECT ID_Client, SUM(Transum*CurrencyRate) as payment_total_converted
FROM `Payment.TotalPayments`
WHERE YEAR(Trandate) = 2018
and MONTH(Trandate) = 12
group by ID_Client
If you want things grouped by each client, year, and month in a given date range, you'd write it like this:
SELECT ID_Client, YEAR(Trandate) as tran_year, MONTH(Trandate) as tran_month,
SUM(Transum*CurrencyRate) as payment_total_converted
FROM `Payment.TotalPayments`
WHERE Trandate between '2018-12-01' and '2019-01-01'
group by ID_Client, YEAR(Trandate), MONTH(Trandate)
I added a column name for your computed column so that the result set is still relational (columns need distinct names).
I'd recommend reading up on the SQL 'group by' clause (https://www.w3schools.com/sql/sql_groupby.asp) and aggregate (https://www.w3schools.com/sql/sql_count_avg_sum.asp, https://www.w3schools.com/sql/sql_min_max.asp) operators.
I think you want sum(). Then you can order by the result:
SELECT ID_Client, SUM(Transum*CurrencyRate) as total
FROM `Payment.TotalPayments`
WHERE Trandate >= '2018-12-01' AND Transdate < '2019-01-01'
GROUP BY ID_Client
ORDER BY total DESC;