sum values up to current cursor for average saldo analytics - sql

I made really simple example table with columns date and credit, so we can sum all credit to get account saldo of the account. I can sum all credit values to get saldo, but that is not what I want. I want to calculate average saldo, so in order to do that I need to use RangeDate table with date of every day and query with this logic:
SELECT DRA.date, SUM(ACB.credit)
FROM AccountBalance ACB
JOIN DateRange DRA ON ACB.date <= DRA.date
GROUP BY DRA.date
http://sqlfiddle.com/#!18/88afa/10
the problem is when using this program on a larger range of dates, like whole year for example.
Instruction is telling SQL Engine to sum up all rows of credit before the current date including credit of the current row date (where the cursor is in that moment (JOIN... ACB.date <= DRA.date)), in order to get accounts credit for that day.
This is inefficient and slow for big tables because that sum already exists in the row 1 level before, and I would like to tell SQL Engine to take that sum and only add the one row of credit that it is in.
Somone told me that I should use LAG function, but i need an example first.

I think you simply need an analyitc function -
SELECT DRA.date,
SUM(ACB.saldo) OVER (ORDER BY DRA.date)
FROM DateRange DRA
LEFT JOIN AccountBalance ACB ON ACB.date = DRA.date;
Demo.

Related

How to write an SQL query to get max number of counts for the most number of travelling of a user within a month

I have been given a task by my manager to write a SQL query to select the max number of counts (no of records) for a user who has travelled the most within a month provided that if the user travels multiple places on the same date, then it should be counted as one. For instance, if you look at the following table design; according to this scenario, my query must return me a count of 2. Although traveller_id "1" has traveled three times within a month, but he traveled to Thailand and USA on the same date, that is why its count is reduced to 2.
I have also developed my logic for this query but I am unable to write it due to lack of syntax knowledge. I split up this query into 3 parts:
Select All records from the table within a month using the MONTH function of SQL
Select All distinct DateTime records from the above result so that the same DateTime gets eliminated.
Select max number of counts for the traveller who visited most places.
Please help me in completing my query. You can also use a different approach from mine.
You can use the count aggregation in a cte then select top(1):
with u as
(select traveller_id,
count(distinct visit_date) as n
from travellers_log
where visit_date between '2022-03-01' and '2022-03-31'
group by traveller_id)
select top(1) traveller_id, name, n from u inner join table_travellers
on u.traveller_id = table_travellers.id
order by n desc;

Cohort retention with SQL BigQuery

I am trying to create a retention table like the following using SQL in Big Query but with MONTHLY cohorts;
I have the following columns to use in my dataset, I am only using one table and it's name is 'curious-furnace-341507.TEST.Test_Dataset_-_Orders'
order_date
order_id
customer_id
2020-01-02
12345
6789
I do not need the new user column and the data goes through June 2020 I think ideally a cohort month column that lists January-June cohorts and then 5 periods across.
I have tried so many different things and keep getting errors in BigQuery I think I am approaching it all wrong. The online queries I am trying to pull from seem to use dates rather than months which is also causing some confusion as I think I need to truncate my date column to months only in the query?
Does anyone have a go-to query that will work in BigQuery for a retention table or can help me approach this? Thanks!
This may help you:
With cohorts AS (
SELECT
customer_id,
MIN(DATE(order_date)) AS cohort_date
FROM 'curious-furnace-341507.TEST.Test_Dataset_-_Orders'
GROUP BY 1)
SELECT
FORMAT_DATE("%Y-%m", c.cohort_date) AS cohort_mth,
t.customer_id AS cust_id,
DATE_DIFF(t.order_date, c.cohort_date, month) AS order_period,
FROM 'curious-furnace-341507.TEST.Test_Dataset_-_Orders' t
JOIN cohorts c ON t.customer_id = c.customer_id
WHERE cohort_date >= ('2020-01-01')
AND DATE_DIFF(t.order_date, c.cohort_date, month) <=5
GROUP BY 1, 2, 3
I typically do pivots and % calcs in excel/ sheets. So this will give just you the input data you need for that.
NOTE:
This will give you a count of unique customers who ordered in period X (ignores repeat orders in period).
This also has period 0 (ordered again in cohort_mth) which you may wish to keep/ exclude.

SQL query solution for getting statistics from my tables

I have been trying to get statistics from my tables. In the following, I have tried to draw my table structure:
In the above, I have my Transaction table where each transaction is recorded for each user Profile(profile_id). As well as, I record transaction created date pub_date, transaction_type(type options are shown in the circle) and transaction amount. When transaction created I give Bonus to each transaction with appropriate Profile(profile_id).
So, I want to get statistics from above tables within the date range. More precisely:
Total transaction sum amount of each profile transaction within the date range with the transaction_types of WITHDRAW and WITHDRAW_MANUAL.
Total transaction sum amount of each profile transaction within the date range with the transaction_types of DEPOSIT and DEPOSIT_MANUAL.
Total Bonus sum amount of each profile transaction within the date range.
Visually, I want this result.
Here chosen date range means, I will give startDate and endDate period to the query
I could manage to solve 1. and 2.. But, I couldn't find way (for 3.) to SUM of each profiles' BONUS amount within the date range. My solution is as followings:
SELECT mtd.profile_id, sum(mtd.amount) AS summa_deposit,
(
SELECT sum(mtw.amount) AS summa_withdraw
FROM public.main_transaction AS mtw
WHERE mtw.profile_id=mtd.profile_id AND mtw.pub_date>='2017-01-01' AND mtw.pub_date<='2017-10-01' AND mtw.transaction_type IN ('WITHDRAW','WITHDRAW_MANUAL')
)
FROM public.main_transaction AS mtd
WHERE mtd.pub_date>='2017-01-01' AND mtd.pub_date<='2017-10-01' AND mtd.transaction_type IN ('DEPOSIT','DEPOSIT_MANUAL')
GROUP BY mtd.profile_id
ORDER BY mtd.profile_id;
Getting total amount of each profiles bonus is not a problem here. The problem is to get those amount within the date range. Because, I don't record date to my Bonus table. I only have my transaction_id and profile_id
P.S. my table is in the PostGreSQL.
Do you just need another join in a subquery?
(SELECT SUM(b.amount) AS summa_bonus
FROM public.bonus b JOIN
public.main_transaction mtw
ON b.transaction_id = mtw.id
WHERE mtw.profile_id = mtd.profile_id AND
mtw.pub_date >= '2017-01-01' AND
mtw.pub_date <= '2017-10-01' AND
mtw.transaction_type IN ('WITHDRAW', 'WITHDRAW_MANUAL')
)
I don't know if the filter on transaction_type is necessary.

SQL Statement for MS Access Query to Calculate Quarterly Growth Rate

I have a table named "Historical_Stock_Prices" in a MS Access database. This table has the columns: Ticker, Date1, Open1, High, Low, Close1, Volume, Adj_Close. The rows consist of the data for each ticker for every business day.
I need to run a query from inside my VB.net program that will return a table in my program that displays the growth rates for each quarter of every year for each ticker symbol listed. So for this example I would need to find the growth rate for GOOG in the 4th quarter of 2012.
To calculate this manually I would need to take the Close Price on the last BUSINESS day of the 4th quarter (12/31/2012) divided by the Open Price of the first BUSINESS day of the 4th quarter (10/1/2012). Then I need to subtract by 1 and multiply by 100 in order to get a percentage.
The actual calculation would look like this: ((707.38/759.05)-1)*100 = -6.807%
The first and last days of each quarter may vary due to weekend days.
I cannot come up with the correct syntax for the SQL statement to create a table of Growth Rates from a table of raw Historical Prices. Can anyone help me with the SQL statment?
Here's how I would approach the problem:
I'd start by creating a saved query Access named [Stock_Price_with_qtr] that calculates the year and quarter for each row:
SELECT
Historical_Stock_Prices.*,
Year([Date1]) AS Yr,
Switch(Month([Date1])<4,1,Month([Date1])<7,2,Month([Date1])<10,3,True,4) AS Qtr
FROM Historical_Stock_Prices
Then I'd create another saved query in Access named [Qtr_Dates] that finds the first and last business days for each ticker and quarter:
SELECT
Stock_Price_with_qtr.Ticker,
Stock_Price_with_qtr.Yr,
Stock_Price_with_qtr.Qtr,
Min(Stock_Price_with_qtr.Date1) AS Qtr_Start,
Max(Stock_Price_with_qtr.Date1) AS Qtr_End
FROM Stock_Price_with_qtr
GROUP BY
Stock_Price_with_qtr.Ticker,
Stock_Price_with_qtr.Yr,
Stock_Price_with_qtr.Qtr
That would allow me to use the following query in VB.NET (or C#, or Access itself) to calculate the quarterly growth rates:
SELECT
Qtr_Dates.Ticker,
Qtr_Dates.Yr,
Qtr_Dates.Qtr,
(([Close_Prices]![Close1]/[Open_Prices]![Open1])-1)*100 AS Qtr_Growth
FROM
(
Historical_Stock_Prices AS Open_Prices
INNER JOIN Qtr_Dates
ON (Open_Prices.Ticker = Qtr_Dates.Ticker)
AND (Open_Prices.Date1 = Qtr_Dates.Qtr_Start)
)
INNER JOIN
Historical_Stock_Prices AS Close_Prices
ON (Qtr_Dates.Ticker = Close_Prices.Ticker)
AND (Qtr_Dates.Qtr_End = Close_Prices.Date1)

Joining a second instance of Sales table to get last weeks Sales

I have a Sales table showing product number, sales value, and sales volume per week. I need to build a report to display these values and volumes along with the equivalent values from the previous week. I also have a Weeks table which gives me the previous week number for the current week (for instance if current week is 2013-01, then the previous week value is 2012-52).
I therefore assumed it would be simple enough to join to another instance of Sales on product number and previous week number from the Weeks table. However Teradata is not letting me do this, initially it threw an error of Improper column reference in the search condition of a joined table and when I re-ordered the query to reference Weeks before the second instance of Sales it now tries to run but gives me a No more spool space error, so I assume my approach is incorrect. My SQL is as follows:
select s.Week_Number,
s.Product_Number,
s.Sales_Value,
s.Sales_Volume,
s_lw.Sales_Value,
s_lw.Sales_Volume
from SALES s
inner join WEEKS w
on s.Week_Number = w.Week_Number
left join SALES s_lw
on s.Product_Number = s_lw.Product_Number
and s_lw.Week_Number = w.Last_week_Number
Could anyone please suggest what I'm doing wrong here? It seems like this should be achievable.
I would suggest using a Window Aggregate Function to accomplish this with a single pass of the SALES table:
SELECT DISTINCT
s.Week_Number,
s.Product_Number,
s.Sales_Value,
s.Sales_Volume,
MAX(s.Sales_Value) OVER (PARTITION BY s.Product_Number
ORDER BY s.Week_Number DESC
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS LW_Sales_Value,
MAX(s.Sales_Volume) OVER (PARTITION BY s.Product_Number
ORDER BY s.Week_Number DESC
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS LW_Sales_Volume
FROM SALES s;