mix two sql queries

mix two sql queries - sql

I want to mix 2 sql queries, one to get total amount in a particular year, and other to know the amount in a particular month of a particular year
SELECT SUM(money) As Anual FROM Deposito WHERE Year(FechaDeposito)=2011
SELECT SUM(money) As monthly FROM Deposito WHERE Year(FechaDeposito)= 2011 AND Month(FechaDeposito)=10
How to do that in the most efficent way ?

Simply combining your two queries would give
SELECT SUM(money) As Anual,
SUM(CASE WHEN Month(FechaDeposito)=10 THEN money else 0 end) as monthly
FROM Deposito
WHERE Year(FechaDeposito)=2011
It will perform terribly however. Ideally you want an index on FechaDeposito, and to construct date ranges to test against instead of running functions over the column, e.g.
SELECT SUM(money) As Anual,
SUM(CASE WHEN Month(FechaDeposito)=10 THEN money else 0 end) as monthly
FROM Deposito
WHERE FechaDeposito >= '2011-01-01' and FechaDeposito < '2012-01-01'

Related

How to calculate average and latest cost of a product over date range in same sql query

I have table where product is there and it's cost over a time range. I need to calculate the average cost over the period, with the latest cost till date to be considered in average also I need to fetch the current cost. How can I achieve it in same query.
Input Table
I am looking for output like
product | average_cost | current_cost
(average cost is (cost*days of that cost)/total days dill today's date.

You can use date arithmetic and conditional aggregation:
select product,
( sum( cost * datediff(day, beg_date, (case when end_date > getdate() then getdate() else end_date end) )) /
sum(datediff(day, beg_date, (case when end_date > getdate() then getdate() else end_date end))
) as avg_price,
max(case when end_date > getdate() then price end)
from t
group by product;

Filter results without affecting all columns in SQL Server 2017

I am using SQL Server 2017 and through asking numerous questions on here I have discovered case statements which act as if - else in SQL. This is good but will not satisfy what I need from my result set. If I have a sales table with an amount, date of sale and item description. I am trying to write something like this.
Select
sum(amount) -- total amount,
count(date_of_sale) -- number of days selling
sum(amount where date_of_sale between certain date and certain date)
I don't want to put a where clause outside this because I don't want it to effect the result of the other columns. I can't get around this using a case statement to what I have tried

We can use conditional aggregation here, and sum a CASE expression which includes in the sum only amounts from your date range of interest.
SELECT
SUM(amount) AS total_sales,
COUNT(date_of_sale) AS total_items,
SUM(CASE WHEN date_of_sale BETWEEN start_date AND end_date
THEN amount ELSE 0 END) AS partial_sales,
COUNT(CASE WHEN date_of_sale BETWEEN start_date AND end_date
THEN 1 END) AS partial_items
FROM yourTable;

Joining sums of different periods from the same data set

I often face the situation where I need to compare aggregated data of different periods from the same source.
I usually deal with it this way:
SELECT
COALESCE(SalesThisYear.StoreId, SalesLastYear.StoreId) StoreId
, SalesThisYear.Sum_Revenue RevenueThisYear
, SalesLastYear.Sum_Revenue RevenueLastYear
FROM
(
SELECT StoreId, SUM(Revenue) Sum_Revenue
FROM Sales
WHERE Date BETWEEN '2017-09-01' AND '2017-09-30'
GROUP BY StoreId
) SalesThisYear
FULL JOIN (
SELECT StoreId, SUM(Revenue) Sum_Revenue
FROM Sales
WHERE Date BETWEEN '2016-09-01' AND '2016-09-30'
GROUP BY StoreId
) SalesLastYear
ON (SalesLastYear.StoreId = SalesThisYear.StoreId)
-- execution time 337 ms
It is not very elegant in my opinion, because it visits the table twice, but it works.
Another similar way to achieve the same is:
SELECT
Sales.StoreId
, SUM(CASE YEAR(Date) WHEN 2017 THEN Revenue ELSE 0 END) RevenueThisYear
, SUM(CASE YEAR(Date) WHEN 2016 THEN Revenue ELSE 0 END) RevenueLastYear
FROM
Sales
WHERE
Date BETWEEN '2017-09-01' AND '2017-09-30'
or Date BETWEEN '2016-09-01' AND '2016-09-30'
GROUP BY
StoreId
-- execution time 548 ms
Both solutions performs almost the same on my data set (1,929,419 rows in the selected period, all indexes on their places), the first one a little better in time. And it doesn't matter if I include more periods, the first one is always better on my data set.
This is only a simple example but, sometimes, it involves more than two intervals and even some logic (e.g. compare isoweek/weekday instead of month/day, compare different stores, etc).
Although I already have figured out several ways to achieve the same, I was wondering if there is a clever way to achieve the same. Maybe a more cleaner solution, or a more suitable for big data sets (over a TB).
For example, I suppose the second one is less resource intensive for a big data set, since it does a single Index Scan over the table. The first one, on the other hand, requires two Index Scans and a Merge. If the table is too big to fit in memory, what will happen? Or the first one is always better?

There is very rarely a This way of doing things is always better, especially when they are doing very similar things.
What I will suggest however is that you try to utilise best practise wherever you can, such as minimising the use of scalar functions in your queries as this inhibits index usage.
For example, by changing your second query to the following I would imagine you will see at least some improvement performance wise:
SELECT
Sales.StoreId
, SUM(CASE WHEN Date BETWEEN '2017-09-01' AND '2017-09-30' THEN Revenue ELSE 0 END) RevenueThisYear
, SUM(CASE WHEN Date BETWEEN '2016-09-01' AND '2016-09-30' THEN Revenue ELSE 0 END) RevenueLastYear
FROM
Sales
WHERE
Date BETWEEN '2017-09-01' AND '2017-09-30'
or Date BETWEEN '2016-09-01' AND '2016-09-30'
GROUP BY
StoreId

The second looks better. But I guess the year part is slowing the query. Lets take out the year and put this. 2017-01-01 will be greater for this year range('2017-09-01' AND '2017-09-30') and less for last year range ('2016-09-01' AND '2016-09-30') .
SELECT
Sales.StoreId
, SUM(CASE WHEN date > 2017-01-01 THEN Revenue ELSE 0 END) RevenueThisYear
, SUM(CASE WHEN date < 2017-01-01 THEN Revenue ELSE 0 END) RevenueLastYear
FROM
Sales
WHERE
Date BETWEEN '2017-09-01' AND '2017-09-30'
or Date BETWEEN '2016-09-01' AND '2016-09-30'
GROUP BY
StoreId
IF FULL join is working great, lets try this.
SELECT
COALESCE(SalesThisYear.StoreId, SalesLastYear.StoreId) StoreId
, sum(SalesThisYear.Revenue) RevenueThisYear
, sum(SalesLastYear.Revenue) RevenueLastYear
FROM Sales SalesThisYear full join
Sales SalesLastYear
ON SalesLastYear.StoreId = SalesThisYear.StoreId
WHERE SalesThisYear.Date BETWEEN '2017-09-01' AND '2017-09-30'
AND SalesLastYear.Date BETWEEN '2016-09-01' AND '2016-09-30'
GROUP BY COALESCE(SalesThisYear.StoreId, SalesLastYear.StoreId)
Edit *
SELECT Sales.StoreId
, SUM(CASE WHEN date > '2017-01-01' THEN Revenue ELSE 0 END) RevenueThisYear
, SUM(CASE WHEN date < '2017-01-01' THEN Revenue ELSE 0 END) RevenueLastYear
FROM
(Select store_id, date, revenue
from Sales
WHERE Date BETWEEN '2017-09-01' AND '2017-09-30'
or Date BETWEEN '2016-09-01' AND '2016-09-30') q
GROUP BY StoreId

SQL Over partition by

I basically have a case statement that displays the sum of profit and a month to date total for each person. My idea is i want to display a daily figure of that person as well as their whole month total altogether.
My issue is when i limit results to just yesterday (supposed to be a daily figure) this then effects the calculation of the month value (just calculates the sum for that day rather than the whole month).
This is because the total month values are all out of the scope of the query. Is there anyway to calculate the whole month value for each person correctly without having the limits of where effecting the result.
e.g.
The result:
08/09/17: 25
09/09/17: 25
10/09/17: 25
11/09/17: 25 <<<< but only display one day and month total
Overall Month total: 100
Can this also includes nulls too? I think im almost looking at a dynamically stored month to date value that isn't effected by where clauses.
SELECT SUM(Figure) AS 'Daily Figure',
CASE WHEN
MONTH([DATE]) = MONTH(getdate()) AND
YEAR([DATE]) = YEAR(getdate())
THEN
SUM(Figure)
OVER (PARTITION BY [Name],
MONTH([DATE]))
ELSE 0 END
as [Month To Date Total]
WHERE
dateadd(day,datediff(day,1,GETDATE()),0)

If you want month-to-date and the current amount, then use conditional aggregation:
SELECT NAME,
SUM(CASE WHEN DAY(DATE) = DAY(GETDATE()) - 1 THEN Figure ELSE 0 END) AS DailyFigure,
SUM(Figure) as MonthToDate
WHERE MONTH([DATE]) = MONTH(getdate()) AND
YEAR([DATE]) = YEAR(getdate())
GROUP BY NAME;
This works on all but the first day of the month.

Average for partition bounded by last 7 days

I have a query spanned across last 30 days, which sums total revenue, however I also want along with sum of last 30 days, add average of last 7 days. I want something like this:
select
country
, avg(revenue) over (partition by country range between current_date - 7 and current_date) avg_revenue_last_7_days
, sum(revenue) total_revenue_30_days
from table
group by 1,2
Is it possible to get average for a smaller number of days than what aggregation is based on?
I want to avoid subqueries because the query already quite complex.

You don't need window functions for this, just conditional aggregation:
select country,
avg(case when datecol between current_date - 7 and current_date
then revenue
end) as avg_revenue_last_7_days,
sum(case when datecol between current_date - 30 and current_date
then revenue
end) as total_revenue_30_days
from table
group by country;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas