Group by Month and Year in SQL - sql

I am trying to make a query, I must receive a date and give a report in which I must show the sum of the amounts at the end of a month.
What I have so far is this:
CREATE PROCEDURE consulta
#fecha DATE
AS
SELECT
SUM(dca.UNIDADES) as Amount,
MONTH(ca.FINICIO) as Month,
YEAR(ca.FINICIO)
FROM
DETALLE_CONTRATO_ALQUILER dca
INNER JOIN
CONTRATOALQUILER ca ON dca.CODCONTRATO = ca.CODCONTRATO
AND ca.FINICIO >= #fecha
AND YEAR(ca.FINICIO) = YEAR(#fecha)
GROUP BY
MONTH(ca.FINICIO), YEAR(ca.FINICIO)
HAVING
SUM(dca.UNIDADES) > 2;
The comparison of years is because I only have to obtain the months of that same year.
I also attach my diagram:
The context of the database is about product rentals, the tables I use are the rental contract and the detail
I know I get errors because when I enter a specific date, I do not get results. I do not know what I'm failing. My query is correctly logical?
What I expect to obtain is:
Amount | Month | Year
12 1 2017
45 2 2017
...
Here's the example

I would assume all rows of both tables have matching row(s) in the other table, so an INNER JOIN is what you need.
There's a small detail in your query that smells fishy. Your join includes filtering conditions that may throw rows out of the query. Maybe you should place the filtering conditions in a WHERE clause instead of a JOIN clause, as in:
SELECT
SUM(dca.UNIDADES) as Amount,
MONTH(ca.FINICIO) as Month,
YEAR(ca.FINICIO)
FROM
DETALLE_CONTRATO_ALQUILER dca
INNER JOIN
CONTRATOALQUILER ca ON dca.CODCONTRATO = ca.CODCONTRATO
WHERE ca.FINICIO >= #fecha -- Using WHERE instead of JOIN here!
AND YEAR(ca.FINICIO) = YEAR(#fecha)
GROUP BY
MONTH(ca.FINICIO), YEAR(ca.FINICIO)
HAVING
SUM(dca.UNIDADES) > 2;
You can place filtering in the JOIN clause and that is very useful for OUTER JOINs. However, for INNER JOINs that applies to the join itself and may filter out rows you wanted to include.

Related

Query two tables joined over a third table with the same foreign key?

I have a Postgres database schema groceries.
There are two tables purchases 19 and 20 connected over a third one categories.
I can join every table alone with categories without problem.
For calculating the year change I need 19 and 20 together.
It seems the problem is that the third table categories has got only one foreign key for both tables. Thus it return every time a col with zeros because there is no match for one table. Maybe I am wrong.
Any suggestions to query the tables?
More info below.
The groceries database has a subset dairies: 'whole milk','yogurt', 'domestic eggs'.
There are no clear primary keys.
I share the database file with this link:
https://drive.google.com/drive/folders/1BBXr-il7rmDkHAukETUle_ZYcDC7t44v?usp=sharing
I want to answer:
For each month of 2020, what was the percentage increase or decrease in total monthly dairy purchases compared to the same month in 2019 (i.e., the year_change)?
How can I do this?
I have tried different queries along this line:
SELECT
a.month,
COUNT(a.purchaseid) as sales_2020,
COUNT(b.purchase_id) as sales_2019,
ROUND(((CAST(COUNT(purchaseid) as decimal) /
(SELECT COUNT(purchaseid)FROM purchases_2020)) *100),2)
as market_share,
(COUNT(a.purchaseid) - COUNT(b.purchase_id) ) as year_change
FROM purchases_2020 as a
Left Outer Join categories as cat ON a.purchaseid = cat.purchase_id
Left Outer Join purchases_2019 as b ON cat.purchase_id = b.purchase_id
WHERE cat.category in ('whole milk','yogurt', 'domestic eggs')
GROUP BY a.month
ORDER BY a.month
;
It gives me either no result or the result above with an empty sales_2019 column.
The expected result is a table
with the monthly dairy sales for 2020, the montly market share of dairies of all products in 2020, and the monthly year change between 2019 and 2020 in percentage.
How can I calculate the year change?
Thanks for your help.
%%sql
postgresql:///groceries
with p2019Sales as (
select
month,
count(p.purchase_id) as total_sales
from purchases_2019 p
left join categories c
using (purchase_id)
where c.category in ('whole milk', 'yogurt' ,'domestic eggs')
group by month
order by month
),
mkS as (
select
cast(extract(month from fulldate::date)as int) as month,
count(*) as total_share
from purchases_2020
group by month
order by month
),
p2020Sales as (
select
cast(extract(month from fulldate::date)as int) as month,
count(p.purchaseid) as total_sales,
round(count(p.purchaseid)*100::numeric/ m.total_share,2) as market_share,
sum(count(*)) over() as tos
from purchases_2020 p
left join categories c
on p.purchaseid = c.purchase_id
left join mks m
on cast(extract(month from p.fulldate::date)as int) = m.month
where c.category in ('whole milk', 'yogurt' ,'domestic eggs')
group by 1,m.total_share
order by 1,m.total_share
),
finalSale as (
select
month,
p2.total_sales,
p2.market_share,
round((p2.total_sales - p1.total_sales)*100::numeric/p1.total_sales,2) as year_change
from p2019Sales p1
inner join p2020Sales p2
using(month)
)
select *
from finalSale
The answer of user18262778 is excellent.
but as Jeremy Caney is stating:
" add additional details that will help others understand how this addresses the question asked."
I deliver some details.
My goal:
get the output I want in one query
My problem:
The query is long and complicated.
There are several approaches to the problem:
joins
subqueries
All are prone to circular dependencies.
The subqueries and joins produce results,but discard data necessary to move on further towards the final result
The solution:
The with statement allows to compute the aggregation and reference this by name within the query.
If you know it is the WITH statement, then there is of course a lot of info on the web. The description below summarises exactly the benefits of the given solution in general.
"In PostgreSQL, the WITH query provides a way to write auxiliary statements for use in a larger query. It helps in breaking down complicated and large queries into simpler forms, which are easily readable. These statements often referred to as Common Table Expressions or CTEs, can be thought of as defining temporary tables that exist just for one query.
The WITH query being CTE query, is particularly useful when subquery is executed multiple times. It is equally helpful in place of temporary tables. It computes the aggregation once and allows us to reference it by its name (may be multiple times) in the queries.
The WITH clause must be defined before it is used in the query."
PostgreSQL - WITH Clause

SQL Query to pull date range based on dates in a table

I have some SQL code which is contained in an SSRS report and when run pulls a list of student detentions for a set period such as a week or month but I have been asked to get the report to run automatically from the start of the current school term to the date the report has been run. Is this possible? We have 3 terms per year and the dates change each year. The report has multiple subscriptions which will run weekly and filter to students in particular day houses and years so we ideally need the report to update itself.
We have a table in our database titled TblSchoolManagementTermDates which includes txtStartDate and txtFinishDate columns for each term.
The date of the detention is stored in the column detPpl.dDetentionDate
The full SQL code I am currently using is:
SELECT ppl.txtSchoolID AS SchoolID,
detPpl.TblDisciplineManagerDetentionsPupilsID AS DetentionID,
ppl.txtSurname AS Surname,
ppl.txtForename AS Forename,
ppl.txtPrename AS PreferredName,
ppl.intNCYear AS Year,
ppl.txtAcademicHouse AS House,
schTermDates.intSchoolYear AS AcademicYear,
schTerms.txtName AS TermName,
CAST(schTermDates.intSchoolYear AS CHAR(4)) + '/' +
RIGHT(CAST(schTermDates.intSchoolYear + 1 AS CHAR(4)), 2) AS AcademicYearName,
detPpl.dDetentionDate AS DetentionDate,
detSessions.txtSessionName AS DetentionName,
detPpl.txtOffenceDescription AS OffenceDescription,
LEFT(Staff.Firstname, 1) + '. ' + Staff.Surname AS PutInBy,
detPpl.intPresent AS AttendedDetention
FROM dbo.TblPupilManagementPupils AS ppl
INNER JOIN
dbo.TblDisciplineManagerDetentionsPupils AS detPpl
ON detPpl.txtSchoolID = ppl.txtSchoolID
INNER JOIN
dbo.TblDisciplineManagerDetentionsSessions AS detSessions
ON detPpl.intDetentionSessionID = detSessions.TblDisciplineManagerDetentionsSessionsID
INNER JOIN
dbo.TblStaff AS Staff
ON Staff.User_Code = detPpl.txtSubmittedBy
INNER JOIN
dbo.TblSchoolManagementTermDates AS schTermDates
ON detPpl.dDetentionDate BETWEEN schTermDates.txtStartDate AND schTermDates.txtFinishDate
INNER JOIN
dbo.TblSchoolManagementTermNames AS schTerms
ON schTermDates.intTerm = schTerms.TblSchoolManagementTermNamesID
LEFT OUTER JOIN
dbo.TblDisciplineManagerDetentionsCancellations AS Cancelled
ON Cancelled.intSessionID = detPpl.intDetentionSessionID
AND Cancelled.dDetDate = detPpl.dDetentionDate
WHERE (ppl.txtAcademicHouse = 'Challoner') AND (Cancelled.TblDisciplineManagerDetentionsCancellationsID IS NULL) AND (CAST(detPpl.dDetentionDate AS DATE) >= CAST (GETDATE()-28 AS DATE))
ORDER BY ppl.txtSurname, ppl.txtForename, detPpl.dDetentionDate
What you need is to assign a couple of parameters to this code.
lets call the parameters
#term_start
and
#term_end
In your where clause you simply need to remove this piece
AND (CAST(detPpl.dDetentionDate AS DATE) >= CAST (GETDATE()-28 AS DATE))
and add this piece in
AND (CAST(detPpl.dDetentionDate AS DATE) between #term_start and #term_end
Now create another dataset based on your term dates - lets call the dataset term_dates
something like this (I'm making up these fields as I don't know what columns are available or have no sample data) Use the idea below to adapt to your requirements
select
min(term_start_date) as start_date
,max(term_end_date) as end_date
from TblSchoolManagementTermNames
where convert(date,getdate()) between term_start_date and term_end_date
Now your report should have 2 parameters.. You simply need to set the default value for the parameters.
Set the default value for #term_start as the start_date and #term_end as the end_date from your term_dates dataset
Run your report.. You should have the data between the term dates.
This should work.. unless I've misunderstood the requirement

SQL to calculate value of Shares at a particular time

I'm looking for a way that I can calculate what the value of shares are at a given time.
In the example I need to calculate and report on the redemptions of shares in a given month.
There are 3 tables that I need to look at:
Redemptions table that has the Date of the redemption, the number of shares that were redeemed and the type of share.
The share type table which has the share type and links the 1st and 3rd tables.
The Share price table which has the share type, valuation date, value.
So what I need to do is report on and have calculated based on the number of share redemptions the value of those shares broken down by month.
Does that make sense?
Thanks in advance for your help!
Apologies, I think I should elaborate a little further as there might have been some misunderstandings. This isn't to calculate daily changing stocks and shares, it's more for fund management. What this means is that the share price only changes on a monthly basis and it's also normally a month behind.
The effect of this is that the what the query needs to do, is look at the date of the redemption, work out the date ie month and year. Then look at the share price table and if there's a share price for the given date (this will need to be calculated as it will be a single day ie the price was x on day y) then multiple they number of units by this value. However, if there isn't a share price for the given date then use the last price for that particular share type.
Hopefully this might be a little more clear but if there's any other information I can provide to make this easier then please let me know and I'll supply you with the information.
Regards,
Phil
This should do the trick (note: updated to group by ShareType):
SELECT
ST.ShareType,
RedemptionMonth = DateAdd(month, DateDiff(month, 0, R.RedemptionDate), 0),
TotalShareValueRedeemed = Sum(P.SharePrice * R.SharesRedeemed)
FROM
dbo.Redemption R
INNER JOIN dbo.ShareType ST
ON R.ShareTypeID = ST.ShareTypeID
CROSS APPLY (
SELECT TOP 1 P.*
FROM dbo.SharePrice P
WHERE
R.ShareTypeID = P.ShareTypeID
AND R.RedemptionDate >= P.SharePriceDate
ORDER BY P.SharePriceDate DESC
) P
GROUP BY
ShareType,
DateAdd(month, DateDiff(month, 0, R.RedemptionDate), 0)
ORDER BY
ShareType,
RedemptionMonth
;
See it working in a Sql Fiddle.
This can easily be parameterized by simply adding a WHERE clause with conditions on the Redemption table. If you need to show a 0 for share types in months where they had no Redemptions, please let me know and I'll improve my answer--it would help if you would fill out your use case scenario a little bit, and describe exactly what you want to input and what you want to see as output.
Also please note: I'm assuming here that there will always be a price for a share redemption--if a redemption exists that is before any share price for it, that redemption will be excluded.
If you have the valuations for every day, then the calculation is a simple join followed by an aggregation. The resulting query is something like:
select year(redemptiondate), month(redemptiondate),
sum(r.NumShares*sp.Price) as TotalPrice
from Redemptions r left outer join
ShareType st
on r.sharetype = st.sharetype left outer join
SharePrice sp
on st.sharename = sp.sharename and r.redemptiondate = sp.pricedate
group by year(redemptiondate), month(redemptiondate)
order by 1, 2;
If I understand your question, you need a query like
select shares.id, shares.name, sum (redemption.quant * shareprices.price)
from shares
inner join redemption on shares.id = redemption.share
inner join shareprices on shares.id = shareprices.share
where redemption.curdate between :p1 and :p2
order by shares.id
group by shares.id, shares.name
:p1 and :p2 are date parameters
If you just need it for one date range:
SELECT s.ShareType, SUM(ISNULL(sp.SharePrice, 0) * ISNULL(r.NumRedemptions, 0)) [RedemptionPrice]
FROM dbo.Shares s
LEFT JOIN dbo.Redemptions r
ON r.ShareType = s.ShareType
OUTER APPLY (
SELECT TOP 1 SharePrice
FROM dbo.SharePrice p
WHERE p.ShareType = s.ShareType
AND p.ValuationDate <= r.RedemptionDate
ORDER BY p.ValuationDate DESC) sp
WHERE r.RedemptionDate BETWEEN #Date1 AND #Date2
GROUP BY s.ShareType
Where #Date1 and #Date2 are your dates
The ISNULL checks are just there so it actually gives you a value if something is null (it'll be 0). It's completely optional in this case, just a personal preference.
The OUTER APPLY acts like a LEFT JOIN that will filter down the results from SharePrice to make sure you get the most recent ValuationDate from table based on the RedemptionDate, even if it wasn't from the same date range as that date. It could probably be achieved another way, but I feel like this is easily readable.
If you don't feel comfortable with the OUTER APPLY, you could use a subquery in the SELECT part (i.e., ISNULL(r.NumRedemptions, 0) * (/* subquery from dbo.SharePrice here */)

Window moving average in sql server

I am trying to create a function that computes a windowed moving average in SQLServer 2008. I am quite new to SQL so I am having a fair bit of difficulty. The data that I am trying to perform the moving average on needs to be grouped by day (it is all timestamped data) and then a variable moving average window needs to be applied to it.
I already have a function that groups the data by day (and #id) which is shown at the bottom. I have a few questions:
Would it be better to call the grouping function inside the moving average function or should I do it all at once?
Is it possible to get the moving average for the dates input into the function, but go back n days to begin the moving average so that the first n days of the returned data will not have 0 for their average? (ie. if they want a 7 day moving average from 01-08-2011 to 02-08-2011 that I start the moving average calculation on 01-01-2011 so that the first day they defined has a value?)
I am in the process of looking into how to do the moving average, and know that a moving window seems to be the best option (currentSum = prevSum + todayCount - nthDayAgoCount) / nDays but I am still working on figuring out the SQL implementation of this.
I have a grouping function that looks like this (some variables removed for visibility purposes):
SELECT
'ALL' as GeogType,
CAST(v.AdmissionOn as date) as dtAdmission,
CASE WHEN #id IS NULL THEN 99 ELSE v.ID END,
COUNT(*) as nVisits
FROM dbo.Table1 v INNER JOIN dbo.Table2 t ON v.FSLDU = t.FSLDU5
WHERE v.AdmissionOn >= '01-01-2010' AND v.AdmissionOn < DATEADD(day,1,'02-01-2010')
AND v.ID = Coalesce(#id,ID)
GROUP BY
CAST(v.AdmissionOn as date),
CASE WHEN #id IS NULL THEN 99 ELSE v.ID END
ORDER BY 2,3,4
Which returns a table like so:
ALL 2010-01-01 1 103
ALL 2010-01-02 1 114
ALL 2010-01-03 1 86
ALL 2010-01-04 1 88
ALL 2010-01-05 1 84
ALL 2010-01-06 1 87
ALL 2010-01-07 1 82
EDIT: To answer the first question I asked:
I ended up creating a function which declared a temporary table and inserted the results from the count function into it, then used the example from user662852 to compute the moving average.
Take the hardcoded date range out of your query. Write the output (like your sample at the end) to a temp table (I called it #visits below).
Try this self join to the temp table:
Select list.dtadmission
, AVG(data.nvisits) as Avg
, SUM(data.nvisits) as sum
, COUNT(data.nvisits) as RollingDayCount
, MIN(data.dtadmission) as Verifymindate
, MAX(data.dtadmission) as Verifymaxdate
from #visits as list
inner join #visits as data
on list.dtadmission between data.dtadmission and DATEADD(DD,6,data.dtadmission) group by list.dtadmission
EDIT: I didn't have enough room in Comments to say this in response to your question:
My join is "kinda cartesian" because it uses a between in the join constraint. Each record in list is going up against every other record, and then I want the ones where the date I report is between a lower bound of (-7) days and today. Every data date is available to list date, this is the key to your question. I could have written the join condition as
list.dtadmission between DATEADD(DD,-6,data.dtadmission) and data.dtadmission
But what really happened was I tested it as
list.dtadmission between DATEADD(DD,6,data.dtadmission) and data.dtadmission
Which returns no records because the syntax is "Between LOW and HIGH". I facepalmed on 0 records and swapped the arguments, that's all.
Try the following, see what I mean: This is the cartesian join for just one listdate:
SELECT
list.[dtAdmission] as listdate
,data.[dtAdmission] as datadate
,data.nVisits as datadata
,DATEADD(dd,6,list.dtadmission) as listplus6
,DATEADD(dd,6,data.dtAdmission ) as datapplus6
from [sandbox].[dbo].[admAvg] as list inner join [sandbox].[dbo].[admAvg] as data
on
1=1
where list.dtAdmission = '5-Jan-2011'
Compare this to the actual join condition
SELECT
list.[dtAdmission] as listdate
,data.[dtAdmission] as datadate
,data.nVisits as datadata
,DATEADD(dd,6,list.dtadmission) as listplus6
,DATEADD(dd,6,data.dtAdmission ) as datapplus6
from [sandbox].[dbo].[admAvg] as list inner join [sandbox].[dbo].[admAvg] as data
on
list.dtadmission between data.dtadmission and DATEADD(DD,6,data.dtadmission)
where list.dtAdmission = '5-Jan-2011'
See how list date is between datadate and dataplus6 in all the records?

join two tables to get tomorrow's price (and the price two days from "now")

I'm trying to do a JOIN query to analyze some stocks. In my first table called top10perday, I list 10 stocks per day that I have chosen to "buy" the next day and sell the following day:
date symbol
07-Aug-08 PM
07-Aug-08 HNZ
07-Aug-08 KFT
07-Aug-08 MET
...
08-Aug-08 WYE
08-Aug-08 XOM
08-Aug-08 SGP
08-Aug-08 JNJ
For instance, for record #1:
the date of the record is 07-Aug-08
I want to buy a share of PM stock on the next trading day after 07-Aug-08 (which is 08-Aug-08)
I want to sell that shar eof PM stock two trading days after 07-Aug-08), which turns out to be 11-Aug-08
My stock prices are in a table called prices, which looks like this:
date symbol price
07-Aug-08 PM 54.64
08-Aug-08 PM 55.21
11-Aug-08 PM 55.75
12-Aug-08 PM 55.95
... many more records with trading day, symbol, price
I want to do a JOIN so that my result set looks like this:
date symbol price-next-day price-two-days
07-Aug-08 PM 55.21 55.75
...
list one record per date and symbol in table1.
I have tried doing something like:
SELECT top10perday.date, top10perday.symbol, Min(prices.date) AS MinOfdate
FROM prices INNER JOIN top10perday ON prices.symbol = top10perday.symbol
GROUP BY top10perday.date, top10perday.symbol
HAVING (((Min(prices.date))>[date]));
I have tried many variations of this, but I'm clearly not on the right path, because the result set just includes 10 rows as of the earliest date shown in my top10perday table.
I am using Microsoft Access. Thanks in advance for your help! :-)
This syntax worked in Access 2003:
SELECT t10.Date, t10.Symbol, p1.date, p1.price, p2.date, p2.price
FROM
(top10perday AS t10
LEFT JOIN prices AS p1
ON t10.Symbol = p1.symbol)
INNER JOIN prices AS p2 ON t10.Symbol = p2.symbol
WHERE (
((p1.date)=((Select Min([date]) as md
from prices
where [date]>t10.[Date] and symbol = t10.symbol
))
) AND ((p2.date)=((Select Min([date]) as md
from prices
where [date]>p1.[Date] and symbol = t10.symbol)
))
);
the idea is to get the first (min) date that is greater than the date in the previous table (top10perday and the prices as p1)
This should just be a join between three copies of the prices table. The problem is that you need to join to the next trading day, and that's a slightly trickier problem, since it's not always the next day. So we end up with a more complex situation (particularly as some days are skipped beacuse of holidays).
If it weren't Access you could use row_number() to order your prices by date (using a different sequence per stock code).
WITH OrderedPrices AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY symbol ORDER BY date) AS RowNum
FROM Prices
)
SELECT orig.*, next_day.price, two_days.price
FROM OrderedPrices orig
JOIN
OrderedPrices next_day
ON next_day.symbol = orig.symbol AND next_day.RowNum = orig.RowNum + 1
JOIN
OrderedPrices two_days
ON two_days.symbol = orig.symbol AND two_days.RowNum = orig.RowNum + 2
;
But you're using Access, so I don't think you have ROW_NUMBER().
Instead, you could have a table which lists the dates, having a TradingDayNumber... then use that to facilitate your join.
SELECT orig.*, next_day.price, two_days.price
FROM Prices orig
JOIN
TradingDays d0
ON d1.date = orig.date
JOIN
TradingDays d1
ON d1.TradingDayNum = d0.TradingDayNum + 1
JOIN
TradingDays d2
ON d2.TradingDayNum = d0.TradingDayNum + 2
JOIN
Prices next_day
ON next_day.symbol = orig.symbol AND next_day.date = d1.date
JOIN
Prices two_days
ON two_days.symbol = orig.symbol AND two_days.date = d2.date
But obviously you'll need to construct your TradingDays table...
Rob
My guess is:
SELECT top10perday.date, top10perday.symbol, MIN(pnd.price) AS PriceNextDay, MIN(ptd.price) AS PriceTwoDays
FROM top10perday
LEFT OUTER JOIN prices AS pnd ON (pnd.symbol = top10perday.symbol AND pnd.date > top10perday.date)
LEFT OUTER JOIN prices AS ptd ON (ptd.symbol = top10perday.symbol AND ptd.date > pnd.date)
GROUP BY top10perday.date, top10perday.symbol
HAVING ((pnd.date = Min(pnd.date) AND ptd.date = Min(ptd.date));
It´s just a shoot in the dark but my reasoning is: List all stocks you want (top10perday) and for each stock get the price, if exists, with mininum date after its date to populate the PriceNextDay and the price with minimun date after the PriceNextDay to populate the PriceTwoDays. The performance may stinks. But test it and see if it works. Later we can try to improve it.
**EDIT**ed to include Rob Farley´s comment.
I'm not a guru on this transformation but I can point you at an idea. Try using Pivot on the date column for each symbol in your query from a date to a date. This should give you a table with many columns with the name of the date you're using, and the price on each day. Indeed it should do this for every stock symbol you have over a given time.
Based on what you're trying to graph though, I think it would be interesting for you to look at the VWSP not just the spot price on your trades if you're trying to plot the stock performance.