I'm pretty new in sql language, so I'm reading Sql Server 2012 T-SQL Fundamentals book to introduce myself in these topics. There are 2 examples that I'm trying to dissect to get a clear understanding about. First, the author execute the following query:
SELECT
empid,
YEAR(orderdate) AS orderyear,
SUM(freight) AS totalfreight,
COUNT(*) AS numorders
FROM
Sales.Orders
WHERE
custid = 71
GROUP BY
empid, YEAR(orderdate);
to obtain this:
empid orderyear totalfreight numorders
----------- ----------- --------------------- -----------
1 2006 126.56 1
2 2006 89.16 1
9 2006 214.27 1
1 2007 711.13 2
2 2007 352.69 1
3 2007 297.65 2
4 2007 86.53 1
5 2007 277.14 3
6 2007 628.31 3
7 2007 388.98 1
8 2007 371.07 4
1 2008 357.44 3
2 2008 672.16 2
4 2008 651.83 3
6 2008 227.22 1
7 2008 1231.56 2
But, in the 2nd example the author runs the following query:
SELECT
empid, YEAR(orderdate) AS orderyear
FROM
Sales.Orders
WHERE
custid = 71
GROUP BY
empid, YEAR(orderdate)
HAVING
COUNT(*) > 1;
This query returns the following output:
empid orderyear
----------- -----------
1 2007
3 2007
5 2007
6 2007
8 2007
1 2008
2 2008
4 2008
7 2008
My questions are:
Why is the resultset excluding 2006 years? and
why there are two rows with 1 values?
How HAVING clause does determine what rows return in both columns?
Thank you in advance.
In the first query, we see this:
COUNT(*) AS numorders
And in the second:
COUNT(*) > 1;
In the second query, this value isn't displayed, but we can use the first set of results to figure it out. All of these rows are not included in the second query:
empid orderyear totalfreight numorders
----------- ----------- --------------------- -----------
1 2006 126.56 1
2 2006 89.16 1
9 2006 214.27 1
2 2007 352.69 1
4 2007 86.53 1
7 2007 388.98 1
6 2008 227.22 1
Why?
Because numorders is only 1 and in the second query, we asked for rows where numorders > 1.
As to your question, HAVING is the version of WHERE that works on functions (such as COUNT()).
http://www.w3schools.com/sql/sql_having.asp
You use HAVING together with GROUP BY.
The criteria used in WHERE applies to all rows before aggregation, while HAVING applies to the aggregated results.
In other words: You have two criteria sections for filtering the dataset; before the grouping defined in WHERE, and after grouping defined in HAVING.
Related
This is my table:
index_melanoma_yr Total_Melanoma Total_Virus
2000 700 12
2001 746 7
2002 724 12
2003 815 15
2004 893 16
2005 1020 22
I would like to count by 5 year increments. So, 2000-2004, 2005-2009, etc. I can hard code this, but since there are so many years, I'm wondering if there is a more efficient way.
Here's how I got the initial counts:
SELECT index_melanoma_yr,
COUNT(DISTINCT PersonID) AS Total_Melanoma,
SUM( CASE
WHEN index_virus_yr IS NOT NULL THEN
1
ELSE
0
END
) AS Total_Virus
FROM Asare_ViralMelanoma_IndexDates
GROUP BY index_melanoma_yr
ORDER BY index_melanoma_yr
you can perform some simple maths year / 5 * 5 on the year column, and then GROUP BY that. Assuming that the year column is integer
SELECT MIN(index_melanoma_yr) AS Year_Start,
MAX(index_melanoma_yr) AS Year_End,
COUNT(DISTINCT PersonID) AS Total_Melanoma,
SUM( CASE
WHEN index_virus_yr IS NOT NULL THEN
1
ELSE
0
END
) AS Total_Virus
FROM Asare_ViralMelanoma_IndexDates
GROUP BY index_melanoma_yr / 5 * 5
ORDER BY Year_Start
I have a table relating to products:
PRD_SLD table
ID DATE SALE_IND
3 2012 0
3 2013 0
3 2014 1
3 2014 1
3 2015 1
3 2016 0
3 2017 1
I would like my final results to look like this:
PRD_SLD table
ID DATE SALE_IND STRT END
3 2012 0 2012 2014
3 2013 0 2012 2014
3 2014 1 2014 2016
3 2014 1 2014 2016
3 2015 1 2014 2016
3 2016 0 2016 2017
3 2017 1 2017 2017
I currently have a working CTE for retrieving the rows in which the values change. this CTE returns this:
PRD_SLD table
ID DATE SALE_IND
3 2012 0
3 2014 1
3 2016 0
3 2017 1
So it returns the first instance of the value in the table, and returns every time the SALE_IND changes.
Is there a way to create a start and end date based off of the date column? I am still very new to this and was enrolled in an advanced course. I'm sure there is a better way to complete this but is there a way to do it with the CTE results i have created? I know there is a between function but i don't know how to implement it into this query
One method is to define groups of adjacent records. You don't have a solid ordering of the rows, but you do have just enough information for this to work -- assuming the indicator is constant in each year.
select t.*,
min(year) over (partition by id, sale_ind, seqnum - seqnum_s) as min_year,
max(year) over (partition by id, sale_ind, seqnum - seqnum_s) as max_year
from (select t.*,
dense_rank() over (partition by id order by date) as seqnum,
dense_rank() over (partition by id, sale_ind order by date) as seqnum_s
from t
) t;
I have table like this:
ID Region CreatedDate Value
--------------------------------
1 USA 2016-01-01 5
2 USA 2016-02-02 10
3 Canada 2016-02-02 2
4 USA 2016-02-03 7
5 Canada 2016-03-03 3
6 Canada 2016-03-04 10
7 USA 2016-03-04 1
8 Cuba 2016-01-01 4
I need to sum column Value grouped by Region and CreatedDate by year and month. The result will be
Region Year Month SumOfValue
--------------------------------
USA 2016 1 5
USA 2016 2 17
USA 2016 3 1
Canada 2016 2 2
Canada 2016 3 13
Cuba 2016 1 4
BUT I want to replace all repeated values in column Region with empty string except first met row. The finish result must be:
Region Year Month SumOfValue
--------------------------------
USA 2016 1 5
2016 2 17
2016 3 1
Canada 2016 2 2
2016 3 13
Cuba 2016 1 4
Thank you for a solution. It will be advantage if solution will replace also in column Year
You need to use SUM and GROUP BY to get the SumOfValue. For the formatting, you can use ROW_NUMBER:
WITH Cte AS(
SELECT
Region,
[Year] = YEAR(CreatedDate),
[Month] = MONTH(CreatedDate),
SumOfValue = SUM(Value),
Rn = ROW_NUMBER() OVER(PARTITION BY Region ORDER BY YEAR(CreatedDate), MONTH(CreatedDate))
FROM #tbl
GROUP BY
Region, YEAR(CreatedDate), MONTH(CreatedDate)
)
SELECT
Region = CASE WHEN Rn = 1 THEN c.Region ELSE '' END,
[Year],
[Month],
SumOfValue
FROM Cte c
ORDER BY
c.Region, Rn
ONLINE DEMO
Although this can be done in TSQL, I suggest you do the formatting on the application side.
Query that follows the same order as the OP.
I have a table that looks like this:
YEAR RESOLUTION_DATE CREATION_DATE
2013 2013/02/18
2012 2012/05/26
2009 2009/11/11
2013 2013/12/08 2013/12/01
2000 2000/17/31
2007 2007/12/08
2012 2012/12/08
2012 2012/03/23 2012/03/10
2012 2012/12/08
2007 2007/01/17
2012 2012/01/17 2012/01/10
2009 2009/02/14
I am trying to make a query that will output the following:
YEAR COUNT_RESOLUTION_DATE COUNT_CREATION_DATE
2000 0 1
2007 0 2
2009 0 2
2011 0 0
2012 2 5
2013 1 2
The caveat is that I would like the query to count the number of RESOLUTION_DATE by YEAR, where the RESOLUTION_DATE IS NOT NULL and i want to count ALL CREATION_DATE's. The SQL is needed for an oracle database.
Try this:
SELECT
COUNT(RESOLUTION_DATE) AS COUNT_RESOLUTION_DATE,
COUNT(CREATION_DATE) AS COUNT_CREATION_DATE
FROM MyTable
GROUP BY YEAR
ORDER BY YEAR
If you only want the non-NULL resolution dates counted, this should work:
SELECT
SUM(CASE WHEN RESOLUTION_DATE IS NULL THEN 0 ELSE 1 END) AS COUNT_RESOLUTION_DATE,
COUNT(CREATION_DATE) AS COUNT_CREATION_DATE
FROM MyTable
GROUP BY YEAR
ORDER BY YEAR;
I could use some help with SQL to solve the following problem.
I have a table that has the following columns and data.
CUSTOMER PAYMENT_YEAR BILLS_PER_YEAR BILL_NUMBER
Chris 2010 1 1
Chris 2010 2 2
Chris 2010 3 3
I would like to return all three rows but with the max bills_per_year value in all 3 rows.
It would like like this.
CUSTOMER PAYMENT_YEAR BILLS_PER_YEAR BILL_NUMBER
Chris 2010 3 1
Chris 2010 3 2
Chris 2010 3 3
Will it work for you?
SELECT CUSTOMER, PAYMENT_YEAR, BILLS_PER_YEAR, BILL_NUMBER,
MAX(BILLS_PER_YEAR) OVER (PARTITION BY CUSTOMER,PAYMENT_YEAR) as max_per_year
FROM table1
Is this you mean?
SELECT
CUSTOMER,
PAYMENT_YEAR,
MAX(BILLS_PER_YEAR) OVER(PARTITION BY CUSTOMER, PAYMENT_YEAR) BILLS_PER_YEAR,
BILL_NUMBER
FROM TableName