I'm looking for improve the performance of the Last Year Attended query. Right now, its taking 20+ minutes to run this block.
The LYA take the most recent year attended for a particular event and finds the year they attended prior to the max. For example if they attended in 2018 for an event, the query will look for the last year attended prior to 2018.
LYA for 2018 should return a Null
The data should return the following:
CompanyID MarketID Industry LAST YEAR ATTENDED
-------------------------------------------------------
123456 1234 GIFT 2018
123457 1234 HOME 2017
123458 1234 GIFT 2018
123459 1234 HOME 2018
123460 1234 APPAREL 2018
123461 1234 HOME 2018
123462 1234 HOME 2017
123463 1234 APPAREL 2018
Can anyone assist?
SELECT DISTINCT
COMPANYID, MARKETID, INDUSTRY,
[LAST YEAR ATTENDED] = (SELECT MAX(YEAR(attdate))
FROM v_marketatt va
WHERE va.companyid = vm.companyid
AND YEAR(attdate) <> (SELECT MAX(YEAR(attdate))
FROM v_marketatt vb
WHERE vb.companyid = vm.companyid)
AND MARKETCODE LIKE 'SM1%')
FROM
v_marketatt vm
WHERE
MARKETID IN (835, 1032, 1101)
UPDATE:
Found that is version is more efficient than the rest. Run time down to 7 minutes on a clone. Instead of allowing the subquery to dip into my view twice, had it dip once.
select
DISTINCT COMPANYID,
MARKETID,
INDUSTRY,
CSTATUS,
[LAST YEAR ATTENDED] = (select max(year(attdate)) from v_marketatt va where year(attdate) <> (select max(year(attdate)) from v_marketatt) AND MARKETCODE LIKE 'SM1%' AND va.COMPANYID = vm.COMPANYID)
from v_marketatt vm
WHERE MARKETID IN (835,1032,1101)
;
Thanks to all who responded.
The field [LAST YEAR ATTENDED] has a subquery that computes the max year on each iteration.You can try moving this piece of query to a join something like
select DISTINCT COMPANYID, MARKETID, INDUSTRY,
[LAST YEAR ATTENDED]
from v_marketatt vm
inner join
( select max(year(attdate)) as [LAST YEAR ATTENDED]
from v_marketatt ivm
where year(ivm.attdate) <> (select max(year(attdate))
from v_marketatt vb
where vb.companyid =
ivm.companyid)
AND MARKETCODE LIKE 'SM1%')va on va.companyid = vm.companyid
--where companyid not in (select distinct companyid from
v_marketatt where marketid in (602))
WHERE MARKETID IN (835,1032,1101)
I have not run this query , there could be some minor corrections on syntax , but if you get the concept it should be easy to pick and fix.
apologies for syntax, I'm throwing this together quickly. But I suspect making use a CTE should improve performance dramatically. I'm also not quite sure what you're doing here:
WHERE va.companyid = vm.companyid
AND YEAR(attdate) <> (SELECT MAX(YEAR(attdate))
FROM v_marketatt vb)
AND MARKETCODE LIKE 'SM1%'
So I've left that piece alone. Try something like this, which should help, and possible clarification on the part I've noted above might unlock other things to tweak.
;with Year_CTE (year)
as
(SELECT MAX(YEAR(attdate), va.companyid)
FROM v_marketatt va
WHERE va.companyid = vm.companyid
AND YEAR(attdate) <> (SELECT MAX(YEAR(attdate))
FROM v_marketatt vb)
AND MARKETCODE LIKE 'SM1%')
SELECT DISTINCT
COMPANYID, MARKETID, INDUSTRY,
vb.[YEAR]
FROM
v_marketatt vm
join Year_CTE vb on vb.companyid = vm.companyid
WHERE
MARKETID IN (835, 1032, 1101)
IF you need 'the one before this one' I'd suggest to use LEAD() or LAG() functions.
Although I'm not quite sure I fully understand your example (see Thorsten Kettners comments), going by the explanation I think what you want is something along the lines of:
;WITH years
AS (
SELECT COMPANYID, MARKETID, INDUSTRY, YEAR_ATTENDED = Year(attdate)
FROM v_marketatt
WHERE MARKETID IN (835, 1032, 1101)
AND MARKETCODE LIKE 'SM1%' -- not sure about this one, the example isn't very clear
GROUP BY COMPANYID, MARKETID, INDUSTRY, Year(attdate)
),
last_ones
AS (
SELECT row_nbr = ROW_NUMBER() OVER ( PARTITION BY COMPANYID, MARKETID, INDUSTRY ORDER BY YEAR_ATTENDED DESC),
COMPANYID, MARKETID, INDUSTRY,
LAST_YEAR_ATTENDED = YEAR_ATTENDED,
PREV_YEAR_ATTENDED = LEAD(YEAR_ATTENDED, 1, NULL) OVER (PARTITION BY COMPANYID, MARKETID, INDUSTRY ORDER BY YEAR_ATTENDED DESC)
FROM years
)
SELECT COMPANYID, MARKETID, INDUSTRY,
LAST_YEAR_ATTENDED,
PREV_YEAR_ATTENDED
FROM last_ones
WHERE row_nbr = 1
Since I don't have the tables nor the data here, I haven't tested the query, but I hope it will get you going...
Related
I have spent some time on StackOverflow looking for this answer and trying a bunch of these solutions without luck. I feel like I am missing something minor but I cannot resolve it. NOTE - Learning SQL and using Access because that is what work uses.
I have 2 tables, 1 has consultant information in it (Columns:Consultant ID, First Name, Last Name, Active (Yes or No Checkbox). The 2nd table has their Weekly Numbers (Columns: AutoGenID, Consultant ID, WeekOf (Date), FullService, Consulting, Classified, Reallocations, RecruitmentSrvs) and joined them on ConsultantID (primary key)
I built a simply query to Join the 2 tables and show all the results ONLY for the active consultants in qry_Join (anyone marked not active does not show in this query) qry_Join returns Consultant ID, First Name, & Last Name (From tbl_Consultants) and then WeekOf (Date), FullService, Consulting, Classified, Reallocations, RecruitmentSrvs from tbl_WeeklyNumbers.
Question:
I would like to have a query that shows ONLY the most recent WeekOf (Date)entry by each consultant.
Issue:
SQL I am using is below but the issue I am having is that if ConsultantID #'s 3, 4, 5, 6, & 7 use a 10/11/2017 date and then ConsultantID #8 uses a 10/12/2017, the query will only return ConsultantID #8's row since it is most recent. I need it to still return all the other consultants most recent rows as well even if they are a date prior to ConsultantID #8s'
SQL:
SELECT ConsultantID, FirstName, WeekOf, USFullService, USConsulting, Classified, Reallocations, RecruitmentSrvs
FROM qry_Join
Where WeekOf = (SELECT MAX(WeekOf) FROM qry_Join)
Just pass ConsultantID to your subquery:
SELECT ConsultantID, FirstName, WeekOf, USFullService, USConsulting, Classified, Reallocations, RecruitmentSrvs
FROM qry_Join q
Where WeekOf = (SELECT MAX(s.WeekOf) FROM qry_Join s WHERE s.ConsultantID = q.ConsultantID)
SELECT ConsultantID,
FirstName,
WeekOf,
USFullService,
USConsulting,
Classified,
Reallocations,
RecruitmentSrvs
FROM qry_Join INNER JOIN
(SELECT MAX(WeekOf) maxwkof ,ConsultantID cid
FROM qry_Join
GROUP BY ConsultantID )
ON ConsultantID = cid
WHERE maxwkof = WeekOf
The issue in your query is, The below query gives a max date in the whole table. And that max id Matches only one consultant. SO you get one row
WeekOf = (SELECT MAX(WeekOf) FROM qry_Join)
The below query will give the max date of each consultant. SO you can join the max date and Consultant ID to get the details for each consultant in their latest weekoff
SELECT MAX(WeekOf) maxwkof ,ConsultantID cid
FROM qry_Join
GROUP BY ConsultantID
I hope someone could guide in the correct path . It's my first class in SQL.
SELECT distinct
a.LICENSEID,
a.license,
a.business_name,
a,year
a.TOTAL_AMOUNT_PAID,
SUM(e.COMPUTED_AMOUNT) over (partition by e.LICENSEID) as AMOUNT_OWNED,
FROM vw_business AS a
INNER JOIN vw_fees AS e ON e.LICENSEID = a.LICENSEID
WHERE LICENSE = '1000'
AND(e.STATUS='BILLED' OR e.STATUS='PAID')
This will give me a result like this:
LICENSEID LICENSE BUSINESS_NAME YEAR TOTAL_AMOUT_PAID AMOUNT_OWNED
1CA6918B 1000 CORTANA 2016 0.00 1000.00
EE6DBDD0 1000 CORTANA 2017 1000.00 1000.00
Basically, I want to add another column to calculate the Total Balance which should be the difference between AMOUNT_OWNED and TOTAL_AMOUNT_PAID. I tried adding another line after SUM like this:
(AMOUNT_OWNED - TOTAL_AMOUNT_PAID) AS TOTAL_BALANCED,
However, I get an error that doesn't recognized the TOTAL_BALANCED. I also tried adding the entire line of the SUM again with no luck.
Can you guys guide in the correct path? If this is possible. Thank you.
Alias names cannot be referred in same select query. You need to write the sum over() window aggregate again to find difference
Try this way
SELECT DISTINCT a.LICENSEID,
a.license,
a.business_name,
a.year, -- Here it is should be . instead of ,
a.TOTAL_AMOUNT_PAID,
Sum(e.COMPUTED_AMOUNT)OVER (partition BY e.LICENSEID) AS AMOUNT_OWNED,
a.TOTAL_AMOUNT_PAID - Sum(e.COMPUTED_AMOUNT)
OVER (
partition BY e.LICENSEID) AS TOTAL_BALANCED
FROM vw_business AS a
INNER JOIN vw_fees AS e
ON e.LICENSEID = a.LICENSEID
WHERE LICENSE = '1000'
AND e.STATUS IN ( 'BILLED', 'PAID' ) -- use IN clause
or use derived table, this is a better option when the expression is big. Query will be more readable
SELECT LICENSEID,
license,
business_name,
year,
TOTAL_AMOUNT_PAID,
AMOUNT_OWNED,
TOTAL_AMOUNT_PAID - AMOUNT_OWNED as TOTAL_BALANCED
FROM (SELECT DISTINCT a.LICENSEID,
a.license,
a.business_name,
a.year,-- Here it is should be . instead of ,
a.TOTAL_AMOUNT_PAID,
Sum(e.COMPUTED_AMOUNT)OVER (partition BY e.LICENSEID) AS AMOUNT_OWNED
FROM vw_business AS a
INNER JOIN vw_fees AS e
ON e.LICENSEID = a.LICENSEID
WHERE LICENSE = '1000'
AND e.STATUS IN ( 'BILLED', 'PAID' ) -- use IN clause
) a
I have database of library and i am trying to assign most borrowed title to each year like
2015 - The Great Gatsby
2014 - Da vinci code
2013 - Harry Potter
....
I've tried this but i am not sure about it
select to_char(borrow_date,'YYYY'),title_name
from k_title
join k_book
using(title_id)
join k_rent_books
using(book_id)
group by to_char(borrow_date,'YYYY'),title_name
having count(title_id) = (
select max(cnt) FROM(select count(title_name) as cnt
from k_title
join k_book
using(title_id)
join k_rent_books
using(book_id)
group by title_id,title_name,to_char(borrow_date,'YYYY')));
I've got only 3 results
2016 - Shogun
2006 - The Revolt of Mamie Stover
1996 - The Great Gatsby
I will be happy for any help :)
Oracle has the nice capability to get the first or last value in an aggregation (as opposed to the min() or max()). This requires using something called keep.
So, the way to express what you want to do is:
select yyyy,
max(title_name) keep (dense_rank first order by cnt desc) as title_name
from (select to_char(borrow_date, 'YYYY') as yyyy,
title_name, count(*) as cnt
from k_title t join
k_book b
using (title_id) join
k_rent_books
using (book_id)
group by to_char(borrow_date, 'YYYY'), title_name
) yt
group by yyyy;
Your query is returning the year/title combinations that have the overall maximum count over all years, not the maximum per year.
SELECT DISTINCT
PART ,
MakeName ,
ModelName ,
YearID
FROM COMPATMATRIX..PART_TABLE
LEFT JOIN MATRIX_ACES(NOLOCK) ON PART_TABLE.MFGID = MATRIX_ACES.MFGID
AND PART_TABLE.MFG_PART = MATRIX_ACES.MFG_PART
LEFT JOIN ACES..BASEVEHICLE(NOLOCK) ON MATRIX_ACES.BaseVehicleID = BaseVehicle.BaseVehicleID
LEFT JOIN ACES..Make(NOLOCK) ON BaseVehicle.MakeID = Make.MakeID
LEFT JOIN ACES..Model(NOLOCK) ON BaseVehicle.ModelID = Model.ModelID
WHERE PART_TABLE.MFGID IN ( 'ACC', 'DRT' )
AND MakeName IS NOT NULL
ORDER BY PART ,
MakeName ,
ModelName ,
YearID
I'm try to concatenate all the years in a single row. So there may be multiple Ford F-150's and the only thing that differs is the year and I would like all the years to be in one row instead of having each different year being a new row.
I have tried using GROUP BY but then I have to use an aggregate and that only selects one year. I'm a little stumped. I'm using SQL Server 2008.
sample of what currently happens
ACC1234 Ford F-150 2001
ACC1234 Ford F-150 2002
ACC1234 Dodge Ram 2000
What I would like
ACC1234 Ford F-150 2001, 2002
ACC1234 Dodge Ram 2000
Look into STUFF and FOR XML PATH.
http://sqlandme.com/2011/04/27/tsql-concatenate-rows-using-for-xml-path/
Taking your code as an example:
SELECT DISTINCT
PART ,
MakeName ,
ModelName ,
STUFF( ( SELECT ',' + CONVERT( VARCHAR(10), YearID ) FROM ... WHERE ... FOR XML PATH( '' ) ), 1, 1, '' ) AS YearIDs
:
:
ORDER BY PART ,
MakeName ,
ModelName
Because I don't know what table YearID comes from and don't know the primary keys, I couldn't build the FROM or WHERE clause for you, but I think this will get you on the right path.
Good luck!
When I had to something similar, I ended up doing a dynamic SQL, but this was with Oracle (I guess it's possible to do the same in SQL Server based on this answer):
SELECT Truckname,
SUM(Year2013) Year2013,
SUM(Year2014) Year2014,
SUM(Year2015) Year2015
FROM
(
SELECT Truckname, NULL Year2013, NULL Year2014, COUNT(Field) Year2015
FROM Trucks
WHERE Year = '2015'
UNION ALL
SELECT Truckname, NULL Year2013, COUNT(Field) Year2014, NULL Year2015
FROM Trucks
WHERE Year = '2014'
UNION ALL
SELECT Truckname, COUNT(Field) Year2013, NULL Year2014, NULL Year2015
FROM Trucks
WHERE Year = '2013'
)
GROUP BY Truckname
All the "union all selects" of the years are generated dynamically depending how many years you want.
It's probably not the most elegant / optimized solution, but you have one row with the information of all the years. I you want text instead of a counter, you can use MAX instead of count to get the value.
I hope I understood correctly what you are trying to achieve here.
Imagine I have a table showing the sales of Acme Widgets, and where they were sold. It's fairly easy to produce a report grouping sales by country. It's fairly easy to find the top 10. But what I'd like is to show the top 10, and then have a final row saying Other. E.g.,
Ctry | Sales
=============
GB | 100
US | 80
ES | 60
...
IT | 10
Other | 50
I've been searching for ages but can't seem to find any help which takes me beyond the standard top 10.
TIA
I tried some of the other solutions here, however they seem to be either slightly off, or the ordering wasn't quite right.
My attempt at a Microsoft SQL Server solution appears to work correctly:
SELECT Ctry, Sales FROM
(
SELECT TOP 2
Ctry,
SUM(Sales) AS Sales
FROM
Table1
GROUP BY
Ctry
ORDER BY
Sales DESC
) AS Q1
UNION ALL
SELECT
Ctry AS 'Other',
SUM(Sales) AS Sales
FROM
Table1
WHERE
Ctry NOT IN (SELECT TOP 2
Ctry
FROM
Table1
GROUP BY
Ctry
ORDER BY
SUM(Sales) DESC)
Note that in my example, I'm only using TOP 2 rather than TOP 10. This is simply due to my test data being rather more limited. You can easily substitute the 2 for a 10 in your own data.
Here's the SQL Script to create the table:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[Table1](
[Ctry] [varchar](50) NOT NULL,
[Sales] [float] NOT NULL
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
And my data looks like this:
GB 10
GB 21.2
GB 34
GB 16.75
US 10
US 11
US 56.43
FR 18.54
FR 98.58
WE 44.33
WE 11.54
WE 89.21
KR 10
PO 10
DE 10
Note that the query result is correctly ordered by the Sales value aggregate and not the alphabetic country code, and that the "Other" category is always last, even if it's Sales value aggregate would ordinarily push it to the top of the list.
I'm not saying this is the best (read: most optimal) solution, however, for the dataset that I provided it seems to work pretty well.
SELECT Ctry, sum(Sales) Sales
FROM (SELECT COALESCE(T2.Ctry, 'OTHER') Ctry, T1.Sales
FROM (SELECT Ctry, sum(Sales) Sales
FROM Table1
GROUP BY Ctry) T1
LEFT JOIN
(SELECT TOP 10 Ctry, sum(sales) Sales
FROM Table1
GROUP BY Ctry) T2
on T1.Ctry = T2.Ctry
) T
GROUP BY Ctry
The pure SQL solutions to this problem make multiple passes through the individual records more than once. The following solution only queries the data once, and uses a SQL ranking function, ROW_NUMBER() to determine if some results belong in the "Other" category. The ROW_NUMBER() function has been available in SQL Server since SQL Server 2008. In my database, this seems to have resulted in a more efficient query. Please note that the "Other" row will appear above some rows if the total of the "Other" sales exceeds the top 10. If this is not desired some adjustments would need to be made to this query:
SELECT CASE WHEN RowNumber > 10 THEN 'Other' ELSE Ctry END AS Ctry,
SUM(Sales) as Sales FROM
(
SELECT Ctry, SUM(Sales) as Sales,
ROW_NUMBER() OVER(ORDER BY SUM(Sales) DESC) AS RowNumber
FROM Table1 GROUP BY Ctry
) as AggregateQuery
GROUP BY CASE WHEN RowNumber > 10 THEN 'Other' ELSE Ctry END
ORDER BY SUM(Sales) DESC
Using a real analytics SQL engine, such as Apache Spark, you can use Common Table Expression with to do:
with t as (
select rank() over (order by sales desc) as r, sales,city
from DB
order by sales desc
)
select sales, city, r
from t where r <= 10
union
select sum(sales) as sales, "Other" as city, 11 as r
from t where r > 10
In pseudo SQL:
select top 10 order by sales
UNION
select 'Other',SUM(sales) where Ctry not in (select top 10 like above)
Union the top ten with an outer Join of the top ten with the table it self to aggregate the rest.
I don't have access to SQL here but I'll hazzard a guess:
select top (10) Ctry, sales from table1
union all
select 'other', sum(sales)
from table1
left outer join (select top (10) Ctry, sales from table1) as table2
on table2.Ctry = table2.Ctry
where table2.ctry = null
group by table1.Ctry
Of course if this is a rapidly changing top(10) then you either lock or maintain a copy of the top(10) for the duration of the query.
Have in mind that depending on your use (and database volume / restrictions) you can achieve the same results using application code (python, node, C#, java etc). Sure it will depend on your use-case but hey, it's possible.
I ended up doing this in C# for instance:
// Mockup Class that has a CATEGORY and it's VOLUME
class YourModel { string category; double volume; }
List<YourModel> groupedList = wholeList.Take (5).ToList ();
groupedList.Add (new YourModel()
{
category = "Others",
volume = tempChartData.Skip (5).Select (t => t.qtd).Sum ()
});
Disclaimer
I understand that this is a "SQL Only" tagged question, but there might be other people like me out there who can make use of the application layer instead of relying only on SQL to make it happen. I am just trying to show people other ways of doing the same thing, that might be helpful. Even if this gets downvoted to oblivion I know that someone will be happy to read this because they were taught to use each tool to it's best, and think "outside the box".