concatenate years

concatenate years - sql

SELECT DISTINCT
PART ,
MakeName ,
ModelName ,
YearID
FROM COMPATMATRIX..PART_TABLE
LEFT JOIN MATRIX_ACES(NOLOCK) ON PART_TABLE.MFGID = MATRIX_ACES.MFGID
AND PART_TABLE.MFG_PART = MATRIX_ACES.MFG_PART
LEFT JOIN ACES..BASEVEHICLE(NOLOCK) ON MATRIX_ACES.BaseVehicleID = BaseVehicle.BaseVehicleID
LEFT JOIN ACES..Make(NOLOCK) ON BaseVehicle.MakeID = Make.MakeID
LEFT JOIN ACES..Model(NOLOCK) ON BaseVehicle.ModelID = Model.ModelID
WHERE PART_TABLE.MFGID IN ( 'ACC', 'DRT' )
AND MakeName IS NOT NULL
ORDER BY PART ,
MakeName ,
ModelName ,
YearID
I'm try to concatenate all the years in a single row. So there may be multiple Ford F-150's and the only thing that differs is the year and I would like all the years to be in one row instead of having each different year being a new row.
I have tried using GROUP BY but then I have to use an aggregate and that only selects one year. I'm a little stumped. I'm using SQL Server 2008.
sample of what currently happens
ACC1234 Ford F-150 2001
ACC1234 Ford F-150 2002
ACC1234 Dodge Ram 2000
What I would like
ACC1234 Ford F-150 2001, 2002
ACC1234 Dodge Ram 2000

Look into STUFF and FOR XML PATH.
http://sqlandme.com/2011/04/27/tsql-concatenate-rows-using-for-xml-path/
Taking your code as an example:
SELECT DISTINCT
PART ,
MakeName ,
ModelName ,
STUFF( ( SELECT ',' + CONVERT( VARCHAR(10), YearID ) FROM ... WHERE ... FOR XML PATH( '' ) ), 1, 1, '' ) AS YearIDs
:
:
ORDER BY PART ,
MakeName ,
ModelName
Because I don't know what table YearID comes from and don't know the primary keys, I couldn't build the FROM or WHERE clause for you, but I think this will get you on the right path.
Good luck!

When I had to something similar, I ended up doing a dynamic SQL, but this was with Oracle (I guess it's possible to do the same in SQL Server based on this answer):
SELECT Truckname,
SUM(Year2013) Year2013,
SUM(Year2014) Year2014,
SUM(Year2015) Year2015
FROM
(
SELECT Truckname, NULL Year2013, NULL Year2014, COUNT(Field) Year2015
FROM Trucks
WHERE Year = '2015'
UNION ALL
SELECT Truckname, NULL Year2013, COUNT(Field) Year2014, NULL Year2015
FROM Trucks
WHERE Year = '2014'
UNION ALL
SELECT Truckname, COUNT(Field) Year2013, NULL Year2014, NULL Year2015
FROM Trucks
WHERE Year = '2013'
)
GROUP BY Truckname
All the "union all selects" of the years are generated dynamically depending how many years you want.
It's probably not the most elegant / optimized solution, but you have one row with the information of all the years. I you want text instead of a counter, you can use MAX instead of count to get the value.
I hope I understood correctly what you are trying to achieve here.

Related

Improve the performance of a SQL Query via SQL Server

I'm looking for improve the performance of the Last Year Attended query. Right now, its taking 20+ minutes to run this block.
The LYA take the most recent year attended for a particular event and finds the year they attended prior to the max. For example if they attended in 2018 for an event, the query will look for the last year attended prior to 2018.
LYA for 2018 should return a Null
The data should return the following:
CompanyID MarketID Industry LAST YEAR ATTENDED
-------------------------------------------------------
123456 1234 GIFT 2018
123457 1234 HOME 2017
123458 1234 GIFT 2018
123459 1234 HOME 2018
123460 1234 APPAREL 2018
123461 1234 HOME 2018
123462 1234 HOME 2017
123463 1234 APPAREL 2018
Can anyone assist?
SELECT DISTINCT
COMPANYID, MARKETID, INDUSTRY,
[LAST YEAR ATTENDED] = (SELECT MAX(YEAR(attdate))
FROM v_marketatt va
WHERE va.companyid = vm.companyid
AND YEAR(attdate) <> (SELECT MAX(YEAR(attdate))
FROM v_marketatt vb
WHERE vb.companyid = vm.companyid)
AND MARKETCODE LIKE 'SM1%')
FROM
v_marketatt vm
WHERE
MARKETID IN (835, 1032, 1101)
UPDATE:
Found that is version is more efficient than the rest. Run time down to 7 minutes on a clone. Instead of allowing the subquery to dip into my view twice, had it dip once.
select
DISTINCT COMPANYID,
MARKETID,
INDUSTRY,
CSTATUS,
[LAST YEAR ATTENDED] = (select max(year(attdate)) from v_marketatt va where year(attdate) <> (select max(year(attdate)) from v_marketatt) AND MARKETCODE LIKE 'SM1%' AND va.COMPANYID = vm.COMPANYID)
from v_marketatt vm
WHERE MARKETID IN (835,1032,1101)
;
Thanks to all who responded.

The field [LAST YEAR ATTENDED] has a subquery that computes the max year on each iteration.You can try moving this piece of query to a join something like
select DISTINCT COMPANYID, MARKETID, INDUSTRY,
[LAST YEAR ATTENDED]
from v_marketatt vm
inner join
( select max(year(attdate)) as [LAST YEAR ATTENDED]
from v_marketatt ivm
where year(ivm.attdate) <> (select max(year(attdate))
from v_marketatt vb
where vb.companyid =
ivm.companyid)
AND MARKETCODE LIKE 'SM1%')va on va.companyid = vm.companyid
--where companyid not in (select distinct companyid from
v_marketatt where marketid in (602))
WHERE MARKETID IN (835,1032,1101)
I have not run this query , there could be some minor corrections on syntax , but if you get the concept it should be easy to pick and fix.

apologies for syntax, I'm throwing this together quickly. But I suspect making use a CTE should improve performance dramatically. I'm also not quite sure what you're doing here:
WHERE va.companyid = vm.companyid
AND YEAR(attdate) <> (SELECT MAX(YEAR(attdate))
FROM v_marketatt vb)
AND MARKETCODE LIKE 'SM1%'
So I've left that piece alone. Try something like this, which should help, and possible clarification on the part I've noted above might unlock other things to tweak.
;with Year_CTE (year)
as
(SELECT MAX(YEAR(attdate), va.companyid)
FROM v_marketatt va
WHERE va.companyid = vm.companyid
AND YEAR(attdate) <> (SELECT MAX(YEAR(attdate))
FROM v_marketatt vb)
AND MARKETCODE LIKE 'SM1%')
SELECT DISTINCT
COMPANYID, MARKETID, INDUSTRY,
vb.[YEAR]
FROM
v_marketatt vm
join Year_CTE vb on vb.companyid = vm.companyid
WHERE
MARKETID IN (835, 1032, 1101)

IF you need 'the one before this one' I'd suggest to use LEAD() or LAG() functions.
Although I'm not quite sure I fully understand your example (see Thorsten Kettners comments), going by the explanation I think what you want is something along the lines of:
;WITH years
AS (
SELECT COMPANYID, MARKETID, INDUSTRY, YEAR_ATTENDED = Year(attdate)
FROM v_marketatt
WHERE MARKETID IN (835, 1032, 1101)
AND MARKETCODE LIKE 'SM1%' -- not sure about this one, the example isn't very clear
GROUP BY COMPANYID, MARKETID, INDUSTRY, Year(attdate)
),
last_ones
AS (
SELECT row_nbr = ROW_NUMBER() OVER ( PARTITION BY COMPANYID, MARKETID, INDUSTRY ORDER BY YEAR_ATTENDED DESC),
COMPANYID, MARKETID, INDUSTRY,
LAST_YEAR_ATTENDED = YEAR_ATTENDED,
PREV_YEAR_ATTENDED = LEAD(YEAR_ATTENDED, 1, NULL) OVER (PARTITION BY COMPANYID, MARKETID, INDUSTRY ORDER BY YEAR_ATTENDED DESC)
FROM years
)
SELECT COMPANYID, MARKETID, INDUSTRY,
LAST_YEAR_ATTENDED,
PREV_YEAR_ATTENDED
FROM last_ones
WHERE row_nbr = 1
Since I don't have the tables nor the data here, I haven't tested the query, but I hope it will get you going...

select different Max ID's for different customer

situation:
we have monthly files that get loaded into our data warehouse however instead of being replaced with old loads, these are just compiled on top of each other. the files are loaded in over a period of days.
so when running a SQL script, we would get duplicate records so to counteract this we run a union over 10-20 'customers' and selecting Max(loadID) e.g
SELECT
Customer
column 2
column 3
FROM
MyTable
WHERE
LOADID = (SELECT MAX (LOADID) FROM MyTable WHERE Customer= 'ASDA')
UNION
SELECT
Customer
column 2
column 3
FROM
MyTable
WHERE
LOADID = (SELECT MAX (LOADID) FROM MyTable WHERE Customer= 'TESCO'
The above union would have to be done for multiple customers so i was thinking surely there has to be a more efficient way.
we cant use a MAX (LoadID) in the SELECT statement as a possible scenario could entail the following;
Monday: Asda,Tesco,Waitrose loaded into DW (with LoadID as 124)
Tuesday: Sainsburys loaded in DW (with LoadID as 125)
Wednesday: New Tesco loaded in DW (with LoadID as 126)
so i would want LoadID 124 Asda & Waitrose, 125 Sainsburys, & 126 Tesco

Use window functions:
SELECT t.*
FROM (SELECT t.*, MAX(LOADID) OVER (PARTITION BY Customer) as maxLOADID
FROM MyTable t
) t
WHERE LOADID = maxLOADID;

Would a subquery to a derived table meet your needs?
select yourfields
from yourtables join
(select customer, max(loadID) maxLoadId
from yourtables
group by customer) derivedTable on derivedTable.customer = realTable.customer
and loadId = maxLoadId

creating a calculated column using SQL COUNT

Lets say I work for a call center and I closed 10 calls but opened 20 calls in the day. The "real" figure is actually -10. Even though a target is 10 calls to close, The worker failed because 20 calls were opened.
I would like to write an SQL report to reflect this. But my problem seems to be I cannot calculate figures from aggregate counts.
SELECT workername AS Name,
(SELECT Count(closeddate)
FROM mybanksupport
WHERE closeddate = NULL) OPENCALLS,
(SELECT Count(closeddate)
FROM mybanksupport
WHERE closeddate = NOT NULL) CLOSEDCALLS,
(SELECT opencalls - closedcalls) REALCALLS
FROM mybanksupport
In short, I want to calculate 2 column count values and then use that value to produce another calculated column called Real Calls

COUNT only counts values, i.e., it ignores NULLs. This property can be used to simply your expression:
SELECT workername, closedCalls, totalCalls - closedCalss AS openCalls
FROM (SELECT workername, COUNT(closeddate) AS closedcalls, COUNT(*) totalCalls
FROM mybanksupport
GROUP BY workername) t

Write as subquery then you can use the fields form that as you want
Select
workername
, OPENCALLS
, CLOSEDCALLS
, (OPENCALLS - CLOSEDCALLS) REALCALLS
From
(
SELECT workername AS Name,
(SELECT Count(closeddate)
FROM mybanksupport
WHERE closeddate = NULL) OPENCALLS,
(SELECT Count(closeddate)
FROM mybanksupport
WHERE closeddate = NOT NULL) CLOSEDCALLS,
FROM mybanksupport
) T1

SQL query ...multiple max value selection. Help needed

Business World 1256987 monthly 10 2009-10-28
Business World 1256987 monthly 10 2009-09-23
Business World 1256987 monthly 10 2009-08-18
Linux 4 U 456734 monthly 25 2009-12-24
Linux 4 U 456734 monthly 25 2009-11-11
Linux 4 U 456734 monthly 25 2009-10-28
I get this result with the query:
SELECT DISTINCT ljm.journelname,ljm. subscription_id,
ljm.frequency,ljm.publisher, ljm.price, ljd.receipt_date
FROM lib_journals_master ljm,
lib_subscriptionhistory
lsh,lib_journal_details ljd
WHERE ljd.journal_id=ljm.id
ORDER BY ljm.publisher
What I need is the latest date in each journal?
I tried this query:
SELECT DISTINCT ljm.journelname, ljm.subscription_id,
ljm.frequency, ljm.publisher, ljm.price,ljd.receipt_date
FROM lib_journals_master ljm,
lib_subscriptionhistory lsh,
lib_journal_details ljd
WHERE ljd.journal_id=ljm.id
AND ljd.receipt_date = (
SELECT max(ljd.receipt_date)
from lib_journal_details ljd)
But it gives me the maximum from the entire column. My needed result will have two dates (maximum of each magazine), but this query gives me only one?

You could change the WHERE statement to look up the last date for each journal:
AND ljd.receipt_date = (
SELECT max(subljd.receipt_date)
from lib_journal_details subljd
where subljd.journelname = ljd.journelname)
Make sure to give the table in the subquery a different alias from the table in the main query.

You should use Group By if you need the Max from date.
Should look something like this:
SELECT
ljm.journelname
, ljm.subscription_id
, ljm.frequency
, ljm.publisher
, ljm.price
, **MAX(ljd.receipt_date)**
FROM
lib_journals_master ljm
, lib_subscriptionhistory lsh
, lib_journal_details ljd
WHERE
ljd.journal_id=ljm.id
GROUP BY
ljm.journelname
, ljm.subscription_id
, ljm.frequency
, ljm.publisher
, ljm.price

Something like this should work for you.
SELECT ljm.journelname
, ljm.subscription_id
, ljm.frequency
, ljm.publisher
, ljm.price
,md.max_receipt_date
FROM lib_journals_master ljm
, ( SELECT journal_id
, max(receipt_date) as max_receipt_date
FROM lib_journal_details
GROUP BY journal_id) md
WHERE ljm.id = md.journal_id
/
Note that I have removed the tables from the FROM clause which don't contribute anything to the query. You may need to replace them if yopu simplified your scenario for our benefit.

Separate this into two queries one will get journal name and latest date
declare table #table (journalName as varchar,saleDate as datetime)
insert into #table
select journalName,max(saleDate) from JournalTable group by journalName
select all fields you need from your table and join #table with them. join on journalName.

Sounds like top of group. You can use a CTE in SQL Server:
;WITH journeldata AS
(
SELECT
ljm.journelname
,ljm.subscription_id
,ljm.frequency
,ljm.publisher
,ljm.price
,ljd.receipt_date
,ROW_NUMBER() OVER (PARTITION BY ljm.journelname ORDER BY ljd.receipt_date DESC) AS RowNumber
FROM
lib_journals_master ljm
,lib_subscriptionhistory lsh
,lib_journal_details ljd
WHERE
ljd.journal_id=ljm.id
AND ljm.subscription_id = ljm.subscription_id
)
SELECT
journelname
,subscription_id
,frequency
,publisher
,price
,receipt_date
FROM journeldata
WHERE RowNumber = 1

SQL to produce Top 10 and Other

Imagine I have a table showing the sales of Acme Widgets, and where they were sold. It's fairly easy to produce a report grouping sales by country. It's fairly easy to find the top 10. But what I'd like is to show the top 10, and then have a final row saying Other. E.g.,
Ctry | Sales
=============
GB | 100
US | 80
ES | 60
...
IT | 10
Other | 50
I've been searching for ages but can't seem to find any help which takes me beyond the standard top 10.
TIA

I tried some of the other solutions here, however they seem to be either slightly off, or the ordering wasn't quite right.
My attempt at a Microsoft SQL Server solution appears to work correctly:
SELECT Ctry, Sales FROM
(
SELECT TOP 2
Ctry,
SUM(Sales) AS Sales
FROM
Table1
GROUP BY
Ctry
ORDER BY
Sales DESC
) AS Q1
UNION ALL
SELECT
Ctry AS 'Other',
SUM(Sales) AS Sales
FROM
Table1
WHERE
Ctry NOT IN (SELECT TOP 2
Ctry
FROM
Table1
GROUP BY
Ctry
ORDER BY
SUM(Sales) DESC)
Note that in my example, I'm only using TOP 2 rather than TOP 10. This is simply due to my test data being rather more limited. You can easily substitute the 2 for a 10 in your own data.
Here's the SQL Script to create the table:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[Table1](
[Ctry] [varchar](50) NOT NULL,
[Sales] [float] NOT NULL
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
And my data looks like this:
GB 10
GB 21.2
GB 34
GB 16.75
US 10
US 11
US 56.43
FR 18.54
FR 98.58
WE 44.33
WE 11.54
WE 89.21
KR 10
PO 10
DE 10
Note that the query result is correctly ordered by the Sales value aggregate and not the alphabetic country code, and that the "Other" category is always last, even if it's Sales value aggregate would ordinarily push it to the top of the list.
I'm not saying this is the best (read: most optimal) solution, however, for the dataset that I provided it seems to work pretty well.

SELECT Ctry, sum(Sales) Sales
FROM (SELECT COALESCE(T2.Ctry, 'OTHER') Ctry, T1.Sales
FROM (SELECT Ctry, sum(Sales) Sales
FROM Table1
GROUP BY Ctry) T1
LEFT JOIN
(SELECT TOP 10 Ctry, sum(sales) Sales
FROM Table1
GROUP BY Ctry) T2
on T1.Ctry = T2.Ctry
) T
GROUP BY Ctry

The pure SQL solutions to this problem make multiple passes through the individual records more than once. The following solution only queries the data once, and uses a SQL ranking function, ROW_NUMBER() to determine if some results belong in the "Other" category. The ROW_NUMBER() function has been available in SQL Server since SQL Server 2008. In my database, this seems to have resulted in a more efficient query. Please note that the "Other" row will appear above some rows if the total of the "Other" sales exceeds the top 10. If this is not desired some adjustments would need to be made to this query:
SELECT CASE WHEN RowNumber > 10 THEN 'Other' ELSE Ctry END AS Ctry,
SUM(Sales) as Sales FROM
(
SELECT Ctry, SUM(Sales) as Sales,
ROW_NUMBER() OVER(ORDER BY SUM(Sales) DESC) AS RowNumber
FROM Table1 GROUP BY Ctry
) as AggregateQuery
GROUP BY CASE WHEN RowNumber > 10 THEN 'Other' ELSE Ctry END
ORDER BY SUM(Sales) DESC

Using a real analytics SQL engine, such as Apache Spark, you can use Common Table Expression with to do:
with t as (
select rank() over (order by sales desc) as r, sales,city
from DB
order by sales desc
)
select sales, city, r
from t where r <= 10
union
select sum(sales) as sales, "Other" as city, 11 as r
from t where r > 10

In pseudo SQL:
select top 10 order by sales
UNION
select 'Other',SUM(sales) where Ctry not in (select top 10 like above)

Union the top ten with an outer Join of the top ten with the table it self to aggregate the rest.
I don't have access to SQL here but I'll hazzard a guess:
select top (10) Ctry, sales from table1
union all
select 'other', sum(sales)
from table1
left outer join (select top (10) Ctry, sales from table1) as table2
on table2.Ctry = table2.Ctry
where table2.ctry = null
group by table1.Ctry
Of course if this is a rapidly changing top(10) then you either lock or maintain a copy of the top(10) for the duration of the query.

Have in mind that depending on your use (and database volume / restrictions) you can achieve the same results using application code (python, node, C#, java etc). Sure it will depend on your use-case but hey, it's possible.
I ended up doing this in C# for instance:
// Mockup Class that has a CATEGORY and it's VOLUME
class YourModel { string category; double volume; }
List<YourModel> groupedList = wholeList.Take (5).ToList ();
groupedList.Add (new YourModel()
{
category = "Others",
volume = tempChartData.Skip (5).Select (t => t.qtd).Sum ()
});
Disclaimer
I understand that this is a "SQL Only" tagged question, but there might be other people like me out there who can make use of the application layer instead of relying only on SQL to make it happen. I am just trying to show people other ways of doing the same thing, that might be helpful. Even if this gets downvoted to oblivion I know that someone will be happy to read this because they were taught to use each tool to it's best, and think "outside the box".

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

concatenate years - sql

Related

Improve the performance of a SQL Query via SQL Server

select different Max ID's for different customer

creating a calculated column using SQL COUNT

SQL query ...multiple max value selection. Help needed

SQL to produce Top 10 and Other

Categories

Resources