Add SQL SELECT COUNT result to an existing column as text - sql

I have two tables:
1. TABLE [dbo].[ItemCategories](
[Id] [int] IDENTITY(1,1) NOT NULL,
[CategoryId] [int] NULL,
[StockId] [int] NULL,
2. TABLE [dbo].[Categories](
[Id] [int] IDENTITY(1,1) NOT NULL,
[ParentCategoryId] [int] NULL,
[CategoryName] [nvarchar](100) NULL,
[Slug] [nvarchar](150) NULL
And this query in SQL Server 2012
SELECT [CategoryName], [Slug], [ParentCategoryId], [Id]
FROM [Categories]
ORDER BY [ParentCategoryId] DESC
Which returns these rows
[CategoryName] [Slug] [ParentCategoryId] [Id]
Exercise exercise 42 46
Fashion fashion 42 47
And I have a second query:
SELECT COUNT(*)
FROM [ItemCategories]
WHERE CategoryId = '46' <--- This Id is the same as [Id] from the first query
How can I a modify the first query to add total count from the second query to the returned CategoryName column (as a single string) ?
Like this:
[CategoryName] [Slug] [ParentCategoryId] [Id]
Exercise (31) exercise 42 46
Fashion (56) fashion 42 47
I have created this join, but I don't know how to add the COUNT(*) as text
SELECT [CategoryName], [Slug], [ParentCategoryId], [Categories].[Id]
FROM [Categories]
INNER JOIN [ItemCategories] ON [Categories].[Id]=[ItemCategories].[CategoryId]
ORDER BY [ParentCategoryId] DESC

You can use the count(*) window function. I would put it in a separate column, but you can do:
SELECT [CategoryName] + ' (' + cast(count(*) over (partition by Id) as varchar(255)) + ')',
[Slug], [ParentCategoryId], [Id]
FROM [Categories]
ORDER BY [ParentCategoryId] DESC;
EDIT:
For two tables, use a JOIN and GROUP BY:
SELECT c.CategoryName + ' (' + cast(count(ic.Id) as varchar(255)) + ')',
c.Slug, c.ParentCategoryId, c.Id
FROM Categories c LEFT JOIN
ItemCategories ic
on ic.CategoryId = c.Id
GROUP BY c.CategoryName, c.slug, c.ParentCategoryId, c.id
ORDER BY ParentCategoryId DESC;

Related

Return list of Students by ZipCode Count

I am trying to get a list of students that live in the same zip code where zip code count > 1.
I tried the following and get nothing in my query. If I remove s.Student, I get results of zipcode and count, but I want to include student also.
SELECT s.Student, z.ZipCode, COUNT(s.ZipCodeId) As 'Zip Code Count'
FROM Students s
INNER JOIN ZipCodes z ON z.ZipCodeId = s.ZipCodeId
GROUP BY s.Student, z.ZipCode
HAVING COUNT(z.ZipCode) > 1
Below are the database tables I am using.
CREATE TABLE [dbo].[Instructors](
[InstructorId] [int] IDENTITY(1,1) NOT NULL,
[Instructor] [varchar](50) NOT NULL,
[ZipCodeId] [int] NOT NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[Students](
[StudentId] [int] IDENTITY(1,1) NOT NULL,
[Student] [varchar](50) NOT NULL,
[ZipCodeId] [int] NOT NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[ZipCodes](
[ZipCodeId] [int] IDENTITY(1,1) NOT NULL,
[ZipCode] [varchar](9) NULL,
[City] [varchar](50) NULL,
[State] [varchar](25) NULL
) ON [PRIMARY]
I think you need to query the Zip Codes which are used more than once, then join the Students on along with the Zip Code details e.g.
SELECT S.Student, Z.ZipCode, Z1.Num AS "Zip Code Count"
FROM (
SELECT COUNT(*) Num, ZipCodeId
FROM Students S
GROUP BY ZipCodeId
HAVING COUNT(*) > 1
) Z1
INNER JOIN Students S on S.ZipCodeId = Z1.ZipCodeId
INNER JOIN ZipCodes Z on Z.ZipCodeId = Z1.ZipCodeId;
Note: You don't use single quotes (') to delimit a column name - you use double quotes (") or square brackets ([]).
Also, sample data would allow testing of our solutions.
You can do this using a window function, without re-joining
SELECT
S.Student,
Z.ZipCode,
Z.Num AS [Zip Code Count]
FROM (
SELECT *,
COUNT(*) OVER (PARTITION BY S.ZipCodeId) Num
FROM Students S
) S
INNER JOIN ZipCodes Z on Z.ZipCodeId = S.ZipCodeId
WHERE S.Num > 1;

I can't figure out how to Order by with string_agg

I have this query (I am using SQL Server 2019) and is working fine (combining Dates and Notes into one column). However, the result I am looking for is to have the latest date show up first.
How can I achieve that from this query?
SELECT ID,
​(SELECT string_agg(​concat(Date, ': ', Notes), CHAR(13) + CHAR(10) + CHAR(13) + CHAR (10)) as Expr1​
FROM(SELECT DISTINCT nd.Notes, nd.Date
FROM dbo.ReleaseTrackerNotes AS nd
INNER JOIN dbo.ReleaseTracker AS ac4 ON ac4.ID = nd.ReleaseTrackerID
WHERE (ac4.ID = ac.ID)) AS z_1) AS vNotes
FROM dbo.ReleaseTracker AS ac
GROUP BY ID
I have tried the ORDER BY but is not working
Here is my table:
CREATE TABLE [dbo].[ReleaseTrackerNotes](
[ID] [int] IDENTITY(1,1) NOT NULL,
[ReleaseTrackerID] [int] NULL,
[AOC_ModelID] [int] NULL,
[Date] [date] NULL,
[Notes] [nvarchar](800) NULL,
CONSTRAINT [PK_ReleaseTrackerNotes] PRIMARY KEY CLUSTERED
CREATE TABLE [dbo].[ReleaseTracker](
[ID] [int] IDENTITY(1,1) NOT NULL,
[AOC_ModelID] [int] NOT NULL,
[MotherboardID] [int] NOT NULL,
[StatusID] [int] NOT NULL,
[TestCateoryID] [int] NULL,
[TestTypeID] [int] NULL,
[DateStarted] [date] NULL,
[DateCompleted] [date] NULL,
[LCS#/ORS#] [nvarchar](20) NULL,
[ETCDate] [date] NULL,
[CardsNeeded] [nvarchar](2) NULL,
CONSTRAINT [PK_Compatibility] PRIMARY KEY CLUSTERED
Use WITHIN GROUP (ORDER BY ...):
SELECT
ID,
STRING_AGG(​TRY_CONVERT(varchar, Date, 101) + ': ' + Notes +
CHAR(13) + CHAR(10) + CHAR(13), CHAR(10))
WITHIN GROUP (ORDER BY Date DESC) AS Expr1​
FROM
(
SELECT DISTINCT ac4.ID, nd.Notes, nd.Date
FROM dbo.ReleaseTrackerNotes AS nd
INNER JOIN dbo.ReleaseTracker AS ac4
ON ac4.ID = nd.ReleaseTrackerID
) AS vNotes
GROUP BY ID;

How do I aggregate 3 columns that are different with MIN(DATE)?

I'm facing a simple problem here that I can't solve, I have this query:
SELECT
MIN(TEA_InicioTarefa),
PFJ_Id_Analista,
ATC_Id,
SRV_Id
FROM
dbo.TarefaEtapaAreaTecnica
INNER JOIN Tarefa t ON t.TRF_Id = TarefaEtapaAreaTecnica.TRF_Id
WHERE SRV_Id = 88
GROUP BY SRV_Id, ATC_Id, PFJ_Id_Analista
ORDER BY ATC_Id ASC
It returns me this:
I was able to group it a little with GROUP BY SRV_Id, ATC_Id, PFJ_Id_Analista that gave me these 8 records, but as you can see some PFJ_Id_Analista are different.
What I want is to select only the early date of each SRV_Id and ATC_Id, the PFJ_Id_Analista don't need to grup, if I remove PFJ_Id_Analista from the grouping the query works, but I need the column.
For eg.: between row number 2 and 3 I want only the early date, so it will be row 2. The same goes for rows 5 to 8, I want only row 6.
DDL for TarefaEtapaAreaTecnica (important key: TRF_Id)
CREATE TABLE [dbo].[TarefaEtapaAreaTecnica](
[TEA_Id] [int] IDENTITY(1,1) NOT NULL,
**[TRF_Id] [int] NOT NULL,**
[ETS_Id] [int] NOT NULL,
[ATC_Id] [int] NOT NULL,
[TEA_Revisao] [int] NOT NULL,
[PFJ_Id_Projetista] [int] NULL,
[TEA_DoctosQtd] [int] NULL,
[TEA_InicioTarefa] [datetime2](7) NULL,
[PFJ_Id_Analista] [int] NULL,
[TEA_FimTarefa] [datetime2](7) NULL,
[TEA_HorasQtd] [numeric](18, 1) NULL,
[TEA_NcfQtd] [int] NULL,
[PAT_Id] [int] NULL
DDL for Tarefa (important keys TRF_Id and SRV_Id (which I need it)):
CREATE TABLE [dbo].[Tarefa](
**[TRF_Id] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,**
**[SRV_Id] [int] NOT NULL,**
[TRT_Id] [int] NOT NULL,
[TRF_Descr] [varchar](255) NULL,
[TRF_Entrada] [datetime] NOT NULL,
[TRF_DoctosQtd] [int] NOT NULL,
[TRF_Devolucao] [datetime] NULL,
[TRF_NcfQtd] [int] NULL,
[TRF_EhDocInsuf] [bit] NULL,
[TRF_Observ] [varchar](255) NULL,
[TRF_AreasTrfQtd] [int] NULL,
[TRF_AreasTrfLiqQtd] [int] NULL
Thanks a lot.
EDIT:
CORRECT QUERY
Based on #Gordon Linoff post:
select t.TEA_InicioTarefa, t.PFJ_Id_Analista, t.ATC_Id, t.SRV_Id
from (select t.*,
row_number() over (partition by ATC_Id, SRV_Id
order by TEA_InicioTarefa) as seqnum, ta.SRV_Id
from dbo.TarefaEtapaAreaTecnica t
inner join dbo.Tarefa ta on t.TRF_Id = ta.TRF_Id
) t
where seqnum = 1 AND t.SRV_Id = 88
Just use window functions:
select t.*
from (select t.*,
row_number() over (partition by ATC_Id, SRV_Id
order by ini) as seqnum
from dbo.TarefaEtapaAreaTecnica t
) t
where seqnum = 1;
This is really an example of filtering, not aggregation. The problem is getting the right value to filter on.
Then get the grouping first and then do a JOIN with it like
SELECT
x.Min_TEA_InicioTarefa,
t.PFJ_Id_Analista,
t.ATC_Id,
t.SRV_Id
FROM
dbo.TarefaEtapaAreaTecnica t
INNER JOIN Tarefa ta ON ta.TRF_Id = t.TRF_Id
INNER JOIN (
select SRV_Id, MIN(TEA_InicioTarefa) as Min_TEA_InicioTarefa
from dbo.TarefaEtapaAreaTecnica
GROUP BY SRV_Id
) x ON t.SRV_Id = x.SRV_Id
WHERE t.SRV_Id = 88
ORDER BY t.ATC_Id ASC;

Query is very very slow for processing 200000 plus records

I have 200,000 rows in Patient & Person table, and the query shown takes 30 secs to execute.
I have defined the primary key (and clustered index) in the Person table on PersonId and on PatientId in the Patient table. What else can I do here to improve performance of my procedure?
New to database development side. I know only basic SQL. Also not sure SQL Server can handle 200,000 rows quickly.
Whole dynamic Procedure you can see at https://github.com/Padayappa/SQLProblem/blob/master/Performance
Anyone faced handling huge rows like this? How do I improve performance here?
DECLARE #return_value int,
#unitRows bigint,
#unitPages int,
#TenantId int,
#unitItems int,
#page int
SET #TenantId = 1
SET #unitItems = 20
SET #page = 1
DECLARE #PatientSearch TABLE(
[PatientId] [bigint] NOT NULL,
[PatientIdentifier] [nvarchar](50) NULL,
[PersonNumber] [nvarchar](20) NULL,
[FirstName] [nvarchar](100) NOT NULL,
[LastName] [nvarchar](100) NOT NULL,
[ResFirstName] [nvarchar](100) NOT NULL,
[ResLastName] [nvarchar](100) NOT NULL,
[AddFirstName] [nvarchar](100) NOT NULL,
[AddLastName] [nvarchar](100) NOT NULL,
[Address] [nvarchar](255) NULL,
[City] [nvarchar](50) NULL,
[State] [nvarchar](50) NULL,
[ZipCode] [nvarchar](20) NULL,
[Country] [nvarchar](50) NULL,
[RowNumber] [bigint] NULL
)
INSERT INTO #PatientSearch SELECT PAT.PatientId
,PAT.PatientIdentifier
,PER.PersonNumber
,PER.FirstName
,PER.LastName
,RES_PER.FirstName AS ResFirstName
,RES_PER.LastName AS ResLastName
,ADD_PER.FirstName AS AddFirstName
,ADD_PER.LastName AS AddLastName
,PER.Address
,PER.City
,PER.State
,PER.ZipCode
,PER.Country
,ROW_NUMBER() OVER (ORDER BY PAT.PatientId DESC) AS RowNumber
FROM dbo.Patient AS PAT
INNER JOIN dbo.Person AS PER
ON PAT.PersonId = PER.PersonId
INNER JOIN dbo.Person AS RES_PER
ON PAT.ResponsiblePersonId = RES_PER.PersonId
INNER JOIN dbo.Person AS ADD_PER
ON PAT.AddedBy = ADD_PER.PersonId
INNER JOIN dbo.Booking AS B
ON PAT.PatientId = B.PatientId
WHERE PAT.TenantId = #TenantId AND B.CategoryId = #CategoryId
GROUP BY PAT.PatientId
,PAT.PatientIdentifier
,PER.PersonNumber
,PER.FirstName
,PER.LastName
,RES_PER.FirstName
,RES_PER.LastName
,ADD_PER.FirstName
,ADD_PER.LastName
,PER.Address
,PER.City
,PER.State
,PER.ZipCode
,PER.Country
;
SELECT #unitRows = ##ROWCOUNT
,#unitPages = (#unitRows / #unitItems) + 1;
SELECT *
FROM #PatientSearch AS IT
WHERE RowNumber BETWEEN (#page - 1) * #unitItems + 1 AND #unitItems * #page
Well, unless I am missing something (like duplicate rows?) you should be able to remove the GROUP BY
GROUP BY PAT.PatientId
,PAT.PatientIdentifier
,PER.PersonNumber
,PER.FirstName
,PER.LastName
,RES_PER.FirstName
,RES_PER.LastName
,ADD_PER.FirstName
,ADD_PER.LastName
,PER.Address
,PER.City
,PER.State
,PER.ZipCode
,PER.Country
as you are grouping by all fields in the select list, and you are partitioning by PAT.PatientId
Further to that, you should create index on the tables with the index containing columns that you join/filter on.
So for instance I would create an index on table Patient with columns (TenantId,PersonId,ResponsiblePersonId,AddedBy) with included columns (PatientId,PatientIdentifier)
Frankly speaking, 200,000 rows is nothing to SQL server. Please first remove logic redundancy, like you have primary key, why still group so many columns, and why you need to join same table (person) 3 times? After removing logic redundancy, you need to create some composite index/include index at least. Get the execution plan (CTRL+M) or (CTRL+M), to see what index you missed. If you need further help, please paste your table schema with few rows of sample data.

what's the right way of joning two tables, group by a column, and select only one row for each record?

I have a crews table
CREATE TABLE crew(crew_id INT, crew_name nvarchar(20), )
And a time log table, which is just a very long list of actions performed by the crew
CREATE TABLE [dbo].[TimeLog](
[time_log_id] [int] IDENTITY(1,1) NOT NULL,
[experiment_id] [int] NOT NULL,
[crew_id] [int] NOT NULL,
[starting] [bit] NULL,
[ending] [bit] NULL,
[exception] [nchar](10) NULL,
[sim_time] [time](7) NULL,
[duration] [int] NULL,
[real_time] [datetime] NOT NULL )
I want to have a view that shows only one row for each crew with the latest sim_time + duration .
Is a view the way to go? If yes, how do I write it? If not, what's the best way of doing this?
Thanks
Here is a query to select what you want:
select * from (
select
*,
row_number() over (partition by c.crew_id order by l.sim_time desc) as rNum
from crew as c
inner join TileLog as l (on c.crew_id = l.crew_id)
) as t
where rNum = 1
it depends on what you need that data for.
anyway, a simple query to find latest sim time would be something like
select C.*, TL.sim_time
from crew C /*left? right? inner?*/ join TimeLog TL on TL.crew_id = C.crew.id
where TL.sim_time in (select max(timelog_subquery.sim_time) from TimeLog timelog_subquery where crew_id = C.crew_id )