Grouping data in SQL Server

Grouping data in SQL Server - sql

Please Consider the following table :
As you can see the Name column has some repeated values with is a group like
I need to have a query so I can fetch just the first row of a group something like this:
Please take in to account that I need the fastest way because the real table is not like that and could have lots of data to be filter that way.
Thanks in advance.

this depends greatly on how you define 'the first in the group'
something like this:
select name, min(code)
from mytable
group by name
order by name

Assuming the table name is test (change to match yours), try this
CREATE TABLE [dbo].[test](
[name] [varchar](3) NULL,
[code] [varchar](5) NULL,
[RowNumber] [int] NOT NULL,
CONSTRAINT [PK_test] PRIMARY KEY CLUSTERED
(
[RowNumber] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
INSERT INTO [test] ([name],[code],[RowNumber])VALUES('A1','AED',1)
INSERT INTO [test] ([name],[code],[RowNumber])VALUES('A1','BG',2)
INSERT INTO [test] ([name],[code],[RowNumber])VALUES('A1','WS',3)
INSERT INTO [test] ([name],[code],[RowNumber])VALUES('A2','CER',4)
INSERT INTO [test] ([name],[code],[RowNumber])VALUES('A2','HJY',5)
INSERT INTO [test] ([name],[code],[RowNumber])VALUES('A5','OLP',6)
INSERT INTO [test] ([name],[code],[RowNumber])VALUES('A6','LOO',7)
INSERT INTO [test] ([name],[code],[RowNumber])VALUES('A6','AED',8)
SELECT a.*
FROM dbo.test a
INNER JOIN(SELECT name,
MIN(rownumber) AS rownumber
FROM dbo.test
GROUP BY name) b
ON a.name = b.name
AND a.rownumber = b.rownumber
ORDER BY a.name
If the RowNumber column is always going to be sequential the put an index on that column.

Related

sql - Add calculated column to existing table of max date grouped by id

I have an existing table I have created and I would like to alter my table by creating a new calculated column which will obtain the max date by ID.
Multiple times a day, new data will flow into the table from a form input (web app). Every time the ID and datestamp are entered into the database, I would like the calculated column to update.
I created my table to interact with Powerapps web form:
CREATE TABLE [dbo].[test_table2](
[Id] [int] IDENTITY(1,1) NOT NULL,
[datestamp] [date] NULL,
CONSTRAINT [tableId] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]
GO
I have tried to create a column with something like this.....
SELECT id, datestamp FROM (
SELECT id, datestamp,
RANK() OVER (PARTITION BY id ORDER BY datestamp DESC) max_date_id
FROM test_table2
) where max_date_id = 1
I ultimately would like to alter the table so it updates automatically, but this code is just a query and also does not work. How can I achieve this?

SQL Server - Compare Fields in 2 Tables with PIVOT query

I have the same table in two databases (dbSource and dbTarget) and I'm trying to write a query to compare the field values in each table with a Source/Target diff.
This is the table structure:
CREATE TABLE [dbo].[tblWidget](
[ID] [int] NOT NULL,
[Description] [varchar](50) NOT NULL,
[UpdatedBy] [varchar](50) NOT NULL,
CONSTRAINT [PK_tblWidget] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
For simplicity, I'll say the table in each database has one row:
[dbSource]:
[dbTarget]:
I want the result to look like this:
The closest I could get to the result is a PIVOT query that only returns one of the fields (UpdatedBy).
Is there a simple way to include all of the fields in one query vs. doing some kind of UNION with multiple PIVOT statements?
I realize if there is more than one row (example: a row with ID = 2), the expected results won't make sense, so please assume I will only be comparing one row between databases.
This is what I have so far:
SELECT 'UpdatedBy' Field, [0] AS [Source], [1] AS [Target]
FROM
(
SELECT [ID]
,[Description]
,[UpdatedBy]
,1 IsSource
FROM [dbSource].[dbo].[tblWidget]
UNION ALL
SELECT [ID]
,[Description]
,[UpdatedBy]
,0 IsSource
FROM [dbTarget].[dbo].[tblWidget]
) a
PIVOT (
MAX(UpdatedBy)
FOR IsSource IN ([0], [1])
) AS pvt
Thanks

You could join the two tables, and then unpivot the columns to rows:
select s.id, x.*
from dbsource.tblWidget s
inner join dbtarget.tblWidget t on t.id = s.id
cross apply (values
('Description', s.description, t.description),
('UpdatedBy', s.updatedby, t.updatedby)
) x (field, source, target)
This assumes that column id can be used to relate the two tables - as a consequence, it does not make sense having a row for id in the result (both the source and target values always are the same).

SELECT * INTO FAILS

I have following table structure :
CREATE TABLE [dbo].[UTS_USERCLIENT_MAPPING_USER_LIST]
(
[MAPPING_ID] [int] IDENTITY(1,1) NOT NULL,
[USER_ID] [varchar](50) NULL,
[USER_EMAIL_ID] [varchar](100) NULL,
[USER_CREATED_DATE] [datetime] NULL,
[USER_IS_ACTIVE] [bit] NULL,
CONSTRAINT [PK_UTS_USERCLIENT_MAPPING_USER_LIST]
PRIMARY KEY CLUSTERED ([MAPPING_ID] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
In stored procedure I have this code:
ALTER PROCEDURE [dbo].[PROC_UTS_USER_CLIENTMAPPING_LIST_SET]
(#RETURN_CODE INT OUTPUT,
#RETURN_MESSAGE NVARCHAR(512) OUTPUT,
#XML_USER_LIST xml)
AS
BEGIN TRY
SELECT
ROW_NUMBER() OVER(ORDER BY x.value('USERNAME[1]','nvarchar(50)')) AS MAPPING_ID,
x.value('USERNAME[1]', 'nvarchar(50)') as USER_ID,
x.value('EMAILID[1]', 'nvarchar(50)') as USER_EMAIL_ID,
x.value('CREATEDDATE[1]', 'datetime') as USER_CREATED_DATE,
x.value('ISACTIVE[1]', 'bit') as USER_IS_ACTIVE
INTO #tempXML
FROM #XML_USER_LIST.nodes('/DocumentElement/dtLstUsers') AS TEMPTABLE(x)
SELECT *
INTO UTS_USERCLIENT_MAPPING_USER_LIST
FROM #tempXML
END TRY
My problem is that above stored procedure is not inserting data into UTS_USERCLIENT_MAPPING_USER_LIST from #tempXML table.
I have ensured that #tempXML table contains values.

There are a few flaws in your query:
1 - you are trying to insert an IDENTITY value without setting IDENTITY_INSERT ON before inserting into your table, and then set it to OFF
SET IDENTITY_INSERT UTS_USERCLIENT_MAPPING_USER_LIST ON
2 - SELECT * INTO table will assume the table doesn't exist and will try to create it there, will fail -> need to use INSERT INTO SELECT
INSERT INTO UTS_USERCLIENT_MAPPING_USER_LIST (cols)
SELECT cols
FROM #temp
3 - you are calculating the MAPPING_ID with ROW_NUMBER function which will start from 1 to n (where n is number of nodes you have in xml)every time, but your table has a PRIMARY KEY on MAPPING_ID column which implies is UNIQUE so 2nd time you want to insert MAPPING_ID 1, it will fail.
4 - If you have a CATCH block which is empty, it will hide your errors
Now, the solution without really understanding your needs regarding MAPPING_ID column, is to change the insert statement there to:
INSERT INTO UTS_USERCLIENT_MAPPING_USER_LIST ([USER_ID], [USER_EMAIL_ID], [USER_CREATED_DATE], [USER_IS_ACTIVE])
SELECT [USER_ID], [USER_EMAIL_ID], [USER_CREATED_DATE], [USER_IS_ACTIVE]
FROM #tempXML
OR if you have a valid MAPPING_ID found from xml somehow:
SET IDENTITY_INSERT UTS_USERCLIENT_MAPPING_USER_LIST ON
INSERT INTO UTS_USERCLIENT_MAPPING_USER_LIST ([MAPPING_ID], [USER_ID], [USER_EMAIL_ID], [USER_CREATED_DATE], [USER_IS_ACTIVE])
SELECT [MAPPING_ID], [USER_ID], [USER_EMAIL_ID], [USER_CREATED_DATE], [USER_IS_ACTIVE]
FROM #tempXML
SET IDENTITY_INSERT UTS_USERCLIENT_MAPPING_USER_LIST OFF

As per my understanding you need to mentioned all the column name apart from Mapping_ID. see below code
INSERT INTO UTS_USERCLIENT_MAPPING_USER_LIST (
USER_ID,
USER_EMAIL_ID,
USER_CREATED_DATE,
USER_IS_ACTIVE)
select USER_ID,
USER_EMAIL_ID,
USER_CREATED_DATE,
USER_IS_ACTIVE
from #tempXML

Looks like you need to turn on IDENTITY_INSERT as Sparky has suggested.

Also a possible solution is that you aren't using IDENTITY column at all - in case it's not needed. (In case this table is getting the data only from this store procedure there is no need to use IDENTITY column):
CREATE TABLE [dbo].[UTS_USERCLIENT_MAPPING_USER_LIST](
[MAPPING_ID] [int] NOT NULL,
[USER_ID] [varchar](50) NULL,
[USER_EMAIL_ID] [varchar](100) NULL,
[USER_CREATED_DATE] [datetime] NULL,
[USER_IS_ACTIVE] [bit] NULL,
CONSTRAINT [PK_UTS_USERCLIENT_MAPPING_USER_LIST] PRIMARY KEY CLUSTERED
(
[MAPPING_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

Real problem was,
SELECT * INTO always creates new table.
So I needed to drop the existing table before its creation.
When I :
DROP TABLE UTS_USERCLIENT_MAPPING_USER_LIST
SELECT *
INTO UTS_USERCLIENT_MAPPING_USER_LIST
FROM #tempXML
Then it worked.
Thank You.

Couldn't you just skip the temporary table altogether?
-- TRUNCATE TABLE HERE, IF NEED BE
INSERT INTO UTS_USERCLIENT_MAPPING_USER_LIST (<ColumnList>)
SELECT
ROW_NUMBER() OVER(ORDER BY x.value('USERNAME[1]','nvarchar(50)')) AS MAPPING_ID,
x.value('USERNAME[1]', 'nvarchar(50)') as USER_ID,
x.value('EMAILID[1]', 'nvarchar(50)') as USER_EMAIL_ID,
x.value('CREATEDDATE[1]', 'datetime') as USER_CREATED_DATE,
x.value('ISACTIVE[1]', 'bit') as USER_IS_ACTIVE
FROM #XML_USER_LIST.nodes('/DocumentElement/dtLstUsers') AS TEMPTABLE(x)

Select count from another table to each row in result rows

Here are the tables:
CREATE TABLE [dbo].[Classes](
[ClassId] [int] NOT NULL,
[ClassName] [nvarchar](50) NOT NULL,
CONSTRAINT [PK_Classes] PRIMARY KEY CLUSTERED
(
[ClassId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Students](
[StudentId] [int] NOT NULL,
[ClassId] [int] NOT NULL,
CONSTRAINT [PK_Students] PRIMARY KEY CLUSTERED
(
[StudentId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Students] WITH CHECK ADD CONSTRAINT [FK_Students_Classes] FOREIGN KEY([ClassId])
REFERENCES [dbo].[Classes] ([ClassId])
GO
ALTER TABLE [dbo].[Students] CHECK CONSTRAINT [FK_Students_Classes]
GO
I want to get list of class, and each class - the number of student which belong to each class.
How can I do this?

You need to do this -
SELECT C.ClassId, C.ClassName, count(S.StudentId) AS studentCount
FROM CLASSES C LEFT JOIN STUDENTS S ON (C.ClassId=S.ClassId)
GROUP BY C.ClassId, C.ClassName

You mean something like this?
SELECT C.[ClassName], COUNT(*) AS 'Number of Students'
FROM [dbo].[Classes] AS C
INNER JOIN [dbo].[Students] AS S ON S.[ClassId] = C.[ClassId]
GROUP BY C.[ClassName]

Without having to add the group by clause, you can do the following:
Create a function to get the students count:
go
CREATE FUNCTION [dbo].GetStudentsCountByClass(#classId int) RETURNS INT
AS BEGIN
declare #count as int
select #count = count(*) from STUDENTS
where ClassId = #classId
RETURN #count
END
then use it in your select statement
SELECT * , dbo.GetStudentsCountByClass(ClassId) AS StudentsCount
FROM Classes

select c.ClassId,C.ClassName,COUNT(*) [Number of students]
from Classes C,Students S
where c.ClassId=S.ClassId
group by C.ClassId,C.ClassName

SELECT class.ClassId, count(student .StudentId) AS studentCount
FROM dbo.CLASSES class LEFT JOIN dbo.STUDENTS student ON (class.ClassId=student.ClassId)
GROUP BY class.ClassId

Query to update rows and then create a table if it does not exist

I have two different things to happen at once.., basically I need to insert two new records in a table if they are not already there... and for sure they will always have the same ID and Name (i did this) but then immediately after that i need to check if a table existst and if it does not create it.. (but if it does exist I DO NOT want to drop it just leave it alone).
Please see my code below.. Can you help me with the checking of the table existing or not? and if you see a room on improvement please do..
Thank you
--ADD LOCKS
BEGIN TRAN
IF EXISTS (SELECT myID, myName
FROM myTable
WHERE myID = 7 AND myName = 'Pedro')
SELECT 1
ELSE
INSERT INTO myTable (myID , myName) values ( 7, 'Pedro')
IF EXISTS (SELECT myID, myName
FROM myTable
WHERE myID = 8 AND myName = 'Joseph')
SELECT 1
ELSE
INSERT INTO myTable (myID , myName) values ( 8, 'Joseph')
COMMIT
--NOW BELOW I WANT TO DO THE CREATION OF A TABLE IF IT DOES NOT EXIST
NOT SURE HOW TO CHECK IT.. BUT KNOW HOW TO CREATE IT
--IF TABLE DOES NOT EXIST DO THE FOLLOWING
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[myTable](
[myID] [int] IDENTITY(1,1) NOT NULL,
[Name] [varchar(MAX)] NOT NULL
CONSTRAINT [PK_myTable] PRIMARY KEY CLUSTERED
(
[myID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
--ELSE DONT DO NOTHING

As per my knowledge, you have check whether the table exists or not first and create it before inserting if it doesn't exist. Hope the below query might be of some use for you.
IF NOT EXISTS (SELECT * FROM SYSOBJECTS WHERE ID = OBJECT_ID(N'[dbo].[myTable]') AND OBJECTPROPERTY(id, N'IsUserTable') = 1)
BEGIN
CREATE TABLE [dbo].[myTable]
(
[myID] [int] IDENTITY(1,1) NOT NULL,
[Name] [varchar(MAX)] NOT NULL
CONSTRAINT [PK_myTable] PRIMARY KEY CLUSTERED
( [myID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON
) ON [PRIMARY] ) ON [PRIMARY]
END
IF NOT EXISTS (SELECT myID, myName FROM myTable WHERE myID = 7 AND myName = 'Pedro')
BEGIN
INSERT INTO myTable (myID , myName) values ( 7, 'Pedro')
END
IF NOT EXISTS (SELECT myID, myName FROM myTable WHERE myID = 8 AND myName = 'Joseph')
BEGIN
INSERT INTO myTable (myID , myName) values ( 8, 'Joseph')
END
Edit: Should also note, this will not work without turning IDENTITY_INSERT ON. Because the myID column is an identity field, the values attempting to be inserted will fail.

For inserting the records cleaner script would be:
IF NOT EXISTS (SELECT 1 FROM myTable WHERE ...)
BEGIN
-- Insert record
END
And for checking if a table exists:
Check if table exists in SQL Server
IF (NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'TheSchema' AND TABLE_NAME = 'TheTable'))
BEGIN
-- Insert table
END
You might want to switch the order of these statements so you don't query a table that may not exist.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Grouping data in SQL Server - sql

this depends greatly on how you define 'the first in the group' something like this: select name, min(code) from mytable group by name order by name

Related

sql - Add calculated column to existing table of max date grouped by id

SQL Server - Compare Fields in 2 Tables with PIVOT query

SELECT * INTO FAILS

Select count from another table to each row in result rows

Query to update rows and then create a table if it does not exist

Categories

Resources