How to remove duplicate values from datatable SQL - sql

Getting values duplicate:
╔══════╦══════╦═══════╦════════════╦═════════╦═════════╦══════╦═══════╗
║ ID ║ Name ║ Class ║ Date ║ Intime ║ Outtime ║ INAM ║ OUTPM ║
╠══════╬══════╬═══════╬════════════╬═════════╬═════════╬══════╬═══════╣
║ 1001 ║ Paul ║ 1st ║ 29-11-2022 ║ Holiday ║ Holiday ║ H ║ H ║
╠══════╬══════╬═══════╬════════════╬═════════╬═════════╬══════╬═══════╣
║ 1001 ║ Paul ║ 1st ║ 29-11-2022 ║ Holiday ║ Holiday ║ H ║ H ║
╠══════╬══════╬═══════╬════════════╬═════════╬═════════╬══════╬═══════╣
║ 1001 ║ Paul ║ 1st ║ 29-11-2022 ║ Holiday ║ Holiday ║ H ║ H ║
╚══════╩══════╩═══════╩════════════╩═════════╩═════════╩══════╩═══════╝
Code:
SELECT DISTINCT COALESCE(tt.ID,t1.ID) AS ID,
COALESCE(tt.Name,t1.Name) AS Name,
COALESCE(tt.Class,t1.Class) AS Class,tt.Date,
COALESCE(tt.Intime,t1.Intime) AS Intime,
COALESCE(tt.Outtime,t1.Outtime) AS Outtime,
COALESCE(tt.INAM,t1.INAM) AS INAM,
COALESCE(tt.OUTPM,t1.OUTPM) AS OUTPM
FROM stuattrecordAMPM AS t1
CROSS JOIN (SELECT * FROM stuattrecordAMPM UNION ALL
SELECT null,null,null,Date,Holiday_Name,Holiday_Name,Status,Status FROM HolidayList) AS tt
order by [ID]
DELETE FROM stuattrecordAMPM
WHERE Date IS NULL
In this code I'm getting duplicate values. How to avoid duplicates from datatable?

You can give a row number to each row grouped by all the columns, then delete the rows having row number greater than 1.
Query
with cte as(
select *, row_number() over(
partition by [id], [name], [class], [date], [intime], [outtime], [inam], [outpm]
order by [id]
) as [rn]
from [your_table_name]
)
delete * from cte
where [rn] > 1;

Related

cast string to array of structs then look up values in another table SQL

I currently have a string which represents a list of structs in my table. I want to look up values in another table based on the values of elements in the struct.
For example, below, the car info struct is [spare, carType, carColour].
╔═══════════════════════════╗
║ CarInfo ║
╠═══════════════════════════╣
║ “[1,1,1]” ║
║ “[1,2,1] [1,1,2]” ║
║ null ║
║ “[1,2,1] [1,1,2] [1,1,1]” ║
╚═══════════════════════════╝
and I want to look up the table:
╔═══════════╦═══════════════╦═════════════╦═════════════════╦══╗
║ CarTypeId ║ CarTypeString ║ CarColourId ║ CarColourString ║ ║
╠═══════════╬═══════════════╬═════════════╬═════════════════╬══╣
║ 1 ║ "Hyundai" ║ 1 ║ "Red" ║ ║
║ 1 ║ "Hyundai" ║ 2 ║ "Blue" ║ ║
║ 2 ║ "Toyota" ║ 1 ║ "Green" ║ ║
║ 2 ║ "Toyota" ║ 2 ║ "Yellow" ║ ║
╚═══════════╩═══════════════╩═════════════╩═════════════════╩══╝
and obtain the following result:
╔═════════════════════════════════════════════════════╗
║ CarInfo ║
╠═════════════════════════════════════════════════════╣
║ “[1,Hyundai,Red]” ║
║ “[1,Toyota,Green] [1,Hyundai,Blue]” ║
║ null ║
║ “[1,Toyota,Green] [1,Hyundai,Blue] [1,Hyundai,Red]” ║
╚═════════════════════════════════════════════════════╝
I've found out that I can split the strings into arrays with someString.split(CarInfo,' ') but thereafter I'm not sure how to do the cast to struct or the "looped" left join after.
Below is for BigQuery Standard SQL
#standardSQL
SELECT STRING_AGG('[' || spare || ',' || carTypeString || ',' || carColourString || ']', ' ') AS CarInfo
FROM `project.dataset.cars` t
LEFT JOIN UNNEST(SPLIT(CarInfo, ' ')) info,
UNNEST([STRUCT(
SPLIT(TRIM(info, '[]'))[OFFSET(0)] AS spare,
CAST(SPLIT(TRIM(info, '[]'))[OFFSET(1)] AS INT64) AS carTypeId,
CAST(SPLIT(TRIM(info, '[]'))[OFFSET(2)] AS INT64) AS carColourId
)])
LEFT JOIN `project.dataset.lookup` l
USING(carTypeId, carColourId)
GROUP BY FORMAT('%t', t)
if to apply to sample data from your question - as in below example
#standardSQL
WITH `project.dataset.cars` AS (
SELECT '[1,1,1]' CarInfo UNION ALL
SELECT '[1,2,1] [1,1,2]' UNION ALL
SELECT NULL UNION ALL
SELECT '[1,2,1] [1,1,2] [1,1,1]'
), `project.dataset.lookup` AS (
SELECT 1 CarTypeId, 'Hyundai' CarTypeString, 1 CarColourId, 'Red' CarColourString UNION ALL
SELECT 1, 'Hyundai', 2, 'Blue' UNION ALL
SELECT 2, 'Toyota', 1, 'Green' UNION ALL
SELECT 2, 'Toyota', 2, 'Yellow'
)
SELECT STRING_AGG('[' || spare || ',' || carTypeString || ',' || carColourString || ']', ' ') AS CarInfo
FROM `project.dataset.cars` t
LEFT JOIN UNNEST(SPLIT(CarInfo, ' ')) info,
UNNEST([STRUCT(
SPLIT(TRIM(info, '[]'))[OFFSET(0)] AS spare,
CAST(SPLIT(TRIM(info, '[]'))[OFFSET(1)] AS INT64) AS carTypeId,
CAST(SPLIT(TRIM(info, '[]'))[OFFSET(2)] AS INT64) AS carColourId
)])
LEFT JOIN `project.dataset.lookup` l
USING(carTypeId, carColourId)
GROUP BY FORMAT('%t', t)
output is
Row CarInfo
1 [1,Hyundai,Red]
2 [1,Toyota,Green] [1,Hyundai,Blue]
3 null
4 [1,Toyota,Green] [1,Hyundai,Blue] [1,Hyundai,Red]

Select maximum/minimum with another column

Is there is a way to select the maximum of value + another column without the use of TOP and order by?
Assuming that we have a list of people and their ages, we want take the oldest/youngest. I want to select the name + the age. Even If it happens that we want to group them by name, that won't work.
SELECT nom,
max(age)
from Agents
group by nom
╔════════╦═════╗
║ Name ║ Age ║
╠════════╬═════╣
║ John ║ 200 ║
║ Bob ║ 150 ║
║ GSkill ║ 300 ║
║ Smith ║ 250 ║
║ John ║ 400 ║
║ Zid ║ 300 ║
║ Wick ║ 250 ║
║ Smith ║ 140 ║
╚════════╩═════╝
You could use ROW_NUMBER or DENSE_RANK. For example, if you have to show those employees having the MIN and MAX salary then you could use following SQL statement:
SELECT x.Name, x.Salary,
IIF(x.RowNumMIN = 1, 1, 0) AS IsMin,
IIF(x.RowNumMAX = 1, 1, 0) AS IsMax
FROM (
SELECT x.Name, x.Salary,
ROW_NUMBER() OVER(ORDER BY x.Salary ASC) AS RowNumMIN,
ROW_NUMBER() OVER(ORDER BY x.Salary DESC) AS RowNumMAX
FROM dbo.SourceTable AS x
) AS x
WHERE x.RowNumMIN = 1 OR x.RowNumMAX = 1
If there are two or more people having the same min or max salary and you have to show all of then you could use DENSE_RANK function instead of ROW_NUMBER.
Try this query --
;WITH CTE
AS (
SELECT [NAME]
,AGE
,DENSE_RANK() OVER (
ORDER BY AGE DESC
) AS Older
,DENSE_RANK() OVER (
ORDER BY AGE ASC
) AS Younger
FROM tblSample
)
SELECT [NAME] + ': ' + CAST(AGE AS VARCHAR(50))
FROM CTE
WHERE Older = 1 OR Younger = 1

How can I merge 3 distinct Sqls by the same grouping but different columns (Sql Server Stored Procedure)

I have 3 distinct queries in a Stored Procedure in Sql Server. I need to merge the results grouping by
"Date, Team, Account", and having the columns:
(Query1.NumberUnits + Query2.NumberUnits) AS TotalUnits,
(Query2.NumberCartons) AS TotalCartons,
(Query3.TotalPallets) AS TotalPallets
My Sqls are a bit complex so I couldn't post here to don't make it too
complicated, but I need some command like Merge or Union all or even
temporary tables, but I don't know how to use in this case.
Query 1
╔═══════════╦════════╦═══════════╦════════════════╦═════════════╗
║ Date ║ TeamId ║ AccountId ║ TransactionQty ║ NumberUnits ║
╠═══════════╬════════╬═══════════╬════════════════╬═════════════╣
║ 8/12/2014 ║ 4 ║ 1989 ║ 4 ║ 4 ║
╚═══════════╩════════╩═══════════╩════════════════╩═════════════╝
Query 2
╔═══════════╦════════╦═══════════╦════════════════╦═══════════════╦═════════════╗
║ Date ║ TeamId ║ AccountId ║ TransactionQty ║ NumberCartons ║ NumberUnits ║
╠═══════════╬════════╬═══════════╬════════════════╬═══════════════╬═════════════╣
║ 8/12/2014 ║ 4 ║ 1989 ║ 6 ║ 6 ║ 1 ║
╚═══════════╩════════╩═══════════╩════════════════╩═══════════════╩═════════════╝
Query 3
╔═══════════╦════════╦═══════════╦══════════════╗
║ Date ║ TeamId ║ AccountId ║ TotalPallets ║
╠═══════════╬════════╬═══════════╬══════════════╣
║ 8/12/2014 ║ 5 ║ 2000 ║ 2 ║
║ 9/12/2014 ║ 4 ║ 1989 ║ 1 ║
╚═══════════╩════════╩═══════════╩══════════════╝
Query Result
╔═══════════╦════════╦═══════════╦════════════╦══════════════╦══════════════╗
║ Date ║ TeamId ║ AccountId ║ TotalUnits ║ TotalCartons ║ TotalPallets ║
╠═══════════╬════════╬═══════════╬════════════╬══════════════╬══════════════╣
║ 8/12/2014 ║ 4 ║ 1989 ║ 5 ║ 6 ║ 0 ║
║ 8/12/2014 ║ 5 ║ 2000 ║ 0 ║ 0 ║ 2 ║
║ 9/12/2014 ║ 4 ║ 1989 ║ 0 ║ 0 ║ 1 ║
╚═══════════╩════════╩═══════════╩════════════╩══════════════╩══════════════╝
You can do this with either full outer join or with union all and group by. Here is the union all method:
with q1 as (<query1>),
q2 as (<query2>),
q3 as (<query3>)
select date, TeamId, AccountId,
sum(NumberUnits) as TotalUnits,
sum(NumberCartons) as TotalCartons,
sum(TotalPallets) as TotalPallets
from ((select date, TeamId, AccountId, NumberUnits, 0 as NumberCartons, 0 as TotalPallets
from q1
) union all
(select date, TeamId, AccountId, NumberUnits, NumberCartons, 0 as TotalPallets
from q2
) union all
(select date, TeamId, AccountId, 0 as NumberUnits, 0 as NumberCartons, TotalPallets
from q3
)
) qqq
group by date, TeamId, AccountId
order by date, TeamId, AccountId;
Create table
DECLARE #q1 TABLE ([Date] DATE, TeamId INT, AccountId INT, TransactionQty INT, NumberUnits INT)
DECLARE #q2 TABLE ([Date] DATE, TeamId INT, AccountId INT, TransactionQty INT, NumberCartons INT, NumberUnits INT)
DECLARE #q3 TABLE ([Date] DATE, TeamId INT, AccountId INT, TotalPallets INT)
Sample data
INSERT INTO #q1 VALUES ('8/12/2014', 4, 1989, 4, 4)
INSERT INTO #q2 VALUES ('8/12/2014', 4, 1989, 6, 6, 1)
INSERT INTO #q3 VALUES ('8/12/2014', 5, 2000, 2)
,('9/12/2014', 4, 1989, 1)
Query
SELECT [Date], TeamId, AccountId,
ISNULL(SUM(NumberUnits), 0) AS TotalUnits,
ISNULL(SUM(NumberCartons), 0),
ISNULL(SUM(TotalPallets), 0)
FROM (
SELECT [Date], TeamId, AccountId, NULL AS NumberCartons, NumberUnits, NULL AS TotalPallets FROM #q1
UNION ALL
SELECT [Date], TeamId, AccountId, NumberCartons, NumberUnits, NULL FROM #q2
UNION ALL
SELECT [Date], TeamId, AccountId, NULL, NULL, TotalPallets FROM #q3
) AS t
GROUP BY [Date], TeamId, AccountId

How to sort a column based on length of data in it in SQL server

As we all know general sorting is using order by. The sort I want to perform is different. I want the smallest length value in middle of table n the largest ones in top and bottom of it. One half should be descending and another half should be ascending. Can you guys help. It was an interview question.
This is one way:
;WITH CTE AS
(
SELECT *,
RN = ROW_NUMBER() OVER(ORDER BY LEN(YourColumn))
FROM dbo.YourTable
)
SELECT *
FROM CTE
ORDER BY RN%2, (CASE WHEN RN%2 = 0 THEN 1 ELSE -1 END)*RN DESC
Test Data
DECLARE #Table TABLE
(ID INT, Value VARCHAR(10))
INSERT INTO #Table VALUES
(1 , 'A'),
(2 , 'AB'),
(3 , 'ABC'),
(4 , 'ABCD'),
(5 , 'ABCDE'),
(6 , 'ABCDEF'),
(7 , 'ABCDEFG'),
(8 , 'ABCDEFGI'),
(9 , 'ABCDEFGIJ'),
(10 ,'ABCDEFGIJK')
Query
;WITH CTE AS (
SELECT *
,NTILE(2) OVER (ORDER BY LEN(Value) DESC) rn
FROM #Table )
SELECT *
FROM CTE
ORDER BY CASE WHEN rn = 1 THEN LEN(Value) END DESC
,CASE WHEN rn = 2 THEN LEN(Value) END ASC
Result
╔════╦════════════╦════╗
║ ID ║ Value ║ rn ║
╠════╬════════════╬════╣
║ 10 ║ ABCDEFGIJK ║ 1 ║
║ 9 ║ ABCDEFGIJ ║ 1 ║
║ 8 ║ ABCDEFGI ║ 1 ║
║ 7 ║ ABCDEFG ║ 1 ║
║ 6 ║ ABCDEF ║ 1 ║
║ 1 ║ A ║ 2 ║
║ 2 ║ AB ║ 2 ║
║ 3 ║ ABC ║ 2 ║
║ 4 ║ ABCD ║ 2 ║
║ 5 ║ ABCDE ║ 2 ║
╚════╩════════════╩════╝
Here's a short approach that would ge t you started:
WITH cte AS
(
SELECT TOP 1000 number
FROM master..spt_values
WHERE type = 'P' and number >0
)
SELECT number, row_number() OVER(ORDER BY CASE WHEN number %2 = 1 THEN number ELSE -(number) END) pos
FROM cte

SQL - Group rows via criteria until exception is found

I am trying to add a Group column to a data set based on some criteria. For a simple example:
╔════╦══════╗
║ ID ║ DATA ║
╠════╬══════╣
║ 1 ║ 12 ║
║ 2 ║ 20 ║
║ 3 ║ 3 ║
║ 4 ║ 55 ║
║ 5 ║ 11 ║
╚════╩══════╝
Let's say our criteria is that the Data should be greater than 10. Then the result should be similar to:
╔════╦══════╦═══════╗
║ ID ║ DATA ║ GROUP ║
╠════╬══════╬═══════╣
║ 1 ║ 12 ║ 1 ║
║ 2 ║ 20 ║ 1 ║
║ 3 ║ 3 ║ 2 ║
║ 4 ║ 55 ║ 3 ║
║ 5 ║ 11 ║ 3 ║
╚════╩══════╩═══════╝
So, all the rows that satisfied the criteria until an exception to the criteria occurred became part of a group. The numbering of the group doesn't necessarily need to follow this pattern, I just felt like this was a logical/simple numbering to explain the solution I am looking for.
You can calculate the group identifier by finding each row where data <= 10. Then, the group identifier is simply the number of rows where that condition is true, before the given row.
select t.*,
(select count(*)
from t t2
where t2.id <= t.id and
t2.data <= 10
) as groupId
from t;
SQL Server 2012 has cumulative sum syntax. The statement would be simpler in that database:
select t.*,
sum(case when t2.data <= 10) over (order by id) as groupId
from t;
EDIT:
The above does not take into account that the values less than 10 are in their own group. The logic above is that they start a new group.
The following assigns a group id with this constraint:
select t.*,
((select 2*count(*)
from t t2
where t2.id < t.id and
t2.data <= 10
) + (case when t.id <= 10 then 1 else 0 end)
) as groupId
from t;
This can be done easily with a recursive query:
;WITH CTE
AS (SELECT *,
1 AS [GROUP]
FROM TABLEB
WHERE ID = 1
UNION ALL
SELECT T1.ID,
T1.DATA,
CASE
WHEN T1.DATA < 10 THEN T2.[GROUP] + 1
ELSE T2.[GROUP]
END [GROUP]
FROM TABLEB T1
INNER JOIN CTE T2
ON T1.ID = T2.ID + 1)
SELECT *
FROM CTE
A working example can be found on SQL Fiddle.
Good Luck!