How to find minimum values in a column in sql - sql

If I have a table like this:
id name value
1 abc 1
2 def 4
3 ghi 1
4 jkl 2
How can I select a new table that still has id, name, value but only the ones with a minimum value.
In this example I need this table back:
1 abc 1
3 ghi 1

Finding those values is pretty straightforward:
SELECT *
FROM YourTable
WHERE value = (SELECT MIN(Value) FROM YourTable);
As for the right syntax for putting those rows in another table, that will depend on the database engine that you are using.

An alternative to #Lamak's solution could be to use the rank window function. Depending on the exact scenario, it may perform quite better:
SELECT id, name, value
FROM (SELECT id, name, value, RANK() OVER (ORDER BY value ASC) AS rk
FROM mytable) t
WHERE rk = 1

not sure exactly if this is what you're trying to do, but I think this would work:
--creating #temp1 to recreate your table/example
CREATE TABLE #TEMP1
(id INT NOT NULL PRIMARY KEY,
name CHAR(3) NOT NULL,
value INT NOT NULL)
INSERT INTO #TEMP1
VALUES
(1,'abc',1),
(2,'def',4),
(3,'ghi',1),
(4,'jkl',2)
-verify correct
SELECT * FROM #temp1
--populate new table with min value from table 1
SELECT *
INTO #TEMP2
FROM #TEMP1
WHERE value = (SELECT MIN(value)
FROM #TEMP1)
SELECT * FROM #TEMP2

Related

Finding Occurrence of the duplicate values

I have table with 3 columns (id, Name, Occurrence), I want to update the Occurrence column ,based on the id column, attached snap for the reference.
for example if my id column has "606" value 3 times then my occurrent column should have 3 against all the "606" value.
Below is the method which I tried.
I tried to find the duplicate values using group by and Having clause and saved it in a temp table and from there I tried to join the table value from the temp table.
you can use window functions in an updatable CTE for this.
You haven't supplied any actual sample data so this is untested, however the following should work:
with x as (
select Id, Occurence, count(*) over(partition by Id) qty
from Table
)
update x
set Occurence = Qty;
You can go for GROUP BY based approach also.
declare #TABLE TABLE(ID INT, NAME CHAR(3), occurance int null)
insert into #TABLE VALUES
(1,'AAA',NULL),(1,'AAA',NULL),(2,'CCC',NULL),(3,'DDD',NULL), (3,'DDD',NULL),(4,'EEE',NULL),(5,'FFF',NULL);
;WITH CTE_Table as
(
SELECT ID, COUNT(*) AS Occurance
FROM #table
group by id
)
UPDATE t
SET occurance = c.occurance
FROM #table t
INNER JOIN CTE_Table as c
on C.ID = T.ID
SELECT * FROM #TABLE
ID
NAME
occurance
1
AAA
2
1
AAA
2
2
CCC
1
3
DDD
2
3
DDD
2
4
EEE
1
5
FFF
1
You can use a CTE and calculate row number and update your table base on CTE
;WITH q
AS
(
SELECT Id,COUNT(1) 'RowNum'
FROM YourTable
GROUP BY Id
)
UPDATE YourTable
SET Occurrence=q.RowNum
FROM YourTable t
INNER JOIN q
ON t.Id=q.Id

Output the results of several SELECT statements to an excel sheet in their own columns

I have a query that I want to turn into a stored proc which has, right now, about 6 select statements in it of similar data. Each one just brings back phone numbers in one column except each of the columns is named differently.
Basically it is:
SELECT PhoneNumber as PhoneGroup1 FROM PhoneNumberTable
SELECT PhoneNumber as PhoneGroup2 FROM PhoneNumberTable
SELECT PhoneNumber as PhoneGroup3 FROM PhoneNumberTable
It is actually more complex than that, but those are the results I get in a nutshell.
I then will go and copy/paste each column and its header name into a spreadsheet into Column A for PhoneGroup1, Column B for PhoneGroup2, etc.
PhoneGroup1 | PhoneGroup2 | PhoneGroup3
4856562281 | 9498675309 | 6238471273
7452837719 | 5739542855 | 4745856147
8472639273 | 6495232247 | 9516538847
Is there any way I can have this export to an excel sheet?
Thank you guys for any guidance!
I think I understand what you're trying to do. Do you have something like this:
declare #tbl1 table ( id int )
declare #tbl2 table ( id int )
insert into #tbl1 values(1),(2),(3)
insert into #tbl2 values(10),(20),(30)
select * from #tbl1
union
select * from #tbl2
which returns this result set:
id
----
1
2
3
10
20
30
but you really want this result set?
id1 id2
---- ----
1 10
2 20
3 30
I can see a way to do this using row numbers. Basically, you give each row returned from the individual tables a row number, and then you join the tables together matching on the row numbers. It looks like this in my example:
declare #tbl1 table ( id int )
declare #tbl2 table ( id int )
insert into #tbl1 values(1),(2),(3)
insert into #tbl2 values(10),(20),(30)
select t1.id as id1, t2.id as id2
from
(
select 'table1' as header, id, row_number() over (order by id) rnum
from #tbl1 t1
) t1
inner join
(
select 'table2' as header, id, row_number() over (order by id) rnum
from #tbl2 t2
) t2 on t1.rnum = t2.rnum
To add a column you have to add another join to the query. If your tables have different numbers of rows and you want to see all rows, use left full outer joins instead of inner joins.

Recursive Update Statement

I need to create a recursive update statement that updates from another table so for ex..
Table1
(
IdNumberGeneratedFromAService INT NOT NULL,
CodeName NVARCHAR(MAX)
)
Table2
(
Table2Id Auto_Increment,
Name NVARCHAR(MAX),
IdNumberThatComesFromTabl1,
CodeNameForTable1ToMatch
)
the issue is CodeNameForTable1ToMatch is not unique so if Table1 has 2 idnumber for the same code and there are two rows in Table2 with the same CodeName I want to update the rows in table2 in sequence so first row gets the first idnumber and second row gets the second id number.
Also want to do it without cursor....
SAMPLE DATA
Table1
idNumber Code
C145-6678-90 Code1
C145-6678-91 Code1
C145-6678-92 Code1
C145-6678-93 Code1
C145-6678-94 Code1
Table 2
AutoIncrementIdNumber Code IdNumber
1 Code1 {NULL}
2 Code1 {NULL}
3 Code1 {NULL}
4 Code1 {NULL}
5 Code1 {NULL}
C145-6678-90 needs to got 1
C145-6678-91 needs to got 2
C145-6678-92 needs to got 3
C145-6678-93 needs to got 4
C145-6678-94 needs to got 5
in one update statement
Using the ROW_NUMBER windowing function on each of the tables, partitioned by the code, you can number each of the rows that have a code in common, then combine the results of that on each query to match rows based on the code and the numbered instance of that code. So the first Code A in Table 1 would matched the first Code A in table 2, and etc.
Sample code showing this (SQL 2005 or higher):
-- Sample code prep
CREATE TABLE #Table1
(
IdNumberGeneratedFromAService INT NOT NULL,
CodeName NVARCHAR(MAX)
);
CREATE TABLE #Table2
(
Table2Id INT NOT NULL IDENTITY(1,1),
Name NVARCHAR(MAX),
IdNumberThatComesFromTabl1 INT NULL,
CodeNameForTable1ToMatch NVARCHAR(MAX)
);
INSERT INTO #Table1(IdNumberGeneratedFromAService, CodeName)
VALUES(100,'Code A'),(150,'Code A'),(200,'Code B'),(250,'Code A'),(300,'Code C'),(400,'Nonexistent');
INSERT INTO #Table2(Name, IdNumberThatComesFromTabl1, CodeNameForTable1ToMatch)
VALUES('A1-100',0,'Code A'),('A2-150',0,'Code A'),('A3-250',0,'Code A'),('B1-200',0,'Code B'),('C1-300',0,'Code C'),('No Id For Me',0,'Code No Id :(');
-- Sample select statement that shows the row numbers
--SELECT *
--FROM
-- (SELECT *, ROW_NUMBER() OVER (Partition By IT2.CodeNameForTable1ToMatch Order By IT2.Table2Id) as RowNum
-- FROM #Table2 IT2) T2
-- INNER JOIN
-- (SELECT *, ROW_NUMBER() OVER (Partition By IT1.CodeName Order By IT1.IdNumberGeneratedFromAService) as RowNum
-- FROM #Table1 IT1) T1
-- ON T1.CodeName = T2.CodeNameForTable1ToMatch AND T1.RowNum = T2.RowNum;
-- Table 2 Before
SELECT * FROM #Table2;
-- Actual update statement
UPDATE #Table2
SET IdNumberThatComesFromTabl1 = T1.IdNumberGeneratedFromAService
FROM #Table2 AT2
INNER JOIN
(SELECT *, ROW_NUMBER() OVER (Partition By IT2.CodeNameForTable1ToMatch Order By IT2.IdNumberThatComesFromTabl1) as RowNum
FROM #Table2 IT2) T2
ON T2.Table2Id = AT2.Table2Id
INNER JOIN
(SELECT *, ROW_NUMBER() OVER (Partition By IT1.CodeName Order By IT1.IdNumberGeneratedFromAService) as RowNum
FROM #Table1 IT1) T1
ON T1.CodeName = T2.CodeNameForTable1ToMatch AND T1.RowNum = T2.RowNum;
-- Table 2 after
SELECT * FROM #Table2;
-- Cleanup
DROP TABLE #Table1;
DROP TABLE #Table2;
I turned your two sample tables into temp tables and added 3 records for 'Code A', a record for 'Code B', and a record for 'Code C'. The codes in table1 are numbered based on the order of the table 1 ID, the codes in Table 2 are ordered by the auto-incrementing Table 2 id. I also included a record in each table that wouldn't have a match in the other. I tried to make the code's descriptive so it would be easier to see that a correct match has occurred (they order for table 2 is important since it has an auto incrementing id)
The commented out sample select is there to help understand how the select works before I join it into the UPDATE statement.
So we can see before the update Table 2 is all 0's, then we update the values in table 2 where the unique table 2 id matches the unique table 2 id from our nicely numbered and matched join, then we select from table 2 again to see the results.
A riff on Tarwn's solution:
with cte1 as (
select code, row_number() over (partition by code order by idNumber) as [rn]
from table1
), cte2 as (
select code, row_number() over (partition by code order by AutoIncrementIdNumber) as [rn]
from table2
)
update cte2
set idNumber = cte1.idNumber
from cte2
inner join cte1
on cte2.code = cte1.code
and cte2.rn = cte1.rn
I only present this because people are often amazed that you can update a common table expression.
This isn't possible without a cursor.

Checking for duplicate data in SQL Server

Please don't ask me why but there is a lot of duplicate data where every field is duplicated.
For example
alex, 1
alex, 1
liza, 32
hary, 34
I will need to eliminate from this table one of the alex, 1 rows
I know this algorithm will be very ineffecient, but it does not matter. I will need to remove duplicate data.
What is the best way to do this? Please keep in mind I do not have 2 fields, I actually have about 10 fields to check on.
As you said, yes this will be very inefficient, but you can try something like
DECLARE #TestTable TABLE(
Name VARCHAR(20),
SomeVal INT
)
INSERT INTO #TestTable SELECT 'alex', 1
INSERT INTO #TestTable SELECT 'alex', 1
INSERT INTO #TestTable SELECT 'liza', 32
INSERT INTO #TestTable SELECT 'hary', 34
SELECT *
FROM #TestTable
;WITH DuplicateVals AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY Name, SomeVal ORDER BY (SELECT NULL)) RowID
FROM #TestTable
)
DELETE FROM DuplicateVals WHERE RowID > 1
SELECT *
FROM #TestTable
I understand this does not answer the specific question (eliminating dupes in SAME table), but I'm offering the solution because it is very fast and might work best for the author.
Speedy solution, if you don't mind creating a new table, create a new table with the same schema named NewTable.
Execute this SQL
Insert into NewTable
Select
name,
num
from
OldTable
group by
name,
num
Just include every field name in both the select and group by clauses.
Method A. You can get a deduped version of your data using
SELECT field1, field2, ...
INTO Deduped
FROM Source
GROUP BY field1, field2, ...
for example, for your sample data,
SELECT name, number
FROM Source
GROUP BY name, number
yields
alex 1
hary 34
liza 32
then simply delete the old table, and rename the new one. Of course, there are a number of fancy in-place solutions, but this is the clearest way to do it.
Method B. An in-place method is to create a primary key and delete duplicates that way. For example, you can
ALTER TABLE Source ADD sid INT IDENTITY(1,1);
which makes Source look like this
alex 1 1
alex 1 2
liza 32 3
hary 34 4
then you can use
DELETE FROM Source
WHERE sid NOT IN
(SELECT MIN(sid)
FROM Source
GROUP BY name, number)
which will give the desired result. Of course, "NOT IN" is not exactly the most efficient, but it will do the job. Alternatively, you can LEFT JOIN the grouped table (maybe stored in a TEMP table), and do the DELETE that way.
create table DuplicateTable(name varchar(10), number int)
insert DuplicateTable
values
('alex', 1),
('alex', 1),
('liza', 32),
('hary', 34);
with cte
as
(
select *, row_number() over(partition by name, number order by name) RowNumber
from DuplicateTable
)
delete cte
where RowNumber > 1
A bit different solution which requires primary key(or unique index):
Suppose you have a table your_table(id - PK, name, and num)
DELETE
FROM your_table
FROM your_table AS t2
WHERE
(select COUNT(*) FROM your_table y
where t2.name = y.name and t2.num = y.num) >1
AND t2.id !=
(SELECT top 1 id FROM your_table z
WHERE t2.name = z.name and t2.num = z.num);
I assumed that name and num are NOT NULL, if they can contain NULL values, you need to change wheres in sub-queries.

Select DISTINCT, return entire row

I have a table with 10 columns.
I want to return all rows for which Col006 is distinct, but return all columns...
How can I do this?
if column 6 appears like this:
| Column 6 |
| item1 |
| item1 |
| item2 |
| item1 |
I want to return two rows, one of the records with item1 and the other with item2, along with all other columns.
In SQL Server 2005 and above:
;WITH q AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY col6 ORDER BY id) rn
FROM mytable
)
SELECT *
FROM q
WHERE rn = 1
In SQL Server 2000, provided that you have a primary key column:
SELECT mt.*
FROM (
SELECT DISTINCT col6
FROM mytable
) mto
JOIN mytable mt
ON mt.id =
(
SELECT TOP 1 id
FROM mytable mti
WHERE mti.col6 = mto.col6
-- ORDER BY
-- id
-- Uncomment the lines above if the order matters
)
Update:
Check your database version and compatibility level:
SELECT ##VERSION
SELECT COMPATIBILITY_LEVEL
FROM sys.databases
WHERE name = DB_NAME()
The key word "DISTINCT" in SQL has the meaning of "unique value". When applied to a column in a query it will return as many rows from the result set as there are unique, different values for that column. As a consequence it creates a grouped result set, and values of other columns are random unless defined by other functions (such as max, min, average, etc.)
If you meant to say you want to return all rows for which Col006 has a specific value, then use the "where Col006 = value" clause.
If you meant to say you want to return all rows for which Col006 is different from all other values of Col006, then you still need to specify what that value is => see above.
If you want to say that the value of Col006 can only be evaluated once all rows have been retrieved, then use the "having Col006 = value" clause. This has the same effect as the "where" clause, but "where" gets applied when rows are retrieved from the raw tables, whereas "having" is applied once all other calculations have been made (i.e. aggregation functions have been run etc.) and just before the result set is returned to the user.
UPDATE:
After having seen your edit, I have to point out that if you use any of the other suggestions, you will end up with random values in all other 9 columns for the row that contains the value "item1" in Col006, due to the constraint further up in my post.
You can group on Col006 to get the distinct values, but then you have to decide what to do with the multiple records in each group.
You can use aggregates to pick a value from the records. Example:
select Col006, min(Col001), max(Col002)
from TheTable
group by Col006
order by Col006
If you want the values to come from a specific record in each group, you have to identify it somehow. Example of using Col002 to identify the record in each group:
select Col006, Col001, Col002
from TheTable t
inner join (
select Col006, min(Col002)
from TheTable
group by Col006
) x on t.Col006 = x.Col006 and t.Col002 = x.Col002
order by Col006
SELECT *
FROM (SELECT DISTINCT YourDistinctField FROM YourTable) AS A
CROSS APPLY
( SELECT TOP 1 * FROM YourTable B
WHERE B.YourDistinctField = A.YourDistinctField ) AS NewTableName
I tried the answers posted above with no luck... but this does the trick!
select * from yourTable where column6 in (select distinct column6 from yourTable);
SELECT *
FROM harvest
GROUP BY estimated_total;
You can use GROUP BY and MIN() to get more specific result.
Lets say that you have id as the primary_key.
And we want to get all the DISTINCT values for a column lets say estimated_total, And you also need one sample of complete row with each distinct value in SQL. Following query should do the trick.
SELECT *, min(id)
FROM harvest
GROUP BY estimated_total;
create table #temp
(C1 TINYINT,
C2 TINYINT,
C3 TINYINT,
C4 TINYINT,
C5 TINYINT,
C6 TINYINT)
INSERT INTO #temp
SELECT 1,1,1,1,1,6
UNION ALL SELECT 1,1,1,1,1,6
UNION ALL SELECT 3,1,1,1,1,3
UNION ALL SELECT 4,2,1,1,1,6
SELECT * FROM #temp
SELECT *
FROM(
SELECT ROW_NUMBER() OVER (PARTITION BY C6 Order by C1) ID,* FROM #temp
)T
WHERE ID = 1