SQL Server 2008 select query difficulty - sql

I have a table with over 100k records. Here my issue, I have a bunch of columns
CompanyID CompanyName CompanyServiceID ServiceTypeID Active
----------------------------------------------------------------
1 Xerox 17 33 Yes
2 Microsoft 19 39 Yes
3 Oracle 22 54 Yes
2 Microsoft 19 36 Yes
So here's how my table looks, it has about 30 other columns but they are irrelevant for this question.
Here's my quandary..I'm trying to select all records where CompanyID and CompanyServiceID are the same, so basically as you can see in the table above, I have Microsoft that appears twice in the table, and has the same CompanyID and CompanyServiceID, but different ServiceTypeID.
I need to be able to search all records where there are duplicates. The person maintaining this data was very messy and did not update some of the columns properly so I have to go through all the records and find where there are records that have the same CompanyID and CompanyServiceID.
Is there a generic query that would be able to do that?
None of these columns are my primary key, I have a column with record number that increments by 1.

You can try something like this:
SELECT CompanyName, COUNT(CompanyServiceID)
FROM //table name here
GROUP BY CompanyName
HAVING ( COUNT(CompanyServiceID) > 1 )
This will return a grouped list of all companies with multiple entries. You can modify what columns you want in the SELECT statement if you need other info from the record as well.

Here's one option using row_number to create the groupings of duplicated data:
select *
from (
select *,
row_number () over (partition by companyId, companyserviceid
order by servicetypeid) rn
from yourtable
) t
where rn > 1

Another option GROUP BY, HAVING and INNER JOIN
SELECT
*
FROM
Tbl A INNER JOIN
(
SELECT
CompanyID,
CompanyServiceID
FROM
Tbl
GROUP BY
CompanyID,
CompanyServiceID
HAVING COUNT(1) > 1
) B ON A.CompanyID = B.CompanyID AND
A.CompanyServiceID = B.CompanyServiceID

Using Join..
Select *
from
Yourtable t1
join
(
select companyid,companyserviceid,count(*)
from
Yourtable
having count(*)>1)b
on b.companyid=t1.companyid
and b.companyserviceid=t1.companyserviceid

Related

How to find Max value in a column in SQL Server 2012

I want to find the max value in a column
ID CName Tot_Val PName
--------------------------------
1 1 100 P1
2 1 10 P2
3 2 50 P2
4 2 80 P1
Above is my table structure. I just want to find the max total value only from the table. In that four row ID 1 and 2 have same value in CName but total val and PName has different values. What I am expecting is have to find the max value in ID 1 and 2
Expected result:
ID CName Tot_Val PName
--------------------------------
1 1 100 P1
4 2 80 P1
I need result same as like mention above
select Max(Tot_Val), CName
from table1
where PName in ('P1', 'P2')
group by CName
This is query I have tried but my problem is that I am not able to bring PName in this table. If I add PName in the select list means it will showing the rows doubled e.g. Result is 100 rows but when I add PName in selected list and group by list it showing 600 rows. That is the problem.
Can someone please help me to resolve this.
One possible option is to use a subquery. Give each row a number within each CName group ordered by Tot_Val. Then select the rows with a row number equal to one.
select x.*
from ( select mt.ID,
mt.CName,
mt.Tot_Val,
mt.PName,
row_number() over(partition by mt.CName order by mt.Tot_Val desc) as No
from MyTable mt ) x
where x.No = 1;
An alternative would be to use a common table expression (CTE) instead of a subquery to isolate the first result set.
with x as
(
select mt.ID,
mt.CName,
mt.Tot_Val,
mt.PName,
row_number() over(partition by mt.CName order by mt.Tot_Val desc) as No
from MyTable mt
)
select x.*
from x
where x.No = 1;
See both solutions in action in this fiddle.
You can search top-n-per-group for this kind of a query.
There are two common ways to do it. The most efficient method depends on your indexes and data distribution and whether you already have another table with the list of all CName values.
Using ROW_NUMBER
WITH
CTE
AS
(
SELECT
ID, CName, Tot_Val, PName,
ROW_NUMBER() OVER (PARTITION BY CName ORDER BY Tot_Val DESC) AS rn
FROM table1
)
SELECT
ID, CName, Tot_Val, PName
FROM CTE
WHERE rn=1
;
Using CROSS APPLY
WITH
CTE
AS
(
SELECT CName
FROM table1
GROUP BY CName
)
SELECT
A.ID
,A.CName
,A.Tot_Val
,A.PName
FROM
CTE
CROSS APPLY
(
SELECT TOP(1)
table1.ID
,table1.CName
,table1.Tot_Val
,table1.PName
FROM table1
WHERE
table1.CName = CTE.CName
ORDER BY
table1.Tot_Val DESC
) AS A
;
See a very detailed answer on dba.se Retrieving n rows per group
, or here Get top 1 row of each group
.
CROSS APPLY might be as fast as a correlated subquery, but this often has very good performance (and better than ROW_NUMBER():
select t.*
from t
where t.tot_val = (select max(t2.tot_val)
from t t2
where t2.cname = t.cname
);
Note: The performance depends on having an index on (cname, tot_val).

Select single first occurrence of row against distinct local ID from a table and insert it in another table

I want a postgre SQL query that selects only first row from table against distinct LocalID and enter the result in another table.
Records:
ID| LocalID| Name
1 233 Tim
2 633 John
3 633 Alex
4 234 Mike
5 233 Dave
6 556 Kim
Wanted result:
ID| LocalID| Name
1 233 Tim
2 633 John
4 234 Mike
6 556 Kim
I tried using
CREATE TABLE Weeklylist AS (select distinct on (localid) * from Monthlylist)
But this query select the last distinct record and enters it into the table. All i want is the first occurrence of the row containing distinct localID should be entered in the table.
The use of distinct on in your existing statement indicates that you are using Postgres.
The problem with your query is that it is missing an ORDER BY clause. Without it, it is undefined which record will be selected (you are seeing the last record being picked, but this is not guaranteed to be consistent over subsequent executions of the same query). So, add the ORDER BY clause:
create table Weeklylist as
select distinct on (localid) * from Monthlylist order by localid, id
Side note: parentheses around the select statement are superfluous here.
You can use DISTINCT ON in PostgreSQL :
CREATE TABLE Weeklylist
AS
SELECT DISTINCT ON (LocalID) *
FROM Monthlylist ml
ORDER BY LocalID, ID -- Missing in your query
In MySQL older version correlated sub-query is one way :
SELECT ml.*
FROM Monthlylist ml
WHERE ml.id = (SELECT MIN(ml1.id) FROM Monthlylist ml1 WHERE ml1.LocalID = ml.LocalID);
This will give you what you need:
select *
from Monthlylist
where id in (
select min(id)
from Monthlylist
group by localid
)
create table WeeklyList as
select *
from Monthlylist
where id in (
select min(id)
from Monthlylist
group by localid
)
Demo on DB Fiddle
You need a subquery and join to get your desired output.
create table Weeklylist AS (
select t.* from Monthlylist t
inner join (select distinct on (localid) * from Monthlylist) t1 on t1.localid = t.localid and t.id = t1.id
order by id, localid)
see sqlfiddle

SQL Separating Distinct Values using single column

Does anyone happen to know a way of basically taking the 'Distinct' command but only using it on a single column. For lack of example, something similar to this:
Select (Distinct ID), Name, Term from Table
So it would get rid of row with duplicate ID's but still use the other column information. I would use distinct on the full query but the rows are all different due to certain columns data set. And I would need to output only the top most term between the two duplicates:
ID Name Term
1 Suzy A
1 Suzy B
2 John A
2 John B
3 Pete A
4 Carl A
5 Sally B
Any suggestions would be helpful.
select t.Id, t.Name, t.Term
from (select distinct ID from Table order by id, term) t
You can use row number for this
Select ID, Name, Term from(
Select ID, Name, Term, ROW_NUMBER ( )
OVER ( PARTITION BY ID order by Name) as rn from Table
Where rn = 1)
as tbl
Order by determines the order from which the first row will be picked.

Deletion of duplicate records using one query only

I am using SQL server 2005.
I have a table like this -
ID Name
1 a
1 a
1 a
2 b
2 b
3 c
4 d
4 d
In this, I want to delete all duplicate entries and retain only one instance as -
ID Name
1 a
2 b
3 c
4 d
I can do this easily by adding another identity column to this table and having unique numbers in it and then deleting the duplicate records. However I want to know if I can delete the duplicate records without adding that additional column to this table.
Additionally if this can be done using only one query statement. i.e. Without using Stored procedures or temp tables.
Using a ROW_NUMBER in a CTE allows you to delete duplicate values while retaining unique rows.
WITH q AS (
SELECT RN = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID )
, ID
, Name
FROM ATable
)
DELETE FROM q WHERE RN > 1
Lieven is Right... however you may want to tweak lieven's code by just adding a top clause in the delete statement like this:
delete top(1) from q where RN > 1;
Hope this helps
You may use this query:
delete a from
(select id,name, ROW_NUMBER() over (partition by id,name order by id) row_Count
from dup_table) a
where a.row_Count >1
delete from table1
USING table1, table1 as vtable
WHERE (NOT table1.ID=vtable.ID)
AND (table1.Name=vtable.Name)
DELETE FROM tbl
WHERE ID NOT IN (
SELECT MIN(ID)
FROM tbl
GROUP BY Name
)

Fetch the row which has the Max value for a column in SQL Server

I found a question that was very similar to this one, but using features that seem exclusive to Oracle. I'm looking to do this in SQL Server.
I have a table like this:
MyTable
--------------------
MyTableID INT PK
UserID INT
Counter INT
Each user can have multiple rows, with different values for Counter in each row. I need to find the rows with the highest Counter value for each user.
How can I do this in SQL Server 2005?
The best I can come up with is a query the returns the MAX(Counter) for each UserID, but I need the entire row because of other data in this table not shown in my table definition for simplicity's sake.
EDIT: It has come to my attention from some of the answers in this post, that I forgot an important detail. It is possible to have 2+ rows where a UserID can have the same MAX counter value. Example below updated for what the expected data/output should be.
With this data:
MyTableID UserID Counter
--------- ------- --------
1 1 4
2 1 7
3 4 3
4 11 9
5 11 3
6 4 6
...
9 11 9
I want these results for the duplicate MAX values, select the first occurance in whatever order SQL server selects them. Which rows are returned isn't important in this case as long as the UserID/Counter pairs are distinct:
MyTableID UserID Counter
--------- ------- --------
2 1 7
4 11 9
6 4 6
I like to use a Common Table Expression for that case, with a suitable ROW_NUMBER() function in it:
WITH MaxPerUser AS
(
SELECT
MyTableID, UserID, Counter,
ROW_NUMBER() OVER(PARTITION BY userid ORDER BY Counter DESC) AS 'RowNumber'
FROM dbo.MyTable
)
SELECT MyTableID, UserID, Counter
FROM MaxPerUser
WHERE RowNumber = 1
THat partitions the data over the UserID, orders it by Counter (descending) for each user, and then labels each of the rows starting with 1 for each user. Select only those rows with a 1 for rownumber and you have your max. values per user.
It's that easy :-) And I get results something like this:
MyTableID UserID Counter
2 1 7
6 4 6
4 11 9
Only one entry per user, no matter how many rows per user happen to have the same max value.
I think this will help you.
SELECT distinct(a.userid), MAX(a.counterid) as counterid
FROM mytable a INNER JOIN mytable b ON a.mytableid = b.mytableid
GROUP BY a.userid
There are several ways to do this, take a look at this Including an Aggregated Column's Related Values Several methods are shown including the performance differences
Here is one example
select t1.*
from(
select UserID, max(counter) as MaxCount
from MyTable
group by UserID) t2
join MyTable t1 on t2.UserID =t1.UserID
and t1.counter = t2.counter
Try this... I'm pretty sure this is the only way to truly make sure you get one row per User.
SELECT MT.*
FROM MyTable MT
INNER JOIN (
SELECT MAX(MID.MyTableId) AS MaxMyTableId,
MID.UserId
FROM MyTable MID
INNER JOIN (
SELECT MAX(Counter) AS MaxCounter, UserId
FROM MyTable
GROUP BY UserId
) AS MC
ON (MID.UserId = MC.UserId
AND MID.Counter = MC.MaxCounter)
GROUP BY MID.UserId
) AS MID
ON (MID.UserId = MC.UserId
AND MID.MyTableId = MC.MaxMyTableId)
select m.*
from MyTable m
inner join (
select UserID, max(Counter) as MaxCounter
from MyTable
group by UserID
) mm on m.UserID = mm.UserID and m.Counter = mm.MaxCounter