solution for writing query - sql

I have a match table with winningteamid and stadiumid as attributes,
I need to retrieve the winningteamid which has won all its games in the same stadium.
I tried this, and I'm getting additional unwanted rows:
select winningteamid
from match
group by winningteamid
having count(winningteamid) in (
select count(*) from match group by (winningteamid,stadiumid)

Try this please (please write which RDBMS you are using):
with cte as (
select winningteamid, stadiumid, count(stadiumid) over (partition by winningteamid) as count
from match
group by winningteamid, stadiumid
)
select * from cte where count = 1;

You should use this:
SELECT MAX(winningteamid)
FROM (
SELECT DISTINCT winningteamid, stadiumid
FROM match
)
GROUP BY stadiumid
HAVING COUNT(*) = 1;

I suppose it's as easy as using HAVING to verify only one distinct stadiumid:
select winningteamid
from match
group by winningteamid
having count(distinct stadiumid) = 1

Related

best way to get count and distinct count of rows in single query

What is the best way to get count of rows and distinct rows in a single query?
To get distinct count we can use subquery like this:
select count(*) from
(
select distinct * from table
)
I have 15+ columns and have many duplicates rows as well and I want to calculate count of rows as well as distinct count of rows in one query.
More if I use this
select count(*) as Rowcount , count(distinct *) as DistinctCount from table
This will not give accurate results as count(distinct *) doesn't work.
Why don't you just put the subquery inside another query?
select count(*),
(select count(*) from (select distinct * from table))
from table;
create table tbl
(
col int
);
insert into tbl values(1),(2),(1),(3);
select count(*) as distinct_count, sum(sum) as all_count
from (
select count(col) sum from tbl group by col
)A
I think I have understood what you are looking for. You need to use some window function. So, you query should be look like =>
Select COUNT(*) OVER() YourRowcount ,
COUNT(*) OVER(Partition BY YourColumnofGroup) YourDistinctCount --Basic of the distinct count
FROM Yourtable
NEW Update
select top 1
COUNT(*) OVER() YourRowcount,
DENSE_RANK() OVER(ORDER BY YourColumn) YourDistinctCount
FROM Yourtable ORDER BY TT DESC
Note: This code is written sql server. Please check the code and let me know.

How to use count & distinct together for all columns

I am getting error while executing the following query.
SELECT COUNT(distinct *) AS "total unique records" FROM table
Please help me to resolve this issue
Because * is not allowed there. If you want to do this, use a subquery:
select count(*)
from (select distinct t.*
from t
) t;
You probably want :
select count(*)
from (select distinct t.*
from table t
) t;
Use this code too count row
SELECT COUNT(*) AS TotalRecords FROM table
AND use that too count row and return order columns
SELECT COUNT(*) OVER() AS TotalRecords,* FROM table

Select a NON-DISTINCT column in a query that return distincts rows

The following query returns the results that I need but I have to add the ID of the row to then update it. If I add the ID directly in the select statement it will return me more results then I need because each ID is unique so the DISTINCT statement see the line as unique.
SELECT DISTINCT ucpse.MemberID, ucpse.ProductID, ucpse.UserID
FROM UserCustomerProductSalaryExceptions as ucpse
WHERE EXISTS (SELECT NULL
FROM UserCustomerProductSalaryExceptions as upcse2
WHERE ucpse.userid = upcse2.userid AND ucpse.MemberID = upcse2.MemberID AND ucpse.ProductID = upcse2.ProductID
GROUP BY upcse2.UserID, upcse2.memberid, upcse2.productid
HAVING COUNT(UserID) >= 2
)
So basically I need to add ucpse.ID in the Select statement while keeping DISTINCT values for MemberID,ProductID and UserID.
Any Ideas ?
Thank you
According to you comment:
If the data has been duplicated 67 times for a given employee with a given product and a given client, I need to keep only one of thoses records. It's not important which one, so this is why I use DISTINC to obtain unique combinaison of given employee with a given product and a given client.
You can use MIN() or MAX() and GROUP BY instead of DISTINCT
SELECT MAX(ucpse.ID) AS ID, ucpse.MemberID, ucpse.ProductID, ucpse.UserID
FROM UserCustomerProductSalaryExceptions as ucpse
WHERE EXISTS (SELECT NULL
FROM UserCustomerProductSalaryExceptions as upcse2
WHERE ucpse.userid = upcse2.userid AND ucpse.MemberID = upcse2.MemberID AND ucpse.ProductID = upcse2.ProductID
GROUP BY upcse2.UserID, upcse2.memberid, upcse2.productid
HAVING COUNT(UserID) >= 2
)
GROUP BY ucpse.MemberID, ucpse.ProductID, ucpse.UserID
UPDATE:
From you comments I think the below query is what you need
DELETE FROM UserCustomerProductSalaryExceptions
WHERE ID NOT IN ( SELECT MAX(ucpse.ID) AS ID
FROM #UserCustomerProductSalaryExceptions
GROUP BY ucpse.MemberID, ucpse.ProductID, ucpse.UserID
HAVING COUNT(ucpse.ID) >= 2
)
If all you want is to delete the duplicates, this will do it:
WITH X AS
(SELECT ID,
ROW_NUMBER() OVER (PARTITION BY MemberID, ProductID, UserID ORDER BY ID) AS DupRowNum<br
FROM UserCustomerProductSalaryExceptions
)
DELETE X WHERE DupRowNum > 1
ID's not necessary - try:
UPDATE uu SET
<your settings here>
FROM UserCustomerProductSalaryExceptions uu
JOIN ( <paste your entire query above here>
) uc ON uc.MemberID=uu.MemberId AND uc.ProductID=uu.ProductId AND uc.UserID=uu.UserId
From the sound of your data structure (which I would STRONGLY advise normalizing as soon as possible), it sounds like you should be updating all the records. It sounds as if each duplicate is important because it contains some information about an employee's relation to a customer or product.
I would probably update all the records. Try this:
UPDATE UCPSE
SET
--Do your updates here
FROM UserCustomerProductSalaryExceptions as ucpse
JOIN
(
SELECT UserID, MemberID, ProductID
FROM UserCustomerProductSalaryExceptions
GROUP BY UserID, MemberID, ProductID
HAVING COUNT(UserID) >= 2
) T
ON ucpse.UserID = T.UserID AND ucpse.MemberID = T.MemberID AND ucpse.ProductID = T.ProductID

Compare SQL groups against eachother

How can one filter a grouped resultset for only those groups that meet some criterion compared against the other groups? For example, only those groups that have the maximum number of constituent records?
I had thought that a subquery as follows should do the trick:
SELECT * FROM (
SELECT *, COUNT(*) AS Records
FROM T
GROUP BY X
) t HAVING Records = MAX(Records);
However the addition of the final HAVING clause results in an empty recordset... what's going on?
In MySQL (Which I assume you are using since you have posted SELECT *, COUNT(*) FROM T GROUP BY X Which would fail in all RDBMS that I know of). You can use:
SELECT T.*
FROM T
INNER JOIN
( SELECT X, COUNT(*) AS Records
FROM T
GROUP BY X
ORDER BY Records DESC
LIMIT 1
) T2
ON T2.X = T.X
This has been tested in MySQL and removes the implicit grouping/aggregation.
If you can use windowed functions and one of TOP/LIMIT with Ties or Common Table expressions it becomes even shorter:
Windowed function + CTE: (MS SQL-Server & PostgreSQL Tested)
WITH CTE AS
( SELECT *, COUNT(*) OVER(PARTITION BY X) AS Records
FROM T
)
SELECT *
FROM CTE
WHERE Records = (SELECT MAX(Records) FROM CTE)
Windowed Function with TOP (MS SQL-Server Tested)
SELECT TOP 1 WITH TIES *
FROM ( SELECT *, COUNT(*) OVER(PARTITION BY X) [Records]
FROM T
)
ORDER BY Records DESC
Lastly, I have never used oracle so apolgies for not adding a solution that works on oracle...
EDIT
My Solution for MySQL did not take into account ties, and my suggestion for a solution to this kind of steps on the toes of what you have said you want to avoid (duplicate subqueries) so I am not sure I can help after all, however just in case it is preferable here is a version that will work as required on your fiddle:
SELECT T.*
FROM T
INNER JOIN
( SELECT X
FROM T
GROUP BY X
HAVING COUNT(*) =
( SELECT COUNT(*) AS Records
FROM T
GROUP BY X
ORDER BY Records DESC
LIMIT 1
)
) T2
ON T2.X = T.X
For the exact question you give, one way to look at it is that you want the group of records where there is no other group that has more records. So if you say
SELECT taxid, COUNT(*) as howMany
GROUP by taxid
You get all counties and their counts
Then you can treat that expressions as a table by making it a subquery, and give it an alias. Below I assign two "copies" of the query the names X and Y and ask for taxids that don't have any more in one table. If there are two with the same number I'd get two or more. Different databases have proprietary syntax, notably TOP and LIMIT, that make this kind of query simpler, easier to understand.
SELECT taxid FROM
(select taxid, count(*) as HowMany from flats
GROUP by taxid) as X
WHERE NOT EXISTS
(
SELECT * from
(
SELECT taxid, count(*) as HowMany FROM
flats
GROUP by taxid
) AS Y
WHERE Y.howmany > X.howmany
)
Try this:
SELECT * FROM (
SELECT *, MAX(Records) as max_records FROM (
SELECT *, COUNT(*) AS Records
FROM T
GROUP BY X
) t
) WHERE Records = max_records
I'm sorry that I can't test the validity of this query right now.

nested select in sql server 2008

I save user logins in a table that named loginstats, I want to retrieve last login of every users, I use these code but I meet some error, What is my mistake?
select *
from loginStats
where id in (
select distinct username, MAX(id) as id
from loginStats
group by username)
You are doing id IN, but are trying to compare it to multiple columns. Try this instead:
SELECT A.*
FROM LoginStats A
INNER JOIN (SELECT DISTINCT username, MAX(id) as id
FROM loginStats
GROUP BY username) B
ON A.username = B.username AND A.id = B.id
select *
from loginStats
where id in (
select distinct MAX(id) as id
from loginStats
group by username)
You can't have multiple field outputs in your IN subquery.
Or you could also do this like this:
;WITH CTE
AS
(
SELECT
RANK() OVER(
PARTITION BY loginStats.username
ORDER BY loginStats.id DESC
) AS iRank,
loginStats.*
FROM
loginStats
)
SELECT
*
FROM
CTE
WHERE
CTE.iRank=1
Here is some information how to use the rank function and how it is applied.
Here is some information on msdn about the rank function.
Here is some information about cte function and usage
Here is some information about With clause and how it is used
Hope it helps you understand
The mistake is that in the IN clause only one column can be specified and you are specifying two: username and id.
You're selecting two fields, where you should only return one. This correlated subquery should work for you:
select *
from loginStats AS a
where id = (
select MAX(id) as id
from loginStats AS b
where b.username = a.username)