Sql select the winner - sql

having this database, bold = PK
CERTIFICATE(USERID, CERTIFICATENAME)
i need to find the userid with the maximum number of certificates with a SQL query.
sample data:
USERID, CERTIFICATENAME
1,cert1
1,cert2
1,cert3
2,cert4
2,cert5
3,cert2
4,cert1
with this sample data i need a query for find that user:1 has 3 certificates, this user has the maximum number of certificates.
request result:
USERID, COUNT
1,3
in this case my dbms is oracle, but i'm looking for a generic sql solution to my problem.

Using old plain group by:
select top 1 userid, count(certificatename) total
from certificates
group by userid -- but not certificatename
order by 2 desc --you can use total or count(certificatname) here
Common Table Expressions (CTE) don't add any performance preferences because you need group by in any case.

As a subquery:
SELECT MAX(Total), UserId FROM -- select the max count
( -- create the counts per user
SELECT Count(CertificateName) as Total,
UserId
FROM YourTable
GROUP BY CertificateName, UserId
) GROUP BY Total, UserId

Related

Get Total Sum with User Sum

SQL Table:
UserId ReportsRead
1 4
2 6
3 5
I would like to query that table so that I can get the following out:
UserId ReportsRead TotalReports
1 4 15
The problem is that because I apply the WHERE clause the sum I get will be the same as users reports read.
SELECT UserId, ReportsRead, SUM(ReportsRead) AS TotalReports FROM MyTable WHERE UserId = 1
Is there a built in function that will allow me to do this? I would like to avoid Sub-queries entirely.
I don't usually recommend subqueries in this situation, but in this case, it seems like a simple approach:
SELECT UserId, ReportsRead,
(SELECT SUM(ReportsRead) from MyTable) AS TotalReports
FROM MyTable
WHERE UserId = 1;
If you want rows for all users, then window functions are the way to go:
select t.*, sum(reportsread) over () as totalreports
from mytable;
However, you can't include a where clause and still expect to get the correct total.
Use the sum window function.
SELECT UserId, ReportsRead, SUM(ReportsRead) OVER() AS TotalReports
FROM MyTable
Use a filtering condition to get a specific userId like
SELECT *
FROM (SELECT UserId, ReportsRead, SUM(ReportsRead) OVER() AS TotalReports
FROM MyTable
) t
WHERE UserId=1

SQL statement to get the MIN() from the AVG() returned from second query

I got a question regarding subquery in SQL statement. What I am trying to do is to find a minimum time with the average column result returned from another query.
SELECT userID
FROM myTable
WHERE time = MIN(...)
SELECT userID, AVG(date_time)
FROM myTable
GROUP BY userID
The second query will return me the average between two times and group by a third party.
Then my first query need to find the minimum average time return from my second query. How can I combine both of the queries together?
The sample data for my second query is like:
user1 20
user2 45
user3 10
Then for my first query, I need to get the user with minimum average:
user3 10
Thanks in advance.
If you want one row with the minimum average time, then you can do:
SELECT TOP 1 userID
FROM myTable
GROUP BY userID
ORDER BY AVG(date_time) ASC;
If you want multiple rows then use TOP WITH TIES:
SELECT TOP (1) WITH TIES userID
FROM myTable
GROUP BY userID
ORDER BY AVG(date_time) ASC;
Try this query:
SELECT TOP 1 userID
FROM myTable
GROUP BY userID
ORDER BY AVG(date_time) ASC

GroupBy Query Which Shows Related Data In Addition to Grouping Clause

Ugh, I know it's a terrible title, but I can't think of a way to summarize my question in a simple statement. It's a fairly basic T-SQL query question but I haven't used T-SQL much in the last year or so and my brain simply doesn't want to work today.
Basically I have a table with usernames (email address) and a client id. There can't be multiple emails per client, but there can be multiple emails for different clients. I'm trying to do a group on email addresses to get a count of how many emails are associated with 1 or more clients - that's the easy part. Where I'm struggling is trying to also list which client ids the email address is associated to.
For example, I have this query which gives me 1/2 of what I'm looking for:
select UserName, COUNT(*)
from UserTable
group by UserName
having COUNT(*) > 1
order by COUNT(*) desc
But I would also like to have either a row-per client, or even just multiple new columns showing each client id the email address is associated with such as:
user1#test.com 3
user1#test.com 34
user1#test.com 9
OR
user1#test.com 3 34 9
Any assistance is appreciated.
If you're using SQL-Server, you can use the COUNT window function:
SELECT UserName, UserId, COUNT(UserId) OVER (PARTITION BY UserName) AS Counts
FROM UserTable
Then to pick out only those with a count greater than 1:
SELECT * FROM (
SELECT UserName, UserId, COUNT(UserId) OVER (PARTITION BY UserName) AS Counts
FROM UserTable
) rows
WHERE rows.Counts > 1
To get them into the second format, you'd need to use some row concatenation strategy - FOR XML PATH is a popular one.
You can use FOR XML PATH:
Select UserName, COUNT(*),
substring(
(
Select ','+clientID AS [text()]
From UserTable UTI
Where UTI.UserName = UTO.UserName
ORDER BY UTI.clientID
For XML PATH ('')
), 2, 1000)
From UserTable UTO
group by UserName
having COUNT(*) > 1
order by COUNT(*) desc

Obtaining the most reoccurring attribute in SQL

Using SQL, I have a table with a list of usernames and I am trying to output the most repeated one with out using MAX. I am very new to SQL so any help would be much appreciated!
Thanks
You can use the aggregate function count() to get the total number of times a username is repeated:
select username, count(username) Total
from yourtable
group by username
order by total desc
Then depending on your database you can return the username that appears the most.
In MySQL, you can use LIMIT:
select username, count(username) Total
from yourtable
group by username
order by total desc
limit 1;
See SQL Fiddle with Demo
In SQL Server, you can use TOP:
select TOP 1 with Ties username, count(username) Total
from yourtable
group by username
order by total desc
See SQL Fiddle with Demo

Efficient query for the first result in groups (postgresql 9)

I have a table with 200000 rows and columns: name and date. The dates and names may have repeated values. I would like get the first 300 unique names for the dates sorted in an ascending order and have this run fast as my table may have a million rows.
I am using postgresql 9.
SELECT name, date
FROM
(
SELECT DISTINCT ON (name) name, date
FROM table
ORDER BY name, date
) AS id_date
ORDER BY date
LIMIT 300;
The last query of #jachguate will miss names having two dates on the same date, however this one doesn't.
The query takes about 100 ms in a non-optimized postgresql 9.1 with about 100.000 entries, thus it may not scale to millions of entries.
An upgrade to postgresql 9.2 may help, as according to the release notes there are many performance improvements
use a CTE:
with unique_date_name as (
select date, name, count(*) rcount
from table
group by date, name
having count(*) = 1
)
select name, date
from unique_date_name
order by date limit 300;
Edit
From the comments, this result in poor performance, so try this other:
select date, name, count(*) rcount
from table
group by date, name
having count(*) = 1
order by date limit 300;
or, transforming the original query into a nested subquery in FROM instead of a CTE:
select name, date
from (
select date, name, count(*) rcount
from table
group by date, name
having count(*) = 1
) unique_date_name
order by date limit 300;
unfortunately I don't have a postgreSQL at hand to check if it works, but the optimizer will make a better work.
A Index for (date, name) is a must for optimal performance.