SQL Server - Possible Pivot Solution? - sql

I have a simple enough issue that has been surprisingly difficult to locate online. Perhaps I am searching on improper keywords so I wanted to stop in and ask you guys because your site has been a blessing with my studies. See below scenario:
Select student, count(*) as Total, (the unknown variable: book1, book2, book3, book4, ect...) from mystudies.
Essentially all I would like to do is list out all books for a unique student id that matches the Total count. Could someone point me in the right direction, a good read or anything, so I can get a step going in the correct direction? I am assuming it would be done via a left join (not sure how to do the x1, x2, x3 part) and then just link the two by the unique student id number (no duplicates) but everyone online points to pivot but pivot appears to put all the rows into columns instead of one single column. SQL server 2005 is the platform of choice.
Thanks!
Sorry
The following query produces my unique id (the student) and the student's count for all duplicate entries in the table:
select student, count(*) as Total
from mystudies
group by student order by total desc
the part I don't know is how to create the left join on the table unique id (boookid)
select mystudies1.student, mystudies1.total, mystudies2.bookid
from ( select student, count(*) as Total
from mystudies
group by student
) mystudies1
left join
( select student, bookid
from mystudies
) mystudies2
on mystudies1.student=mystudies2.student
order by mystudies1.total desc, mystudies1.student asc
Obviously the above row will produce results similar to the following:
Student Total BookID
000001 3 100001
000001 3 100002
000001 3 100003
000002 2 200001
000002 2 200002
000003 1 300001
But what I actually want is something similar to the following:
Student Total BookID
000001 3 100001, 100002, 100003
000002 2 200001, 200002
000003 1 300001
I assumed it had to be done in a left join so that it didn't alter the actual count being performed on the student. thanks!

In SQL-Server use the FOR XML Path Method:
SELECT Student,
Total,
STUFF(( SELECT ', ' + BookID
FROM MyStudies books
WHERE Books.Student = MyStudies.Student
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(MAX)'), 1, 2, '') AS Books
FROM ( SELECT Student, COUNT(*) AS Total
FROM myStudies
GROUP BY Student
) MyStudies
I have previously given a full explanation of how the XML PATH Method works here. With a further improvement to my answer pointed out here
SQL Server Fiddle
In MySQL AND SQLite you can use the GROUP_CONCAT function:
SELECT Student,
COUNT(*) AS Total,
GROUP_CONCAT(BookID) AS Books
FROM myStudies
GROUP BY Student
MySQL Fiddle
SQLite Fiddle
In Postgresql you can use the ARRAY_AGG Function:
SELECT Student,
COUNT(*) AS Total,
ARRAY_AGG(BookID) AS Books
FROM myStudies
GROUP BY Student
Postgresql Fiddle
In oracle you can use the LISTAGG Function
SELECT Student,
COUNT(*) AS Total,
LISTAGG(BookID, ', ') WITHIN GROUP (ORDER BY BookID) AS Books
FROM myStudies
GROUP BY Student
Oracle SQL Fiddle

Related

How to return all names that appear multiple times in table [duplicate]

This question already has answers here:
What's the SQL query to list all rows that have 2 column sub-rows as duplicates?
(10 answers)
Closed last year.
Suppose I have the following schema:
student(name, siblings)
The related table has names and siblings. Note the number of rows of the same name will appear the same number of times as the number of siblings an individual has. For instance, a table could be as follows:
Jack, Lucy
Jack, Tim
Meaning that Jack has Lucy and Tim as his siblings.
I want to identify an SQL query that reports the names of all students who have 2 or more siblings. My attempt is the following:
select name
from student
where count(name) >= 1;
I'm not sure I'm using count correctly in this SQL query. Can someone please help with identifying the correct SQL query for this?
You're almost there:
select name
from student
group by name
having count(*) > 1;
HAVING is a where clause that runs after grouping is done. In it you can use things that a grouping would make available (like counts and aggregations). By grouping on the name and counting (filtering for >1, if you want two or more, not >=1 because that would include 1) you get the names you want..
This will just deliver "Jack" as a single result (in the example data from the question). If you then want all the detail, like who Jack's siblings are, you can join your grouped, filtered list of names back to the table:
select *
from
student
INNER JOIN
(
select name
from student
group by name
having count(*) > 1
) morethanone ON morethanone.name = student.name
You can't avoid doing this "joining back" because the grouping has thrown the detail away in order to create the group. The only way to get the detail back is to take the name list the group gave you and use it to filter the original detail data again
Full disclosure; it's a bit of a lie to say "can't avoid doing this": SQL Server supports something called a window function, which will effectively perform a grouping in the background and join it back to the detail. Such a query would look like:
select student.*, count(*) over(partition by name) n
from student
And for a table like this:
jack, lucy
jack, tim
jane, bill
jane, fred
jane, tom
john, dave
It would produce:
jack, lucy, 2
jack, tim, 2
jane, bill, 3
jane, fred, 3
jane, tom, 3
john, dave, 1
The rows with jack would have 2 on because there are two jack rows. There are 3 janes, there is 1 john. You could then wrap all that in a subquery and filter for n > 1 which would remove john
select *
from
(
select student.*, count(*) over(partition by name) n
from student
) x
where x.n > 1
If SQL Server didn't have window functions, it would look more like:
select *
from
student
INNER JOIN
(
select name, count(*) as n
from student
group by name
) x ON x.name = student.name
The COUNT(*) OVER(PARTITION BY name) is like a mini "group by name and return the count, then auto join back to the main detail using the name as key" i.e. a short form of the latter query
You can do:
select name
from student as s1
where exists (
select s2
from student as s2
where s1.name = s2.name and s1.siblings != s2.siblings
)
I think the best approach is what 'Caius Jard' mentioned. However, additional way if you want to get how many siblings each name has .
SELECT name, COUNT(*) AS Occurrences
FROM student
GROUP BY name
HAVING (COUNT(*) > 1)
I wanted to share another solution I came up with:
select s1.name
from student s1, student s2
where s1.name = s2.name and s1.sibling != s2.sibling;

How to concatenate rows delimited with comma using standard SQL?

Let's suppose we have a table T1 and a table T2. There is a relation of 1:n between T1 and T2. I would like to select all T1 along with all their T2, every row corresponding to T1 records with T2 values concatenated, using only SQL-standard operations.
Example:
T1 = Person
T2 = Popularity (by year)
for each year a person has a certain popularity
I would like to write a selection using SQL-standard operations, resulting something like this:
Person.Name Popularity.Value
John Smith 1.2,5,4.2
John Doe NULL
Jane Smith 8
where there are 3 records in the popularity table for John Smith, none for John Doe and one for Jane Smith, their values being the values represented above. Is this possible? How?
I'm using Oracle but would like to do this using only standard SQL.
Here's one technique, using recursive Common Table Expressions. Unfortunately, I'm not confident on its performance.
I'm sure that there are ways to improve this code, but it shows that there doesn't seem to be an easy way to do something like this using just the SQL standard.
As far as I can see, there really should be some kind of STRINGJOIN aggregate function that would be used with GROUP BY. That would make things like this much easier...
This query assumes that there is some kind of PersonID that joins the two relations, but the Name would work too.
WITH cte (id, Name, Value, ValueCount) AS (
SELECT id,
Name,
CAST(Value AS VARCHAR(MAX)) AS Value,
1 AS ValueCount
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Name) AS id,
Name,
Value
FROM Person AS per
INNER JOIN Popularity AS pop
ON per.PersonID = pop.PersonID
) AS e
WHERE id = 1
UNION ALL
SELECT e.id,
e.Name,
cte.Value + ',' + CAST(e.Value AS VARCHAR(MAX)) AS Value,
cte.ValueCount + 1 AS ValueCount
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Name) AS id,
Name,
Value
FROM Person AS per
INNER JOIN Popularity AS pop
ON per.PersonID = pop.PersonID
) AS e
INNER JOIN cte
ON e.id = cte.id + 1
AND e.Name = cte.Name
)
SELECT p.Name, agg.Value
FROM Person p
LEFT JOIN (
SELECT Name, Value
FROM (
SELECT Name,
Value,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY ValueCount DESC)AS id
FROM cte
) AS p
WHERE id = 1
) AS agg
ON p.Name = agg.Name
This is an example result:
--------------------------------
| Name | Value |
--------------------------------
| John Smith | 1.2,5,4.2 |
--------------------------------
| John Doe | NULL |
--------------------------------
| Jane Smith | 8 |
--------------------------------
As per in Oracle you can use listagg to achive this -
select t1.Person_Name, listagg(t2.Popularity_Value)
within group(order by t2.Popularity_Value)
from t1, t2
where t1.Person_Name = t2.Person_Name (+)
group by t1.Person_Name
I hope this will solve your problem.
But the comment you have given after #DavidJashi question .. well this is not sql standard and I think he is correct. I am also with David that you can not achieve this in pure sql statement.
I know that I'm SUPER late to the party, but for anyone else that might find this, I don't believe that this is possible using pure SQL92. As I discovered in the last few months fighting with NetSuite to try to figure out what Oracle methods I can and cannot use with their ODBC driver, I discovered that they only "support and guarantee" SQL92 standard.
I discovered this, because I had a need to perform a LISTAGG(). Once I found out I was restricted to SQL92, I did some digging through the historical records, and LISTAGG() and recursive queries (common table expressions) are NOT supported in SQL92, at all.
LISTAGG() was added in Oracle SQL version 11g Release 2 (2009 – 11 years ago: reference https://oracle-base.com/articles/misc/string-aggregation-techniques#listagg) , CTEs were added to Oracle SQL in version 9.2 (2007 – 13 years ago: reference https://www.databasestar.com/sql-cte-with/).
VERY frustrating that it's completely impossible to accomplish this kind of effect in pure SQL92, so I had to solve the problem in my C# code after I pulled a ton of extra unnecessary data. Very frustrating.

SQL: How to get the AVG(MIN(number))?

I am looking for the AVERAGE (overall) of the MINIMUM number (grouped by person).
My table looks like this:
Rank Name
1 Amy
2 Amy
3 Amy
2 Bart
1 Charlie
2 David
5 David
1 Ed
2 Frank
4 Frank
5 Frank
I want to know the AVERAGE of the lowest scores. For these people, the lowest scores are:
Rank Name
1 Amy
2 Bart
1 Charlie
2 David
1 Ed
2 Frank
Giving me a final answer of 1.5 - because three people have a MIN(Rank) of 1 and the other three have a MIN(Rank) of 2. That's what I'm looking for - a single number.
My real data has a couple hundred rows, so it's not terribly big. But I can't figure out how to do this in a single, simple statement. Thank you for any help.
Try this:
;WITH MinScores
AS
(
SELECT
"Rank",
Name,
ROW_NUMBER() OVER(PARTITION BY Name ORDER BY "Rank") row_num
FROM Table1
)
SELECT
CAST(SUM("Rank") AS DECIMAL(10, 2)) /
COUNT("Rank")
FROM MinScores
WHERE row_num = 1;
SQL Fiddle Demo
Selecting the set of minimum values is straightforward. The cast() is necessary to avoid integer division later. You could also avoid integer division by casting to float instead of decimal. (But you should be aware that floats are "useful approximations".)
select name, cast(min(rank) as decimal) as min_rank
from Table1
group by name
Now you can use the minimums as a common table expression, and select from it.
with minimums as (
select name, cast(min(rank) as decimal) as min_rank
from Table1
group by name
)
select avg(min_rank) avg_min_rank
from minimums
If you happen to need to do the same thing on a platform that doesn't support common table expressions, you can a) create a view of minimums, and select from that view, or b) use the minimums as a derived table.
You might try using a derived table to get the minimums, then get the average minimum in the outer query, as in:
-- Get the avg min rank as a decimal
select avg(MinRank * 1.0) as AvgRank
from (
-- Get everyone's min rank
select min([Rank]) as MinRank
from MyTable
group by Name
) as a
I think the easiest one will be
for max
select name , max_rank = max(rank)
from table
group by name;
for average
select name , avg_rank = avg(rank)
from table
cgroup by name;

write a query to identify discrepancy

I have a table with Student ID's and Student Names. There has been issues with assigning unique Student Id's to students and Hence I want to find the duplicates
Here is the sample Table:
Student ID Student Name
1 Jack
1 John
1 Bill
2 Amanda
2 Molly
3 Ron
4 Matt
5 James
6 Kathy
6 Will
Here I want a third column "Duplicate_Count" to display count of duplicate records.
For e.g. "Duplicate_Count" would display "3" for Student ID = 1 and so on. How can I do this?
Thanks in advance
Select StudentId, Count(*) DupCount
From Table
Group By StudentId
Having Count(*) > 1
Order By Count(*) desc,
Select
aa.StudentId, aa.StudentName, bb.DupCount
from
Table as aa
join
(
Select StudentId, Count(*) as DupCount from Table group by StudentId
) as bb
on aa.StudentId = bb.StudentId
The virtual table gives the count for each StudentId, this is joined back to the original table to add the count to each student record.
If you want to add a column to the table to hold dupcount, this query can be used in an update statement to update that column in the table
This should work:
update mytable
set duplicate_count = (select count(*) from mytable t where t.id = mytable.id)
UPDATE:
As mentioned by #HansUp, adding a new column with the duplicate count probably doesn't make sense, but that really depends on what the OP originally thought of using it for. I'm leaving the answer in case it is of help for someone else.

count without group

I have one table named GUYS(ID,NAME,PHONE) and i need to add a count of how many guys have the same name and at the same time show all of them so i can't group them.
example:
ID NAME PHONE
1 John 335
2 Harry 444
3 James 367
4 John 742
5 John 654
the wanted output should be
ID NAME PHONE COUNT
1 John 335 3
2 Harry 444 1
3 James 367 1
4 John 742 3
5 John 654 3
how could i do that? i only manage to get lot of guys with different counts.
thanks
Update for 8.0+: This answer was written well before MySQL version 8, which introduced window functions with mostly the same syntax as the existing ones in Oracle.
In this new syntax, the solution would be
SELECT
t.name,
t.phone,
COUNT('x') OVER (PARTITION BY t.name) AS namecounter
FROM
Guys t
The answer below still works on newer versions as well, and in this particular case is just as simple, but depending on the circumstances, these window functions are way easier to use.
Older versions: Since MySQL, until version 8, didn't have analytical functions like Oracle, you'd have to resort to a sub-query.
Don't use GROUP BY, use a sub-select to count the number of guys with the same name:
SELECT
t.name,
t.phone,
(SELECT COUNT('x') FROM Guys ct
WHERE ct.name = t.name) as namecounter
FROM
Guys t
You'd think that running a sub-select for every row would be slow, but if you've got proper indexes, MySQL will optimize this query and you'll see that it runs just fine.
In this example, you should have an index on Guys.name. If you have multiple columns in the where clause of the subquery, the query would probably benefit from a single combined index on all of those columns.
Use an aggregate Query:
select g.ID, g.Name, g.Phone, count(*) over ( partition by g.name ) as Count
from
Guys g;
You can still use a GROUP BY for the count, you just need to JOIN it back to your original table to get all the records, like this:
select g.ID, g.Name, g.Phone, gc.Count
from Guys g
inner join (
select Name, count(*) as Count
from Guys
group by Name
) gc on g.Name = gc.Name
In Oracle DB you can use
SELECT ID,NAME,PHONE,(Select COUNT(ID)From GUYS GROUP BY Name)
FROM GUYS ;
DECLARE #tbl table
(ID int,NAME varchar(20), PHONE int)
insert into #tbl
select
1 ,'John', 335
union all
select
2 ,'Harry', 444
union all
select
3 ,'James', 367
union all
select
4 ,'John', 742
union all
select
5 ,'John', 654
SELECT
ID
, Name
, Phone
, count(*) over(partition by Name)
FROM #tbl
ORDER BY ID
select id, name, phone,(select count(name) from users u1 where u1.name=u2.name) count from users u2
try
select column1, count(1) over ()
it should help