I use oracle 11g , so i read alot of artics about it but i dont understand
how exactly its happened in database , so lets say that have two tables:
select * from Employee
select * from student
so when we want to make group by in multi columns :
SELECT SUBJECT, YEAR, Count(*)
FROM Student
GROUP BY SUBJECT, YEAR;
so my question is: what exactly happened in database ? i mean the query count(*) do first in every column in group by and then sort it ? or what? can any one explain it in details ?.
SQL is a descriptive language, not a procedural language.
What the query does is determine all rows in the original data where the group by keys are the same. It then reduces them to one row.
For example, in your data, these all have the same data:
subject year name
English 1 Harsh
English 1 Pratik
English 1 Ramesh
You are saying to group by subject, year, so these become:
Subject Year Count(*)
English 1 3
Often, this aggregation is implemented using sorting. However, that is up to the database -- and there are many other algorithms. You cannot assume that the database will sort the data. But, if it easier for you to think of it, you can think of the data being sorted by the group by keys, in order to identify the groups. Just one caution, the returned values are not necessarily in any particular order (unless your query includes an order by).
Related
I have a problem getting the same value multiple times and I don't know what I am doing wrong, it's probably something very simple but nothing seems to work for me, and as I said, I need it for a school project and I have only been doing this for about a week.
This is my code:
select hobby
from preshobby
order by hobby asc
When I click execute I get the same value a couple of times. For example:
Wrestling
Wlking
Walking
Walking
Walking
Walking
Touch Football
Tennis
I need the result to be in ascending order and each value should only appear once.
Use distinct:
select distinct hobby
from preshobby
order by hobby
Note that you don't need to specify asc with order by as ascending is the default sort order in most versions of SQL.
In your table you have probably many entries with repeated hobbies. So you need to group them like this
select hobby
from preshobby
group by hobby order by hobby asc
You are basically selecting all the values of hobbies you have entered in the database column.. Since there are many people with same hobby.. when you query the table for the column, you see repetitive values. Use distinct like this..
select distinct hobby from table Name;
And default order is asc so you need not specify any value unless you need it descending.
I have a table called correctObjects. In this tablet here a lot of grups which has different number records. One example is given below as grup 544 has 5 rows in table. So firstly, I should group all records by GRUP COLUMN then I must do inner matching by CAP COLUMN. So in grup#544 there is three different CAP values then I must give Inner Group number to these records. How can I do these two level grouping process. GRUP column is already done. Inner Grup Column is null in every records.
After Inner Group process, It must look like as belows:
I am using Oracle 11g R2 and PL/SQL Developer
Your question lacks certain details, so I'll just give you a starting point, and you can tweak it to suit your needs.
It's not entirely clear, but the way I understand it, you want to rank the different rows by cap. And I think the ranking is independent for every distinct grup value.
What's not clear to me is why 125 mm is ranked 1, and 62 mm is ranked 2. Is it based on the value? Is it based on which row is the first one, and if so, how are the rows ordered? Or maybe you don't really care which one is first or second, as long as they are grouped correctly. I'll have to assume the latter.
In any case, it sounds like you want to use the dense_rank() analytic function in some form:
select mip, startmi, cap, grup,
dense_rank() over (partition by grup order by cap) as inner_grup
from tbl
So I'm taking a course on learning basic SQL (using Oracle), and I felt like I had become fairly fluent with using SELECT statements (grouping, joining, having, etc), but now I'm at a loss on how to deal with this latest problem.
I need to write a statement that would only display rows with more than one piece of data. So, say I had
COMPANY PRODUCT
One Car
One Book
Two Game
it should only list company 'One'. But I can't find anything online to help me.
Select Company
From YourTableName
Group By Company
Having Count(*) > 1
better way to know count of each company is :
Select Company,Count(*)
From Table
Group By Company
Having Count(*) > 1
Trying to solve some queries from sql zoo to practice sql skills
I have a table nobel with columns (year, subject, winner) which have information on the people who have earned noble prizes in a given year for a given subject.
So I assume the primary key to be composite of (year, subject, winner).
The problem I am trying to solve is: Show winners who have won more than one subject.
The output of the SQL query should have just one column with the winner names.
I feel that I should be using group by and using having(count(winner)) >1. But I think I need to group by by subject and that's where my problem is.
I am not looking for a query. If you can provide me with more logic than query that would help. Also do not worry about the database it needs to be implemented on. I am just practicing these questions.
This should give you people who won on multiple different subjects in any number of years:
SELECT winner
FROM nobel
GROUP BY winner
HAVING COUNT(DISTINCT subject) > 1
For a specific year, just add WHERE year = <whatever> (or GROUP BY year for all years), though no person in history won a Nobel for two subjects in a single year - but who knows what future brings ;)
You should really consider creating a SQLite or MySQL database on your local machine so that you can practice making queries.
The primary key is irrelevant to your question, don't worry about it.
You are thinking along the correct lines - you will need a group by and having clause. If you group by winner, then your proposed query should work. I've added a count of wins.
select
winner, count(winner) as wins
from
nobel
group by winner
having (count(winner)) > 1
SELECT winner
FROM nobel
GROUP BY winner
HAVING COUNT(winner) > 1
This is the simplest way to do it ......
I'm trying to make sense of the right way to use JOIN, COUNT(*), and GROUP BY to do a pretty simple query. I've actually gotten it to work (see below) but from what I've read, I'm using an extra GROUP BY that I shouldn't be.
(Note: The problem below isn't my actual problem (which deals with more complicated tables), but I've tried to come up with an analogous problem)
I have two tables:
Table: Person
-------------
key name cityKey
1 Alice 1
2 Bob 2
3 Charles 2
4 David 1
Table: City
-------------
key name
1 Albany
2 Berkeley
3 Chico
I'd like to do a query on the People (with some WHERE clause) that returns
the number of matching people in each city
the key for the city
the name of the city.
If I do
SELECT COUNT(Person.key) AS count, City.key AS cityKey, City.name AS cityName
FROM Person
LEFT JOIN City ON Person.cityKey = City.key
GROUP BY Person.cityKey, City.name
I get the result that I want
count cityKey cityName
2 1 Albany
2 2 Berkeley
However, I've read that throwing in that last part of the GROUP BY clause (City.name) just to make it work is wrong.
So what's the right way to do this? I've been trying to google for an answer, but I feel like there's something fundamental that I'm just not getting.
I don't think that it's "wrong" in this case, because you've got a one-to-one relationship between city name and city key. You could rewrite it such that you join to a sub-select to get the count of persons to cities by key, to the city table again for the name, but it's debatable that that'd be better. It's a matter of style and opinion I guess.
select PC.ct, City.key, City.name
from City
join (select count(Person.key) ct, cityKey key from Person group by cityKey) PC
on City.key = PC.key
if my SQL isn't too rusty :-)
...I've read that throwing in that last part of the GROUP BY clause (City.name) just to make it work is wrong.
You misunderstand, you got it backwards.
Standard SQL requires you to specify in the GROUP BY all the columns mentioned in the SELECT that are not wrapped in aggregate functions. If you don't want certain columns in the GROUP BY, wrap them in aggregate functions. Depending on the database, you could use the analytic/windowing function OVER...
However, MySQL and SQLite provide the "feature" where you can omit these columns from the group by - which leads to no end of "why doesn't this port from MySQL to fill_in_the_blank database?!" Stackoverflow and numerous other sites & forums.
However, I've read that throwing in
that last part of the GROUP BY clause
(City.name) just to make it work is
wrong.
It's not wrong. You have to understand how the Query Optimizer sees your query. The order in which it is parsed is what requires you to "throw the last part in." The optimizer sees your query in something akin to this order:
the required tables are joined
the composite dataset is filtered through the WHERE clause
the remaining rows are chopped into groups by the GROUP BY clause, and aggregated
they are then filtered again, through the HAVING clause
finally operated on, by SELECT / ORDER BY, UPDATE or DELETE.
The point here is that it's not that the GROUP BY has to name all the columns in the SELECT, but in fact it is the opposite - the SELECT cannot include any columns not already in the GROUP BY.
Your query would only work on MySQL, because you group on Person.cityKey but select city.key. All other databases would require you to use an aggregate like min(city.key), or to add City.key to the group by clause.
Because the combination of city name and city key is unique, the following are equivalent:
select count(person.key), min(city.key), min(city.name)
...
group by person.citykey
Or:
select count(person.key), city.key, city.name
...
group by person.citykey, city.key, city.name
Or:
select count(person.key), city.key, max(city.name)
...
group by city.key
All rows in the group will have the same city name and key, so it doesn't matter if you use the max or min aggregate.
P.S. If you'd like to count only different persons, even if they have multiple rows, try:
count(DISTINCT person.key)
instead of
count(person.key)