Geting value count from an Oracle Table - sql

I have a table, that contains employees. Since the company I'm working for is quite big (>3k employees) It is only natural, that some of them have the same names. Now they can be differentiated by their usernames, but since a webpage needs a drop-down with all of these users, I need to add some extra data to their names.
I know I could first grab all of the users and then run them through a foreach and add a count to each of the user objects. That would be quite ineffective though. Therefore I'm in need of a good SQL query, that would do something like this. Could a sub-query be the thing I need?
My Table looks something like this:
name ----- surname ----- username
John Mayer jmaye
Suzan Harvey sharv
John Mayer jmay3
Now what I think would be great, if the query returned the same 3 fields and also a boolean if there is more than one person with the same name and surname combination.

Adding the flag to Daniel's answer...
SELECT NAME, SURNAME, USERNAME, DECODE(COUNT(*) OVER (PARTITION BY NAME, SURNAME), 1, 'N', 'Y')
FROM
YOUR_TABLE;
Please note that Oracle SQL has no support for booleans (sigh...)

This can be easily done with a count over partition:
SELECT NAME, SURNAME, USERNAME, COUNT(*) OVER (PARTITION BY NAME, SURNAME)
FROM
YOUR_TABLE;

Related

Count specific column in DB2

I have a table contact with three columns e.g. name, surname and age.
I would like to count the number of entries from the specific column surname.
How looks the select statement in DB2 to achieve this?
You can change the column name using as :
select count(surname) as surname_count
from contact c;
I assume you want to perform a
select count(surname)
from contact
group by surname
but you need to put some effort into the question and prove you have already researched a bit beforehand

Number of tuples outputted by GROUP BY primitive [duplicate]

I know that if you have one aggregate function in a SELECT statement, then all the other values in the statement must be either aggregate functions, or listed in a GROUP BY clause. I don't understand why that's the case.
If I do:
SELECT Name, 'Jones' AS Surname FROM People
I get:
NAME SURNAME
Dave Jones
Susan Jones
Amy Jones
So, the DBMS has taken a value from each row, and appended a single value to it in the result set. That's fine. But if that works, why can't I do:
SELECT Name, COUNT(Name) AS Surname FROM People
It seems like the same idea, take a value from each row and append a single value. But instead of:
NAME SURNAME
Dave 3
Susan 3
Amy 3
I get:
You tried to execute a query that does not include the specified expression 'ContactName' as part of an aggregate function.
I know it's not allowed, but the two circumstances seem so similar that I don't understand why. Is it to make the DBMS easier to implement? If anyone can explain to me why it doesn't work like I think it should, I'd be very grateful.
Aggregates doesn't work on a complete result, they only work on a group in a result.
Consider a table containing:
Person Pet
-------- --------
Amy Cat
Amy Dog
Amy Canary
Dave Dog
Susan Snake
Susan Spider
If you use a query that groups on Person, it will divide the data into these groups:
Amy:
Amy Cat
Amy Dog
Amy Canary
Dave:
Dave Dog
Susan:
Susan Snake
Susan Spider
If you use an aggreage, for exmple the count aggregate, it will produce one result for each group:
Amy:
Amy Cat
Amy Dog
Amy Canary count(*) = 3
Dave:
Dave Dog count(*) = 1
Susan:
Susan Snake
Susan Spider count(*) = 2
So, the query select Person, count(*) from People group by Person gives you one record for each group:
Amy 3
Dave 1
Susan 2
If you try to get the Pet field in the result also, that doesn't work because there may be multiple values for that field in each group.
(Some databases, like MySQL, does allow that anyway, and just returns any random value from within the group, and it's your responsibility to know if the result is sensible or not.)
If you use an aggregate, but doesn't specify any grouping, the query will still be grouped, and the entire result is a single group. So the query select count(*) from Person will create a single group containing all records, and the aggregate can count the records in that group. The result contains one row from each group, and as there is only one group, there will be one row in the result.
Think about it this way: when you call COUNT without grouping, it "collapses" the table to a single group making it impossible to access the individual items within a group in a select clause.
You can still get your result using a subquery or a cross join:
SELECT p1.Name, COUNT(p2.Name) AS Surname FROM People p1 CROSS JOIN People p2 GROUP BY p1.Name
SELECT Name, (SELECT COUNT(Name) FROM People) AS Surname FROM People
As others explained, when you have a GROUP BY or you are using an aggregate function like COUNT() in the SELECT list, you are doing a grouping of rows and therefore collapsing matching rows into one for every group.
When you only use aggregate functions in the SELECT list, without GROUP BY, think of it as you have a GROUP BY 1, so all rows are grouped, collapsed into one. So, if you have a hundred rows, the database can't really show you a name as there are a hundred of them.
However, for RDBMSs that have "windowing" functions, what you want is feasible. E.g. use aggregate functions without a GROUP BY.
Example for SQL-Server, where all rows (names) in the table are counted:
SELECT Name
, COUNT(*) OVER() AS cnt
FROM People
How does the above work?
It shows the Name like the
COUNT(*) OVER() AS cnt did not
exist and
It shows the COUNT(*) like if it was making a total grouping of the
table.
Another example. If you have a Surname field on the table, you can have something like this to show all rows grouped by Surname and counting how many people have same Surname:
SELECT Name
, Surname
, COUNT(*) OVER(PARTITION BY Surname) AS cnt
FROM People
Your query implicitly asks for different types of rows in your result set, and that is not allowed. All rows returned should be of the same type and have the same kind of columns.
'SELECT name, surname' wants to returns a row for every row in the table.
'SELECT COUNT(*)' wants to return a single row combining the results of all the rows in the table.
I think you're correct that in this case the database could plausibly just do both queries and then copy the result of 'SELECT COUNT(*)' into every result. One reason for not doing this is that it would be a stealth performance hit: you'd effectively be doing an extra self-join without declaring it anywhere.
Other answers have explained how to write a working version of this query, so I won't go into that.
The aggregate function and the group by clause aren't separate things, they're parts of the same thing that appear in different places in the query. If you wish to aggregate on a column, you must say what function to use for aggregation; if you wish to have an aggregation function, it has to be applied over some column.
The aggregate function takes values from multiple rows with a specific condition and combines them into one value. This condition is defined by the GROUP BYin your statement. So you can't use an aggregate function without a GROUP BY
With
SELECT Name, 'Jones' AS Surname FROM People
you simply select an additional column with a fixed value... but with
SELECT Name, COUNT(Name) AS Surname FROM People GROUP BY Name
you tell the DBMS to select the Names, remember how often every Name occured in the table and collapse them into one row. So if you omit the GROUP BY the DBMS can't tell, how to collapse the records

SQL Duplicate Rows

I am new to SQL and was wondering if anyone could help me solve my problem.
I have a table that contains information as follows:
firstname lastname group orderinggroup date
tim s A Facebook 6/4/13
tim s A Facebook 6/4/13
tim s A Facebook 6/4/13
dan d B Google 4/5/12
dan d B Google 4/5/12
Something like that. I want it to look like this
firstname lastname group orderinggroup date
tim s A Facebook 6/4/13
dan d B Google 4/5/12
Where there aren't duplicates for tim and dan. I tried using DISTINCT but that only makes one column distinct, and I actually have many people named Tim, Dan, Groups that are A/B, etc. I was wondering if there is a method to take the distinct of multiple roles, e.g., Distinct of firstname, lastname, group, orderinggroup, and date. Last names matter. Thanks!
You could really use the max() function to aggregate some of the columns and do something like this:
select
firstname,
lastname,
[group],
max(orderinggroup) as orderinggroup,
max([date]) as [date]
from (VALUES
('tim','s','A','Facebook','6/4/13'),
('tim','s','A','Facebook','6/4/13'),
('tim','s','A','Facebook','6/4/13'),
('dan','d','B','Google','4/5/12'),
('dan','d','B','Google','4/5/12')) as A (firstname,lastname,[group],orderinggroup,[date])
--replace this with your tablename
GROUP BY firstname, lastname, [group]
ORDER BY [group]
This is bad data and DISTINCT will work. DISTINCT is actually great for this.
SELECT DISTINCT * from table
Why even list columns unless there is direct manipulation or subqueries going on? There's not even a JOIN in your statement. Please tell me with a comment why DISTINCT does not work in this situation as you mention you've used it before.
DISTINCT checks each column for matching values, if it hits a value that doesn't match on record A from record B it will spit out record B also because it's distinct from A. Even if everything before it matches.
In your example:
firstname lastname group orderinggroup date
tim s A Facebook 6/4/13
tim s A Facebook 6/4/13
tim s A Facebook 6/4/13
dan d B Google 4/5/12
dan d B Google 4/5/12
Let's start with Tim. Every field is exactly the same. Therefore Distinct would collapse all these records into one row. Same for Dan. Now if the lastname is different in your actual database (which should be reflected in your example) then DISTINCT will not work. However, the premise of your question would need to change. You would need to discern what in fact you want to reflect in your data set. Do you want to ignore the last name? Do you want to consider it? These questions are pertinent. Group By works also, but is unnecessary as you're not doing aggregates. I hope this helps.
You can use below query:
SELECT firstname, lastname, group, orderinggroup, date
FROM tablename
GROUP BY firstname, lastname, group, orderinggroup, date
HAVING count(*) = 1;

Select 2 distinct columns in 4GL

Needed for my 4gl program:
Let's say I have a table that holds a phone number and a name. There can be 2 people with the same phone number, or 2 names with 1 phone number.
I need to select just 1 of each phone number in the table.
I did:
SELECT DISTINCT phone_number, last_name FROM table
The results will show 2 records. Even phone number is the same, since the names are different it is no longer unique. How can I get a unique phone number regardless of its last_name? (But I want to get the last name as well. I don't care which one)
DISTINCT, as you've noticed, will return rows that are distinct in their entirety.
It sounds like you're looking for something like group by. Essentially, GROUP BY phone_number will return one row for each phone number. Because you also want to get last_name, you'll need to instruct the database how you want it to be returned. You said you don't care which so you could simply write:
SELECT phone_number, MAX(last_name) as last_name
FROM table
GROUP BY phone_number
Informix also supports a FIRST_VALUE aggregate function although I've only used that in OLAP situations so I don't recall if it will work in this context.
If you don't care which last name, then try this out:
SELECT phone_number,
MAX(last_name) AS last_name
FROM table
GROUP BY phone_number

Return all Fields and Distinct Rows

Whats the best way to do this, when looking for distinct rows?
SELECT DISTINCT name, address
FROM table;
I still want to return all fields, ie address1, city etc but not include them in the DISTINCT row check.
Then you have to decide what to do when there are multiple rows with the same value for the column you want the distinct check to check against, but with different val;ues in the other columns. In this case how does the query processor know which of the multiple values in the other columns to output, if you don't care, then just write a group by on the distinct column, with Min(), or Max() on all the other ones..
EDIT: I agree with comments from others that as long as you have multiple dependant columns in the same table (e.g., Address1, Address2, City, State ) That this approach is going to give you mixed (and therefore inconsistent ) results. If each column attribute in the table is independant ( if addresses are all in an Address Table and only an AddressId is in this table) then it's not as significant an issue... cause at least all the columns from a join to the Address table will generate datea for the same address, but you are still getting a more or less random selection of one of the set of multiple addresses...
This will not mix and match your city, state, etc. and should give you the last one added even:
select b.*
from (
select max(id) id, Name, Address
from table a
group by Name, Address) as a
inner join table b
on a.id = b.id
When you have a mixed set of fields, some of which you want to be DISTINCT and others that you just want to appear, you require an aggregate query rather than DISTINCT. DISTINCT is only for returning single copies of identical fieldsets. Something like this might work:
SELECT name,
GROUP_CONCAT(DISTINCT address) AS addresses,
GROUP_CONCAT(DISTINCT city) AS cities
FROM the_table
GROUP BY name;
The above will get one row for each name. addresses contains a comma delimted string of all the addresses for that name once. cities does the sames for all the cities.
However, I don't see how the results of this query are going to be useful. It will be impossible to tell which address belongs to which city.
If, as is often the case, you are trying to create a query that will output rows in the format you require for presentation, you're much better off accepting multiple rows and then processing the query results in your application layer.
I don't think you can do this because it doesn't really make sense.
name | address | city | etc...
abc | 123 | def | ...
abc | 123 | hij | ...
if you were to include city, but not have it as part of the distinct clause, the value of city would be unpredictable unless you did something like Max(city).
You can do
SELECT DISTINCT Name, Address, Max (Address1), Max (City)
FROM table
Use #JBrooks answer below. He has a better answer.
Return all Fields and Distinct Rows
If you're using SQL Server 2005 or above you can use the RowNumber function. This will get you the row with the lowest ID for each name. If you want to 'group' by more columns, add them in the PARTITION BY section of the RowNumber.
SELECT id, Name, Address, ...
(select id, Name, Address, ...,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY id) AS RowNo
from table) sub
WHERE RowNo = 1