Filter Rows by highest version - sql

I have a selection of Data. example shown in SQL Fiddle
I want to return the highest myValue for each name.
NAME MYVALUE
A1 22
A2 22
A3 21
A4 36
A6 12
A9 5
There are 3 rows named A6, I only want the row with the highest value returned.

SELECT NAME,
max(MYVALUE)
FROM TABLE t
GROUP BY NAME
SQL Fiddle

I find analytic functions a nice way to achieve these kinds of results. In your case that would mean something like:
SELECT
name
, MAX(myvalue) OVER (PARTITION BY name) AS max_name_value
FROM mytable;
Analytic queries IMO offer more flexibility than using a GROUP BY alone. The article to which I've linked offers the following example:
We want to see a list of all departments (that have employees) with the average salary in that department. The following two queries give
the same result, one, traditional, with and one without using the
group by. The second, analytical query, can more easily be extended to
return other information as well, such as an aggregation at a
different grouping level.
select d.dname
, avg(e.sal)
from dept d
natural join
emp e
group
by d.dname
Using an analytical function:
select distinct
d.dname
, avg(e.sal) over (partition by d.dname) Average_Salary_in_Dept
from dept d
natural join
emp e
Read the analytical part of this query as follows: 'for each record
returned, select the average of salary values for all rows in the
partition (subset, group) with the same department-name as the current
record’s department-name'.
You can also partition by more than one column to show, for example, the maximum ID for each unique combination of name and department, simply by adding more column names to the PARTITION BY statement as follows: (PARTITION BY name, department).

Related

Nth salary in SQL

I'm trying to understand below query, how its working.
SELECT *
FROM Employee Emp1
WHERE (N-1) = (
SELECT COUNT(DISTINCT(Emp2.Salary))
FROM Employee Emp2
WHERE Emp2.Salary > Emp1.Salary
)
Lets say I have 5 distinct salaries and want to get 3rd largest salary. So Inner query will run first and then outer query ?
I'm getting confused how its being done in sql engine. Curious to know. Becasue if its 3rd largest then 3-1 = 2, so that 2 needs to be matched with inner count as well. How inner count is being operated.
Can anyone explain the how its working .. ?
The subquery is correlated subquery, so it conceptually executes once for each row in the outer query (database optimizations left apart).. What it does is count how many employees have a salary greater than the one on the row in the outer query: if there are 2 employee with a higher salary, then you know that the employee on the current row in the outer query has the third highest salary.
Another way to phrase this is to use row_number() for this:
select *
from (
select
e.*,
row_number() over(order by salary desc) rn
from employee e
) t
where rn = 3
Depending on how you want to handle duplicates, dense_rank() might also be an option.
SELECT * FROM (SELECT EMP.ID,RANK() OVER (ORDER BY SALARY DESC) AS NOS FROM EMPLOYEE) T WHERE T.NOS=3
Then from this select the one with any desired rank.
It is easier to understand when you run this query:
select e1.*,
(select count(distinct e2.salary)
from employee e2
where e2.salary > e1.salary) as n
from employee e1
This is my sample table:
create table employee(salary) as (
select * from table(sys.odcinumberlist(1500, 1200, 1400, 1500, 1100)));
so my output is:
SALARY N
---------- ----------
1500 0
1200 2
1400 1
1500 0
1100 3
As you can see that subquery counts, for each row, salaries which are greater than salary in current row. So for instance, for 1400 there is one DISTINCT greater salary (1500). 1500 appears twice in my table, but distinct makes that it is counted once. So 1400 is second in order.
Your query has this count moved to the where part and compared with required value. We have to substract one, because for highest salary there is no higher value, for second salary one row etc.
It's one of the methods used to find such values, newer Oracle versions introduced analytic functions (rank, row_number, dense_rank) which eliminates the need of using subqueries for such purposes. They are faster, more efficient. For your query dense_rank() would be useful.

Oracle WHERE Clause/Searching

I'm a beginner to oracle. In recent search I've seen WHERE N-1,3-2 ..so on.
How does it work in searching data?
This is my code attemtp so far:
SELECT name, salary
FROM #Employee e1
WHERE N-1 = (SELECT COUNT(DISTINCT salary) FROM #Employee e2
WHERE e2.salary > e1.salary)
It's a classic( and pretty old ) SQL query to get nth highest salary. I assume It is no longer used( I haven't seen ) in any production codes, but could be a favourite question among the interviewers.
The N you are referring to is not a column or some unknown entity but a placeholder which should translate to a valid integer or bind argument in the working query. It is a correlated subquery, a subquery that is evaluated once for each row processed by the outer query. The way it works is that it takes a count of distinct list of salary values from employees that have a salary greater than each one of employees coming from the outer query and restricts the result where that count is equal to N-1.Which means you get those rows with nth highest salaries.
A more commonly used way to do this would be to use analytic function dense_rank() ( or rank depending on your need ). Read the documentation for these functions in case you aren't aware of them.
SELECT first_name,
salary
FROM (
SELECT e.*,
dense_rank() OVER(
ORDER BY salary desc
) rn
FROM employees e
)
WHERE rn = 6; -- ( n = 6 )
In Oracle 12c and higher versions, even though the above query works, a handy option to use is the FETCH..FIRST syntax.
SELECT *
FROM employees
ORDER BY salary DESC OFFSET 6 ROWS FETCH FIRST 1 ROWS WITH TIES; --n=6

How to count repetitions of each record in a query having multiple tables

I have a query which uses MULTIPLE tables and joins. It returns a list of items. I need to count how many times each item appears in that list. I'm working on Oracle database using SQL Developer.
Use GROUP, example:
select COUNT(employee_id), department_id
from employee
GROUP BY department_id
ORDER BY department_id;
Result:
COUNT(EMPLOYEE_ID) DEPARTMENT_ID
—————————————————— —————————————
6 10
2 20
2 30
1
Do you mean like this one?
Select COUNT(e.employee_id), d.dept_name
from employee e
inner department d
on d.id = e.dept_id
GROUP BY d.dept_name
I would like to see your query to understand better your issue and find an accurate solution, but if you use the count function like this, you have the list of repeated elements with a counter column:
SELECT COUNT(*) AS "Quantity", l1.attrib, l2.attrib
FROM list1 l1, list2 l2
WHERE l1.attrib = l2.attrib
GROUP BY l1.attrib, l2.attrib;
The limit of shown data could be the GROUP BY clause and the amount of fields with different data you want to see.

What would the query be for the following sample table?

Can you please help me with a query that would display a table like this:
Dept_ID Dept_Name
10 Admin
10 Whalen
20 Sales
20 James
20 King
20 Smith
40 Marketing
40 Neena
and so on...The Schema is HR
Display the Department Id and the Department Name and then the subsequent employees last names working under that department
SELECT Dept_ID, Dept_Name
FROM Your_Table
Simple as I can make it. It's very difficult (near impossible) to tell exactly what the query should be without more detail in terms of your table structure and some sample data.
From your edit, you may need something more like this;
SELECT DT.Dept_ID, DT.Dept_Name, ET.Emp_Name
FROM Dept_Table AS DT INNER JOIN Emp_Table AS ET ON DT.Dept_ID = ET.Dept_ID
ORDER BY Dept_ID
This shows the employees in each department on the next column, you don't really want all that in the same column.
When you union two data sets, there is NO implicit ordering, you could get the results in any order.
The get a particular order you must use ORDER BY.
To use ORDER BY, then you must have fields to do that ordering by.
In your case, the pseudo code would be...
- ORDER BY [dept_id], [depts-then-employees], [dept_name]
The middle of those three is something that YOU are going to have to create.
One way of doing that is as follows.
note: Just because you have a field to order by, does not mean that you have to select it.
SELECT
dept_id,
dept_name
FROM
(
SELECT
d.dept_id,
d.dept_name,
0 AS entity_type_ordinal
FROM
department d
UNION ALL
SELECT
d.dept_id,
e.employee_name,
1 AS entity_type_ordinal
FROM
department d
INNER JOIN
employee e
ON e.dept_id = d.dept_id
)
dept_and_emp
ORDER BY
dept_id,
entity_type_ordinal,
dept_name
Assuming there's a table in your database called departments that holds this information, your code might look like this:
select
dept_id, dept_name
from
departments
If you want to display certain columns of the table like you have asked in the question above , you can use the following syntax :
select column_names from table_name
replace:
column_names with the column names you want to display separated by a coma
2.table_name with the name of the table whose columns you wish to display
for the above question , the following code will do:
select Dept_Id , Dept_Name from Department ;
The above code works if your table name is 'Department'

What kind of query is this?

In my Employee table, I wanted to find the 3rd highest salary. Someone provided me with the following query to do this:
SELECT *
FROM employee C1
WHERE 3 = (SELECT Count(DISTINCT( C2.salary ))
FROM employee C2
WHERE C2.salary >= C1.salary)
This query works, but I don't how it works. What kind of query is this?
As others have said, this type of query is called a correlated sub-query. It's a sub-query because there is a query within a query and it's correlated because the inner query references the outer query in its definition.
Consider the inner query:
SELECT Count(DISTINCT( C2.salary ))
FROM employee C2
WHERE C2.salary >= C1.salary
Conceptually, this inner query will be evaluated once for every row produced by the outer query before the WHERE clause is applied, basically once for every row in employee. It will produce a single value, the count of rows from employee where the salary is less than the salary of the outer row.
The outer query will only return records where the value produced by the inner query is exactly 3. Assuming unique salary values, there is only one row from the employee table where there will be exactly 3 records with a salary value greater than or equal to it (the one row) and that one row is necessarily the third-highest salary value.
It's clever, but unnecessarily weird and probably not as optimal as something more straightforward.
Maybe a better solution would have been
SELECT TOP 1 *
FROM (
SELECT TOP 3 * FROM employee ORDER BY Salary DESC
) t
ORDER BY Salary ASC
Easier to read and more efficient than a correlated sub-query.
You can also use Dense_Rank to rank the salaries greatest to least, then select the ones that are ranked 3rd. This will also prevent you from getting the wrong salary if the top 2 are identical like the other answers above mine are doing. This has a better looking execution plan than the Distinct count one also
SELECT *
FROM (
SELECT *,
DENSE_RANK() OVER (ORDER BY Salary DESC) salary_rank
FROM employee e
) t
WHERE salary_rank = 3
Could also rewrite this with a common table expression.
WITH top_three
AS
(
SELECT TOP 3 * FROM employee ORDER BY Salary DESC
)
SELECT TOP 1 *
FROM top_three
ORDER BY Salary ASC;
Or, if you need to look for other ranks in this you can use row_number().
WITH ranked
AS
(
SELECT rank = ROW_NUMBER()OVER(ORDER BY Salary DESC), *
FROM employee
ORDER BY Salary DESC
)
SELECT *
FROM ranked
WHERE rank = #whatever_rank_you_want;