Controlling result of oracle query - sql

I have a schema like this
create table sample(id number ,name varchar2(30),mark number);
Now i has to return names of the top three marks. How can i write sql query for this?
If i use max(mark) it will return only maximum and
select name from sample
returns all the names!! I tried in many ways but i was unable to control the result to 3 rows..
Please suggest the way to get rid of my problem..

How do you want to handle ties? If Mary gets a mark of 100, Tom gets a mark of 95, and John and Dave both get a mark of 90, what results do you want, for example? Do you want both John and Dave to be returned since they both tied for third? Or do you want to pick one of the two so that the result always has exactly three rows? What happens if Beth also tied for second with a score of 95? Do you still consider John and Dave tied for third place or do you consider them tied for fourth place?
You can use analytic functions to get the top N results though which analytic function you pick depends on how you want to resolve ties.
SELECT id,
name,
mark
FROM (SELECT id,
name,
mark,
rank() over (order by mark desc) rnk
FROM sample)
WHERE rnk <= 3
will return the top three rows using the RANK analytic function to rank them by MARK. RANK returns the same rank for people that are tied and uses the standard sports approach to determining your rank so that if two people tie for second, the next competitor is in fourth place, not third. DENSE_RANK ensures that numeric ranks are not skipped so that if two people tie for second, the next row is third. ROW_NUMBER assigns each row a different rank by arbitrarily breaking ties.
If you really want to use ROWNUM rather than analytic functions, you can also do
SELECT id,
name,
mark
FROM (SELECT id,
name,
mark
FROM sample
ORDER BY mark DESC)
WHERE rownum <= 3
You cannot, however, have the ROWNUM predicate at the same level as the ORDER BY clause since the predicate is applied before the ordering.

SELECT t2.name FROM
(
SELECT t.*, t.rownum rn
FROM sample t
ORDER BY mark DESC
) t2
WHERE t2.rn <=3

Related

SQL group by not returning row value for an aggregate column

I was using SQL statement to bring an aggregate (MAX) for a column and rest of the columns should come from that row. I was using group by clause but for other columns I must also use either max or min, etc. This was budget oriented project so I could not have time to do it using LINQ. (Where I could have used first or default). Anyways I believe this is strong inability of SQL language.
Again this could have done by many ways but not using simple SQL group by.
any ideas?
Your question is a bit light on details but it sounds like you want to know, for some set of items, which item has the maximum of something and then what it’s other properties are.
You cannot group by all the non max columns because this breaks the group down into too small chunks to make the max work
You cannot max all the other columns because this mixes row data up
Here is a simple example:
Name, JobRole, StartDate
John, JuniorProgrammer, 2000-01-01
John, SeniorProgrammer, 2010-01-01
John was promoted to senior programmer in 2010. We want johns most recent promotion and what he does now. If we do this:
SELECT name, jobrole, max(startdate)
FROM emp
GROUP BY name
The database will complain that jobrole is not in the group by. If we add it to the group by, John will appear twice, not what we want. If instead we max(jobrole), it DOES accidentally work out ok because alphabetically, SeniorProgrmamer is higher than JuniorProgrammer
If however, John then gets a promotion again in 2019:
Name, JobRole, StartDate
John, JuniorProgrammer, 2000-01-01
John, SeniorProgrammer, 2010-01-01
John, ExecutiveDirector, 2019-01-01
This time our query is wrong:
SELECT name, max(jobrole), max(startdate)
FROM emp
GROUP BY name
Hi he row data will be mixed up: the date will be 2019 but the job will still be seniorprogrammer because it’s alphabetically the maximum value
Instead we have to find the max for the person and then join it back to find the rest of the data:
SELECT name, jobrole, startdate
FROM
emp
INNER JOIN
(
SELECT name, max(startdate) d
FROM emp
GROUP BY name
)findmax
ON findmax.d = emp.startdate and findmax.name = emp.name
There are other ways of achieving the same thing without a join- this method would have issues if an employee was promoted twice on the same day, two records would result. In a dB that supports analytical functions we an do:
SELECT name, jobrole, row_number() over (partition by name order by startdate desc)
FROM emp
This establishes an incrementing counter in order of descending start date. The counter restarts from 1 for every different employee. There is no group by so no complaints that the extra data isn’t grouped or on aggregate function. All we need to do to choose the most recent promotion date is wrap the whole thing in a select that demands the row number be 1:
SELECT * FROM
(
SELECT name, jobrole, row_number() over (partition by name order by startdate desc) r
FROM emp
) emp_with_rownum
WHERE r = 1
You don't want a group by. You seem to want a window function:
select t.*, max(col) over () as overall_max
from t;

SQL Merge two rows with same ID but different column values (Oracle)

I am trying to merge different rows into one when they have the same id but different column values.
For example :
(table1)
id colour
1 red
1 blue
2 green
2 red
I would like this to be combine so that the result is :
id colour1 colour2
1 red blue
2 green red
Or
id colour
1 red, blue
2 green, red
Or any other variation of the above so that the rows are joined together some way.
Any help would be appreciated! Thanks in advance.
Please read my Comment first - you shouldn't even think about doing this unless it is ONLY for reporting purposes, and you want to see how this can be done in plain SQL (as opposed to the correct solution, which is to use your reporting tool for this job).
The second format is easiest, especially if you don't care about the order in which the colors appear:
select id, listagg(colour, ', ') within group (order by null)
from table1
group by id
order by null means order randomly. If you want to order by something else, use that in order by with listagg(). For example, to order the colors alphabetically, you could say within group (order by colour).
For the first format, you need to have an a priori limit on the number of columns, and how you do it depends on the version of Oracle you are using (which you should always include in every question you post here and on other discussion boards). The concept is called "pivoting"; since version 11, Oracle has an explicit PIVOT operator that you can use.
The following would solve your problem in the first of the two ways that you proposed. Listagg is what you would use to solve it the second of the two ways (as pointed out in the other answer):
select id,
min(decode(rn,1,colour,null)) as colour1,
min(decode(rn,2,colour,null)) as colour2,
min(decode(rn,3,colour,null)) as colour3
from (
select id,
colour,
row_number() over(partition by id order by colour) as rn
from table1
)
group by id;
In this approach, you need to add additional case statements up to the maximum number of possible colors for a given ID (this solution is not dynamic).
Additionally, this is putting the colors into color1, color2, etc. based on the alphabetical order of the color names. If you prefer a random order, or some other order, you need to change the order by.
Try this, it works for me:
Here student is the name of the table and studentId is a column. We can merge all subjects to the particular student using GROUP_CONCAT.
SELECT studentId, GROUP_CONCAT(subjects) FROM student

PostgreSQL: get the max values from a consult

I need to get the max values from a list of values obtained from a query.
Basically, the problem is this:
I have 2 tables:
Lawyer
id (PK)
surname
name
Case
id (PK)
id_Client
date
id_Lawyer (FK)
And I need to get the Lawyer with the largest number of cases...(There is not problem with that) but, if exist more than one lawyer with the largest number of cases, I should list them.
Any help on this would be appreciated.
SELECT l.*, cases
FROM (
SELECT "id_Lawyer", count(*) AS cases, rank() OVER (ORDER BY count(*) DESC) AS rnk
FROM "Case"
GROUP BY 1
) c
JOIN "Lawyer" l ON l.id = c."id_Lawyer"
WHERE c.rnk = 1;
Basics for the technique (like #FuzzyTree provided):
PostgreSQL equivalent for TOP n WITH TIES: LIMIT "with ties"?
You only need a single subquery level since you can run window functions over aggregate functions:
Get the distinct sum of a joined table column
Best way to get result count before LIMIT was applied
Aside: It's better to use legal, lower case, unquoted identifiers in Postgres. Never use a reserved word like Case, that can lead to very confusing errors.

Find row number in a sort based on row id, then find its neighbours

Say that I have some SELECT statement:
SELECT id, name FROM people
ORDER BY name ASC;
I have a few million rows in the people table and the ORDER BY clause can be much more complex than what I have shown here (possibly operating on a dozen columns).
I retrieve only a small subset of the rows (say rows 1..11) in order to display them in the UI. Now, I would like to solve following problems:
Find the number of a row with a given id.
Display the 5 items before and the 5 items after a row with a given id.
Problem 2 is easy to solve once I have solved problem 1, as I can then use something like this if I know that the item I was looking for has row number 1000 in the sorted result set (this is the Firebird SQL dialect):
SELECT id, name FROM people
ORDER BY name ASC
ROWS 995 TO 1005;
I also know that I can find the rank of a row by counting all of the rows which come before the one I am looking for, but this can lead to very long WHERE clauses with tons of OR and AND in the condition. And I have to do this repeatedly. With my test data, this takes hundreds of milliseconds, even when using properly indexed columns, which is way too slow.
Is there some means of achieving this by using some SQL:2003 features (such as row_number supported in Firebird 3.0)? I am by no way an SQL guru and I need some pointers here. Could I create a cached view where the result would include a rank/dense rank/row index?
Firebird appears to support window functions (called analytic functions in Oracle). So you can do the following:
To find the "row" number of a a row with a given id:
select id, row_number() over (partition by NULL order by name, id)
from t
where id = <id>
This assumes the id's are unique.
To solve the second problem:
select t.*
from (select id, row_number() over (partition by NULL order by name, id) as rownum
from t
) t join
(select id, row_number() over (partition by NULL order by name, id) as rownum
from t
where id = <id>
) tid
on t.rownum between tid.rownum - 5 and tid.rownum + 5
I might suggest something else, though, if you can modify the table structure. Most databases offer the ability to add an auto-increment column when a row is inserted. If your records are never deleted, this can server as your counter, simplifying your queries.

How to randomize order of data in 3 columns

I have 3 columns of data in SQL Server 2005 :
LASTNAME
FIRSTNAME
CITY
I want to randomly re-order these 3 columns (and munge the data) so that the data is no longer meaningful. Is there an easy way to do this? I don't want to change any data, I just want to re-order the index randomly.
When you say "re-order" these columns, do you mean that you want some of the last names to end up in the first name column? Or do you mean that you want some of the last names to get associated with a different first name and city?
I suspect you mean the latter, in which case you might find a programmatic solution easier (as opposed to a straight SQL solution). Sticking with SQL, you can do something like:
UPDATE the_table
SET lastname = (SELECT lastname FROM the_table ORDER BY RAND())
Depending on what DBMS you're using, this may work for only one line, may make all the last names the same, or may require some variation of syntax to work at all, but the basic approach is about right. Certainly some trials on a copy of the table are warranted before trying it on the real thing.
Of course, to get the first names and cities to also be randomly reordered, you could apply a similar query to either of those columns. (Applying it to all three doesn't make much sense, but wouldn't hurt either.)
Since you don't want to change your original data, you could do this in a temporary table populated with all rows.
Finally, if you just need a single random value from each column, you could do it in place without making a copy of the data, with three separate queries: one to pick a random first name, one a random last name, and the last a random phone number.
I suggest using newid with checksum for doing randomization
SELECT LASTNAME, FIRSTNAME, CITY FROM table ORDER BY CHECKSUM(NEWID())
In SQL Server 2005+ you could prepare a ranked rowset containing the three target columns and three additional computed columns filled with random rankings (one for each of the three target columns). Then the ranked rowset would be joined with itself three times using the ranking columns, and finally each of the three target columns would be pulled from their own instance of the ranked rowset. Here's an illustration:
WITH sampledata (FirstName, LastName, CityName) AS (
SELECT 'John', 'Doe', 'Chicago' UNION ALL
SELECT 'James', 'Foe', 'Austin' UNION ALL
SELECT 'Django', 'Fan', 'Portland'
),
ranked AS (
SELECT
*,
FirstNameRank = ROW_NUMBER() OVER (ORDER BY NEWID()),
LastNameRank = ROW_NUMBER() OVER (ORDER BY NEWID()),
CityNameRank = ROW_NUMBER() OVER (ORDER BY NEWID())
FROM sampledata
)
SELECT
fnr.FirstName,
lnr.LastName,
cnr.CityName
FROM ranked fnr
INNER JOIN ranked lnr ON fnr.FirstNameRank = lnr.LastNameRank
INNER JOIN ranked cnr ON fnr.FirstNameRank = cnr.CityNameRank
This is the result:
FirstName LastName CityName
--------- -------- --------
James Fan Chicago
John Doe Portland
Django Foe Austin
select *, rand() from table order by rand();
I understand some versions of SQL have a rand() that doesn't change for each line. Check for yours. Works on MySQL.