How to do order by min() of some column? - sql

I am pretty new to SQL. I want a query which should do order by on min of some column. Below is the query i want.
SELECT *
FROM (
SELECT p.PROJECT_ID,
p.PROJECT_NAME,
p.PROJECT_TYPE
FROM PROJECT p
LEFT OUTER JOIN code c
ON p.PROJECT_ID= c.PROJECT_ID
WHERE p.PROJECT_NAME IN ('test')
ORDER BY min(c.LABEL) ASC
)
WHERE rownum <= 25;
Why i need it this way is. I have one table PROJECT.
PROJECT_ID PROJECT_NAME PROJECT_TYPE
1 a test1
2 b test2
i have another table code which has project_id as foreign key.
ID PROJECT_ID LABEL
1 1 a
2 1 b
3 1 c
4 2 d
now when i will join it on project_id and make order by on code.label it will give me 4 records three with project id 1 and 1 with project id 2. But my requirement is to sort the project based on the codes label. so logically i want two records . One for project id 1 with min vale of label of all the possible combinations of project id 1 i.e with label a and other with project id 2. So that's why i want to sort it based on min of code label. I cannot use group by as it will degrade the performance.

For use a MIN( )
you need a group by eg:
SELECT *
FROM (
SELECT p.PROJECT_ID,
p.PROJECT_NAME,
p.PROJECT_TYPE
FROM PROJECT p
LEFT OUTER JOIN code c
ON p.codeId=c.ID
WHERE p.PROJECT_NAME IN ('test')
GROUP BY .PROJECT_ID,
p.PROJECT_NAME,
p.PROJECT_TYPE
ORDER BY min(c.LABEL) ASC
)
WHERE rownum <= 25;
and in some db you must select the column you need for order by eg:
SELECT *
FROM (
SELECT p.PROJECT_ID,
p.PROJECT_NAME,
p.PROJECT_TYPE,
min(c.LABEL)
FROM PROJECT p
LEFT OUTER JOIN code c
ON p.codeId=c.ID
WHERE p.PROJECT_NAME IN ('test')
GROUP BY .PROJECT_ID,
p.PROJECT_NAME,
p.PROJECT_TYPE
ORDER BY min(c.LABEL) ASC
)
WHERE rownum <= 25;

Related

Return only one row based on search

Query
select
a.id,
a.ba,
b.status,
b.custid
from balist as a
inner join customer as b
on a.ba = b.ba
I have a table "balist" that has a list of (ba) and i inner join table "customer" on (ba) and right now by output is like the following
id
ba
status
custid
1
ba-1234455
A
123-321-123-321a
2
ba-1234455
I
123-321-123-321a
3
ba-1234457
A
123-321-123-321b
4
ba-1234458
A
123-321-123-321c
5
ba-1234459
I
123-321-123-321d
and I want to return all A and I status but remove the row that has status I that also have a A status. Like the following.
I have a table customer like the following
id
ba
status
custid
1
ba-1234455
A
123-321-123-321a
3
ba-1234457
A
123-321-123-321b
4
ba-1234458
A
123-321-123-321c
5
ba-1234459
I
123-321-123-321d
You could use a row_number() to filter your resulting rows eg
SELECT
id,ba,status,custid
FROM (
SELECT
a.id,
a.ba,
b.status,
b.custid,
ROW_NUMBER() OVER (
PARTITION BY a.ba
ORDER BY b.status ASC
) as rn
FROM
balist as a
INNER JOIN
customer as b ON a.ba = b.ba
)
WHERE rn=1
Let me know if this works for you.

Grouping the data and showing 1 row per group in postgres

I have two tables which look like this :-
Component Table
Revision Table
I want to get the name,model_id,rev_id from this table such that the result set has the data like shown below :-
name model_id rev_id created_at
ABC 1234 2 23456
ABC 5678 2 10001
XYZ 4567
Here the data is grouped by name,model_id and only 1 data for each group is shown which has the highest value of created_at.
I am using the below query but it is giving me incorrect result.
SELECT cm.name,cm.model_id,r.created_at from dummy.component cm
left join dummy.revision r on cm.model_id=r.model_id
group by cm.name,cm.model_id,r.created_at
ORDER BY cm.name asc,
r.created_at DESC;
Result :-
Anyone's help will be highly appreciated.
use max and sub-query
select T1.name,T1.model_id,r.rev_id,T1.created_at from
(
select cm.name,
cm.model_id,
MAX(r.created_at) As created_at from dummy.component cm
left join dummy.revision r on cm.model_id=r.model_id
group by cm.name,cm.model_id
) T1
left join revision r
on T1.created_at =r.created_at
http://www.sqlfiddle.com/#!17/68cb5/4
name model_id rev_id created_at
ABC 1234 2 23456
ABC 5678 2 10001
xyz 4567
In your SELECT you're missing rev_id
Try this:
SELECT
cm.name,
cm.model_id,
MAX(r.rev_id) AS rev_id,
MAX(r.created_at) As created_at
from dummy.component cm
left join dummy.revision r on cm.model_id=r.model_id
group by 1,2
ORDER BY cm.name asc,
r.created_at DESC;
What you were missing is the statement to say you only want the max record from the join table. So you need to join records, but the join will bring in all records from table r. If you group by the 2 columns in component, then select the max from r, on the id and created date, it'll only pick the top out the available to join
I would use distinct on:
select distinct on (m.id) m.id, m.name, r.rev_id, r.created_at
from model m left join
revision r
on m.model_id = r.model_id
order by m.id, r.rev_id;

Combine rows from Mulitple tables into single table

I have one parent table Products with multiple child tables -Hoses,Steeltubes,ElectricCables,FiberOptics.
ProductId -Primary key field in Product table
ProductId- ForeignKey field in Hoses,Steeltubes,ElectricCables,FiberOptics.
Product table has 1 to many relationship with Child tables
I want to combine result of all tables .
For eg - Product P1 has PK field ProductId which is used in all child tables as FK.
If Hoses table has 4 record with ProductId 50 and Steeltubes table has 2 records with ProductId 50 when I perform left join then left join is doing cartesian product of records showing 8 record as result But it should be 4 records .
;with HOSESTEELCTE
as
(
select '' as ModeType, '' as FiberOpticQty , '' as NumberFibers, '' as FiberLength, '' as CableType , '' as Conductorsize , '' as Voltage,'' as ElecticCableLength , s.TubeMaterial , s.TubeQty, s.TubeID , s.WallThickness , s.DWP ,s.Length as SteelLength , h.HoseSeries, h.HoseLength ,h.ProductId
from Hoses h
left join
(
--'' as HoseSeries,'' as HoseLength ,
select TubeMaterial , TubeQty, TubeID , WallThickness , DWP , Length,ProductId from SteelTubes
) s on (s.ProductId = h.ProductId)
) select * from HOSESTEELCTE
Assuming there are no relationships between child tables and you simply want a list of all child entities which make up a product you could generate a cte which has a number of rows which are equal to the largest number of entries across all the child tables for a product. In the example below I have used a dates table to simplify the example.
so for this data
create table products(pid int);
insert into products values
(1),(2);
create table hoses (pid int,descr varchar(2));
insert into hoses values (1,'h1'),(1,'h2'),(1,'h3'),(1,'h4');
create table steeltubes (pid int,descr varchar(2));
insert into steeltubes values (1,'t1'),(1,'t2');
create table electriccables(pid int,descr varchar(2));
truncate table electriccables
insert into electriccables values (1,'e1'),(1,'e2'),(1,'e3'),(2,'e1');
this cte
;with cte as
(select row_number() over(partition by p.pid order by datekey) rn, p.pid
from dimdate, products p
where datekey < 20050105)
select * from cte
create a cartesian join (one of the rare ocassions where an implicit join helps) pid to rn
result
rn pid
-------------------- -----------
1 1
2 1
3 1
4 1
1 2
2 2
3 2
4 2
And if we add the child tables
;with cte as
(select row_number() over(partition by p.pid order by datekey) rn, p.pid
from dimdate, products p
where datekey < 20050106)
select c.pid,h.descr hoses,s.descr steeltubes,e.descr electriccables from cte c
left join (select h.*, row_number() over(order by h.pid) rn from hoses h) h on h.rn = c.rn and h.pid = c.pid
left join (select s.*, row_number() over(order by s.pid) rn from steeltubes s) s on s.rn = c.rn and s.pid = c.pid
left join (select e.*, row_number() over(order by e.pid) rn from electriccables e) e on e.rn = c.rn and e.pid = c.pid
where h.rn is not null or s.rn is not null or e.rn is not null
order by c.pid,c.rn
we get this
pid hoses steeltubes electriccables
----------- ----- ---------- --------------
1 h1 t1 e1
1 h2 t2 e2
1 h3 NULL e3
1 h4 NULL NULL
2 NULL NULL e1
In fact, the result having 8 rows can be expected to be the result, since your four records are joined with the first record in the other table and then your four records are joined with the second record of the other table, making it 4 + 4 = 8.
The very fact that you expect 4 records to be in the result instead of 8 shows that you want to use some kind of grouping. You can group your inner query issued for SteelTubes by ProductId, but then you will need to use aggregate functions for the other columns. Since you have only explained the structure of the desired output, but not the semantics, I am not able with my current knowledge about your problem to determine what aggregations you need.
Once you find out the answer for the first table, you will be able to easily add the other tables into the selection as well, but in case of large data you might get some scaling problems, so you might want to have a table where you store these groups, maintain it when something changes and use it for these selections.

Find the latest or earliest date

I have a table with a foreign key called team_ID, a date column called game_date, and a single char column called result. I need to find when the next volleyball game happens. I have successfully narrowed the game dates down to all the volleyball games that have not happened yet because the result IS NULL. I have all the select in line, I just need to find the earliest date.
Here is what I've got:
SELECT game.game_date, team.team_name
FROM game
JOIN team
ON team.team_id = game.team_id
WHERE team.sport_id IN
(SELECT sport.sport_id
FROM sport
WHERE UPPER(sport.sport_type_code) IN
(SELECT UPPER(sport_type.sport_type_code)
FROM sport_type
WHERE UPPER(sport_type_name) like UPPER('%VOLLEYBALL%')
)
)
AND game.result IS NULL;
I'm a time traveler so don't mind the old dates.
When I run it, I get this:
GAME_DATE TEAM_NAME
----------- ----------
11-NOV-1998 BEars
13-NOV-1998 BEars
13-NOV-1998 WildCats
14-NOV-1998 BEars
How do I set it up so I get only the MIN(DATE) and the TEAM_NAME playing on that date?
I've tried AND game.game_date = MIN(game.game_date) but it simply tells me that a group function in not allowed here. There has to be a way to retrieve the MIN(game_date) and use it as a condition to be met.
I'm using Oracle 11g pl/sql.
This should be the final working code.
SELECT *
FROM
(
SELECT g.game_date, t.team_name
FROM game g
JOIN team t
ON t.team_id = g.team_id
JOIN sport s
ON t.sport_id = s.sport_id
JOIN sport_type st
ON UPPER(s.sport_type_code) IN UPPER(st.sport_type_code)
WHERE UPPER(sport_type_name) like UPPER('%VOLLEYBALL%')
AND g.result IS NULL
ORDER BY g.game_date
)
WHERE ROWNUM = 1;
The ROWNUM pseudocolumn is generated before any ORDER BY clause is applied to the query. If you just do WHERE ROWNUM <= X then you will get X rows in whatever order Oracle produces the data from the datafiles and not the X minimum rows. To guarantee getting the minimum row you need to use ORDER BY first and then filter on ROWNUM like this:
SELECT *
FROM (
SELECT g.game_date, t.team_name
FROM game g
JOIN team t
ON t.team_id = g.team_id
INNER JOIN sport s
ON t.sport_id = s.sport_id
INNER JOIN sport_type y
ON UPPER( s.sport_type_code ) = UPPER( y.sport_type_code )
WHERE UPPER( y.sport_type_name) LIKE UPPER('%VOLLEYBALL%')
AND g.result IS NULL
ORDER BY game_date ASC -- You need to do the ORDER BY in an inner query
)
WHERE ROWNUM = 1; -- Then filter on ROWNUM in an outer query.
If you want to return multiple rows with the minimum date then:
SELECT game_date,
team_name
FROM (
SELECT g.game_date,
t.team_name,
RANK() OVER ( ORDER BY g.game_date ASC ) AS rnk
FROM game g
JOIN team t
ON t.team_id = g.team_id
INNER JOIN sport s
ON t.sport_id = s.sport_id
INNER JOIN sport_type y
ON UPPER( s.sport_type_code ) = UPPER( y.sport_type_code )
WHERE UPPER( y.sport_type_name) LIKE UPPER('%VOLLEYBALL%')
AND g.result IS NULL
)
WHERE rnk = 1;
Could you make it simple and order by date and SELECT TOP 1? I think this is the syntax in Oracle:
WHERE ROWNUM <= number;
select game.game_date,team.team_name from (
SELECT game.game_date, team.team_name, rank() over (partition by team.team_name order by game.game_date asc) T
FROM game
JOIN team
ON team.team_id = game.team_id
WHERE team.sport_id IN
(SELECT sport.sport_id
FROM sport
WHERE UPPER(sport.sport_type_code) IN
(SELECT UPPER(sport_type.sport_type_code)
FROM sport_type
WHERE UPPER(sport_type_name) like UPPER('%VOLLEYBALL%')
)
)
AND game.result IS NULL
) query1 where query1.T=1;

Row_Number() returning duplicate rows

This is my query,
SELECT top 100
UPPER(COALESCE(A.DESCR,C.FULL_NAME_ND)) AS DESCR,
COALESCE(A.STATE, (SELECT TOP 1 STATENAME
FROM M_STATEMASTER
WHERE COUNTRYCODE = B.CODE)) AS STATENAME,
COALESCE(A.STATECD, (SELECT TOP 1 CODE
FROM M_STATEMASTER
WHERE COUNTRYCODE = B.CODE)) AS STATECD,
COALESCE(A.COUNTRYCD, B.CODE) AS COUNTRYCODE
FROM
M_CITY A
JOIN
M_COUNTRYMASTER B ON A.COUNTRYCD = B.CODE
JOIN
[GEODATASOURCE-CITIES-FREE] C ON B.ALPHA2CODE = C.CC_FIPS
WHERE
EXISTS (SELECT 1
FROM [GEODATASOURCE-CITIES-FREE] Z
WHERE B.ALPHA2CODE=Z.CC_FIPS)
ORDER BY
A.CODE
Perfectly working fine, but when I'm trying to get the Row_number() over(order by a.code) I'm getting the duplicate column multiple time.
e.g
SELECT top 100
UPPER(COALESCE(A.DESCR,C.FULL_NAME_ND)) AS DESCR,
COALESCE(A.STATE, (SELECT TOP 1 STATENAME
FROM M_STATEMASTER
WHERE COUNTRYCODE = B.CODE)) AS STATENAME,
COALESCE(A.STATECD, (SELECT TOP 1 CODE
FROM M_STATEMASTER
WHERE COUNTRYCODE = B.CODE)) AS STATECD,
COALESCE(A.COUNTRYCD, B.CODE) AS COUNTRYCODE
ROW_NUMBER() OVER(ORDER BY A.CODE) AS RN -- i made a change here
FROM
M_CITY A
JOIN
M_COUNTRYMASTER B ON A.COUNTRYCD = B.CODE
JOIN
[GEODATASOURCE-CITIES-FREE] C ON B.ALPHA2CODE = C.CC_FIPS
WHERE
EXISTS (SELECT 1
FROM [GEODATASOURCE-CITIES-FREE] Z
WHERE B.ALPHA2CODE=Z.CC_FIPS)
ORDER BY
A.CODE
WHERE
EXISTS (SELECT 1
FROM [GEODATASOURCE-CITIES-FREE] Z
WHERE B.ALPHA2CODE = Z.CC_FIPS)
Another try, when I'm using ROW_NUMBER() OVER(ORDER BY newid()) AS RN it's taking logn time to execute.
Remember: CODE is the Pk of table M_CITY and there is no key in [GEODATASOURCE-CITIES-FREE] table.
Another thing: About JOIN(inner join), Join returns the matched Rows, right???
e.g:
table 1 with 20 rows,
table2 with 30 rows ,
table 3 with 30 rows
If I joined these 3 table on a certain key then the possibility of getting maximum rows is 20, am I right?
Your first query doesn't work fine. It just appears to. The reason is that you are using TOP without an ORDER BY, so an arbitrary set of 100 rows is returned.
When you add ROW_NUMBER(), the query plan changes . . . and the ordering of the result set changes as well. I would suggest that you fix the original query to use a stable sort.