Returning DISTINCT rows from database query - sql

I am fairly new to SQL so apologies if there is a simple solution to this.
I have this piece of SQL that performs a join on 3 tables.
SELECT a.group_leader, b.forum_name
FROM flightuser_group a
INNER JOIN flightacl_groups c ON a.group_id = c.group_id
JOIN flightforums b ON c.forum_id = b.forum_id
WHERE a.user_id = '60'
ORDER BY a.group_leader DESC
This query returns this:
group_leader forum_name
1 tmpSQJ
0 jobby7
0 jobby5
0 tmpSQJ
I am trying to only keep the first tmpSQJ entry and remove the second but cannot determine where the DISTICT clause goes.
Many thanks in advance.

Try This:
SELECT MAX(a.group_leader), b.forum_name
FROM flightuser_group a INNER JOIN flightacl_groups c ON a.group_id = c.group_id
JOIN flightforums b ON c.forum_id = b.forum_id
WHERE a.user_id = '60'
GROUP BY b.forum_name
ORDER BY a.group_leader DESC

For MySQL, add a LIMIT 1 after the ORDER BY.
For MS SQL, add a TOP 1 after the SELECT.
These two flavors will get you only the first record in the recordset.

You could try a GROUP BY which essentially behaves the same:
SELECT a.group_leader, b.forum_name
FROM flightuser_group a
INNER JOIN flightacl_groups c ON a.group_id = c.group_id
JOIN flightforums b ON c.forum_id = b.forum_id
WHERE a.user_id = '60'
GROUP BY b.forum_name
ORDER BY a.group_leader DESC
You could also look at using "INNER JOIN" for flightforums

Related

select top 1 in subquery?

I'm trying to get a top 1 returned for each code in this query but this is giving me syntax errors. Any ideas what I can do?
SELECT d.doc_no,
d.doc_short_name,
d.doc_name,
r.revision_code,
r.revision,
r.description,
dtg.group_name,
(SELECT top 1 di.index_user_id
FROM document_instance
WHERE doc_no = d.doc_no
ORDER BY entry_time DESC) AS 'last indexed by'
FROM documents d
LEFT JOIN document_revision r ON r.doc_no = d.doc_no
LEFT JOIN document_instance di ON di.doc_no = d.doc_no
LEFT JOIN document_type_group dtg ON dtg.doc_no = d.doc_no
Assuming Sybase ASE, the top # clause is only supported in derived tables, ie, it's not supported in correlated sub-queries like you're attempting.
Also note that order by is not supported in any sub-queries (derived table or correlated).
If I'm reading the query correctly, you want the index_user_id for the record with the max(entry_time); if this is correct, and assuming a single record is returned for a given doc_no/entry_time combo, you could try something like:
SELECT d.doc_no,
d.doc_short_name,
d.doc_name,
r.revision_code,
r.revision,
r.description,
dtg.group_name,
(SELECT d2.index_user_id
FROM document_instance d2
WHERE d2.doc_no = d1.doc_no
and d2.entry_time = (select max(d3.entry_time)
from document_instance d3
where d3.doc_no = d1.doc_no)
) as 'last indexed by'
FROM documents d
LEFT JOIN document_revision r ON r.doc_no = d.doc_no
LEFT JOIN document_instance d1 ON d1.doc_no = d.doc_no
LEFT JOIN document_type_group dtg ON dtg.doc_no = d.doc_no

Querying two tables for a condition (Oracle)

Suppose I have a two tables A and B which have similar cust_id column. I am trying to retrieve all the rows where the cust_id are equal for A and B but have different email addresses.
What I have now is the following:
SELECT
A.cust_id,
A.email_addr,
B.preferred_email
FROM
email_feature A
LEFT OUTER JOIN email_feature_sum B ON
(
A.cust_id = B.cust_id
AND
A.email_addr != B.preferred_email
)
ORDER BY
A.date_loaded DESC
but it isn't returning any results and I am not sure what I am doing wrong?
In response to the comment on the OP above, I believe the A.cust_id needs to have leading zeroes trimmed. This can be accomplished as follows:
If the B.cust_id is in a numeric format, you may also have to cast it as a string for the comparison to work. I have included this in the join, but if it is not needed because B.cust_id is already a char type, you can remove the casting and it will be more efficient.
SELECT
A.cust_id,
A.email_addr,
B.preferred_email
FROM
email_feature A
LEFT OUTER JOIN email_feature_sum B ON
(
ltrim(A.cust_id, '0') = to_char(B.cust_id)
AND
A.email_addr != B.preferred_email
)
ORDER BY
A.date_loaded DESC
Perhaps the ordering of the result set is throwing you off. What if you do a regular join?
SELECT ef.cust_id, ef.email_addr, efs.preferred_email
FROM email_feature ef JOIN
email_feature_sum efs ON
ON ef.cust_id = efs.cust_id and
ef.email_addr <> efs.preferred_email
ORDER BY ef.date_loaded desc;
I think your data is not match the condition. I has been test and it run success.
with email_feature as(
select 1 cust_id,'hvv#gmail.com' email_addr from dual
),
email_feature_sum as(
select 1 cust_id,'hvv#gmail2.com' preferred_email from dual
)
SELECT
A.cust_id,
A.email_addr,
B.preferred_email
FROM
email_feature A
LEFT OUTER JOIN email_feature_sum B ON
(
A.cust_id = B.cust_id
AND
A.email_addr != B.preferred_email
)

Transform a correlated subquery into a join

I want to express this:
SELECT
a.*
,b.timestamp_col
FROM weird_data_source a
LEFT JOIN weird_data_source b
ON a.id = b.id
AND b.timestamp_col = (
SELECT
MAX(sub.timestamp_col)
FROM weird_data_source sub
WHERE sub.id = a.id
AND sub.date_col <= a.date_col
AND sub.timestamp_col < a.timestamp_col
)
A couple notes here about the data:
date_col and timestamp_col aren't representing the same thing.
I'm not kidding... the data is really structured like this.
But the subquery is invalid. Netezza cannot handle the < operator in the correlated subquery. For the life of me I cannot figure out an alternative. How could I get around this?
My gut is telling me this could potentially be done with a join, but I haven't been able to be successful at this yet.
There's a dozen or so similar questions, but none of them seem to get at handling this type of inequality.
This should get you pretty close. You will get duplicate rows if there are two rows with the exact same timestamp_col that otherwise meet the criteria, but otherwise you should be good:
SELECT
a.id,
a.some_other_columns, -- Because we NEVER use SELECT *
b.timestamp_col
FROM
weird_data_source a
LEFT JOIN weird_data_source b ON
a.id = b.id
LEFT OUTER JOIN weird_data_source c ON
c.id = a.id AND
c.date_col <= a.date_col AND
c.timestamp_col < a.timestamp_col
LEFT OUTER JOIN weird_data_source d ON
d.id = a.id AND
d.date_col <= a.date_col AND
d.timestamp_col < a.timestamp_col AND
d.timestamp_col > c.timestamp_col
WHERE
d.id IS NULL
The query is basically looking for a matching row where no other matching row is found with a greater timestamp_col value - hence the d.id IS NULL. That column will only be NULL if no match is found.

select columns from different tables with different data type columns

I want to know how to write a query, which selects specific columns(not common) from 2 different tables and combine them together.
I tried this, but didn't work:
SELECT ii.sequence
FROM Costs ii
WHERE ii.order_ID IN (SELECT book.order_ID
FROM BookInfo ci
WHERE ii.order_ID = ci.order_ID)
UNION
SELECT ft.released_title
FROM FinishedBook ft
WHERE ft.version IN (SELECT ii.iiversion
FROM Costs ii
WHERE ii.iiorder_ID IN (SELECT ci.order_ID
FROM BookInfo ci
WHERE ii.iiorder_ID = ci.order_ID))
ORDER BY sequence;
Isn't this a case of joining these tables and calling Distinct to avoid duplicates?
Try this:
select Distinct a.Sequence, b.RELEASED_TITLE
from IncludedIn a inner join FinishedTrack b
on a.OriginatesFrom = b.IIOriginatesFrom
Inner join CdInfo c on a.IIALBUM_ID = c.ALBUM_ID
Order By a.Sequence
For MSSQL Server, Use Join to get the result.
SELECT I.Sequence, F.Released_Title FROM FinishedTrack AS F
INNER JOIN IncludedIn AS I ON I.ORIGINATESFROM = F.IIORIGINATESFROM
INNER JOIN CdInfo AS A ON A.ALBUM_ID = I.IIALBUM_ID
ORDER BY I.Sequence DESC
You need to use a JOIN instead of a UNION:
SELECT ii.sequence, ft.released_title
FROM IncludedIn ii
INNER JOIN CdInfo ci ON ii.iialbumid = ci.album_id
INNER JOIN FinishedTrack ft on ft.originatesfrom = ii.iioriginatesfrom
ORDER BY ii.sequence;
This query might work for you
SELECT IncludedIn.SEQUENCE, FinishedTrack.RELEASED_TITLE
FROM FinishedTrack
INNER JOIN IncludedIn
ON FinishedTrack.ORIGINATESFROM=IncludedIn.IIORIGINATESFROM and
FinishedTrack.VERSION=IncludedIn.IIVERSION order by FinishedTrack.SEQUENCE;

Joining two tables on a key and then left outer joining a table on a number of criteria

I'm attempting to join 3 tables together in a single query. The first two have a key so each entry has a matching entry. This joined table will then be joined by a third table that could produce multiple entries for each entry from the first table (the joined ones).
select * from
(select a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession
from trade_monthly a, trade_monthly_second b
where
a.bidentifier = b.jidentifier AND
a.bsession = b.JSession)
left outer join
trade c
on c.symbol = a.symbol
order by a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession, c.symbol
There will be more criteria (not just c.symbol = a.symbol) on the left outer join but for now this should be useful. How can I nest the queries this way? I'm gettin gan SQL command not properly ended error.
Any help is appreciated.
Thanks
For what I know every derived table must be given a name; so try something like this:
SELECT * FROM
(SELECT a.bidentifier, ....
...
a.bsession = b.JSession) t
LEFT JOIN trade c
ON c.symbol = t.symbol
ORDER BY t.bidentifier, ...
Anyway I think you could use a simpler query:
SELECT a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession, c.*
FROM trade_monthly a
INNER JOIN trade_monthly_second b
ON a.bidentifier = b.jidentifier
AND a.bsession = b.JSession
LEFT JOIN trade c
ON c.symbol = a.symbol
ORDER BY a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession, c.symbol
Try this:
SELECT
`trade_monthly`.`bidentifier` AS `bidentifier`,
`trade_monthly`.`bsession` AS `bsession`,
`trade_monthly`.`symbol` AS `symbol`,
`trade_monthly_second`.`jidentifier` AS `jidentifier`,
`trade_monthly_second`.`jsession` AS `jsession`
FROM
(
(
`trade_monthly`
JOIN `trade_monthly_second` ON(
(
(
`trade_monthly`.`bidentifier` = `trade_monthly_second`.`jidentifier`
)
AND(
`trade_monthly`.`bsession` = `trade_monthly_second`.`jsession`
)
)
)
)
JOIN `trade` ON(
(
`trade`.`symbol` = `trade_monthly`.`symbol`
)
)
)
ORDER BY
`trade_monthly`.`bidentifier`,
`trade_monthly`.`bsession`,
`trade_monthly`.`symbol`,
`trade_monthly_second`.`jidentifier`,
`trade_monthly_second`.`jsession`,
`trade`.`symbol`
Why don't you just create a view of the two inner joined tables. Then you can build a query that joins this view to the trade table using the left outer join matching criteria.
In my opinion, views are one of the most overlooked solutions to a lot of complex queries.