Remove duplicates from result in sql - sql

i have following sql in java project:
select distinct * from drivers inner join licenses on drivers.user_id=licenses.issuer_id
inner join users on drivers.user_id=users.id
where (licenses.state='ISSUED' or drivers.status='WAITING')
and users.is_deleted=false
And result i database looks like this:
And i would like to get only one result instead of two duplicated results.
How can i do that?

Solution 1 - That's Because one of data has duplicate value write distinct keyword with only column you want like this
Select distinct id, distinct creation_date, distinct modification_date from
YourTable
Solution 2 - apply distinct only on ID and once you get id you can get all data using in query
select * from yourtable where id in (select distinct id from drivers inner join
licenses
on drivers.user_id=licenses.issuer_id
inner join users on drivers.user_id=users.id
where (licenses.state='ISSUED' or drivers.status='WAITING')
and users.is_deleted=false )

Enum fields name on select, using COALESCE for fields which value is null.

usually you dont query distinct with * (all columns), because it means if one column has the same value but the rest isn't, it will be treated as a different rows. so you have to distinct only the column you want to, then get the data

I suspect that you want left joins like this:
select *
from users u left join
drivers d
on d.user_id = u.id and d.status = 'WAITING' left join
licenses l
on d.user_id = l.issuer_id and l.state = 'ISSUED'
where u.is_deleted = false and
(d.user_id is not null or l.issuer_id is not null);

Related

Can we select first row of data from column in sql?

I have a table with multiple data for same ID. I want to get the first row data for the ID.
I have added the below SQL that I have tried.
SELECT
"client"."id",
"client"."company_name",
"client_details"."address"
from Client
LEFT OUTER JOIN "client_details" ON ("client"."id" = "client_details"."client_id")
Since I have multiple address for the same ID, can we get only the first id?
Currently the output I get is 2 rows with different addresses.
You can add to your SQL LIMIT 1 and in case you want to be sure the order you can also add to your SQL ORDER BY...
You can use distinct on:
select distinct on (c.id) c.id, c.company_name, cd.address
from Client c left join
client_details cd
on c.id = cd.client_id
order by c.id, ?;
The ? is for the column that specifies the ordering (the definition of "first"). I am guessing that cd.id is what you want.
Note that this query removes the double quotes and introduces table aliases. This is easier on both the eyes (to read) and the fingers (to type).
use row_number()
select * from
(
SELECT
"client"."id",
"client"."company_name",
"client_details"."address",row_number() over(partition by "client"."id" order by "client_details"."address") as rn
from Client
LEFT OUTER JOIN "client_details" ON "client"."id" = "client_details"."client_id"
)A where rn=1
If there is a field you can order the results by you could use a lateral join e.g.
SELECT
"client"."id",
"client"."company_name",
"client_details"."address"
from Client
left join lateral (
select *
from client_details cd
where cd.client_id = client.id
order by [some_ordering_field]
limit 1
) "client_details" on true

SQL Group By Throwing Up Error (SQL Server)

I have SQL code that throws up an error saying
Error: SQLCODE=-119, SQLSTATE=42803, SQLERRMC=WONUM
The code works fine until I add the group by:
select *
from workorder
left join labtrans on labtrans.refwo=workorder.wonum and labtrans.siteid=workorder.siteid
left join matusetrans on workorder.wonum=matusetrans.refwo and workorder.siteid=matusetrans.tositeid and linetype not in (select value from synonymdomain where domainid='LINETYPE' and maxvalue='TOOL')
left join locations on locations.location = workorder.location and locations.siteid=workorder.siteid
left join person on personid in (select personid from labor where laborcode = labtrans.laborcode)
left join po on workorder.wonum=po.hflwonum and workorder.siteid=po.siteid and workorder.orgid=po.orgid
left join companies on companies.company = po.vendor and companies.orgid=po.orgid
left join pluspcustomer on pluspcustomer.customer=workorder.pluspcustomer
where workorder.wonum='10192'
group by personid
;
if you only GROUP BY personid, you cannot select everything except personid, OR the fields used by aggregate functions such as SUM,MAX, etc
UPDATE
If you just want to see the duplicate personid, you could use:
select personid
from table
group by personid
But be careful here: If you write query like this, the only field that to determine the duplicate records is persionid, if you need to uniquely identify each persionid from different CompanyId, you need to group by persionid, CompanyId, otherwise, same personId from different company will be considered as the duplicate records.
But if you want to delete those duplicate records, you should use ROW_NUMBER()OVER (Partition by persionid Order by your_criteria) to delete the duplicate records. Try to do some searches to see how does that work, usually I prefer to use that function along with the CTE table expression.
if you just need to remove duplicates, use DISTINCT with your query like this:
your query:
SELECT * FROM .....
modify it:
SELECT DISTINCT * FROM .....
Hope it helps.

How to return rows matched in a table without multiple EXISTS clauses?

I want to pull back results from one table that match ALL specified values where the specified values are in another table. I can do it like this:
SELECT * FROM Contacts
WHERE
EXISTS (SELECT 1 FROM dbo.ContactClassifications WHERE ContactID = Contacts.ID AND ClassificationID = '8C62E5DE-00FC-4994-8127-000B02E10DA5')
AND EXISTS (SELECT 1 FROM dbo.ContactClassifications WHERE ContactID = Contacts.ID AND ClassificationID = 'D2E90AA0-AC93-4406-AF93-0020009A34BA')
AND EXISTS etc...
However that falls over when I get up to about 40 EXISTS clauses. The error message is "The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query."
The gist of this is to
Select all contacts with any GUID from the IN statement
Use a DISTINCT COUNT to get a count for each contactid on matching GUID's
Use the HAVING to retain only those contacts that equal the amount of matching GUID's you've put into the IN statement
SQL Statement
SELECT *
FROM dbo.Contacts c
INNER JOIN (
SELECT c.ID
FROM dbo.Contacts c
INNER JOIN dbo.ContactClassifications cc ON c.ID = cc.ContactID
WHERE cc.ClassificationID IN ('..', '..', 38 other GUIDS)
GROUP BY
c.ID
HAVING COUNT(DISTINCT cc.ClassificationID) = 40
) cc ON cc.ID = c.ID
Test script at data.stackexchange
One solution is to demand that no classification exists without a matching contact. That's a double negation:
select *
from contacts c
where not exists
(
select *
from ContactClassifications cc
where not exists
(
select *
from ContactClassifications cc2
where cc2.ContactID = c.ID
and cc2.ClassificationID = cc.ClassificationID
)
)
This type of problem is known as relational division.
SELECT c.*
FROM Contacts c
INNER JOIN
(cc.ContactID, COUNT(DISTINCT cc.ClassificationID) as num_class
FROM ContactClassifications
WHERE ClassificationID IN (....)
GROUP BY cc.ContactID
) b ON c.ID = b.ContactID
WHERE b.num_class = [number of distinct values - how many different values you put in "IN"]
If you run SQLServer 2005 and higher, you can do pretty much the same with CROSS APPLY, supposedly more efficiently

SQL Server Join on Select statement using count() and group by

I have two tables in SQL Server, tbl_disputes and tbl_disputetypes. The tbl_disputes table contains a foreign key column disputetype. The table tbl_disputetypes contains the primary key field disputetypeid and disputetypedesc. The following query gives me a count of each disputetype from the tbl_disputes table.
select disputetype, count(disputetype) as numberof
from tbl_disputes
group by disputetype
What sort of join or subquery do I need to use to display the
tbl_disputetypes.dbo.disputetypedesc instead of tbl_disputes.dbo.disputetype?
EDIT Issue was because disputetypedesc was set as TEXT. I changed it to nvarchar, and the following query worked:
SELECT
tbl_disputetypes.disputetypedesc,
count(tbl_disputetypes.disputetypedesc)
FROM
tbl_disputes Left OUTER JOIN
tbl_disputetypes ON tbl_disputes.disputetype = tbl_disputetypes.disputetypeid
group by tbl_disputetypes.disputetypedesc
Unless I'm missing something, you can just LEFT JOIN the description:
select disputetypedesc, count(disputetype) as numberof
from tbl_disputes d
LEFT JOIN tbl_disputetypes dt
ON dt.disputetypeid = d.disputetype
group by disputetypedesc
Assuming 2005+:
WITH x(t, numberof) AS
(
SELECT disputetype, COUNT(*)
FROM tbl_disputes
GROUP BY disputetype
)
SELECT dt.disputetypedesc, x.numberof
FROM tbl_disputetypes AS dt
INNER JOIN x ON dt.disputetype = x.t;
A simple JOIN?
select
DT.disputetypedesc, count(*) as numberof
from
tbl_disputes D
JOIN
tbl_disputetypes DT ON D.disputetype = DT.disputetype
group by
DT.disputetypedesc
The basic idea is that you will need a sub-query. Something like this will work:
select disputetypedesc, disputetype, numberof
from (select disputetype, count(disputetype) numberof
from tbl_disputes
group by disputetype) t left outer join
tbl_disputetypes on t.disputetype = tbl_disputetypes.disputetype
I am not sure if I understand your question however you should be able to select all columns using a query similar to the code sample below.
The following query will join the two tables by the disputetypeid column. I changed the format of the SQL statement however you can obviously format it however you would like.
SELECT tbl_disputetypes.disputetypedesc
, tbl_disputes.*
, <any_column_from_either_table>
FROM tbl_disputes
INNER JOIN tbl_disputetypes
ON tbl_disputes.disputetypeid = tbl_disputetypes.disputetypeid

Combining data from 2 tables in to 1 query

Hi all
Im having some problems combining data from 2 tables in to 1 query.
Now I have one table-nr1 with raw data of restaurants and in the other table-nr2 I have a number of restaurants that have been graded.
So, now I want to select all restaurants and at the same time select grades of that restaurant from table-nr2 and get the average value of those grades.
How can I do this in a single SQL query?
SELECT r.*,
COALESCE(
(
SELECT AVG(grade)
FROM table_nr2 g
WHERE g.restaurant_id = r.id
), 0)
FROM table-nr1 r
Assuming your restaurants have a name and id, and the your reviews have a grade
SELECT re.name, avg(ra.grade)
FROM table-nr1 re
LEFT JOIN table-nr2 ra ON re.id = ra.id
GROUP BY re.name
You need to group by all fields you want to select which are not aggregated, and left join means you will get all restaurants, irrespective of whether they have any ratings.
You need to perform a join. In this case an inner left join sounds fine, which is the default join. You can use USING syntax if the field that links them is the same on both sides, so you would end up with something like this:
SELECT table-nr1.*, AVG(table-nr2.score)
FROM table-nr1
JOIN table-nr2 USING (restrauntId)
Otherwise you could do something that links them using an on clause like this:
SELECT table-nr1.*, AVG(table-nr2.score)
FROM table-nr1
JOIN table-nr2 ON (table-nr1.restrauntId = table-nr2.restrauntId)