Finding non distinct rows

Finding non distinct rows - sql

Assuming the following structure, I need to find if there are cases where there's more than one DESC for each CP4+CP3 combination. I need only to know if they exist. Not where they are.
CP4, integer
CP3, integer
DESC, varchar(50)

You can check using a self join: if the following query returns any rows, there's more than one DESC for a CP4+CP3 combination:
select *
from YourTable a
inner join YourTable b
on a.CP3 = b.CP3
and a.CP4 = b.CP4
and a.DESC <> b.DESC
A group by would work too:
select count(*)
from YourTable
group by CP3, CP4
having count(distinct DESC) > 1
By the way, DESC is a SQL keyword; you might have to escape it in a way specific to your database.

SELECT COUNT(DESC)
FROM [Table]
GROUP BY CP4, CP3
HAVING COUNT(DESC) > 1
Check if this query returns more than 0 rows then true, else false.

Related

Listing multiple columns in a single row in SQL

(select ID,EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE,ROW_NUMBER() OVER(PARTITION BY EXTERNAL_TRANSACTION_ID ORDER BY ID ) AS SEQNUM
from AC_POS_TRANSACTION_TRK aptt WHERE [RESULT] ='Success'
GROUP BY ID, EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE )
Hello,
On above query, I want to get rows of transaction id's which has seqnum=1 and seqnum=2
But if that transaction id has no second row (seqnum=2), I dont want to get any row for that transaction id.
Thanks!!
Something like this

Not 100% sure if this is correct without you table definition, but my understanding is that you want to EXCLUDE records if that record has an entry with seqnum=2 -- you can't use a where clause alone because that would still return seqnum = 1.
You can use an exists /not exists or in/not in clause like this
(select ID,EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE,ROW_NUMBER() OVER(PARTITION BY EXTERNAL_TRANSACTION_ID ORDER BY ID ) AS SEQNUM
from AC_POS_TRANSACTION_TRK aptt WHERE [RESULT] ='Success'
and not exists ( select 1 from AC_POS_TRANSACTION_TRK a where a.id = aptt.id
and a.seqnum = 2)
GROUP BY ID, EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE )
basically what this does is it excludes records if a record exists as specified in the NOT EXISTS query.

One option you can try is to add a count of rows per group using the same partioning critera and then filter accordingly. Not entirely sure about your query without seeing it in context and with sample data - there's no aggregation so why use group by?
However can you try something along these lines
select * from (
select ID,EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE,
Row_Number() over(partition by EXTERNAL_TRANSACTION_ID order by ID) as SEQNUM,
Count(*) over(partition by EXTERNAL_TRANSACTION_ID) Qty
from AC_POS_TRANSACTION_TRK
where [RESULT] ='Success'
)x
where SEQNUM in (1,2) and Qty>1

This should do the job.
With Qry As (
-- Your original query goes here
),
Select Qry.*
From Qry
Where Exists (
Select *
From Qry Qry1
Where Qry1.EXTERNAL_TRANSACTION_ID = Qry.EXTERNAL_TRANSACTION_ID
And Qry1.SEQNUM = 1
)
And Exists (
Select *
From Qry Qry2
Where Qry2.EXTERNAL_TRANSACTION_ID = Qry.EXTERNAL_TRANSACTION_ID
And Qry2.SEQNUM = 2
)
BTW, your original query looks problematic to me, specifically I think that instead of a GROUP BY columns those columns should be in the PARTITION BY clause of the OVER statement, but without knowing more about the table structures and what you're trying to achieve, I could not say for sure.

Using row_number() in subquery results in ORA-00913: too many values

In Oracle, I wish to do something like the SQL below. For each row in "criteria," I want to find the latest row in another table (by last_modified_date) for the same location_id, and use that value to set default_start_interval. Or, if there is no such value, then use 30. However, as you can see, the subquery must have two values in the select statement to use row_number(). That causes an error. How do I reformat it so that it works?
update criteria pc set default_start_interval =
COALESCE(
(SELECT start_interval,
row_number() over(partition by aday.location_id
order by atime.last_modified_date desc
) as rn
FROM available_time atime
JOIN available_day aday ON aday.available_day_id = atime.available_day_id
WHERE aday.location_id = pc.location_id
and rn = 1)
, 30)

There are two issues in your update query:
The update expects only one value per row for default_start_interval, however, you have two columns in the select list.
The row number should be assigned before in the inner query, and then apply filter where rn = 1 in outer query.
Your update query should look like:
UPDATE criteria pc
SET default_start_interval = NVL(
(
SELECT start_interval FROM(
SELECT
start_interval, ROW_NUMBER() OVER(
PARTITION BY aday.location_id
ORDER BY atime.last_modified_date DESC
) AS rn
FROM
available_time atime
JOIN available_day aday ON aday.available_day_id = atime.available_day_id
WHERE
aday.location_id = pc.location_id
)
WHERE rn = 1)
, 30)
Note: You could simply use NVL instead of COALESCE as you only have one value to check for NULL. COALESCE is useful when you have multiple expressions.

I think a simpler method uses aggregation and keep to get the value you want:
update criteria pc
set default_start_interval =
(select coalesce(max(start_interval) keep (dense_rank first order by atime.last_modified_date desc), 30)
from available_time atime join
available_day aday
on aday.available_day_id = atime.available_day_id
where aday.location_id = pc.location_id
);
An aggregation query with no GROUP always returns one row. If no rows match, then the returned value is NULL -- the COALESCE() captures this case.

SQL Oracle Find Max of count

I have this table called item:
| PERSON_id | ITEM_id |
|------------------|----------------|
|------CP2---------|-----A03--------|
|------CP2---------|-----A02--------|
|------HB3---------|-----A02--------|
|------BW4---------|-----A01--------|
I need an SQL statement that would output the person with the most Items. Not really sure where to start either.

I advice you to use inner query for this purpose. the inner query is going to include group by and order by statement. and outer query will select the first statement which has the most items.
SELECT * FROM
(
SELECT PERSON_ID, COUNT(*) FROM TABLE1
GROUP BY PERSON_ID
ORDER BY 2 DESC
)
WHERE ROWNUM = 1
here is the fiddler link : http://sqlfiddle.com/#!4/4c4228/5

Locating the maximum of an aggregated column requires more than a single calculation, so here you can use a "common table expression" (cte) to hold the result and then re-use that result in a where clause:
with cte as (
select
person_id
, count(item_id) count_items
from mytable
group by
person_id
)
select
*
from cte
where count_items = (select max(count_items) from cte)
Note, if more than one person shares the same maximum count; more than one row will be returned bu this query.

SQL query last transactions

You can't see on the image but I have many till_id numbers. (1,2,3,4,5).
What I want to do is just showing the last "trans_num" without repeating the till_id.
For example:
till_id trans_num
1 14211
2 14333
3 14555

A typical way to do this is:
select t.*
from t
where t.trans_date = (select max(t2.trans_date)
from t t2
where t2.till_id = t.till_id
);

select till_id ,trans_num, max(transdate) from tableA
group by till_id ,trans_num
Filter the columns you need in outer query or write inner query in where condition

You can use group by till_id in subselect
Select a.till_id a.trans_num
from your_table as a
where (a.trans_date. a.till_id) = (select max(b.trans_date), b-till_id
from your_table as b
group_by b.till_id
);

Select a NON-DISTINCT column in a query that return distincts rows

The following query returns the results that I need but I have to add the ID of the row to then update it. If I add the ID directly in the select statement it will return me more results then I need because each ID is unique so the DISTINCT statement see the line as unique.
SELECT DISTINCT ucpse.MemberID, ucpse.ProductID, ucpse.UserID
FROM UserCustomerProductSalaryExceptions as ucpse
WHERE EXISTS (SELECT NULL
FROM UserCustomerProductSalaryExceptions as upcse2
WHERE ucpse.userid = upcse2.userid AND ucpse.MemberID = upcse2.MemberID AND ucpse.ProductID = upcse2.ProductID
GROUP BY upcse2.UserID, upcse2.memberid, upcse2.productid
HAVING COUNT(UserID) >= 2
)
So basically I need to add ucpse.ID in the Select statement while keeping DISTINCT values for MemberID,ProductID and UserID.
Any Ideas ?
Thank you

According to you comment:
If the data has been duplicated 67 times for a given employee with a given product and a given client, I need to keep only one of thoses records. It's not important which one, so this is why I use DISTINC to obtain unique combinaison of given employee with a given product and a given client.
You can use MIN() or MAX() and GROUP BY instead of DISTINCT
SELECT MAX(ucpse.ID) AS ID, ucpse.MemberID, ucpse.ProductID, ucpse.UserID
FROM UserCustomerProductSalaryExceptions as ucpse
WHERE EXISTS (SELECT NULL
FROM UserCustomerProductSalaryExceptions as upcse2
WHERE ucpse.userid = upcse2.userid AND ucpse.MemberID = upcse2.MemberID AND ucpse.ProductID = upcse2.ProductID
GROUP BY upcse2.UserID, upcse2.memberid, upcse2.productid
HAVING COUNT(UserID) >= 2
)
GROUP BY ucpse.MemberID, ucpse.ProductID, ucpse.UserID
UPDATE:
From you comments I think the below query is what you need
DELETE FROM UserCustomerProductSalaryExceptions
WHERE ID NOT IN ( SELECT MAX(ucpse.ID) AS ID
FROM #UserCustomerProductSalaryExceptions
GROUP BY ucpse.MemberID, ucpse.ProductID, ucpse.UserID
HAVING COUNT(ucpse.ID) >= 2
)

If all you want is to delete the duplicates, this will do it:
WITH X AS
(SELECT ID,
ROW_NUMBER() OVER (PARTITION BY MemberID, ProductID, UserID ORDER BY ID) AS DupRowNum<br
FROM UserCustomerProductSalaryExceptions
)
DELETE X WHERE DupRowNum > 1

ID's not necessary - try:
UPDATE uu SET
<your settings here>
FROM UserCustomerProductSalaryExceptions uu
JOIN ( <paste your entire query above here>
) uc ON uc.MemberID=uu.MemberId AND uc.ProductID=uu.ProductId AND uc.UserID=uu.UserId

From the sound of your data structure (which I would STRONGLY advise normalizing as soon as possible), it sounds like you should be updating all the records. It sounds as if each duplicate is important because it contains some information about an employee's relation to a customer or product.
I would probably update all the records. Try this:
UPDATE UCPSE
SET
--Do your updates here
FROM UserCustomerProductSalaryExceptions as ucpse
JOIN
(
SELECT UserID, MemberID, ProductID
FROM UserCustomerProductSalaryExceptions
GROUP BY UserID, MemberID, ProductID
HAVING COUNT(UserID) >= 2
) T
ON ucpse.UserID = T.UserID AND ucpse.MemberID = T.MemberID AND ucpse.ProductID = T.ProductID

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Finding non distinct rows - sql

Assuming the following structure, I need to find if there are cases where there's more than one DESC for each CP4+CP3 combination. I need only to know if they exist. Not where they are. CP4, integer CP3, integer DESC, varchar(50)

SELECT COUNT(DESC) FROM [Table] GROUP BY CP4, CP3 HAVING COUNT(DESC) > 1 Check if this query returns more than 0 rows then true, else false.

Related

Listing multiple columns in a single row in SQL

Using row_number() in subquery results in ORA-00913: too many values

SQL Oracle Find Max of count

SQL query last transactions

Select a NON-DISTINCT column in a query that return distincts rows

Categories

Resources