SQL logic to find/drop duplicates of column combination

SQL logic to find/drop duplicates of column combination - sql

Hi I have a sql query that results in the output like this
I want the output to look something like this:
Requirement:
row should not have duplicate when CONCATENATE(column one + column two) or CONCATENATE(column two +column one)
the duplicate value that is lower in column three is dropped

You can join the table with itself to find related pairs. Then discarding the unneeded ones becomes easier:
select x.*
from t x
left join t y on y.one = x.two and x.one = y.two
where y.one is null or x.one < y.one
EDIT:
If the values are coming from a query already you can reuse place it as a subquery of this one. For example
with t as (
-- your long query here
)
select x.*
from t x
left join t y on y.one = x.two and x.one = y.two
where y.one is null or x.one < y.one

Related

Select only the last date when two columns are duplicate

I need to select seven columns from three different tables, only when one of the columns has a particular value. I also need to select only the last date when two columns (TAGNAME and TAGNUMMER) are both duplicate. I'm using the following code:
select c.AKEY, c.AKT_DATUM, c.TAGNAME, c.TAGNUMMER,
cd.TEILANLAGEN_ID, x.TP_GSAP_KZ, c.KLASSEN_ID
from T0EM01 c, T0EM03 x, T0AD07 cd
where cd.TEILANLAGEN_ID = '219A'
inner join
(select c.TAGNAME and c.TAGNUMMER max(C.AKT_DATUM)
where T0EM01 c c.TAGNAME and T0EM01 c c.TAGNUMMER = m.max_date
Up to where cd.TEIANLAGEN_ID = '219A' it works fine (but there are over 2 million rows).
How can I filter so that when both TAGNAME and TAGNUMMER are repeated in two or more rows I only select the latest date?

"Over 2 million rows" could be less if you properly joined those 3 tables. The way you put it, you're producing Cartesian join and got way too many rows.
from t0em01 c,
t0em03 x,
t0ad07 cd
I have no idea how are they to be joined to each other so I'm just guessing; you should know.
As of the "max date value", one option might be to use a subquery, also properly joined to other table(s). Once again, I don't know how exactly to join them.
Improve it:
select c.akey,
c.akt_datum,
c.tagname,
c.tagnummer,
cd.teilanlagen_id,
x.tp_gsap_kz,
c.klassen_id
from t0em01 c join t0em03 x on x.id = c.id --> I'm just
join t0ad07 cd on cd.id = c.id -- guessing here
where cd.teilanlagen_id = '219A'
and c.akt_datum = (select max(c1.akt_datum) --> subquery, to return
from t0em01 c1 -- only the MAX date value
where c1.tagname = c.tagname
and c1.tagnummer = c.tagnummer
);

How to use NOT EXISTS and JOIN at the same time?

We are required to display UID that is in BOS_BARCODE_IT_LOG but not exists in BOS_BARCODE_DO_LOG
The reason that I joined with OITM is because the user asked for the selection Criteria in SAP Business One.
SELECT X0."DATE",X0."ITEMCODE", X0."UID"
FROM "BOS_BARCODE_IT_LOG" X0 JOIN OITM X1 ON
X0."ITEMCODE" = X1."ItemCode"
WHERE
X1."ItemCode" = '[%0]'
AND NOT EXISTS (
SELECT X2."UID" FROM "BOS_BARCODE_DO_LOG" X2
WHERE X0."ITEMCODE" = X2."ITEMCODE" AND
X0."UID" = X2."UID" AND
X0."DATE" = X2."DATE"
)
We need that '[%0]' in order to display
The problem is that, when I tried to search any items there, the query returns no result.
What I've tried.
I Select only from one table without OITM, and neverthless, did not provide Query Selection Criteria.
I Tried this as well
SELECT X0."DATE",X0."ITEMCODE", X0."UID"
FROM "BOS_BARCODE_IT_LOG" X0
WHERE
X0."ITEMCODE" = '[%0]' AND
NOT EXISTS (
SELECT X1."UID" FROM "BOS_BARCODE_DO_LOG" X1
WHERE X0."ITEMCODE" = X1."ITEMCODE" AND
X0."UID" = X1."UID" AND
X0."DATE" = X1."DATE"
)
And it provide error.
1). [SAP AG][LIBODBCHDB DLL][HDBODBC] Syntax error or access violation;257 sql syntax error: incorrect syntax near ")": line 14 col 1 'Received Alerts' (OAIB) (at pos 299)
Another thing, is it possible to use subquery with more than one tables and provide the expected result (Not just blank result)
What I learned in the tutorial that subquery only used for one table.
This is one of the examples.
https://www.tutorialspoint.com/sql/sql-sub-queries.htm
Thanks.

We are required to display UID that is in BOS_BARCODE_IT_LOG but not exists in BOS_BARCODE_DO_LOG.
I don't undersand how your queries actually relate to the question (they have one additional table oitm, and many more columns). From the description of your question it seems like you want:
select bi.uid
from bos_barcode_it_log bi
where not exists (select 1 from bos_barcode_do_log bd where bd.uid = bi.uid)
If you need oitm for filtering, you can join (if there is not more than one row in per itemcode in oitm), or use exists:
select bi.uid
from bos_barcode_it_log bi
where
exists (select 1 from oitm o where o.itemcode = bi.itemcode)
and not exists (select 1 from bos_barcode_do_log bd where bd.uid = bi.uid)

Refer to another table and return data adjacent to Max() result

I have the following two tables:
Using SQL Server 2012, I want to know the INTERVAL from the Hourly table where the MaxWaitTime and Split match what comes from the Daily table for each day. I am assuming I need to use a window function here, but I can't figure out the right answer.
There may be times where MaxWaitTime is 0 for an entire day, and thus all rows from the hourly table match. In this scenario, I would prefer a Null answer, but the earliest INTERVAL for that day would be fine.
There will also be times where multiple INTERVALs have the same wait time. In this scenario the first INTERVAL where the MaxWaitTime is present that day should be returned.

You can use outer apply if you want at most one match:
Looks like a simple left join should work between the tables. I'm simply going by the data shown above...
The query should look something like this. If the join fails, then a NULL will be returned. Give it a go..
select d.*, h.interval as maxinterval
from daily d outer apply
(select top 1 h.*
from hourly h
where convert(date, h.interval) = d.row_date and
h.split = d.split and
h.maxwaittime = d.maxwaittime
order by h.interval asc
) h;
If you want NULL for multiple matches, you can do something similar:
select d.*, h.interval as maxinterval
from daily d outer apply
(select top 1 h.callsoffered, h.split, max(h.interval) as maxinterval
from hourly h
where convert(date, h.interval) = d.row_date and
h.split = d.split and
h.maxwaittime = d.maxwaittime
group by h.maxwaittime, h.split
having count(*) = 1
) h;

Looks like a simple left join should work between the tables. I'm simply going by the data shown above...
The query should look something like this. If the join fails, then a NULL will be returned. Give it a go..
select daily.* ,hourly.callsoffered, hourly.interval as maxinterval
from daily
left join hourly
on convert(date,hourly.interval) = daily.row_date
and hourly.split = daily.split
and hourly.maxwaittime = daily.maxwaittime

Return overlapping date records in SQL

I used the following query to fetch the overlapping records in SQL:
SELECT QUOTE_ID,FUNCTION_ID,FUNCTION_DT,FUNC_SPACE_ID,FN_START_TIME,FN_END_TIME,DATE_AUTH_LEVEL
FROM R_13_ALL_RESERVED A
WHERE
A.FUNC_SPACE_ID = '401-ZFU-52'
AND A.FUNCTION_DT = TO_DATE('09/03/2015','MM/DD/YYYY')
AND EXISTS ( SELECT 'X'
FROM R_13_ALL_RESERVED B
WHERE A.PROPERTY = B.PROPERTY
AND A.FUNCTION_DT = B.FUNCTION_DT
AND A.FUNCTION_ID <> B.FUNCTION_ID
AND ( ( A.FN_START_TIME > B.FN_START_TIME
AND A.FN_START_TIME < B.FN_END_TIME)
OR ( B.FN_START_TIME > A.FN_START_TIME
AND B.FN_START_TIME < A.FN_END_TIME)
OR ( A.FN_START_TIME = B.FN_START_TIME
AND A.FN_END_TIME = B.FN_END_TIME)
)
)
But eventhough the dates are not overlapping it still returns the records as overlapping.
I am missing some thing here?
Also if the date records overlap, I need to compare the count of function_id records with DATE_AUTH_LEVEL, if 2 function_id records overlap and the count of function_id would be 2 and DATE_AUTH_LEVEL is 1, such record should in the result set.
Please find the data set in SQLFiddle
http://sqlfiddle.com/#!9/95874/1
Desired Output : The SQL should return overlapping FN_START_TIME and FN_END_TIME for a function_space_id and it's function_dt
In the provided example, row 5 and 6 overlap for the function space id '401-ZFU-12' and function_dt 'August, 15 2015' and all others are not overlapping

The simplest predicate (where clause condition) for detecting the overlap of two ranges is to compare the start of the first range with the end of the 2nd range, and the start of the 2nd range with the end of the first range:
WHERE R1.Start_Date <= R2.End_Date
AND R2.Start_Date <= R1.End_Date
As you can see each of the two inequalities looks at a start and end value from separate records (R1 and R2 and then R2 and R1 respectively) all that remains is to add the conditions that will correlate the records, and also ensure that you aren't comparing a row to itself So if you want to find all Common_IDs that have Distinct_IDs with over lapping date ranges:
select *
from Your_Table R1
where exists (select 1 from Your_Table R2
where R1.Common_ID = R2.Common_ID
and R1.Distinct_ID <> R2.Distinct_ID
and R1.Start_Date <= R2.End_Date
and R2.Start_Date <= R1.End_Date)
If there is no Distinct_ID to use, you can use R1.rowid <> R2.rowid in place of R1.Distinct_ID <> R2.Distinct_ID

Here is an approach to troubleshooting the issue on your end.
My first suspicion is that the results of your exists clause are too broad and thus returning rows for every record matching in the outer clause unexpectedly. Likely there are rows that do not fall on the desired date or spaceid that share one component of their interval with your inner criteria.
Inspect the results of the inner select statement (the one within the exists clause) for an example row, exchanging all the 'A' aliased values with actual values from one of the rows returned you did not expect to receive.
Additionally, you can inspect what I think would be a semi join in the execution profile to see what the join criteria are. If you expect it to be filtered by a constant for 'FUNC_SPACE_ID' of '401-ZFU-52', you will discover that it is not.

outer query to list only if its rowcount equates to inner subquery

Need help on a query using sql server 2005
I am having two tables
code
chargecode
chargeid
orgid
entry
chargeid
itemNo
rate
I need to list all the chargeids in entry table if it contains multiple entries having different chargeids
which got listed in code table having the same charge code.
data :
code
100,1,100
100,2,100
100,3,100
101,11,100
101,12,100
entry
1,x1,1
1,x2,2
2,x3,2
11,x4,1
11,x5,1
using the above data , it query should list chargeids 1 and 2 and not 11.
I got the way to know how many rows in entry satisfies the criteria, but m failing to get the chargeids
select count (distinct chargeId)
from entry where chargeid in (select chargeid from code where chargecode = (SELECT A.chargecode
from code as A join code as B
ON A.chargecode = B.chargeCode and A.chargetype = B.chargetype and A.orgId = B.orgId AND A.CHARGEID = b.CHARGEid
group by A.chargecode,A.orgid
having count(A.chargecode) > 1)
)

First off: I apologise for my completely inaccurate original answer.
The solution to your problem is a self-join. Self-joins are used when you want to select more than one row from the same table. In our case we want to select two charge IDs that have the same charge code:
SELECT DISTINCT c1.chargeid, c2.chargeid FROM code c1
JOIN code c2 ON c1.chargeid != c2.chargeid AND c1.chargecode = c2.chargecode
JOIN entry e1 ON e1.chargeid = c1.chargeid
JOIN entry e2 ON e2.chargeid = c2.chargeid
WHERE c1.chargeid < c2.chargeid
Explanation of this:
First we pick any two charge IDs from 'code'. The DISTINCT avoids duplicates. We make sure they're two different IDs and that they map to the same chargecode.
Then we join on 'entry' (twice) to make sure they both appear in the entry table.
This approach gives (for your example) the pairs (1,2) and (2,1). So we also insist on an ordering; this cuts to result set down to just (1,2), as you described.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL logic to find/drop duplicates of column combination - sql

Hi I have a sql query that results in the output like this I want the output to look something like this: Requirement: row should not have duplicate when CONCATENATE(column one + column two) or CONCATENATE(column two +column one) the duplicate value that is lower in column three is dropped

Related

Select only the last date when two columns are duplicate

How to use NOT EXISTS and JOIN at the same time?

Refer to another table and return data adjacent to Max() result

Return overlapping date records in SQL

outer query to list only if its rowcount equates to inner subquery

Categories

Resources