Help required with a complex self join sql query - sql

myTable is having composite key formed of columns A and B (total columns A, B, C, D, E).
I want to exclude/ignore records where value of D (say order number) is same and E (say decision) is Y in one but N or Null in other. (means all the twin-records with same order number (equal D value) which were ordered first (so E=Y) and then again cancelled (so E=N) should be ignored)
So I pulled out A,B for all records where D is same but E is Y in one and N in other
SELECT *
FROM myTable A, myTable B
WHERE
(A.D=B.D)
AND
((A.E ='Y' AND (B.E ='N' OR B.E IS NULL)) OR (B.E='Y' AND (A.E='N' OR A.E IS NULL)))
Now my final output should be all records from myTable but not the records found above.
I wrote a join query but its not working as it should. Basically issue is how to compare two composite keys ??
Sample Data:
A B C D E
=========================
1 A xyz ONE Y
2 B pqr TWO Y
3 C lmn ONE N
4 D abc THREE Y
5 E ijk FOUR Y
=========================
Thus, my output should be records 2,4 and 5. As 1 and 3 will be ignored. Because 1.D = 3.D and 1.E is Y but 3.E is N.
Thanks,
Nik

I want to exclude records where value of D is "XYZ".
Why not simply query like this directly?
select *
from myTable
where D <> 'XYZ'
To exclude rows from Temp, you could:
select *
from myTable
where not exists
(
select *
from temp
where myTable.A = temp.A
and myTable.B = temp.B
)
Or with an exclusive left join:
select *
from myTable
left join
temp
on myTable.A = temp.A
and myTable.B = temp.B
where temp.A is null

If I've correctly understood you, what you need is this:
select x.*
from mytable x left outer join
( select mt1.a, mt1.b
from mytable mt1 inner join
mytable mt2 on mt1.d = mt2.d
where ((mt1.E ='Y' AND (mt2.E ='N' OR mt2.E IS NULL)) OR (mt2.E='Y' AND (mt1.E='N' OR mt1.E IS NULL)))
) y on x.a = y.a and x.b = y.b
where y.a is NULL

You need something like
select A.*
from myTable A
WHERE (SELECT COUNT(*) FROM myTable B WHERE B.D = A.D AND (B.E IS NULL OR B.E = 'N')) = 0

Related

Insert missing values in column at all levels of another column in SQL?

I have been working with some big data in SQL/BigQuery and found that it has some holes in it that need to be filled with values in order to complete the dataset. What I'm struggling with is how to insert the missing values properly.
Say that I have multiple levels of a variable (1, 2, 3... no upper bound) and for each of these levels, they should have an A, B, C value. Some of these records will have data, others will not.
Current dataset:
level value data
1 A 1a_data
1 B 1b_data
1 C 1c_data
2 A 2a_data
2 C 2c_data
3 B 3b_data
What I want the dataset to look like:
level value data
1 A 1a_data
1 B 1b_data
1 C 1c_data
2 A 2a_data
2 B NULL
2 C 2c_data
3 A NULL
3 B 3b_data
3 C NULL
What would be the best way to do this?
You need a CROSS join of the distinct levels with the distinct values and a LEFT join to the table:
SELECT l.level, v.value, t.data
FROM (SELECT DISTINCT level FROM tablename) l
CROSS JOIN (SELECT DISTINCT value FROM tablename) v
LEFT JOIN tablename t ON t.level = l.level AND t.value = v.value
ORDER BY l.level, v.value;
See the demo.
We can use an INSERT INTO ... SELECT with the help of a calendar table:
INSERT INTO yourTable (level, value, data)
SELECT t1.level, t2.value, NULL
FROM (SELECT DISTINCT level FROM yourTable) t1
CROSS JOIN (SELECT DISTINCT value FROM yourTable) t2
LEFT JOIN yourTable t3
ON t3.level = t1.level AND
t3.value = t2.value
WHERE t3.data IS NULL;

Suppress rows with reverse/swapped values

I would like to query a database table that contains rows that have reverse values than other rows. So the table looks like this
Src Trgt ValueA ValueB
A B 1,44 5
B A 1,44 5 <--
C D 1,23 8
D C 1,23 8 <--
F G 5,12 9
G F 5,12 9 <--
What I want is a query that returns all rows that do not again with the source and target value swapped. The rows that should not be queried are the ones that have the same Value A and B like another row, but only with source and target value swapped (The ones marked in above table)
So, the desired results would look like this:
Src Trgt ValueA ValueB
A B 1,44 5
C D 1,23 8
F G 5,12 9
I think this is what you want:
select t.*
from t
where t.src < t.trgt
union all
select t.*
from t
where t.src > t.trgt and
not exists (select 1
from t t2
where t2.src = t.trgt and t2.trgt = t.src and
t2.a = t.a and t2.b = t.b
);
It keeps the first row encountered, filtering out equivalent rows where the first two columns are switched.
EDIT:
Another approach if you just one one row per combo is:
select least(src, trgt) as src, greatest(src, trgt) as trgt, a, b
from t
group by least(src, trgt), greatest(src, trgt), a, b;
This runs the risk of returning a row not in the original data (if the row has no duplicate and trgt > src.
SELECT *
FROM ztable zt
WHERE zt.source < zt.target -- pick only one of the twins
OR NOT EXISTS( -- OR :if it is NOT part of a twin
SELECT *
FROM ztable nx
WHERE nx.source = zt.target
AND nx.target = zt.source
);
Assuming that rows with source=target are not present or not wanted.

How to find doubles in master-child table?

I need help with a query to find doubles. Let met explain the situation by example:
tableA (the master table) has a key field keyA with these values :
keyA
1
2
3
etc
tableB (the client table) has a foreign key field keyA and a value field, fieldB
keyA fieldB
1 a
1 b
2 a
2 b
3 a
3 c
4 a
4 b
4 c
etc
So, the values for fieldB in child table tableB are:
for tableA.keyA = 1 are: a and b
for tableA.keyA = 2 are: a and b
for tableA.keyA = 3 are: a and c
for tableA.keyA = 4 are: a, b and c
Now, given a value for keyA I need to find all records in tableA that have matching records in tableB for the field fieldB.
For example, if I search with keyA = 1 then
tableA.keyA = 2 is OK because both have same tableB.fieldB (a and b versus a and b)
tableA.keyA = 3 is not OK because both have not same tableB.fieldB (a and b versus a and c)
tableA.keyA = 4 is not OK because both have not same tableB.fieldB (a and b versus a, b and c)
I need a query that can give me this result. I hope someone can help me with this or can point me into the right direction.
Try this simple query , hope this will solve your problem
DECLARE #vkey int = 1
;WITH cte_test AS (
SELECT keyA,(SELECT ','+fieldb FROM tableB t1 WHERE t1.keyA = t.keyA FOR XML path('')) AS rslt
from tableB t
GROUP BY t.keyA)
SELECT t2.*
FROM cte_test t1
INNER JOIN cte_test t2 ON t1.[rslt] = t2.[rslt] AND t2.[keyA] <> t1.[keyA]
WHERE t1.[keyA] = #vkey
If there is no other item have the same combination , then there is no records in the result, otherwise it will return the matched items.
Assuming there are no duplicates, you can do this with a self-join and aggregation:
select c.keyA, c2.keyA
from (select c.*, count(*) over (partition by keyA) as numBs
from clientTable c
) c join
(select c.*, count(*) over (partition by keyA) as numBs
from clientTable c
) c2
on c2.fieldB = c.fieldB and
c2.keyA <> c.keyA and
c.keyA = 1 -- or whatever key you want to check
where c.numBs = c2.numBs
group by c.keyA, c2.keyA, c.numBs, c2.numBs
having count(*) = c.numBs;
The idea is to count the number of fieldB values for each keyA. These need to be equal (where c.numBs = c2.numBs) and to check that all match (having count(*) = c.numBs).

one to one distinct restriction on selection

I encountered a problem like that. There are two tables (x value is ordered so that
in a incremental trend !)
Table A
id x
1 1
1 3
1 4
1 7
Table B
id x
1 2
1 5
I want to join these two tables:
1) on the condition of the equality of id and
2) each row of A should be matched only to one row of B, vice verse (one to one relationship) based on the absolute difference of x value (small difference row has
more priority to match).
Only based on the description above it is not a clear description because if two pairs of row which share a common row in one of the table have the same difference, there is no way to decide which one goes first. So define A as "Main" table, the row in table A with smaller line number always go first
Expected result of demo:
id A.x B.x abs_diff
1 1 2 1
1 4 5 1
End of table(two extra rows in A shouldn't be considered, because one to one rule)
I am using PostgreSQL so the thing I have tried is DISTINCT ON, but it can not solve.
select distinct on (A.x) id,A.x,B.x,abs_diff
from
(A join B
on A.id=B.id)
order by A.x,greatest(A.x,B.x)-least(A.x,B.x)
Do you have any ideas, it seems to be tricky in plain SQL.
Try:
select a.id, a.x as ax, b.x as bx, x.min_abs_diff
from table_a a
join table_b b
on a.id = b.id
join (select a.id, min(abs(a.x - b.x)) as min_abs_diff
from table_a a
join table_b b
on a.id = b.id
group by a.id) x
on x.id = a.id
and abs(a.x - b.x) = x.min_abs_diff
fiddle: http://sqlfiddle.com/#!15/ab5ae/5/0
Although it doesn't match your expected output, I think the output is correct based on what you described, as you can see each pair has a difference with an absolute value of 1.
Edit - Try the following, based on order of a to b:
select *
from (select a.id,
a.x as ax,
b.x as bx,
x.min_abs_diff,
row_number() over(partition by a.id, b.x order by a.id, a.x) as rn
from table_a a
join table_b b
on a.id = b.id
join (select a.id, min(abs(a.x - b.x)) as min_abs_diff
from table_a a
join table_b b
on a.id = b.id
group by a.id) x
on x.id = a.id
and abs(a.x - b.x) = x.min_abs_diff) x
where x.rn = 1
Fiddle: http://sqlfiddle.com/#!15/ab5ae/19/0
One possible solution for your currently ambiguous question:
SELECT *
FROM (
SELECT id, x AS a, lead(x) OVER (PARTITION BY grp ORDER BY x) AS b
FROM (
SELECT *, count(tbl) OVER (PARTITION BY id ORDER BY x) AS grp
FROM (
SELECT TRUE AS tbl, * FROM table_a
UNION ALL
SELECT NULL, * FROM table_b
) x
) y
) z
WHERE b IS NOT NULL
ORDER BY 1,2,3;
This way, every a.x is assigned the next bigger (or same) b.x, unless there is another a.x that is still smaller than the next b.x (or the same).
Produces the requested result for the demo case. Not sure about various ambiguous cases.
SQL Fiddle.

Multiple NOT distinct

I've got an MS access database and I would need to create an SQL query that allows me to select all the not distinct entries in one column while still keeping all the values.
In this case more than ever an example is worth thousands of words:
Table:
A B C
1 x q
2 y w
3 y e
4 z r
5 z t
6 z y
SQL magic
Result:
B C
y w
y e
z r
z t
z y
Basically it removes all unique values of column B but keeps the multiple rows of the
data kept. I can "group by b" and then "count>1" to get the not distinct but the result will only list one row of B not the 2 or more that I need.
Any help?
Thanks.
Select B, C
From Table
Where B In
(Select B From Table
Group By B
Having Count(*) > 1)
Another way of returning the results you want would be this:
select *
from
my_table
where
B in
(select B from my_table group by B having count(*) > 1)
select
*
from
my_table t1,
my_table t2
where
t1.B = t2.B
and
t1.C != t2.C
-- apparently you need to use <> instead of != in Access
-- Thanks, Dave!
Something like that?
join the unique values of B you determined with group by b and count > 1 back to the original table to retrieve the C values from the table.