Query SQL with "childs" - sql

I'm a newbie with SQL and I was wondering if something like what I'm going to show will be possible to do.
I have a table like this :
A B C D
-------------------------
1 ONE
1 P
1 PF
2 TWO
2 PF
3 THREE
3 P
3 P
3 P
3 P
4 FOUR
4 PF
4 PF
5 FIVE
5 P
I would like to do a query to extract the fields in column "A" which doesn't have a "PF" in column "C" with the same number. I.e.:
A B C D
-------------------------
3 THREE
3 P
3 P
3 P
3 P
5 FIVE
5 P
I'm using Python 2.7 and SQLite 3

Assuming that the empty values are NULL, you can use coalesce() to get the parent ID in each row:
SELECT *
FROM MyTable
WHERE COALESCE(A, B) NOT IN (SELECT B
FROM MyTable
WHERE C = 'PF');

I created the below table and inserted the records
create table abcd ( a integer, b integer, c varchar(10), d varchar(100) );
insert into abcd values (1,null,null,'ONE');
insert into abcd values (null,1,'P',null);
insert into abcd values (null, 1,'PF',null);
insert into abcd values (2,null,null,'TWO')
insert into abcd values (null,2,'PF',null);
insert into abcd values (3,null,null,'THREE');
insert into abcd values (null,3,'P',null);
insert into abcd values (null,3,'P',null);
insert into abcd values (null,3,'P',null);
insert into abcd values (null,3,'P',null);
insert into abcd values (4,null,null,'FOUR');
insert into abcd values (null,4,'PF',null);
insert into abcd values (null,4,'PF',null);
insert into abcd values (5,null,null,'FIVE');
insert into abcd values (null,5,'P',null);
select * from abcd;
The below query gives the result set
with temp as
(
SELECT ISNULL(a, (SELECT TOP 1 a FROM abcd
WHERE a = t.b
AND a IS NOT NULL
ORDER BY d DESC)) as a , b,c,d
FROM abcd t
),
flag as
(
select a,b,c,d,case when c='PF' then 1 else 0 end as flag
from temp
where c is not null
)
select * from temp
where a in (select distinct a from flag where c <> 'PF'
and a not in ( select a from flag where flag=1))

Related

Select within select with multiple matches on the other table SQL

I have these 3 tables
Table 1:
id_Table1 field_table1_1 field_table1_2
1 A B
2 C D
3 E F
Table 1:
id_Table2 field_table2_1 field_table2_2
4 G H
5 I J
List item
Table 3:
id_Table3 id_Table1 id_Table2
1 1 4
2 1 5
3 2 5
So table 3 holds the relation between table 1 and 2.
What I want to do, is with a query, get all the fields in the table 1, plus one extra field that contains all the ids of the table 2 separated by coma.
So the result should be something like this:
id_Table1 field_table1_1 field_table1_2 id_Table2
1 A B 4, 5
2 C D 5
3 E F
One option use a lateral join and string_agg():
select t1.*, x.*
from table1 t1
outer apply (
select string_agg(t3.id_table2) id_table2
from table3 t3
where t3.id_table1 = t1.id_table1
) x
There is no need to bring table2 to get the results you want.

How does this SQL Left join on itself work?

CREATE TABLE dbo.Temp_Test
(Main_id int,
Unique_id char(1) )
insert into Temp_Test(Main_id,Unique_id)
values (1, 'a'),
(2, 'b'),
(3, 'c'),
(4, 'c')
SELECT r.Main_Id, r.Unique_ID, x.Main_Id, x.Unique_id
FROM dbo.Temp_Test r
LEFT JOIN dbo.Temp_Test x
ON r.Unique_ID = x.Unique_ID
AND x.Main_Id < r.Main_Id
WHERE x.Main_Id IS NULL
I'm trying to understand how this query works. When running it in steps, it just made me more confused. When it's ran just as
SELECT r.Main_Id, r.Unique_ID, x.Main_Id, x.Unique_id
FROM dbo.Temp_Test r
LEFT JOIN dbo.Temp_Test x
ON r.Unique_ID = x.Unique_ID
the results turn into
Main_ID Unique_ID Main_id Unique_id
1 a 1 a
2 b 2 b
3 c 3 c
3 c 4 c
4 c 3 c
4 c 4 c
But when ran with the x.main_id < r.main_id filter, we get
Main_ID Unique_ID Main_id Unique_id
1 a NULL NULL
2 b NULL NULL
3 c NULL NULL
4 c 3 c
What happened to the 4 C 4 C row?
When you do left join, all the records from the left table will be part of the result. If query finds match from second table, it will show them in result or else will simply print NULL.
In first query, it did find an match with Unique_id's so it gave you results with all possible pairs.
In second query, since there were no records where Main_id was greater in first table (for 1,2,3) it returned NULL for second table.
Whereas, it found Main Id 4 from table at the left to be greater than 3 from table at the right so it displayed the result.
Since Main_id 4 from left table is not greater than the Main_id 4 from the right table, that is not a part of the result.

SQL joining 2 tables without repeating values

I have 2 tables with a 1:n relationship.
I want to join them without repeating (duplicating) the values from the one table.
First, I have a table with budgets:
id name budget
1 John 1000
2 Kim 3000
And second I have a table of spendings:
id amount
1 112
1 145
1 211
The result should look like this:
id name budget amount
1 John 1000 112
1 null null 145
1 null null 211
2 Kim 3000 null
Output could also be: (this is not important)
id name budget amount
1 null null 112
1 John 1000 145
1 null null 211
2 Kim 3000 null
Is this possible with SQL?
Here a join that repeats the values:
create temporary table a (id1 int,name varchar(10),budget int);
insert into a (id1,name,budget) values(1,'Maier',1000),(2,'Mueller',2000);
create temporary table if not exists b (id2 int,betrag int);
insert into b (id2,betrag) values(1,100),(1,133),(1,234);
select * from a left join b
on a.id1=b.id2
;
The keyword DISTINCT is used to eliminate duplicate rows from a query result:
select distinct b.id, b.name, b.budget, s.amount
from budgets b left join spendings s
on b.id = s.id;
You can also use Group By clause which works similarly like Distinct.In that case,
select b.id, b.name, b.budget, s.amount
from budgets b left join spendings s
on b.id = s.id
group by b.id, b.name, b.budget, s.amount;
create table a (id1 int,name varchar(10),budget int)
insert into a (id1,name,budget) values(1,'Maier',1000)
insert into a (id1,name,budget) values(2,'Mueller',2000)
create table b (id2 int,betrag int)
insert into b (id2,betrag) values(1,100)
insert into b (id2,betrag) values(1,133)
insert into b (id2,betrag) values(1,234)
insert into b (id2,betrag) values(2,300)
insert into b (id2,betrag) values(2,400)
select a.id1, CASE WHEN c.themin IS NOT NULL THEN a.name ELSE NULL END AS [name],
CASE WHEN c.themin IS NOT NULL THEN a.budget ELSE NULL END AS [budget],
b.*
from a
LEFT join b on a.id1=b.id2
LEFT OUTER JOIN (SELECT MIN(betrag) AS [themin], id2 FROM b GROUP BY id2) c ON a.id1 = c.id2 AND b.betrag = c.themin

Fastest way to find distinct matching records

I have two tables A and B. Both have same structure. We find matching records between these two. Here are the scripts
CREATE TABLE HRS.A
(
F_1 NUMBER(5,0),
F_2 NUMBER(5,0),
F_3 NUMBER(5,0)
);
CREATE TABLE HRS.B
(
F_1 NUMBER(5,0),
F_2 NUMBER(5,0),
F_3 NUMBER(5,0)
);
INSERT INTO hrs.a VALUES (1,1000,2000);
INSERT INTO hrs.a VALUES (2,1100,8000);
INSERT INTO hrs.a VALUES (3,4000,3000);
INSERT INTO hrs.a VALUES (4,2000,5000);
INSERT INTO hrs.a VALUES (5,5000,3000);
INSERT INTO hrs.a VALUES (6,6000,6000);
INSERT INTO hrs.a VALUES (7,3000,7000);
INSERT INTO hrs.a VALUES (8,1100,9000);
INSERT INTO hrs.b VALUES (1,4000,2000);
INSERT INTO hrs.b VALUES (2,6000,8000);
INSERT INTO hrs.b VALUES (3,1000,3000);
INSERT INTO hrs.b VALUES (4,2000,5000);
INSERT INTO hrs.b VALUES (5,8000,3000);
INSERT INTO hrs.b VALUES (6,1100,6000);
INSERT INTO hrs.b VALUES (7,5000,7000);
INSERT INTO hrs.b VALUES (8,1000,9000);
To find matching records
SELECT a.F_1 A_F1, b.F_1 B_F1 FROM HRS.A, HRS.B WHERE A.F_2 = B.F_2
results
A_F1 B_F1
3 1
6 2
1 3
4 4
8 6
2 6
5 7
1 8
Now i want to remove duplicate entries in both columns separately e.g. 1 is repeating in A_F1 (regardless of B_F1) so row # 3(1-3) and 8(1-8) will be removed. Now 6 is repeating in B_F1 (regardless of A_F1) so row # 5(8-6) and 6(2-6) will be removed. Final result should be
A_F1 B_F1
3 1
6 2
4 4
5 7
Now most important part, These two tables contain 500,000 records each. I was first finding and inserting these matching records into a temp table, then removing duplicate from first column then from second column and then selecting all from temp table. This is too too slow. How can i achieve this as faster as possible?
Edit # 1
I executed following statements multiple times to generate 4096 records in each table
INSERT INTO hrs.a SELECT F_1 + 1, F_2 + 1, 0 FROM hrs.a;
INSERT INTO hrs.b SELECT F_1 + 1, F_2 + 1, 0 FROM hrs.b;
Now i executed all answers and found these
Rachcha 9.11 secs OK
techdo 1.14 secs OK
Gentlezerg 577 msecs WRONG RESULTS
Justin 218 msecs OK
Even #Justin took 37.69 secs for 65,536 records in each (total = 131,072)
Waiting for more optimized answers as actual number of records are 1,000,000 :)
Here is the execution plan of the query based on Justin's answer
Please try:
select A_F1, B_F1 From(
SELECT a.F_1 A_F1, b.F_1 B_F1,
count(*) over (partition by a.F_1 order by a.F_1) C1,
count(*) over (partition by b.F_1 order by b.F_1) C2
FROM HRS.A A, HRS.B B WHERE A.F_2 = B.F_2
)x
where C1=1 and C2=1;
How about an INNER JOIN instead? Please check with this query.
select A_F1, B_F1 From(
SELECT a.F_1 A_F1, b.F_1 B_F1,
count(*) over (partition by a.F_1 order by a.F_1) C1,
count(*) over (partition by b.F_1 order by b.F_1) C2
FROM HRS.A A INNER JOIN HRS.B B ON A.F_2 = B.F_2
)x
where C1=1 and C2=1;
Query:
SQLFIDDLEExample
SELECT a.f_1 AS a_f_1,
b.f_1 AS b_f_1
FROM a JOIN b ON a.f_2 = b.f_2
WHERE 1 = (SELECT COUNT(*)
FROM a aa JOIN b bb ON aa.f_2 = bb.f_2
WHERE aa.f_1 = a.f_1 )
AND 1 = (SELECT COUNT(*)
FROM a aa JOIN b bb ON aa.f_2 = bb.f_2
WHERE bb.f_1 = b.f_1 )
Result:
| A_F_1 | B_F_1 |
-----------------
| 3 | 1 |
| 6 | 2 |
| 4 | 4 |
| 5 | 7 |
According to #techdo 's answer, I think this can be better:
select A_F1, B_F1 From(
SELECT a.F_1 A_F1, b.F_1 B_F1,a.F_2,
count(*) OVER(PARTITION BY A.F_2) C
FROM HRS.A A, HRS.B B WHERE A.F_2 = B.F_2
)x
where C=1 ;
The existence of multi rows is due to the same f_2. This SQL has only one count..over,so you said you have vast data, I think this would be a little faster.
I have the answer.
See this fiddle here.
I used the following code:
WITH x AS (SELECT a.f_1 AS a_f_1, b.f_1 AS b_f_1
FROM a JOIN b ON a.f_2 = b.f_2)
SELECT *
FROM x x1
WHERE NOT EXISTS (SELECT 1
FROM x x2
WHERE (x2.a_f_1 = x1.a_f_1
AND x2.b_f_1 != x1.b_f_1)
OR (x2.a_f_1 != x1.a_f_1
AND x2.b_f_1 = x1.b_f_1)
)
;
EDIT
I used to following code that runs within 14 ms on SQL fiddle. I removed the common table expression and observed that the query performance improved.
SELECT a1.f_1 AS a_f1, b1.f_1 AS b_f1
FROM a a1 JOIN b b1 ON a1.f_2 = b1.f_2
WHERE NOT EXISTS (SELECT 1
FROM a a2 JOIN b b2 ON a2.f_2 = b2.f_2
WHERE (a2.f_1 = a1.f_1
AND b2.f_1 != b1.f_1)
OR (a2.f_1 != a1.f_1
AND b2.f_1 = b1.f_1))
;
Output:
A_F_1 B_F_1
3 1
6 2
4 4
5 7
Each one of these solutions are taking time, the best one (Justin) took almost 45 mins without even returning for 2 million records. I ended up with inserting matching records in a temp table and then removing duplicates and i found it much faster than these solutions with this data set.

SQL Always return all rows from LUT for each ID

I am wondering if this query can be modified to achieve what I want:
SELECT
cv.[ID]
,cv.[CustomValue]
,cf.[SpecialInformationId]
FROM #CustomFields cf
FULL OUTER JOIN #CustomValues cv ON cf.SpecialInformationId = cv.id
This returns all cv.Id's. It also returns any unmatched cf.SpecialInformationId's with a NULL for the cv information. What I actually want is for each instance of cv, I want every cf to show. cf is a lookup table. In this instance there are 12 values, but that varies everytime the query runs. Here is an example:
What this query currently returns:
cv.id cv.customvalue cf.specialinformationid
1 003 1
1 abc 2
2 004 1
2 1/1/2010 4
2 abc 2
3 009 1
4 003 1
4 acb 2
4 1/2/2010 4
What I want it to return:
cv.id cv.customvalue cf.specialinformationid
1 003 1
1 abc 2
1 NULL 3
1 NULL 4
1 NULL 5
2 004 1
2 abc 2
2 NULL 3
2 1/1/2010 4
2 NULL 5
3 009 1
3 NULL 2
3 NULL 3
3 NULL 4
3 NULL 5
4 003 1
4 acb 2
4 NULL 3
4 1/2/2010 4
4 NULL 5
A Left join cannot be used because there are only 12 rows in the lookup table so if a left join is used the same result will be achieved as the full outer join.
This is a spinoff of my other question:
SQL 2 tables always return all rows from 1 table match existing on other
Thanks
I believe a CROSS JOIN will achieve the results you're looking for.
The problems are arising because your Table2 is not really a 'vehicle' table. Because the VehicleId does not uniquely identify a record in that table. This is where all of the confusion is coming from. So to solve that and get your problem to work I did a select distinct on table2 against the values in table 1 (I also did a select distinct for clarity, but it was not necessary.) Hope this helps.
CREATE TABLE #Table1 (Id INT)
CREATE TABLE #Table2 (VehicleID INT, Value VARCHAR(50), Table1ID INT)
INSERT INTO #Table1 VALUES (1),(2),(3),(4),(5)
INSERT INTO #Table2 VALUES (1, 't', 1),(1, 'q', 2),(3, 'w', 3),(3, 'e', 4),(4, 't', 1),(5, 'e', 1),(5, 'f', 2),(5, 'g', 4)
SELECT * FROM #Table1
SELECT * FROM #Table2
SELECT t2.VehicleID, t2.Value
FROM ( SELECT t2.VehicleId, t1.Id
FROM ( SELECT DISTINCT
VehicleId
FROM #Table2 ) t2
CROSS JOIN ( SELECT Id
FROM #Table1 ) t1 ) Base
LEFT JOIN #Table2 t2
ON Base.VehicleId = t2.VehicleID
AND Base.Id = t2.Table1ID
WHERE (Base.VehicleId BETWEEN 1 AND 3)
DROP TABLE #Table1
DROP TABLE #Table2