LEFT JOIN with tables having boolean data producing unexpected result set - sql

A Table_1 with only column BOOLVALUE(int) having records as
1
1
0
0
0
and another Table_2 with only column BOOLVALUE(int) having records as
1
1
1
0
0
.. I am trying to run a query
select t1.BOOLVALUE from Table_1 t1
left join Table_2 t2 on t1.BOOLVALUE=t2.BOOLVALUE
and to my surprise output is not what I expected.There are 12 rows with 6 1's and 6 0's. But doesn't this invalidates how joins work ?

12 rows is completely expected as you have 2 rows related to 3 rows, resulting in 6 rows, and 3 rows related to 2 rows resulting in 6 rows; add these together and you get 12.
When you JOIN all related rows are JOINed based on the ON clause. Your ON clause is t1.BOOLVALUE=t2.BOOLVALUE. This means all the 1s inTable_1 relate to all the 1s in Table_2; so that's 2 rows related to 3 rows (2 * 3). Then all the 0s inTable_1 relate to all the 0s in Table_2; so that's 3 rows related to 2 rows (3 * 2). Hence (2 * 3) + (3 * 2) = 6 + 6 = 12.
If we add an ID column to the table, this might become a little clearer.
Let's say you have 2 tables like this:
ID1
I1
1
1
2
1
3
0
4
0
5
0
ID2
I2
1
1
2
1
3
1
4
0
5
0
Then lets say you have the following query:
SELECT T1.ID1,
T2.ID2,
T1.I1,
T2.I2
FROM dbo.Table1 T1
JOIN dbo.Table2 T2 ON T1.I1 = T2.I2
ORDER BY T1.ID1
T2.ID2;
This would result in the following data set:
ID1
ID2
I1
I2
1
1
1
1
1
2
1
1
1
3
1
1
2
1
1
1
2
2
1
1
2
3
1
1
3
4
0
0
3
5
0
0
4
4
0
0
4
5
0
0
5
4
0
0
5
5
0
0
Here you can see you have a many to many join, and where the "extra" rows are coming from.
If you LEFT JOINed on the ID and I columns, starting at Table1, you would get 5 rows, with 1 row having NULL values for ID2 and I2 (in this case because although the ID matched, I did not):
SELECT T1.ID1,
T2.ID2,
T1.I1,
T2.I2
FROM dbo.Table1 T1
LEFT JOIN dbo.Table2 T2 ON T1.ID1 = T2.ID1
AND T1.I1 = T2.I2
ORDER BY T1.ID1
T2.ID2;
ID1
ID2
I1
I2
1
1
1
1
2
2
1
1
3
NULL
0
NULL
4
4
0
0
5
5
0
0

When you join on a column of which has repeating values the number of rows returned is the product of the number of matching values in the 2 tables.
In this case there are 2 1's in table 1 and 3 in table 2 so SQL returns the 6 possible combinations (2 x 3). As there are 3 x 2 zero combinations you get 12 rows in total.
If you did a cross join you would get 25 rows back (5 x 5).

Related

sqlite delete all results where column a and column b is not in first n items

Lets say I have the following table
a b c
-----------
1 1 5
1 2 3
4 1 2
1 2 4
4 2 10
And I want to delete all rows where none of the first n rows has the same value in a and b as that row.
So for example the resulting tables for various n's would be
n = 1
a b c
-----------
1 1 5
// No row other than the first has a 1 in a, and a 1 in b
n = 2
a b c
-----------
1 1 5
1 2 3
1 2 4
// The fourth row has the same values in a and b as the second, so it is not deleted. The first 2 rows of course match themselves so are not deleted
n = 3
a b c
-----------
1 1 5
1 2 3
4 1 2
1 2 4
// The fourth row has the same values in a and b as the second, so it is not deleted. The first 3 rows of course match themselves so are not deleted
n = 4
a b c
-----------
1 1 5
1 2 3
4 1 2
1 2 4
// The first 4 rows of course match themselves so are not deleted. The fifth row does not have the same value in both a and b as any of the first 4 rows, so is deleted.
I've been trying to work out how to do this using a not in or a not exists, but since I'm interested in two columns matching not just 1 or the whole record, I'm struggling.
Since you are not defining a specific order, the result is not completely defined, but depends on arbitrary choices of implementation regarding which rows are computed first in the limit clause. A different SQLite version for example may give you a different result. With that being said, I believe that you want the following query:
select t1.* from table1 t1,
(select distinct t2.a, t2.b from table1 t2 limit N) tabledist
where t1.a=tabledist.a and t1.b=tabledist.b;
where you should replace N with the desired number of rows
EDIT: So, to delete directly from the existing table you need something like:
with toremove(a, b, c) as
(select * from table1 tt
EXCEPT select t1.* from table1 t1,
(select distinct t2.a, t2.b from table1 t2 limit N) tabledist
where t1.a=tabledist.a and t1.b=tabledist.b)
delete from table1 where exists
(select * from toremove
where table1.a=toremove.a and table1.b=toremove.b and table1.c=toremove.c);

Update one table from another doing some math

Postgres 9.6.6, latest Ubuntu LTS.
I have a big main table and a small one, who receiving data from external sensors. Each record of the small one have the record id of the big one.
Table_1 Table_2
id temp0 temp0temp1 temp1 Tab1_Id
1 3 0 35 2
2 5 0 15 3
3 8 0 75 1
4 9 0 45 4
5 3 0 .some
6 8 0
7 2 0
.tens of thousand...
I'am looking for an efficient solution to update each record of the big one, doing some math, ie:
Table 1 (after)
id temp0 temp0temp1
1 3 78
2 5 40
3 8 23
4 9 54
5 3 0
6 8 0
7 2 0
Something similar to:
UPDATE Table_1
SET temp0temp1 = Table_1.temp0 + (SELECT temp1
FROM Table_2
WHERE table_2.Tab1_Id = Table_1.Id)...
Thanks
Perez
You can use a update ... from
UPDATE Table_1 t1
SET temp0temp1 = t1.temp0 + t2.temp1
from Table_2 t2
WHERE t2.Tab1_Id = t1.Id
You can create a trigger. On each insert of table 1 run a procedure that will update table 2 with desired calculations.
update t1
set temp0temp1 = temp0 + temp1
from Table_1 t1 join
Table_2 t2 on t1.id = t2.Tab1_Id

Selecting Last change value per group

I am trying to select the last change value per group.
I have a table
MMID column is incremental
MMID GID MID Value Bundle DateEntered
1 1 1 1 2 17/8/15 05:05:04
2 1 2 2 3 16/8/15 05:05:06
3 1 3 3 2 15/8/15 05:05:07
4 1 1 0 2 18/8/15 05:05:08
5 2 2 1 1 18/8/15 05:05:05
6 2 2 2 2 18/8/15 06:06:06
7 2 4 3 1 17/8/15 06:06:06
8 2 4 3 2 18/8/15 06:06:07
Here, I want the last change 'Value' in the last 24 hour(Having Date 18th August).
From the below query, I can get that. But even if the bundle value is changed, then I get that row.
But I want only rows when 'Value' is changed, or 'Value and Bundle' are changed. But not only when Bundle is changed
Desired output
MMID GID MID Value Bundle DateEntered
4 1 1 0 2 18/8/15 05:05:08
6 2 2 2 2 18/8/15 06:06:06
The query I tried is :
select yt1.*
from Table1 yt1
left outer join Table1 yt2
on (yt1.GID = yt2.GID and yt1.MID = yt2.MID
and yt1.MMID < yt2.MMID)
where yt2.MMID is null and yt2.GID is null and yt2.MID is null and yt1.DateEntered > '2015-08-18 00:00:00' ;
The output i get from here is:
MMID GID MID Value Bundle DateEntered
4 1 1 0 2 18/8/15 05:05:08
6 2 2 2 2 18/8/15 06:06:06
8 2 4 3 2 18/8/15 06:06:07
I should not be getting the last row here.
Can anyone tell me what should I change here.
Not really following the logic of your attempt, but here is how I would get the desired results:
WITH cte AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY GID, MID ORDER BY MMID) AS rn
FROM Table
)
, cte2 AS (
SELECT t1.* FROM cte t1
INNER JOIN cte t2
ON t1.GID=t2.GID
AND t1.MID=t2.MID
AND t1.value<>t2.value
AND t1.rn=t2.rn+1
)
SELECT *
FROM cte2
WHERE MMID=(
SELECT TOP 1 MMID
FROM cte2 c2
WHERE cte2.GID=c2.GID
AND cte2.MID=c2.MID
ORDER BY MMID DESC
)
NB: If you don't want to include the rn column in the final results, use a column list instead of SELECT *.

SQL Inner Join and Partitioning To obtain RowNumbers when matching

I have 2 tables. The first table 'a' the second 'b'.
I am writing a query that grabs every row in table a (there is 33 rows defined) and inner joins table b where the EnclLocation or the BackPanLoc match the Workcell in table A.
I only want a row from table B where they match based off BackPan and EnclLocation but they are not the same records. table b has a few rows of data that is assigned to the same workcell as table a. I am just trying to retrieve those additional rows and partition it.
I attached table a and table b. I also attached the desired results for this query with respect to Workcell 10 only as an example... As you can see, table B has 4 records that has either the EnclLocation or the BackPanLoc = 10. But my results only show the same DelvNumber 4 times. any help is most appreicated.
Table a
Table b
Incorrect Results
Desired Results (showing only Workcell 10 as an example)
workcell DelvNumber RowNum
1 447910-02 1
2 445710-01 1
2 445710-01 2
3 444291-01 1
3 444291-01 2
4 447910-03 1
4 447910-03 2
5 648020-01 1
6 647800-02 1
7 646920-01 1
7 646920-01 2
8 644830-4-8 1
8 644830-4-8 2
9 443990-01 1
10 645960-01-03 1
10 445710-11 2
10 445710-02 3
10 445710-09 4
Code Used
WITH ss
AS (SELECT a.*,
Row_number()
OVER(
partition BY a.workcell
ORDER BY a.workcell) AS rownum
FROM nwcurrent a
INNER JOIN nwdeliverables b
ON b.encllocation = a.workcell
OR b.backpanloc = a.workcell
WHERE ( b.status < 9
AND ( b.encllocation <> 0
OR b.backpanloc <> 0 )
OR a.delvnumber = '123' ))
SELECT *
FROM ss
copy and paste format
1 447910-02 1
2 445710-01 1
2 445710-01 2
3 444291-01 1
3 444291-01 2
4 447910-03 1
4 447910-03 2
5 648020-01 1
6 647800-02 1
7 646920-01 1
7 646920-01 2
8 644830-4-8 1
8 644830-4-8 2
9 443990-01 1
10 645960-01-03 1
10 445710-11 2
10 445710-02 3
10 445710-09 4
SQLFiddle
http://sqlfiddle.com/#!3/a8682/4
A new try...
SELECT a.workcell
,a.DelvNumber AS A_DelvNumber
,b.DelvNumber AS B_DelvNumber
,CASE WHEN a.DelvNumber<>b.DelvNumber THEN b.DelvNumber ELSE a.DelvNumber END AS DelvNumber_Resolved
,Row_number() OVER(partition BY a.workcell ORDER BY a.workcell) AS rownum
FROM NWCurrent a
INNER JOIN NWDeliverables AS b ON b.EnclLocation=a.WorkCell OR b.BackPanLoc=a.WorkCell
WHERE (b.status <9 AND (b.EnclLocation<>0 OR b.BackPanLoc<>0)OR a.DelvNumber='123')

MySQL SUM Query

I've got two tables.
I'm trying to calculating the SUM quantity of tbl1
tbl1.xid is the primary, while tbl2.xid is the foreign
tbl1
xid pub quantity
1 1 10
2 1 2
3 0 1
4 1 5
tbl2
id ttype fno xid qnty
1 A 0 1 0
2 A 1 1 3
3 B 1 1 4
4 A 1 2 1
5 A 1 3 2
6 A 1 4 3
7 A 1 4 1
8 A 0 1 0
We are calculating the sum of tbl1's quantity
1) Whos tbl1.pub is 1
Thus tbl1.xid 3 is removed form the list, for it's pub is 0
Results
tbl1
xid pub quantity
1 1 10
2 1 2
4 1 5
2) AND Who's tbl1 has at least one tbl2.xid who's tbl2.ttype is 'A' and who's tbl2.fno is '0'
Thus tbl1.xid 2 & 4 are removed form the list, because none of them have at least one tbl2.xid who's fno is '0' and who's tbl2.ttype is 'A'
Results
parent_tbl1
xid pub quantity
1 1 10
The final results should be 10
SELECT SUM(quantity) AS Total
FROM tbl1
WHERE pub=1
AND EXISTS
(SELECT *
FROM tbl2
WHERE tbl2.ttype = 'A'
AND tbl2.fno = 0
AND tbl1.xid = tbl2.xid
)