Subtract Values from Two Different Tables - sql

Consider table X:
A
-
1
2
3
3
6
Consider table Y:
A
-
0
4
2
1
9
How do you write a query that takes the difference between these two tables, to compute the following table (say table Z):
A
-
1
-2
1
2
-3

It's not clear what you want. Could it be this?
SELECT (SELECT SUM(A) FROM X) -
(SELECT SUM(A) FROM Y)
AS MyValue

Marcelo is 100% right - in a true relational database the order of a result set is never guaranteed. that said, there are some databases that do always return sets in an order.
So if you are willing to risk it, here is one solution. Make two tables with autoincrement keys like this:
CREATE TABLE Sets (
id integer identity(1,1)
, val decimal
)
CREATE TABLE SetY (
id integer identity(1,1)
, val decimal
)
Then fill them with the X and Y values:
INSERT INTO Sets (val) (SELECT * FROM X)
INSERT INTO SetY (val) (SELECT * FROM Y)
Then you can do this to get your answer:
SELECT X.ID, X.Val, Y.Val, X.val-Y.val as Difference
FROM Sets X
LEFT OUTER JOIN SetY Y
ON Y.id = X.ID
I would cross my fingers first though! If there is any way you can get a proper key in your table, please do so.
Cheers,
Daniel

Related

Why does Snowflake not report ambiguous column references for USING joins

Given this query snowflake returns a result set of 2, arbitrarily resolving y to the table T,
select y
from (select 1 x, 2 y) T
join (select 1 x, 3 y) T1 using (x)
while at the same time returning an ambiguous column error when using a qualified join instead:
select y
from (select 1 x, 2 y) T
join (select 1 x, 3 y) T1 on T.x = T1.x
What's the set of rules that determine whether a column reference is ambiguous in Snowflake SQL? Postgres considers both of these queries ambiguous.
This answer is just an observation. It seems the column is chosen depending on order of join(left-to-right):
CREATE OR REPLACE TABLE T(x INT, y INT) AS select 1, 2 UNION SELECT 10, 20;
CREATE OR REPLACE TABLE T1(x INT, y INT) AS select 1, 3 UNION SELECT 10, 30;
-- disabling cache
ALTER SESSION SET USE_CACHED_RESULT=FALSE;
Query profile:
explain using tabular
select y
from T
join T1 using (x);
Output:
Swapped join order:
explain using tabular
select y
from T1
join T using (x);
Output:

Normalization into three tables in postgres incl one association table

Say I have original data like so:
foo bar baz
1 a b
1 x y
2 z q
And I want to end up with three tables, of which I and III are the main tables and II is an association table between I and III
I. e.:
I
id foo
1 1
2 2
II
id I_id III_id
1 1 1
2 1 2
3 2 3
NB that I_ID is a serial and not foo
III
id bar baz
1 a b
2 x y
3 z q
How would I go about inserting this in one go?
I have played around with CTEs but I am stuck at the following: if I start with III and then return the ID's I cannot see how I can get back to the I table since there is nothing connecting them (yet)
My previous solutions have ended up in pre-generating id sequences which feels so-so
What if you generate a dense rank ?
you generate first a big table with the information you need.
select foo,
bar,
baz,
dense_rank() over (order by foo) as I_id,
dense_rank() over (order by bar, baz) as III_id,
row_number() over (order by 1) as II_id
from main_Table
Then you just have to transfer in the table you want with a distinct.
Start with "main" tables, create two main entities and then use their IDs to insert a record to "connection table" between them, you can use CTE for that of course (I assume that "main" tables I and III have default nextval(..) in PK column, pooling next ID from sequences):
with ins1 as (
insert into tabl1(foo)
values(...)
returning *
), ins3 as (
insert into tabl3(bar, baz)
values (.., ..)
returning *
)
insert into tabl2(i_id, ii_id)
select ins1.id, ins3.id
from ins1, ins3 -- implicit CROSS JOIN here:
-- we assume that only single row was
-- inserted to each "main" table
-- or you need Cartesian product to be inserted to `II`
returning *
;

Select exactly 5 rows from a table

I have an odd requirement which ideally should be solved in SQL, not the surrounding app.
I need to select exactly 5 rows regardless of how many are actually available. In practice the number of rows available will usually be less than 5 and on some rare occasions it will be more than 5. The "extra" rows should have null in every column.
The app is written in a technology that isn't Turing Complete. This requirement is much more difficult to solve in the app's code than you might imagine! To describe it, the app is effectively a transformer: It takes in a bunch of queries and spits out a report. So please understand the app is NOT written in a "programming language" in the traditional sense.
So for example, if I have a table:
A | B
-----
1 | X
2 | Y
3 | Z
Then a valid result would be
A | B
-----------
2 | Y
1 | X
3 | Z
null | null
null | null
I know this is an unusual requirement. Sadly it can't be solved in the application due to the technology being used.
Ideally this shouldn't require changes to the database but if there is no other way that changes can be arranged.
Any suggestions?
You can do something like this:
select top 5 a, b
from (select a, b, 1 as priority from t union all
select null, null, 2 cross join
(values(1, 2, 3, 4, 5)) v(5)
) x
order by priority;
That is, create dummy rows, append them, and then choose the first five.
I do think that this work should be done in the app, but you can do it in SQL.
Create Table #Test (A int, B int)
Insert #Test Values (1,1)
Insert #Test Values (2,1)
Insert #Test Values (3,1)
Select Top 5 * From
(
Select A, B From #Test
Union All
Select Null, Null
Union All
Select Null, Null
Union All
Select Null, Null
Union All
Select Null, Null
Union All
Select Null, Null
) A
Wrap this in a stored proc..
declare #rowcount int
select top 5* from dbo.test
set #rowcount=##rowcount
if #rowcount<5
Begin
select * from dbo.test
union all
select null from dbo.numbers where n<=5-#rowcount
End
If you use some sort of tally table (although the numbers themselves do not matter, only that the table has enough records), you can use it to create the dummy rows. e.g. using sys.columns:
select top 5 a,b from
(
select a, b, 0 ord from yourTable
union all
select null a,null b, 1 from sys.columns
) t
order by ord
The advantage of the tally would be that if you need another number of rows in the future, you only need to change the top x (provided the tally table has enough rows)
Get those 3 records from your table.
Take a counter variable.
and then from your code add the NULL content until your counter gets 5.

SQL Join on sequence number

I have 2 tables (A, B). They each have a different column that is basically an order or a sequence number. Table A has 'Sequence' and the values range from 0 to 5. Table B has 'Index' and the values are 16740, 16744, 16759, 16828, 16838, and 16990. Unfortunately I do not know the significance of these values. But I do believe they will always match in sequential order. I want to join these tables on these numbers where 0 = 16740, 1 = 16744, etc. Any ideas?
Thanks
You could use a case expression to convert table a's values to table b's values (or vise-versa) and join on that:
SELECT *
FROM a
JOIN b ON a.[sequence] = CASE b.[index] WHEN 16740 THEN 0
WHEN 16744 THEN 1
WHEN 16759 THEN 2
WHEN 16828 THEN 3
WHEN 16838 THEN 4
WHEN 16990 THEN 5
ELSE NULL
END;
#Mureinik has a great example. If down the road you do end up adding more numbers maybe putting this information into a new table would be a good idea.
CREATE TABLE C(
AInfo INT,
BInfo INT
)
INSERT INTO TABLE C(AInfo,BInfo) VALUES(0,16740)
INSERT INTO TABLE C(AInfo,BInfo) VALUES(1,16744)
etc
Then you can Join all the tables.
If the values are in ascending order as per your example, you can use the ROW_NUMBER() function to achieve this:
;with cte AS (SELECT *, ROW_NUMBER() OVER(ORDER BY [Index])-1 RN
FROM B)
SELECT *
FROM cte

Optimize query with many OR statements in WHERE clause

What is the best way to write query which will give equivalent result to this:
SELECT X,Y,* FROM TABLE
WHERE (X = 1 AND Y = 2) OR (X = 2235 AND Y = 324) OR...
Table has clustered index (X, Y).
Table is huge (milions) and there can be hundreds of OR statements.
you can create another table with columns X and Y
and insert the values in that table and and then join with the original table
create table XY_Values(X int, Y int)
Insert into XY_Values values
(1,2),
(2235,324),
...
Then
SELECT X,Y,* FROM TABLE T
join XY_Values V
on T.X=V.X
and T.Y=V.Y
You could create an index on (X,Y) on XY_Values , which will boost the performance
You could create XY_Values as a table variable also..
I think you can fill up a temp tables with the hundreds of X and Y values, and join them.
Like:
DECLARE #Temp TABLE
(
X int,
Y int
)
Prefill with this with your search requirements and join then.
(Or an other physical table which saves the search settings.)
this will do better
select t.*
from table t
join (select 1 as x,2 as y
union
...) t1 on t.x=t1.x and t.y=t1.y
if you are using too many or statements the execution plan wont use indexes.
It is better to create multiple statements and merge the result using union all.
SELECT X,Y,*
FROM TABLE
WHERE (X = 1 AND Y = 2)
union all
SELECT X,Y,*
FROM TABLE
WHERE (X = 2235 AND Y = 324)
union all...