Weird SQL request(Join) - sql

Assuming there are two tables A={a,b} and B={0,1,2}, which can be joined
tableA tableB
a 0
b 1
a 2
3
How to get the following result
ExpectingResult:
tableA tableB
a-------0
b-------1
null----2
null----3
OR
tableA tableB
a-------2
b-------3
null----0
null----1
Just make sure the element in each table just appear once, I tried all kinds of join(inner, full, cross), none of them can achieve so. Could anybody give me a tip?
Thank you very much
Please check this link out to the question itself: http://www.sqlfiddle.com/#!3/9fc21/2

That is a crappy request. There is almost negligible reasons for producing such an output that comes to mind, although I'm not ruling out a sane reason completely.
For SQL Server 2000, you will need to go through temp tables to get a sequential key to zip up with.
SELECT IDENTITY(int,1,1) ID, Value
INTO #tblA
FROM tableA
ORDER BY Value;
SELECT IDENTITY(int,1,1) ID, Value
INTO #tblB
FROM tableB
ORDER BY Value;
SELECT A.Value, B.Value
FROM #tblA A FULL OUTER JOIN #tblB B ON A.ID = B.ID
ORDER BY Coalesce(A.ID, B.ID);

Using SQL Server 2005, you can use a combination of ROW_NUMBER and a FULL OUTER JOIN to combine both results.
See SQL Fiddle
Unfortunately, SQL Server 2000 doesn't have a ROW_NUMBER function so you are kind of stuck with using temporary tables with identity fields to simulate a rownumber.
The gist of this would be to
Select your required data from tableA into #tempTableA, adding an identity field.
Repeat for tableB
Use the temptables to FULL OUTER JOIN the results

Related

Oracle SQL - Select not using index as expected

So I haven't used Oracle in more than 5 years and I'm out of practice. I've been on SQL Server all that time.
I'm looking at some of the existing queries and trying to improve them, but they're reacting really weirdly. According to the explain plan instead of going faster they're instead doing full table scans and not using the indexes.
In the original query, there is an equijoin done between two tables done in the where statement. We'll call them table A and B. I used an explain plan followed by SELECT * FROM table(DBMS_XPLAN.DISPLAY (FORMAT=>'ALL +OUTLINE')); and it tells me that Table A is queried by Local Index.
TABLE ACCESS BY LOCAL INDEX ROWID
SELECT A.*
FROM TableA A, TableB B
WHERE A.SecondaryID = B.ID;
I tried to change the query and join TableA with a new table (Table C). Table C is a subset of Table B with 700 records instead of 100K. However the explain plan tells me that Table A is now queried with a full lookup.
CREATE TableC
AS<br>
SELECT * FROM TableB WHERE Active='Y';
SELECT A.*
FROM TableA A, TableC C
WHERE A.SecondaryID = C.ID;
Next step, I kept the join between tables A & C, but used a hint to tell it to use the index on Table A. However it still does a full lookup.
SELECT /*+ INDEX (A_NDX01) */ A.*
FROM TableA A, TableC C
WHERE A.SecondaryID = C.ID;
So I tried to change from a join to a simple Select of table A and use an IN statement to compare to table C. Still a full table scan.
SELECT A.*
FROM TableA A
WHERE A.SecondaryID in (SELECT ID FROM TableC);
Lastly, I took the previous statement and changed the subselect to pull the top 1000 records, and it used the index. The odd thing is that there are only 700 records in Table C.
SELECT A.*
FROM TableA A
WHERE A.SecondaryID in (SELECT ID FROM TableC WHERE rownum <1000
)
I was wondering if someone could help me figure out what's happening?
My best guess is that since TableC is a new table, maybe the optimizer doesn't know how many records are in it and that's why it's it will only use the index if it knows that there are fewer than 1000 records?
I tried to run dbms_stats.gather_schema_stats on my schema though and it did not help.
Thank you for your help.
As a general rule Using an index will not necessarily make your query go faster ALWAYS.
Hints are directives to the optimizer to make use of the path, it doenst mean optimizer would choose to obey the hint directive. In this case, the optimizer would have considered that an index lookup on TableA is more expensive in the
SELECT A.*
FROM TableA A, TableB B
WHERE A.SecondaryID = B.ID;
SELECT /*+ INDEX (A_NDX01) */ A.*
FROM TableA A, TableC C
WHERE A.SecondaryID = C.ID;
SELECT A.*
FROM TableA A
WHERE A.SecondaryID in (SELECT ID FROM TableC);
Internally it might have converted all of these statements(IN) into a join which when considering the data in the tableA and tableC decided to make use of full table scan.
When you did the rownum condition, this plan conversion was not done. This is because view-merging will not work when it has the rownum in the query block.
I believe this is what is happening when you did
SELECT A.*
FROM TableA A
WHERE A.SecondaryID in (SELECT ID FROM TableC WHERE rownum <1000)
Have a look at the following link
Oracle. Preventing merge subquery and main query conditions

INTERSECT two table of size 500ml rows in vertica

I am very new to vertica db and hence looking for different efficient ways for comparing two tables of average size 500ml-800ml rows in vertica. I have a process that gets the data from vertica view and dump in to SQL server for later merge to final table in sql server. for few large tables combine it is dumping about 3bl rows daily. Instead of dumping all data I want to take daily snapshot, and compare it with previous days snapshot on vertica side only and then push changed rows only in to SQL SEREVER.
lets say previous snapshot is stored in tableA, today's snapshot stored in tableB. PK on both table is column named OrderId.
Simplest way I can think of is
Select * from tableB
Where OrderId NOT IN (
SELECT * from tableA
INTERSECT
SELECT * from tbleB
)
So my questions are:
Is there any other/better option in vertica to get only changed rows between two tables? Or should I
even consider doing this compare on vertica side?
How much doing such comparison should take?
What should I consider to improve the performance of such query?
If your columns have no NULL values, then a massive LEFT JOIN would seem to do what you want:
select b.*
from tableB b left join
tableA a
on b.OrderId = a.OrderId and
b.col1 = a.col1 and
. . . -- for all the columns you care about
However, I think you want except:
select b.*
from tableB b
except
select a.*
from tableA a;
I imagine this would have reasonable performance.
Do you have a primary key in the two tables?
Then my technique, for a complete Change Data Capture, is:
SELECT
'I' AS to_do
, newrows.*
FROM tb_today newrows
LEFT
JOIN tb_yesterday oldrows USING(id)
WHERE oldrows.id IS NULL
UNION ALL
SELECT
'U' AS to_do
, newrows.*
FROM tb_today newrows
JOIN tb_yesterday oldrows
WHERE oldrows.fname <> newrows.fname
OR oldrows.lnamd <> newrows.lname
OR oldrows.bdate <> newrwos.bdate
OR oldrows.sal <> newrows.sal
[...]
OR oldrows.lastcol <> newrows.lastcol
UNION ALL
SELECT
'D' AS to_do
, oldrows.*
FROM tb_yesterday oldrows
LEFT
JOIN tb_today oldrows USING(id)
WHERE newrows.id IS NULL
;
Just leave out the last leg of the UNION SELECT if you don't want to cater for DELETEs ('D')
Good luck
you also do it nicely using joins:
SELECT b.*
FROM tableB AS b
LEFT JOIN tableA AS a ON a.id = b.id
WHERE a.id IS NULL
so above query return only diff from TableB to TableA i.e. data which is present in both table will be skipped...

counting abscense or rows

I need to write a stored procedures to update contacts who have no active pledges in our database, I can't seem to find a way of counting contacts with 0 rows on the pledges table.
The external key in the pledges table is supporter_id, I've tried using Count(*), but it only returns 1 or more.
Thanks in advance.
PS: This is on a MS SQL database.
We'd need more information to give you a specific answer, but there are a number of ways to identify non-matching records, here are two:
LEFT JOIN:
SELECT a.*
FROM TableA a
LEFT JOIN TableB b
ON a.ID = b.ID
WHERE b.ID IS NULL
NOT EXISTS:
SELECT *
FROM TableA a
WHERE NOT EXISTS (SELECT *
FROM TableB b
WHERE a.ID = b.ID)
I ended using a subquery so I first determine who has pledges of the type I want and then look for contacts who are not in that list.
thanks for the responses.

SQLite Count() and Group By not getting the count of rows

I've written my SQLite query a couple different ways: 1 with a Group By and Count(), the other with a nested select that does a count().
Here's the current version:
Select t.A,
t.B,
(Select Count(*) As CurrentCount From tableB b Where b.B = t.B) As CurrentCount
From tableA t
Why does this not work in SQLite, but it's working just fine (both versions actually, the other using a group by) in SQL Server/T-SQL? Is it because of the join in the sub/nested select?
Edit: Let me clarify, I'm getting 0 for a count every time...
Edit2: I tried taking out the where clause in my nested select and it still returns 0 even though I know the table has records (133 to be exact)
Edit (Final Solution): This was NOT a code issue, it was a data issue. It's nothing that anyone would have caught. My "refresh" script that was reading the records from SQL Server was reading all 133, but the actual insert into my SQLite database had a missing comma, therefore the table WAS empty, hence the 0 records. Sorry for wasting everyone's time.
Something like
Select t.A
,t.B
,b.CurrentCount
From tableA t
INNER JOIN (Select B, Count(*) [CurrentCount] From tableB b GROUP BY B) as B
ON b.B = t.B

Use join with a table and SQL Statement

Joins are usually used to fetch data from 2 tables using a common factor from either tables
Is it possible to use a join statement using a table and results of another SQL statement and if it is what is the syntax
Sure, this is called a derived table
such as:
select a.column, b.column
from
table1 a
join (select statement) b
on b.column = a.column
keep in mind that it will run the select for the derived table in entirety, so it can be helpful if you only select things you need.
EDIT: I've found that I rarely need to use this technique unless I am joining on some aggregated queries.... so I would carefully consider your design here.
For example, thus far most demonstrations in this thread have not required the use of a derived table.
It depends on what the other statement is, but one of the techniques you can use is common table expressions - this may not be available on your particular SQL platform.
In the case of SQL Server, if the other statement is a stored procedure, you may have to insert the results into a temporary table and join to that.
It's also possible in SQL Server (and some other platforms) to have table-valued functions which can be joined just like a view or table.
select *
from TableA a
inner join (select x from TableB) b
on a.x = b.x
Select c.CustomerCode, c.CustomerName, sq.AccountBalance
From Customers c
Join (
Select CustomerCode, AccountBalance
From Balances
)sq on c.CustomerCode = sq.CustomerCode
Sure, as an example:
SELECT *
FROM Employees E
INNER JOIN
(
SELECT EmployeeID, COUNT(EmployeeID) as ComplaintCount
FROM Complaints
GROUP BY EmployeeID
) C ON E.EmployeeID = C.EmployeeID
WHERE C.ComplaintCount > 3
It is. But what specifically are you looking to do?
That can be done with either a sub-select, a view or a temp table... More information would help us answer this question better, including which SQL software, and an example of what you'd like to do.
Try this:
SELECT T1.col1, t2.col2 FROM Table1 t1 INNER JOIN
(SELECT col1, col2, col3 FROM Table 2) t2 ON t1.col1 = t2.col1