Make a "LEFT UNION" query - sql

I have several databases (nobu and bu) with exact same tables (one is just a back up of the other).
I need to get values from a table from both databases to join them with other tables then I obviously use an UNION. The thing is, some products have different names in the tables from both bu and nobu.
I then tried to select only one database about this table (I used nobu since it's the latest one), but I noticed that some products are not in nobu, but are actually in bu (which makes it not a backup anymore).
The part of the query in which I need this looks like this :
With this I get duplicates
... INNER JOIN (SELECT * FROM nobu.dbo.product UNION SELECT * FROM bu.dbo.product) AS product
ON [...] INNER JOIN (SELECT * FROM nobu.dbo.name UNION SELECT bu.dbo.name) AS name
ON product.key = name.id ...
With this I get some of the products with NULL name since it doesn't exist on nobu
... INNER JOIN (SELECT * FROM nobu.dbo.product UNION SELECT * FROM bu.dbo.product) AS product
ON [...] INNER JOIN (SELECT * FROM nobu.dbo.name) AS name
ON product.key = name.id ...
I wanted to know if there is a way to perform a LEFT UNION or something like that, to get all the values from nobu, and if there is no data, take the ones from bu, without getting the duplicates (since they can have different names on both databases).

If only names have been changed and suggesting that table names is not a big table and will not create performance issues then this code below will do the job:
INNER JOIN (SELECT * FROM nobu.dbo.product UNION SELECT * FROM bu.dbo.product) AS product
ON [...] INNER JOIN (SELECT * FROM nobu.dbo.name UNION SELECT bu.dbo.name WHERE id NOT IN (SELECT id FROM nobu.dbo.name)) AS name
ON product.key = name.id

Related

Remove duplicates from result in sql

i have following sql in java project:
select distinct * from drivers inner join licenses on drivers.user_id=licenses.issuer_id
inner join users on drivers.user_id=users.id
where (licenses.state='ISSUED' or drivers.status='WAITING')
and users.is_deleted=false
And result i database looks like this:
And i would like to get only one result instead of two duplicated results.
How can i do that?
Solution 1 - That's Because one of data has duplicate value write distinct keyword with only column you want like this
Select distinct id, distinct creation_date, distinct modification_date from
YourTable
Solution 2 - apply distinct only on ID and once you get id you can get all data using in query
select * from yourtable where id in (select distinct id from drivers inner join
licenses
on drivers.user_id=licenses.issuer_id
inner join users on drivers.user_id=users.id
where (licenses.state='ISSUED' or drivers.status='WAITING')
and users.is_deleted=false )
Enum fields name on select, using COALESCE for fields which value is null.
usually you dont query distinct with * (all columns), because it means if one column has the same value but the rest isn't, it will be treated as a different rows. so you have to distinct only the column you want to, then get the data
I suspect that you want left joins like this:
select *
from users u left join
drivers d
on d.user_id = u.id and d.status = 'WAITING' left join
licenses l
on d.user_id = l.issuer_id and l.state = 'ISSUED'
where u.is_deleted = false and
(d.user_id is not null or l.issuer_id is not null);

Combine two SQL Server table results to pull latest data where exists

I have two tables a main table and a work in progress table. Any inserts/updates are inserted into the WIP table while the record is being manipulated, this allows for validation checks and the like. I want to create a view that combines the two tables showing the WIP table data whenever it exists and the main table data when there is no WIP data.
I have figured out a way to do this but it seems that it's not the most elegant solution. I would like to know if there are other ideas or better solutions?
Example illustrating the situation:
select mt.id, wt.id wip_id, isnull(wt.name,mt.name) name,
isnull(wt.address, mt.address) address
from main_table mt full outer join
wip_table wt on mt.id = wt.orig_id;
So that will pull results from the WIP table when they exist, if they dont it will pull results from the main table. This was a simple example but the tables could have many rows.
if you want data either from one table or the other:
select top 1 *
from
(
select 1 as prio, wt.name, wt.address, .... from wip_table wt where ...
union
select 2 as prio, mt.name, mt.address, .... from main_table mt where ...
order by prio
) x
otherwise, like you have done (checking individual columns), but maybe using a left outer join rather than a full one:
select
mt.id
, wt.id wip_id
, isnull(wt.name,mt.name) name
, isnull(wt.address, mt.address) address
from main_table mt left outer join wip_table wt
on mt.id = wt.orig_id;

INNER JOIN vs IN

SELECT C.* FROM StockToCategory STC
INNER JOIN Category C ON STC.CategoryID = C.CategoryID
WHERE STC.StockID = #StockID
VS
SELECT * FROM Category
WHERE CategoryID IN
(SELECT CategoryID FROM StockToCategory WHERE StockID = #StockID)
Which is considered the correct (syntactically) and most performant approach and why?
The syntax in the latter example seems more logical to me but my assumption is the JOIN will be faster.
I have looked at the query plans and havent been able to decipher anything from them.
Query Plan 1
Query Plan 2
The two syntaxes serve different purposes. Using the Join syntax presumes you want something from both the StockToCategory and Category table. If there are multiple entries in the StockToCategory table for each category, the Category table values will be repeated.
Using the IN function presumes that you want only items from the Category whose ID meets some criteria. If a given CategoryId (assuming it is the PK of the Category table) exists multiple times in the StockToCategory table, it will only be returned once.
In your exact example, they will produce the same output however IMO, the later syntax makes your intent (only wanting categories), clearer.
Btw, yet a third syntax which is similar to using the IN function:
Select ...
From Category
Where Exists (
Select 1
From StockToCategory
Where StockToCategory.CategoryId = Category.CategoryId
And StockToCategory.Stock = #StockId
)
Syntactically (semantically too) these are both correct. In terms of performance they are effectively equivalent, in fact I would expect SQL Server to generate the exact same physical plans for these two queries.
T think There are just two ways to specify the same desired result.
for sqlite
table device_group_folders contains 10 records
table device_groups contains ~100000 records
INNER JOIN: 31 ms
WITH RECURSIVE select_childs(uuid) AS (
SELECT uuid FROM device_group_folders WHERE uuid = '000B:653D1D5D:00000003'
UNION ALL
SELECT device_group_folders.uuid FROM device_group_folders INNER JOIN select_childs ON parent = select_childs.uuid
) SELECT device_groups.uuid FROM select_childs INNER JOIN device_groups ON device_groups.parent = select_childs.uuid;
WHERE 31 ms
WITH RECURSIVE select_childs(uuid) AS (
SELECT uuid FROM device_group_folders WHERE uuid = '000B:653D1D5D:00000003'
UNION ALL
SELECT device_group_folders.uuid FROM device_group_folders INNER JOIN select_childs ON parent = select_childs.uuid
) SELECT device_groups.uuid FROM select_childs, device_groups WHERE device_groups.parent = select_childs.uuid;
IN <1 ms
SELECT device_groups.uuid FROM device_groups WHERE device_groups.parent IN (WITH RECURSIVE select_childs(uuid) AS (
SELECT uuid FROM device_group_folders WHERE uuid = '000B:653D1D5D:00000003'
UNION ALL
SELECT device_group_folders.uuid FROM device_group_folders INNER JOIN select_childs ON parent = select_childs.uuid
) SELECT * FROM select_childs);

How to avoid large in clause?

I have 3 tables :
table_product (30 000 row)
---------
ID
label
_
table_period (225 000 row)
---------
ID
date_start
date_end
default_price
FK_ID_product
and
table_special_offer (10 000 row)
-----
ID
label
date_start,
date_end,
special_offer_price
FK_ID_period
So I need to load data from all these table, so here it's what I do :
1/ load data from "table_product" like this
select *
from table_product
where label like 'gun%'
2/ load data from "table_period" like this
select *
from table_period
where FK_ID_product IN(list of all the ids selected in the 1)
3/ load data from "table_special_offer" like this
select *
from table_special_offer
where FK_ID_period IN(list of all the ids selected in the 2)
As you may think the IN clause in the point 3 can be very very big (like 75 000 big), so I got a lot of chance of getting either a timeout or something like " An expression services limit has been reached".
Have you ever had something like this, and how did you manage to avoid it ?
PS :
the context : SQL server 2005, .net 2.0
(please don't tell me my design is bad, or I shouldn't do "select *", I just simplified my problem so it is a little bit simpler than 500 pages describing my business).
Thanks.
Switch to using joins:
SELECT <FieldList>
FROM Table_Product prod
JOIN Table_Period per ON prod.Id = per.FK_ID_Product
JOIN Table_Special_Offer spec ON per.ID = spec.FK_ID_Period
WHERE prod.label LIKE 'gun%'
Something you should be aware of is the difference of IN vs JOIN vs EXISTS - great article here.
In finally have my answer : table variable (a bit like #smirkingman's solution but not with cte) so:
declare #product(id int primary key,label nvarchar(max))
declare #period(id int primary key,date_start datetime,date_end datetime,defaultprice real)
declare #special_offer(id int,date_start datetime,date_end datetime,special_offer_price real)
insert into #product
select *
from table_product
where label like 'gun%'
insert into #period
select *
from table_period
where exists(
select * from #product p where p.id = table_period.FK_id_product
)
insert into #special_offer
select *
from table_special_offer
where exists(
select * from #period p where p.id = table_special_offer.fk_id_period
)
select * from #product
select * from #period
select * from #special_offer
this is for the sql, and with c# I use ExecuteReader, Read, and NextResult of the class sqldatareader
http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqldatareader.aspx
I got all I want :
- my datas
- i don't have too much data (unlike the solutions with join)
- i don't execute twice the same query (like solution with subquery)
- i don't have to change my mapping code (1row = 1 business object)
Don't use explicit list of values in IN clause. Instead, write your query like
... FK_ID_product IN (select ID
from table_product
where label like 'gun%')
SELECT *
FROM
table_product tp
INNER JOIN table_period tper
ON tp.ID = tper.FK_ID_product
INNER JOIN table_special_offer so
ON tper.ID = so.FK_ID_period
WHERE
tp.label like 'gun%'"
First some code...
Using JOIN:
SELECT
table_product.* --'Explicit table calls just for organisation sake'
, table_period.*
, table_special_offer.*
FROM
table_product
INNER JOIN table_period
ON table_product.ID = table_period.FK_ID_product
INNER JOIN table_special_offer
ON table_period.ID = table_special_offer.FK_ID_period
WHERE
tp.label like 'gun%'"
Using IN :
SELECT
*
FROM
table_special_offer
WHERE FK_ID_period IN
(
SELECT
FK_ID_period
FROM
table_period
WHERE FK_ID_product IN
(
SELECT
FK_ID_product
FROM
table_product
WHERE label like '%gun'
) AS ProductSub
) AS PeriodSub
Depending on how well your tables get indexed both can be used. Inner Joins as the others have suggested are definitely efficient at doing your query and returning all data for the 3 tables. If you are only needing To use the ID's from table_product and table_period Then using the nested "IN" statements can be good for adapting search criteria on indexed tables (Using IN can be ok if the criteria used are integers like I assume your FK_ID_product is).
An important thing to remember is every database and relational table setup is going to act differently, you wont have the same optimised results in one db to another. Try ALL the possibilities at hand and use the one that is best for you. The query analyser can be incredibly useful in times like these when you need to check performance.
I had this situation when we were trying to join up customer accounts to their appropriate addresses via an ID join and a linked table based condition (we had another table which showed customers with certain equipment which we had to do a string search on.) Strangely enough it was quicker for us to use both methods in the one query:
--The query with the WHERE Desc LIKE '%Equipment%' was "joined" to the client table using the IN clause and then this was joined onto the addresses table:
SELECT
Address.*
, Customers_Filtered.*
FROM
Address AS Address
INNER JOIN
(SELECT Customers.* FROM Customers WHERE ID IN (SELECT CustomerID FROM Equipment WHERE Desc LIKE '%Equipment search here%') AS Equipment ) AS Customers_Filtered
ON Address.CustomerID = Customers_Filtered.ID
This style of query (I apologise if my syntax isn't exactly correct) ended up being more efficient and easier to organise after the overall query got more complicated.
Hope this has helped - Follow #AdaTheDev 's article link, definitely a good resource.
A JOIN gives you the same results.
SELECT so.Col1
, so.Col2
FROM table_product pt
INNER JOIN table_period pd ON pd.FK_ID_product = pt.ID_product
INNER JOIN table_special_offer so ON so.FK_ID_Period = pd.ID_Period
WHERE pt.lable LIKE 'gun%'
I'd be interested to know if this might make an improvement:
WITH products(prdid) AS (
SELECT
ID
FROM
table_product
WHERE
label like 'gun%'
),
periods(perid) AS (
SELECT
ID
FROM
table_period
INNER JOIN products
ON id = prdid
),
offers(offid) AS (
SELECT
ID
FROM
table_special_offer
INNER JOIN periods
ON id = perid
)
... just a suggestion...

SQL 2005 - The column was specified multiple times

I am getting the following error when trying to run this query in SQL 2005:
SELECT tb.*
FROM (
SELECT *
FROM vCodesWithPEs INNER JOIN vDeriveAvailabilityFromPE
ON vCodesWithPEs.PROD_PERM = vDeriveAvailabilityFromPE.PEID
INNER JOIN PE_PDP ON vCodesWithPEs.PROD_PERM = PE_PDP.PEID
) AS tb;
Error: The column 'PEID' was specified multiple times for 'tb'.
I am new to SQL.
The problem, as mentioned, is that you are selecting PEID from two tables, the solution is to specify which PEID do you want, for example
SELECT tb.*
FROM (
SELECT tb1.PEID,tb2.col1,tb2.col2,tb3.col3 --, and so on
FROM vCodesWithPEs as tb1 INNER JOIN vDeriveAvailabilityFromPE as tb2
ON tb1.PROD_PERM = tb2.PEID
INNER JOIN PE_PDP tb3 ON tb1.PROD_PERM = tb3.PEID
) AS tb;
That aside, as Chris Lively cleverly points out in a comment the outer SELECT is totally superfluous. The following is totally equivalent to the first.
SELECT tb1.PEID,tb2.col1,tb2.col2,tb3.col3 --, and so on
FROM vCodesWithPEs as tb1 INNER JOIN vDeriveAvailabilityFromPE as tb2
ON tb1.PROD_PERM = tb2.PEID
INNER JOIN PE_PDP tb3 ON tb1.PROD_PERM = tb3.PEID
or even
SELECT *
FROM vCodesWithPEs as tb1 INNER JOIN vDeriveAvailabilityFromPE as tb2
ON tb1.PROD_PERM = tb2.PEID
INNER JOIN PE_PDP tb3 ON tb1.PROD_PERM = tb3.PEID
but please avoid using SELECT * whenever possible. It may work while you are doing interactive queries to save typing, but in production code never use it.
Looks like you have the column PEID in both tables: vDeriveAvailabilityFromPE and PE_PDP. The SELECT statement tries to select both, and gives an error about duplicate column name.
You're joining three tables, and looking at all columns in the output (*).
It looks like the tables have a common column name PEID, which you're going to have to alias as something else.
Solution: don't use * in the subquery, but explicitly select each column you wish to see, aliasing any column name that appears more than once.
Instead of using * to identify collecting all of the fields, rewrite your query to explicitly name the columns you want. That way there will be no confusion.
just give new alias name for the column that repeats,it worked for me.....