Create New SQL Table w/o duplicates - sql

I'm learning how to create tables in SQL pulling data from existing tables from two different databases. I am trying to create a table combining two tables without duplicates. I've seen some say using UNION but I could not get that to work.
Say TABLE 1 has 2 COLUMNS (IdNumber, Material) and TABLE 2 has 3 COLUMNS (IdNumber, Size, Description)
How can I create a new table (named TABLE3) that combines those two but only shows the columns (PartDescription, Weight, Color) but without duplicates.
What I have done so far is as follows:
CREATE TABLE #Materialsearch (IdNumber varchar(30), Material varchar(30))
CREATE TABLE #Sizesearch (idnumber varchar(30), Size varchar(30), Description varchar(50))
INSERT INTO #Materialsearch (IdNumber, Material)
SELECT [IdNumber],[Material]
FROM [datalist].[dbo].[Table1]
WHERE Material LIKE 'Steel' AND IdNumber NOT LIKE 'Steel'
INSERT INTO #Sizesearch (idnumber, Size, Description)
SELECT [idNumber],[itemSize], [ShortDesc]
FROM [515dap].[dbo].[Table2]
WHERE itemSize LIKE '1' AND idnumber NOT LIKE 'Steel'
SELECT DISTINCT #Materialsearch.IdNumber, #Materialsearch.Material,
#Sizesearch.Size, #Sizesearch.Description
FROM #Materialsearch
INNER JOIN #Sizesearch
ON #Materialsearch.IdNumber = #Sizesearch.idnumber
ORDER BY #Materialsearch.IdNumber
DROP TABLE #Materialsearch
DROP TABLE #Sizesearch
This would show all items that are made from steel but do not have steel as their itemid's.
Thanks for your help

I'm not 100% sure what you're after - but you may find this useful.
You could use a FULL OUTER JOIN which takes takes all rows from both tables, matches the ones it can, then reports all rows.
I'd suggest (for your understanding) running
SELECT A.*, B.*
FROM #Materialsearch AS A
FULL OUTER JOIN #Sizesearch AS B ON A.[IdNumber] = B.[IdNumber]
Then to get the relevant data, you just need some tweaks on that e.g.,
SELECT
ISNULL(A.[IdNumber], B.[IdNumber]) AS [IdNumber],
A.Material,
B.Size,
B.Description
FROM #Materialsearch AS A
FULL OUTER JOIN #Sizesearch AS B ON A.[IdNumber] = B.[IdNumber]
Edit: Changed typoed INNER JOINs to FULL OUTER JOINs. Oops :( Thankyou very much #Thorsten for finding it!

Related

SQL server how to do string comparison with left join

I have two tables: the entry table (one nvarchar column called entry) and the disease table (one nvarchar column called disease).
I would like to produce another table that has all the entry-disease combos where entry.entry is contained completely in disease.disease. However, I want all entries that do not have a disease that is completely contained inside of them to still appear in the results table as {entry, blank}.
I know it should probably be something like:
select entry disease from
entry, disease
where ...
not really sure how to write this, thanks in advance
Ok, I figured out this much:
select entry.entry, disease.disease
into new_table
from entry, disease
where CHARINDEX(entry, disease) > 0
how do I include the entries that have no match?
You can use left join in this case.
It will show all entries and provide null value for the column disease.disease if an entry is not contained in any cell of disease.disease.
SELECT entry.entry, disease.disease
FROM entry LEFT JOIN disease
ON CHARINDEX(entry.entry, disease.disease) > 0
An alternative solution is to use LIKE keyword. Please see the example below:
create table #t (st varchar(20))
insert into #t values ('johnalexmichael'), ('johnmichael'),('alex'),('michael')
create table #t2 (st varchar(20))
insert into #t2 values ('alex'),('john')
SELECT t.st, t2.st
FROM #t t
LEFT JOIN #t2 t2 on t.st like '%'+t2.st+'%'

SQL: Using Multi-Row data in Column

This may not be possible in one query, but I'd like to see what my options are.
I have a large query that returns the data for each piece of an inventory (in this case trees). The query gets data from a few different tables. I mostly use left outer joins to bring this information in, so if it's not there I can ignore it and take the NULL. I have an interesting situation where a one-to-many relationship exists between "tree" and it's "pests".
tree table: treeID, treeHeight, etc....
pest (pest to tree) table: pestID, treeID, pestRef.....
I need a query that gets the top 6 pests for each tree and returns them as columns:
pest1, pest2, pest3... and so on.
I know that I could do this in multiple queries, however that would happen thousands of times just per use and our servers can't handle that.
Some notes: we're using ColdFusionMX7, and my knowledge of stored procedures is very low.
One approach is to generate a column representing the pest rank by tree, then join the ranked pest table to the tree table with rank as a join condition. Make sure you use ROW_NUMBER not RANK because a tie would cause repeated numbers in RANK (but not ROW_NUMBER), and make sure you use LEFT OUTER joins so trees with fewer pests are not excluded. Also, I ordered by 2 conditions, but anything valid in a normal ORDER BY clause is valid here.
DECLARE #t TABLE (TreeID INT, TreeName VARCHAR(25));
DECLARE #p TABLE (PestID INT, TreeID INT, PestName VARCHAR(25));
INSERT INTO #t VALUES (1,'ash'),(2,'elm'),(3,'oak')
INSERT INTO #p VALUES (1,1,'ash borer'),(2,1,'tent catapilar'),(3,1,'black weevil'),(4,1,'brown weevil');
INSERT INTO #p VALUES (5,2,'elm thrip'),(6,2,'wooly adelgid');
INSERT INTO #p VALUES (7,3,'oak gall wasp'),(8,3,'asian longhorn beetle'),(9,3,'aphids');
WITH cteRankedPests as (
SELECT PestID, TreeID, PestName, ROW_NUMBER() OVER (PARTITION BY TreeID ORDER BY PestName,PestID) as PestRank
FROM #p
)
SELECT T.TreeID, T.TreeName
, P1.PestID as P1ID, P1.PestName as P1Name
, P2.PestID as P2ID, P2.PestName as P2Name
, P3.PestID as P3ID, P3.PestName as P3Name
FROM #t as T
LEFT OUTER JOIN cteRankedPests as P1 ON T.TreeID = P1.TreeID AND P1.PestRank = 1
LEFT OUTER JOIN cteRankedPests as P2 ON T.TreeID = P2.TreeID AND P2.PestRank = 2
LEFT OUTER JOIN cteRankedPests as P3 ON T.TreeID = P3.TreeID AND P3.PestRank = 3

Most efficient way of comparing two tables in SQL Server

So I have two tables which will store sales figures for products. Table one holds the last 6 weeks sales figures for each product and table 2 shows the last 12 months. I need to find a way to compare these two tables to then produce a 3rd table which will contain the difference between the 2 values as column 2 as well as the products Sage code in column one. What would be the most efficient (in terms of time) way to do this as there will be a fair amount of products to compare and it will only continue to grow? The product Sage code is the key identifier here. The two tables are created as below.
IF OBJECT_ID('tempdb..#Last6WeeksProductSales') IS NOT NULL DROP TABLE #Last6WeeksProductSales;
CREATE TABLE #Last6WeeksProductSales
(
CompoundSageCode varchar(200),
Value decimal(18,2)
)
INSERT INTO #Last6WeeksProductSales
SELECT [SalesOrderLine].[sProductSageCode] AS [CompoundSageCode],
SUM([SalesOrderLine].[fQtyOrdered] * [SalesOrderLine].[fPricePerUnit]) AS [Value]
FROM [SalesOrderLine]
INNER JOIN [SalesOrder] ON (SalesOrder.iSalesOrderID = SalesOrderLine.iSalesOrderID)
WHERE [SalesOrder].[dOrderDateTime] > DateAdd("ww", -6, CURRENT_TIMESTAMP)
GROUP BYsProductSageCode;
SELECT * FROM #Last6WeeksProductSales
ORDER BY CompoundSageCode;
IF OBJECT_ID('tempdb..#Last12MonthsProductSales') IS NOT NULL DROP TABLE #Last12MonthsProductSales;
CREATE TABLE #Last12MonthsProductSales
(
CompoundSageCode varchar(200),
Value decimal(18,2)
)
INSERT INTO #Last12MonthsProductSales SELECT [SalesOrderLine].[sProductSageCode] AS [CompoundSageCode],
SUM([SalesOrderLine].[fQtyOrdered] * [SalesOrderLine].[fPricePerUnit]) AS [Value]
FROM [SalesOrderLine]
INNER JOIN [SalesOrder] ON (SalesOrder.iSalesOrderID = SalesOrderLine.iSalesOrderID)
WHERE [SalesOrder].[dOrderDateTime] > DateAdd(month, -12, CURRENT_TIMESTAMP)
GROUP BY sProductSageCode;
SELECT * FROM#Last12MonthsProductSales
ORDER BY CompoundSageCode;
DROP TABLE #Last6WeeksProductSales;
DROP TABLE #Last12MonthsProductSales;
Use a view. That way you don't have to worry about updating your third table, and it will reflect current information. Base the view on a basic SELECT:
SELECT sixS.CompoundSageCode,
(twelveS.value - sixS.Value ) as diffValue
FROM Last6WeeksProductSales sixS
INNER JOIN Last12MonthsProductSales twelveS ON sixS.CompoundSageCode = twelveS.CompoundSageCode
(I have not tested this code, but it should be a good starting point.)
Computing the difference of two tables is usually done using a FULL OUTER JOIN. SQL Server can implement it using all three of the physical join operators. Apply reasonable indexing and it will run fine.
If you can manage it, create covering indexes on both tables that are sorted by the join key. This will result in a highly efficient merge join plan.

1 to Many join for SQL View where joined table may have 0 records

I have been tasked with creating a view that needs to bring in up to 10 records from another table. Problem is this table may have 0, 5, 10, or more corresponding records.
Here is the very simplified design to only include what is relevant
SalesOrderTable OutsideSalesRepTable SalesRepTable
OrderID BranchID SalesRepID
CustID CustID SalesRepName
BranchID SalesRepID
The first join needs to be between SalesOrderTable and OutsideSalesRepTable on BranchID & CustID
The second join needs to be between OutsideSalesRepTable and SalesRepTable on SalesRepID
The view will need to have columns listed as OutsideSalesRep1, OutsideSalesRep2, ... OutsideSalesRep10 and filled with the SalesRepName. I have no control over the design of this database. I would have much rather seen 10 fields dedicated to SalesRepIDs in the customer table and just used left joins.
If only 3 OutsideSalesReps exsit for the branch/customer than OutsideSalesRep4-10 should be null
This is the only part of the 165 column / 35+ table view I wasn't able to figure out.
Any help would be sincerely appreciated.
PS I'm semi-fresh to TSQL. Only been using it about 6 months.
EDIT: I linked to an image that shows a sample of the source data to assist (I hope) explain what I'm looking for.
the Pivot Table needs to show
SONum OutsideRep1 OutsideRep2 OutsideRep3 ..... Outside Rep10
5819 59 69 70 null null
5821 59 70 null null null
http://www.bayernsupport.com/SQL.png
Sounds like you need to join your tables with an outer join (ie: left join or right join) (to allow for joins where there are no results) and to use a pivot to create columns from rows.
http://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx
got it working with the assistance of a friend. It did require a pivot, but it also required an interesting query as the source, bear in mind the field names below don't exactly match, but the structure and end result was dead on.
SELECT *
FROM
(
SELECT so.OrderID,
so.OrderName,
sr.SalesRepName,
'SalesRep_'+CAST(ROW_NUMBER() OVER(PARTITION BY OrderName ORDER BY SalesRepName) AS VARCHAR(30)) rn
FROM #SalesOrderTable so
JOIN #OutsideSalesRepTable osp ON so.BranchID = osp.BranchID and so.CustID=osp.CustID
JOIN #SalesRepTable sr ON osp.SalesRepID = sr.SalesRepID
) src
PIVOT
(
MAX(SalesRepName)
FOR rn in (SalesRep_1, SalesRep_2, SalesRep_3,SalesRep_4, SalesRep_5,
SalesRep_6,SalesRep_7,SalesRep_8,SalesRep_9,SalesRep_10)
) piv

Is it possible to report on 2 tables without using a subquery?

You have one table against which you wish to count the number of items in two different tables. In this example I used buildings, men and women
DROP TABLE IF EXISTS building;
DROP TABLE IF EXISTS men;
DROP TABLE IF EXISTS women;
CREATE TABLE building(name VARCHAR(255));
CREATE TABLE men(building VARCHAR(255), name VARCHAR(255));
CREATE TABLE women(building VARCHAR(255), name VARCHAR(255));
INSERT INTO building VALUES('building1');
INSERT INTO building VALUES('building2');
INSERT INTO building VALUES('building3');
INSERT INTO men VALUES('building1', 'andy');
INSERT INTO men VALUES('building1', 'barry');
INSERT INTO men VALUES('building2', 'calvin');
INSERT INTO men VALUES(null, 'dwain');
INSERT INTO women VALUES('building1', 'alice');
INSERT INTO women VALUES('building1', 'betty');
INSERT INTO women VALUES(null, 'casandra');
select
r1.building_name,
r1.men,
GROUP_CONCAT(women.name) as women,
COUNT(women.name) + r1.men_count as count
from
(select
building.name as building_name,
GROUP_CONCAT(men.name) as men,
COUNT(men.name) as men_count
from
building
left join
men on building.name=men.building
GROUP BY building.name) as r1
left join
women on r1.building_name=women.building
GROUP BY r1.building_name;
Might there be another way? The problem with the above approach is that the columns of the two tables in the subquery are hidden and need to be redeclared in the outer query. Doing it in two separate set operations creates an asymmetry where there is none. We could equally have joined to the women first and then the men.
In SQL Server, I would just join two subqueries with two left joins - if symmetry is what you are looking for:
SELECT *
FROM building
LEFT JOIN (SELECT building, etc. FROM men GROUP BY etc.) AS men_summary
ON building.name = men_summary.building_name
LEFT JOIN (SELECT building, etc. FROM women GROUP BY etc.) AS women_summary
ON building.name = women_summary.building_name
I tend to use common table expressions declared first instead of subqueries - it's far more readable (but not ANSI - but then neither is GROUP_CONCAT).
Use Union to combine the data from the men/women tables
select building, [name] as menname, null as womenname from men
union
select building, null as menname, [name] as womenname from women
you now have a single 'table' addmitedly in a subquery against which you can join, count or whatever.
BTW I can see why Cas[s]andra is out in the cold as no-one belives her, but what about dwain, is he similarly cursed by the gods?