Combine two tables into a new one so that select rows from the other one are ignored - sql

I have two tables that have identical columns. I would like to join these two tables together into a third one that contains all the rows from the first one and from the second one all the rows that have a date that doesn't exist in the first table for the same location.
Example:
transactions:
date |location_code| product_code | quantity
------------+------------------+--------------+----------
2013-01-20 | ABC | 123 | -20
2013-01-23 | ABC | 123 | -13.158
2013-02-04 | BCD | 234 | -4.063
transactions2:
date |location_code| product_code | quantity
------------+------------------+--------------+----------
2013-01-20 | BDE | 123 | -30
2013-01-23 | DCF | 123 | -2
2013-02-05 | UXJ | 234 | -6
Desired result:
date |location_code| product_code | quantity
------------+------------------+--------------+----------
2013-01-20 | ABC | 123 | -20
2013-01-23 | ABC | 123 | -13.158
2013-01-23 | DCF | 123 | -2
2013-02-04 | BCD | 234 | -4.063
2013-02-05 | UXJ | 234 | -6
How would I go about this? I tried for example this:
SELECT date, location_code, product_code, type, quantity, location_type, updated_at
,period_start_date, period_end_date
INTO transactions_combined
FROM ( SELECT * FROM transactions_kitchen k
UNION ALL
SELECT *
FROM transactions_admin h
WHERE h.date NOT IN (SELECT k.date FROM k)
) AS t;
but that doesn't take into account that I'd like to include the rows that have the same date, but different location. I have Postgresql 9.2 in use.

UNION simply doesn't do what you describe. This query should:
CREATE TABLE AS
SELECT date, location_code, product_code, quantity
FROM transactions_kitchen k
UNION ALL
SELECT h.date, h.location_code, h.product_code, h.quantity
FROM transactions_admin h
LEFT JOIN transactions_kitchen k USING (location_code, date)
WHERE k.location_code IS NULL;
LEFT JOIN / IS NULL to exclude rows from the second table for the same location and date. See:
Select rows which are not present in other table
Use CREATE TABLE AS instead of SELECT INTO. The manual:
CREATE TABLE AS is functionally similar to SELECT INTO. CREATE TABLE AS is the recommended syntax, since this form of SELECT INTO
is not available in ECPG or PL/pgSQL, because they interpret the
INTO clause differently. Furthermore, CREATE TABLE AS offers a
superset of the functionality provided by SELECT INTO.
Or, if the target table already exists:
INSERT INTO transactions_combined (<list names of target column here!>)
SELECT ...
Aside: I would not use date as column name. It's a reserved word in every SQL standard and a function and data type name in Postgres.

Change UNION ALL to just UNION and it should return only unique rows from each table.

Related

SQL Server stored procedure inserting duplicate rows

I have a table with column GetDup and I'd like to the duplicate records based on the value of this column. For example, if value on is 1 in GetDup, then duplicate the record once. If value in the column is 2, then duplicate the record twice and so on and the statement has to be in looping statement.
What will be a good way to write a stored procedures for this? Please help.
Input:
+--------+--------------+---------------+
| Getdup | CustomerName | CustomerAdd |
+--------+--------------+---------------+
| 1 | John | 123 SomeWhere |
| 2 | Bob | 987 SomeWhere |
+--------+--------------+---------------+
What I want:
+--------+--------------+---------------+
| Getdup | CustomerName | CustomerAdd |
+--------+--------------+---------------+
| 1 | John | 123 SomeWhere |
| 1 | John | 123 SomeWhere |
| 2 | Bob | 987 SomeWhere |
| 2 | Bob | 987 SomeWhere |
| 2 | Bob | 987 SomeWhere |
+--------+--------------+---------------+
picture of data
Answer #2 After Clarification
Number Table to the Rescue!
The number table in my example (or tally table, if you want to call it that), is both temporary and very small. To make it bigger, just add more values to z and add more CROSS JOINs. In my opinion, a number table and a calendar table are both things that should be in every database you have. They are extremely useful.
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE mytable ( Getdup int, CustomerName varchar(10), CustomerAdd varchar(20) ) ;
INSERT INTO mytable (Getdup, CustomerName, CustomerAdd)
VALUES (1,'John','123 SomeWhere'), (2,'Bob','987 SomeWhere')
;
Query 1:
;WITH z AS (
SELECT *
FROM ( VALUES(0),(0),(0),(0) ) v(x)
)
, numTable AS (
SELECT num
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY z1.x)-1 num
FROM z z1
CROSS JOIN z z2
) s1
)
SELECT t1.Getdup, t1.CustomerName, t1.CustomerAdd
FROM mytable t1
INNER JOIN numTable ON t1.getdup >= numTable.num
ORDER BY CustomerName, CustomerAdd
Results:
| Getdup | CustomerName | CustomerAdd |
|--------|--------------|---------------|
| 2 | Bob | 987 SomeWhere |
| 2 | Bob | 987 SomeWhere |
| 2 | Bob | 987 SomeWhere |
| 1 | John | 123 SomeWhere |
| 1 | John | 123 SomeWhere |
--------------------------------------------------------------------------
ORIGINAL ANSWER
EDIT: After further clarification of the problem, this won't duplicate rows, this will only duplicate the data in a column.
Something like one of these might work.
T-SQL
SELECT replicate(mycolumn,getdup) AS x
FROM mytable
MySQL
SELECT repeat(mycolumn,getdup) AS x
FROM mytable
Oracle SQL
SELECT rpad(mycolumn,getdup*length(mycolumn),mycolumn) AS x
FROM mytable
PostgreSQL
SELECT repeat(mycolumn,getdup+1) AS x
FROM mytable
If you can provide more details for exactly what you want and what you're working with, we might be able to help you better.
NOTE 2: Depending on what you need, you may need to do some math magic. You say above if GetDup is 1 then you want one duplicate. If that means that your output should be GetDup``GetDup, then you'll want to add one in the repeat(),replicate() or rpad() functions. ie replicate(mycolumn,getdup+1). Oracle SQL will be a little different, since it uses rpad().
In standard SQL you can use a recursive CTE:
with recursive cte as (
select t.dup, . . .
from t
union all
select cte.dup - 1, . . .
from cte
where cte.dup > 1
)
select *
from cte;
Of course, not all databases support recursive CTEs (and the recursive keyword is not used in some of them).
So, you want recursive solution :
with t as (
select Getdup, CustomerName, CustomerAdd, 0 as id
from table
union all
select Getdup, CustomerName, CustomerAdd, id + 1
from t
where id < getdup
)
insert into table (col1, col2, col3)
select Getdup, CustomerName, CustomerAdd
from t
order by getdup
option (maxrecursion 0);

Access query combine two tables with criteria

The below code references two tables. Each table are identical in structure, only difference being the "PRICE" and "PRICE_DATE" values. This is because it's the same table created one year ago. All I want to do is have a new table which takes the latest price in each table for each fund and inserts that into a new table. In addition to this, I also want another column which calculates the growth.
The code below works for this purpose.
SELECT [2015_11_Fund_Prices].FUND_CODE, [2015_11_Fund_Prices].PRICE AS
[PRICE_#_112015], [2016_11_Fund_Prices].PRICE AS [PRICE_#_112016]
([2016_11_Fund_Prices].[PRICE]/[2015_11_Fund_Prices].[PRICE]-1) AS Growth INTO 2016_11_Monthly_Fund_Prices
FROM 2016_11_Fund_Prices INNER JOIN 2015_11_Fund_Prices ON [2016_11_Fund_Prices].FUND_CODE = [2015_11_Fund_Prices].FUND_CODE
GROUP BY [2015_11_Fund_Prices].FUND_CODE, [2015_11_Fund_Prices].PRICE_DATE, [2015_11_Fund_Prices].PRICE, [2016_11_Fund_Prices].PRICE, [2016_11_Fund_Prices].PRICE_DATE, ([2016_11_Fund_Prices].[PRICE]/[2015_11_Fund_Prices].[PRICE]-1)
HAVING ((([2015_11_Fund_Prices].PRICE_DATE)=#24/11/2015#) AND (([2016_11_Fund_Prices].PRICE_DATE)=#24/11/2016#));
However, this code assumes that the latest price is 24/11 in both tables. I want to replace this with a max function that will result in the query referencing only the price in the row with the highest date value.
Can anyone help?
Tabels used are
+-----------+------------+-------+
| Fund_Code | PRICE_DATE | PRICE |
+-----------+------------+-------+
| 1 | 12/12/12 | 1 |
| 1 | 13/12/12 | 1.2 |
| 1 | 14/12/12 | 1.1 |
| 2 | 12/12/12 | 1.12 |
| 2 | 13/12/12 | 1.13 |
| 2 | 14/12/12 | 1.11 |
So the second table is exactly the same but dates corresponding to the following year.
All I want is a table with:
Fund_Code Price1 Price2 Growth
Thanks
You need a sub-query like this:
SELECT FUND_CODE, MAX(PRICE_DATE) AS MaxPriceDate FROM 2016_11_Fund_Prices GROUP BY FUND_CODE
If you add this sub-query to the above and link it to the 2016_11_Fund_Prices table on FUND_CODE and PRICE_DATE=MaxPriceDate it should do what you need.
SELECT 2016_11_Fund_Prices.FUND_CODE, PRICE, PRICE_DATE
FROM 2016_11_Fund_Prices
INNER JOIN (SELECT FUND_CODE, MAX(PRICE_DATE) AS MaxPriceDate FROM 2016_11_Fund_Prices GROUP BY FUND_CODE) mp
ON 2016_11_Fund_Prices.FUND_CODE=mp.FUND_CODE AND 2016_11_Fund_Prices.PRICE_DATE=mp.MaxPriceDate

SQL Where Query to Return Distinct Values

I have an app that has the built in initial Select option and only allows me to enter from the Where section. I have rows with duplicate values. I'm trying to get the list of just one record for each distinct value but am unsure how to get the statement to work. I've found one that almost does the trick but it doesn't give me any rows that had a dup. I assume due to the = so just need a way to get one for each that matches my where criteria. Examples below.
Initial Data Set
Date | Name | ANI | CallIndex | Duration
---------------------------------------------------------
2/2/2015 | John | 5555051000 | 00000.0001 | 60
2/2/2015 | John | | 00000.0001 | 70
3/1/2015 | Jim | 5555051001 | 00000.0012 | 80
3/4/2015 | Susan | | 00000.0022 | 90
3/4/2015 | Susan | 5555051002 | 00000.0022 | 30
4/10/2015 | April | 5555051003 | 00000.0030 | 35
4/11/2015 | Leon | 5555051004 | 00000.0035 | 10
4/15/2015 | Jane | 5555051005 | 00000.0050 | 20
4/15/2015 | Jane | 5555051005 | 00000.0050 | 60
4/15/2015 | Kevin | 5555051006 | 00000.0061 | 35
What I Want the Query to Return
Date | Name | ANI | CallIndex | Duration
---------------------------------------------------------
2/2/2015 | John | 5555051000 | 00000.0001 | 60
3/1/2015 | Jim | 5555051001 | 00000.0012 | 80
3/4/2015 | Susan | 5555051002 | 00000.0022 | 30
4/10/2015 | April | 5555051003 | 00000.0030 | 35
4/11/2015 | Leon | 5555051004 | 00000.0035 | 10
4/15/2015 | Jane | 5555051005 | 00000.0050 | 20
4/15/2015 | Kevin | 5555051006 | 00000.0061 | 35
Here is what I was able to get but when i run it I don't get the rows that did have dups callindex values. duration doesn't mattern and they never match up so if it helps to query using that as a filter that would be fine. I've added mock data to assist.
use Database
SELECT * FROM table
WHERE Date between '4/15/15 00:00' and '4/15/15 23:59'
and callindex in
(SELECT callindex
FROM table
GROUP BY callinex
HAVING COUNT(callindex) = 1)
Any help would be greatly appreciated.
Ok with the assistance of everyone here i was able to get the query to work perfectly within SQL. That said apparently the app I'm trying this on has a built in character limit and the below query is too long. This is the query i have to use as far as the restrictions and i have to be able to search both ID's at the same time because some get stamped with one or the other rarely both. I'm hoping someone might be able to help me shorten it?
use Database
select * from tblCall
WHERE
flddate between '4/15/15 00:00' and '4/15/15 23:59'
and fldAgentLoginID='1234'
and fldcalldir='incoming'
and fldcalltype='external'
and EXISTS (SELECT * FROM (SELECT MAX(fldCallName) AS fldCallName, fldCallID FROM tblCall GROUP BY fldCallID) derv WHERE tblCall.fldCallName = derv.fldCallName AND tblCall.fldCallID = derv.fldCallID)
or
flddate between '4/15/15 00:00' and '4/15/15 23:59'
and '4/15/15 23:59'
and fldPhoneLoginID='56789'
and fldcalldir='incoming'
and fldcalltype='external'
and EXISTS (SELECT * FROM (SELECT MAX(fldCallName) AS fldCallName, fldCallID FROM tblCall GROUP BY fldCallID) derv WHERE tblCall.fldCallName = derv.fldCallName AND tblCall.fldCallID = derv.fldCallID)
If the constraint is that we can only add to the WHERE clause, I don't think it's possible, due to there being 2 absolutely identical rows:
4/15/2015 | Jane | 5555051005 | 00000.0050
4/15/2015 | Jane | 5555051005 | 00000.0050
Is it possible that you can add HAVING or GROUP BY to the WHERE? or possibly UNION the SELECT to another SELECT statement? That may open up some additional possibilities.
Maybe with an union:
SELECT *
FROM table
GROUP BY Date, Name, ANI, CallIndex
HAVING ( COUNT(*) > 1 )
UNION
SELECT *
FROM table
WHERE Name not in (SELECT name from table
GROUP BY Date, Name, ANI, CallIndex
HAVING ( COUNT(*) > 1 ))
From your sample, it seems like you could just exclude rows in which there was no value in the ANI column. If that is the case you could simply do:
use Database
SELECT * FROM table
WHERE Date between '4/15/15 00:00' and '4/15/15 23:59'
and ANI is not null
If this doesn't work for you, let me know and I can see what else I can do.
Edit:
You've made it sound like the CallIndex combined with the Duration is a unique value. That seems somewhat doubtful to me, but if that is the case you could do something like this:
use Database
SELECT * FROM table
WHERE Date between '4/15/15 00:00' and '4/15/15 23:59'
and cast(callindex as varchar(80))+'-'+cast(min(duration) as varchar(80)) in
(SELECT cast(callindex as varchar(80))+'-'+cast(min(duration) as varchar(80))
FROM table
GROUP BY callindex)
There are two keywords you can use to get non-duplicated data, either DISTINCT or GROUP BY. In this case, I would use a GROUP BY, but you should read up on both.
This query groups all of the records by CallIndex and takes the MAX value for each of the other columns and should give you the results you want:
SELECT MAX(Date) AS Date, MAX(Name) AS Name, MAX(ANI) AS ANI, CallIndex
FROM table
GROUP BY CallIndex
EDIT
Since you can't use GROUP BY directly but you can have any SQL in the WHERE clause you can do:
SELECT *
FROM table
WHERE EXISTS
(
SELECT *
FROM
(
SELECT MAX(Date) AS Date, MAX(Name) AS Name, MAX(ANI) AS ANI, CallIndex
FROM table
GROUP BY CallIndex
) derv
WHERE table.Date = derv.Date
AND table.Name = derv.Name
AND table.ANI = derv.ANI
AND table.CallIndex = derv.CallIndex
)
This selects all rows from the table where there exists a matching row from the GROUP BY.
It won't be perfect, if any two rows match exactly, you'll still have duplicates, but that's the best you'll get with your restriction.
In your data, why not just do this?
SELECT *
FROM table
WHERE Date >= '2015-04-15' and Date < '2015-04-16'
ani is not null;
If the blank values are only a coincidence, then you have a problem just using a where clause. If the results are full duplicates (no column has a different value), then you probably cannot do what you want with just a where clause -- unless you are using SQLite, Oracle, or Postgres.

PostgreSQL return multiple rows with DISTINCT though only latest date per second column

Lets says I have the following database table (date truncated for example only, two 'id_' preix columns join with other tables)...
+-----------+---------+------+--------------------+-------+
| id_table1 | id_tab2 | date | description | price |
+-----------+---------+------+--------------------+-------+
| 1 | 11 | 2014 | man-eating-waffles | 1.46 |
+-----------+---------+------+--------------------+-------+
| 2 | 22 | 2014 | Flying Shoes | 8.99 |
+-----------+---------+------+--------------------+-------+
| 3 | 44 | 2015 | Flying Shoes | 12.99 |
+-----------+---------+------+--------------------+-------+
...and I have a query like the following...
SELECT id, date, description FROM inventory ORDER BY date ASC;
How do I SELECT all the descriptions, but only once each while simultaneously only the latest year for that description? So I need the database query to return the first and last row from the sample data above; the second it not returned because the last row has a later date.
Postgres has something called distinct on. This is usually more efficient than using window functions. So, an alternative method would be:
SELECT distinct on (description) id, date, description
FROM inventory
ORDER BY description, date desc;
The row_number window function should do the trick:
SELECT id, date, description
FROM (SELECT id, date, description,
ROW_NUMBER() OVER (PARTITION BY description
ORDER BY date DESC) AS rn
FROM inventory) t
WHERE rn = 1
ORDER BY date ASC;

sql insert from table to table

I have a table Farm with these columns
FarmID:(primary)
Kenizy:
BarBedo:
BarBodo:
MorKodo:
These columns are palm types in some language. each column of those contains a number indicates the number of this type of palm inside a farm.
Example:
FarmID | Kenizy | BarBedo | BarBodo | MorKodo
-----------------------------------------------
3 | 20 | 12 | 45 | 60
22 | 21 | 9 | 41 | 3
I want to insert that table into the following tables:
Table Palm_Farm
FarmID:(primary)
PalmID;(primary)
PalmTypeName:
Count:
That table connects each farm with each palm type.
Example:
FarmID | PalmID | PalmTypeName | Count
-----------------------------------------------
3 | 1 | Kenizy | 20
3 | 2 | BarBedo | 12
3 | 3 | BarBodo | 45
3 | 4 | MorKodo | 60
22 | 1 | Kenizy | 21
22 | 2 | BarBedo | 9
22 | 3 | BarBodo | 41
22 | 4 | MorKodo | 3
I have to use the following table Palms in order to take the PalmID column.
PalmID:(primary)
PlamTypeName:
...other not important columns
This table is to save information about each palm type.
Example:
PalmID | PlamTypeName
-------------------------
1 | Kenizy
2 | BarBedo
3 | BarBodo
4 | MorKodo
The PalmTypeName column has the value the same as the COLUMN NAMES in the Farm table.
So my question is:
How to insert the data from Farm table to Palm_Farm considering that the PalmID exist in the Palm table
I hope I could make my question clear, I tried to solve my problem myself but the fact that the column name in the Farm table must be the column value in the Palm_Farm table couldn't know how to do it.
I can't change the table structure because we are trying to help a customer with this already existing tables
I am using SQL Server 2008 so Merge is welcomed.
Update
After the genius answer by #GarethD, I got this exception
You can use UNPIVOT to turn the columns into rows:
INSERT Palm_Farm (FarmID, PalmID, PalmTypeName, [Count])
SELECT upvt.FarmID,
p.PalmID,
p.PalmTypeName,
upvt.[Count]
FROM Farm AS f
UNPIVOT
( [Count]
FOR PalmTypeName IN ([Kenizy], [BarBedo], [BarBodo], [MorKodo])
) AS upvt
INNER JOIN Palms AS p
ON p.PalmTypeName = upvt.PalmTypeName;
Example on SQL Fiddle
The docs for UNPIVOT state:
UNPIVOT performs almost the reverse operation of PIVOT, by rotating columns into rows. Suppose the table produced in the previous example is stored in the database as pvt, and you want to rotate the column identifiers Emp1, Emp2, Emp3, Emp4, and Emp5 into row values that correspond to a particular vendor. This means that you must identify two additional columns. The column that will contain the column values that you are rotating (Emp1, Emp2,...) will be called Employee, and the column that will hold the values that currently reside under the columns being rotated will be called Orders. These columns correspond to the pivot_column and value_column, respectively, in the Transact-SQL definition.
To explain further how unpivot works, I will look at the first row original table:
FarmID | Kenizy | BarBedo | BarBodo | MorKodo
-----------------------------------------------
3 | 20 | 12 | 45 | 60
So what UPIVOT will do is look for columns specified in the UNPIVOT statement, and create a row for each column:
SELECT upvt.FarmID, upvt.PalmTypeName, upvt.[Count]
FROM Farm AS f
UNPIVOT
( [Count]
FOR PalmTypeName IN ([Kenizy], [BarBedo])
) AS upvt;
So here you are saying, for every row find the columns [Kenizy] and [BarBedo] and create a row for each, then for each of these rows create a new column called PalmTypeName that will contain the column name used, then put the value of that column into a new column called [Count]. Giving a result of:
FarmID | Kenizy | Count |
---------------------------
3 | Kenizy | 20 |
3 | BarBedo | 12 |
If you are running SQL Server 2000, or a later version with a lower compatibility level, then you may need to use a different query:
INSERT Palm_Farm (FarmID, PalmID, PalmTypeName, [Count])
SELECT f.FarmID,
p.PalmID,
p.PalmTypeName,
[Count] = CASE upvt.PalmTypeName
WHEN 'Kenizy' THEN f.Kenizy
WHEN 'BarBedo' THEN f.BarBedo
WHEN 'BarBodo' THEN f.BarBodo
WHEN 'MorKodo' THEN f.MorKodo
END
FROM Farm AS f
CROSS JOIN
( SELECT PalmTypeName = 'Kenizy' UNION ALL
SELECT PalmTypeName = 'BarBedo' UNION ALL
SELECT PalmTypeName = 'BarBodo' UNION ALL
SELECT PalmTypeName = 'MorKodo'
) AS upvt
INNER JOIN Palms AS p
ON p.PalmTypeName = upvt.PalmTypeName;
This is similar, but you have to create the additional rows yourself using UNION ALL inside the subquery upvt, then choose the value for [Count] using a case expression.
To update when the row exists you can use MERGE
WITH Data AS
( SELECT upvt.FarmID,
p.PalmID,
p.PalmTypeName,
upvt.[Count]
FROM Farm AS f
UNPIVOT
( [Count]
FOR PalmTypeName IN ([Kenizy], [BarBedo], [BarBodo], [MorKodo])
) AS upvt
INNER JOIN Palms AS p
ON p.PalmTypeName = upvt.PalmTypeName
)
MERGE Palm_Farm WITH (HOLDLOCK) AS pf
USING Data AS d
ON d.FarmID = pf.FarmID
AND d.PalmID = pf.PalmID
WHEN NOT MATCHED BY TARGET THEN
INSERT (FarmID, PalmID, PalmTypeName, [Count])
VALUES (d.FarmID, d.PalmID, d.PalmTypeName, d.[Count])
WHEN MATCHED THEN
UPDATE
SET [Count] = d.[Count],
PalmTypeName = d.PalmTypeName;