How to join 2 tables without relationship in SQL - sql

I am struggling to combine three tables with outer join, but cant get it right
I am using PostgreSQL database..
My tables look like this:
My Orders table have the foreign key of Customers and Years (this two dont have any relation)
Some year and Customer are not presented in my Orders table
I am trying to make an sql query so that each customer have all year. I can get this result by cross join customer and year, but this approach doesn't work when i continue to join this result with other tables. (My Orders table have other foreign keys also that i have to join also) So can i get this result by using outer join instead?
I have tried:
Select* From Orders
RIGHT JOIN Customers on Orders.customer_id = customer.id
That give me all the customer in Orders table (even them who doesent have a order) then i would like to do the same with all years also so every customer have one row for each year(2015-2020) Have tryed to do another right join with years table but it doesent work.. Anyone know how to fix this?
p.s the names of the tables is not real,i just used this names to make it easier to understand!

If you want all customers to have a year and their respective orders if they have them per year.
SELECT CUS.ID AS CustomerID, CUS.Name AS CustomerName, YEA.year AS Year, ORD.ID AS OrderID
FROM Customers AS CUS
CROSS JOIN Years AS YEA
LEFT JOIN Orders AS ORD
ON CUS.ID = ORD.Customer_ID
AND ORD.YEAR_ID = YEA.ID;
This will give you and their orders as seperate rows.
If you want the number of orders use this instead:
SELECT CUS.ID AS CustomerID, CUS.Name AS CustomerName, YEA.year AS Year, Count(ORD.ID) AS NrOfOrders
FROM Customers AS CUS
CROSS JOIN Years AS YEA
LEFT JOIN Orders AS ORD
ON CUS.ID = ORD.Customer_ID
AND ORD.YEAR_ID = YEA.ID
GROUP BY CUS.ID, CUS.Name, YEA.Year;
Which will show the number of orders and give each customer a row per year.
Try it out/see it in action here, I added some data to show the multiple orders.

A CROSS JOIN of customers and years, then a LEFT JOIN of that with orders seems to do exactly what you describe in your post. Or am I getting something wrong?
And don't put your data into an image next time. Just type it /paste it into your post, so we can copy-paste it into our examples ... I had to re-type them...
And "name" and "year" are reserved words. I avoided them in my example.
\pset null NULL
WITH
-- your input ...
customers ( id, nm) AS (
SELECT 1,'alpha'
UNION ALL SELECT 2,'bravo'
UNION ALL SELECT 3,'charlie'
UNION ALL SELECT 4,'delta'
UNION ALL SELECT 5,'echo'
UNION ALL SELECT 6,'foxtrot'
)
,
years(id,yr) AS (
SELECT 1,2015
UNION ALL SELECT 2,2016
UNION ALL SELECT 3,2017
UNION ALL SELECT 4,2018
UNION ALL SELECT 5,2019
UNION ALL SELECT 6,2020
)
,
orders(id,cust_id,yr_id) AS (
SELECT 1,1,1
UNION ALL SELECT 2,2,3
UNION ALL SELECT 3,4,5
UNION ALL SELECT 4,5,6
)
-- end of your input ...
SELECT
cust.nm
, years.yr
, ord.id
FROM customers cust
CROSS JOIN years
LEFT JOIN orders ord ON ord.yr_id=years.id
ORDER BY 1,2
;
-- out nm | yr | id
-- out ---------+------+------
-- out alpha | 2015 | 1
-- out alpha | 2016 | NULL
-- out alpha | 2017 | 2
-- out alpha | 2018 | NULL
-- out alpha | 2019 | 3
-- out alpha | 2020 | 4
-- out bravo | 2015 | 1
-- out bravo | 2016 | NULL
-- out bravo | 2017 | 2
-- out bravo | 2018 | NULL
-- out bravo | 2019 | 3
-- out bravo | 2020 | 4
-- out charlie | 2015 | 1
-- out charlie | 2016 | NULL
-- out charlie | 2017 | 2
-- out charlie | 2018 | NULL
-- out charlie | 2019 | 3
-- out charlie | 2020 | 4
-- out delta | 2015 | 1
-- out delta | 2016 | NULL
-- out delta | 2017 | 2
-- out delta | 2018 | NULL
-- out delta | 2019 | 3
-- out delta | 2020 | 4
-- out echo | 2015 | 1
-- out echo | 2016 | NULL
-- out echo | 2017 | 2
-- out echo | 2018 | NULL
-- out echo | 2019 | 3
-- out echo | 2020 | 4
-- out foxtrot | 2015 | 1
-- out foxtrot | 2016 | NULL
-- out foxtrot | 2017 | 2
-- out foxtrot | 2018 | NULL
-- out foxtrot | 2019 | 3
-- out foxtrot | 2020 | 4

Related

SQL How to select contents of a row directly above a row, and move both into a new table

I am trying to write something to automatically clean up some travel data. See these as flights:
FLIGHTS:
ID DocType Name Travel Date Fare Paid
1 INV Mrs G 13/03/2017 37.6
2 INV Mrs G 13/03/2017 200
3 INV Mr H 14/03/2017 60
4 INV Mr H 15/03/2017 126
5 CRN Mr H 15/03/2017 126
6 INV Mr H 20/03/2017 126
7 INV Mrs S 29/03/2017 110
8 INV Mr J 26/03/2017 54
9 INV Mr R 13/03/2017 200
10 INV Miss C 27/03/2017 78.98
Sometimes people buy a flight and then get a refund. This shows up as two identical entries in the data, except that the refund is DocType 'CRN'. I need to be able to pull both the booking and the refund line out of the dataset.
I can do this for the CRN tagged rows. But how can I pull out rows that are immediately above the CRN rows? The ID of the related INV row will always have an ID that is directly and sequentially lower than the CRN row.
I have managed
INSERT INTO TRAVEL.REFUNDS (ID, DocType, Name, [Travel Date], [Fare Paid])
SELECT ID, DocType, Name, [Travel Date], [Fare Paid]
FROM TRAVEL.FLIGHTS
WHERE [DocType] = 'CRN';
GO
Thank you in advance
using exists():
select *
from t
where DocType = 'CRN'
or exists (
select 1
from t i
where i.DocType='CRN'
and i.id-1 = t.id
)
or a left join
select t.*
from t
left join t i
on i.id-1 = t.id
where t.DocType = 'CRN'
or i.DocType = 'CRN'
rextester demo: rextester.com/MSGGX10058
returns:
+----+---------+--------+------------+----------+
| ID | DocType | Name | TravelDate | FarePaid |
+----+---------+--------+------------+----------+
| 4 | INV | Mr H | 15.03.2017 | 126.00 |
| 5 | CRN | Mr H | 15.03.2017 | 126.00 |
+----+---------+--------+------------+----------+
using not exists() for the opposite result set:
select *
from t
where DocType = 'INV'
and not exists (
select 1
from t i
where i.DocType='CRN'
and i.id-1 = t.id
)
returns:
+----+---------+--------+------------+----------+
| ID | DocType | Name | TravelDate | FarePaid |
+----+---------+--------+------------+----------+
| 1 | INV | Mrs G | 13.03.2017 | 37.60 |
| 2 | INV | Mrs G | 13.03.2017 | 200.00 |
| 3 | INV | Mr H | 14.03.2017 | 60.00 |
| 6 | INV | Mr H | 20.03.2017 | 126.00 |
| 7 | INV | Mrs S | 29.03.2017 | 110.00 |
| 8 | INV | Mr J | 26.03.2017 | 54.00 |
| 9 | INV | Mr R | 13.03.2017 | 200.00 |
| 10 | INV | Miss C | 27.03.2017 | 78.98 |
+----+---------+--------+------------+----------+
This is for SELECT purposes, not sure if you wanted that or INSERT or DELETE, but hopefully it's easily reworkable to those, plus it's good to check before modifying, right?
What I'm doing is, I'm using LAG/LEAD to add one new column, which is mostly a copy of some other row column, although shifted one row up or down. With that, you'll have each row containing everything needed to decide what to do with it, which will be done in the higher query that targets the lower query's results.
-- Making an MCVE, first time I know its name though.
DECLARE #Flights TABLE (ID int, DocType char(3))
INSERT INTO #Flights VALUES
( 1, 'INV')
, ( 2, 'INV')
, ( 3, 'INV') -- Should not show up.
, ( 4, 'CRN') -- Should not show up.
, ( 5, 'INV')
, ( 6, 'INV')
, ( 7, 'INV')
, ( 8, 'INV')
, ( 9, 'INV')
, (10, 'INV') -- Should not show up.
, (11, 'CRN') -- Should not show up.
-- Querying via LEAD(), with 1 level nesting (or subquerying, I dunno which is which).
SELECT *
FROM (
SELECT ID
, DocType AS DocTypeThis
, LEAD(DocType) OVER(ORDER BY ID ASC) AS DocTypeOther -- Seems like the choice of ASC/DESC reverses LAG/LEAD behaviours into each other, although not sure.
FROM #Flights
) AS T
WHERE (DocTypeOther IS NULL AND DocTypeThis = 'INV') -- Special treatment for last row (for other implementations, might be first row).
OR DocTypeThis = DocTypeOther -- This is the core of filtering, this fails only when the row is a 'CRN', or is superceded directly by a 'CRN'.
ORDER BY ID ASC

DB2 query to find average sale for each item 1 year previous

Having some trouble figuring out how to make these query.
In general I have a table with
sales_ID
Employee_ID
sale_date
sale_price
what I want to do is have a view that shows for each sales item how much the employee on average sells for 1 year previous of the sale_date.
example: Suppose I have this in the sales table
sales_ID employee_id sale_date sale_price
1 Bob 2016/06/10 100
2 Bob 2016/01/01 75
3 Bob 2014/01/01 475
4 Bob 2015/12/01 100
5 Bob 2016/05/01 200
6 Fred 2016/01/01 30
7 Fred 2015/05/01 50
for sales_id 1 record I want to pull all sales from Bob by 1 year up to the month of the sale (so 2015-05-01 to 2016-05-31 which has 3 sales for 75, 100, 200) so the final output would be
sales_ID employee_id sale_date sale_price avg_sale
1 Bob 2016/06/10 100 125
2 Bob 2016/01/01 75 275
3 Bob 2014/01/01 475 null
4 Bob 2015/12/01 100 475
5 Bob 2016/05/01 200 87.5
6 Fred 2016/01/01 30 50
7 Fred 2015/05/01 50 null
What I tried doing is something like this
select a.sales_ID, a.sale_price, a.employee_ID, a.sale_date, b.avg_price
from sales a
left join (
select employee_id, avg(sale_price) as avg_price
from sales
where sale_date between Date(VARCHAR(YEAR(a.sale_date)-1) ||'-'|| VARCHAR(MONTH(a.sale_date)-1) || '-01')
and Date(VARCHAR(YEAR(a.sale_date)) ||'-'|| VARCHAR(MONTH(a.sale_date)) || '-01') -1 day
group by employee_id
) b on a.employee_id = b.employee_id
which DB2 doesn't like using the parent table a in the sub query, but I can't think of how to properly write this query. any thoughts?
Ok. I think I figured it out. Please note 3 things.
I couldn't test it in DB2, so I used Oracle. But syntax would be more or less same.
I didn't use your 1 year logic exactly. I am counting current_date minus 365 days, but you can change the between part in where clause in inner query, as you mentioned in the question.
The expected output you mentioned is incorrect. So for every sale_id, I took the date, found the employee_id, took all the sales of that employee for last 1 year, excluding the current date, and then took average. If you want to change it, you can change the where clause in subquery.
select t1.*,t2.avg_sale
from
sales t1
left join
(
select a.sales_id
,avg(b.sale_price) as avg_sale
from sales a
inner join
sales b
on a.employee_id=b.employee_id
where b.sale_date between a.sale_date - 365 and a.sale_date -1
group by a.sales_id
) t2
on t1.sales_id=t2.sales_id
order by t1.sales_id
Output
+----------+-------------+-------------+------------+----------+
| SALES_ID | EMPLOYEE_ID | SALE_DATE | SALE_PRICE | AVG_SALE |
+----------+-------------+-------------+------------+----------+
| 1 | Bob | 10-JUN-2016 | 100 | 125 |
| 2 | Bob | 01-JAN-2016 | 75 | 100 |
| 3 | Bob | 01-JAN-2014 | 475 | |
| 4 | Bob | 01-DEC-2015 | 100 | |
| 5 | Bob | 01-MAY-2016 | 200 | 87.5 |
| 6 | Fred | 01-JAN-2016 | 30 | 50 |
| 7 | Fred | 01-MAY-2015 | 50 | |
+----------+-------------+-------------+------------+----------+
You can almost fix your original query by doing a LATERAL join. Lateral allows you to reference previously declared tables as in:
select a.sales_ID, a.sale_price, a.employee_ID, a.sale_date, b.avg_price
from sales a
left join LATERAL (
select employee_id, avg(sale_price) as avg_price
from sales
where sale_date between Date(VARCHAR(YEAR(a.sale_date)-1) ||'-'|| VARCHAR(MONTH(a.sale_date)-1) || '-01')
and Date(VARCHAR(YEAR(a.sale_date)) ||'-'|| VARCHAR(MONTH(a.sale_date)) || '-01') -1 day
group by employee_id
) b on a.employee_id = b.employee_id
However, I get an syntax error from your date arithmetic, so using #Utsav solution for this yields:
select a.sales_ID, a.sale_price, a.employee_ID, a.sale_date, b.avg_price
from sales a
left join lateral (
select employee_id, avg(sale_price) as avg_price
from sales b
where a.employee_id = b.employee_id
and b.sale_date between a.sale_date - 365 and a.sale_date -1
group by employee_id
) b on a.employee_id = b.employee_id
Since we already pushed the predicate inside the LATERAL join, it is strictly speaking not necessary to use the on clause:
select a.sales_ID, a.sale_price, a.employee_ID, a.sale_date, b.avg_price
from sales a
left join lateral (
select employee_id, avg(sale_price) as avg_price
from sales b
where a.employee_id = b.employee_id
and b.sale_date between a.sale_date - 365 and a.sale_date -1
group by employee_id
) b on 1=1
By using a LATERAL join we removed one access against the sales table. A comparison of the plans show:
No LATERAL Join
Access Plan:
Total Cost: 20,4571
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
7
>MSJOIN
( 2)
20,4565
3
/---+----\
7 0,388889
TBSCAN FILTER
( 3) ( 6)
6,81572 13,6402
1 2
| |
7 2,72222
SORT GRPBY
( 4) ( 7)
6,81552 13,6397
1 2
| |
7 2,72222
TBSCAN TBSCAN
( 5) ( 8)
6,81488 13,6395
1 2
| |
7 2,72222
TABLE: LELLE SORT
SALES ( 9)
Q6 13,6391
2
|
2,72222
HSJOIN
( 10)
13,6385
2
/-----+------\
7 7
TBSCAN TBSCAN
( 11) ( 12)
6,81488 6,81488
1 1
| |
7 7
TABLE: LELLE TABLE: LELLE
SALES SALES
Q2 Q1
LATERAL Join
Access Plan:
Total Cost: 13,6565
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
7
>^NLJOIN
( 2)
13,6559
2
/---+----\
7 0,35
TBSCAN GRPBY
( 3) ( 4)
6,81488 6,81662
1 1
| |
7 0,35
TABLE: LELLE TBSCAN
SALES ( 5)
Q5 6,81656
1
|
7
TABLE: LELLE
SALES
Q1
Window functions with framing
DB2 does not yet support range frames over dates, but by using a clever trick by #mustaccio in:
https://dba.stackexchange.com/questions/141263/what-is-the-meaning-of-order-by-x-range-between-n-preceding-if-x-is-a-dat
we can actually use only one table access and solve the problem:
select a.sales_ID, a.sale_price, a.employee_ID, a.sale_date
, avg(sale_price) over (partition by employee_id
order by julian_day(a.sale_date)
range between 365 preceding
and 1 preceding
) as avg_price
from sales a
Access Plan:
Total Cost: 6.8197
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
7
TBSCAN
( 2)
6.81753
1
|
7
SORT
( 3)
6.81703
1
|
7
TBSCAN
( 4)
6.81488
1
|
7
TABLE: LELLE
SALES
Q1

How to do a count including not existing records?

How to do a count including not existing records, which should have '0' as the count?
Here is my table:
CREATE TABLE SURVEY
(year CHAR(4),
cust CHAR(2));
INSERT INTO SURVEY VALUES ('2011', 'AZ');
INSERT INTO SURVEY VALUES ('2011', 'CO');
INSERT INTO SURVEY VALUES ('2012', 'ME');
INSERT INTO SURVEY VALUES ('2014', 'ME');
INSERT INTO SURVEY VALUES ('2014', 'CO');
INSERT INTO SURVEY VALUES ('2014', 'ME');
INSERT INTO SURVEY VALUES ('2014', 'CO');
I've tried this, but of course it is missing zero counts:
select cust, year, count(*) as count from SURVEY
group by cust, year
I want to have this result:
+------+---------+--------+
| cust | year | count |
+------+---------+--------+
| AZ | 2011 | 1 |
| AZ | 2012 | 0 |
| AZ | 2014 | 0 |
| CO | 2011 | 1 |
| CO | 2012 | 0 |
| CO | 2014 | 2 |
| ME | 2011 | 0 |
| ME | 2012 | 1 |
| ME | 2014 | 2 |
+------+---------+--------+
please note:
My table has many records (~10k with different 'cust')
years may not be sequential (for example 2013 is skipped)
over time i may have 2015, 2016 and so on
the actual query will be executed in MS_ACCESS'2010 (not sure if its matter)
please help, thank you!
It sounds like you want a count for every cust x year combination with a zero when no survey record exists. If this is the case you will need two more tables: customers and years then do something like:
select leftside.cust, leftside.year, count(survey.cust) from
(select * from customers, years) as leftside left join survey
on leftside.cust = survey.cust and
leftside.year = survey.year
group by leftside.cust, leftside.year
select cust, year, (select count(cust) from survey) as count
from SURVEY
group by cust, year
But this query will return count of all records, without group condition.
If you have a domain table for years and customers:
select y.year, c.cust, count(s.year) as cnt
from customer as c
cross join year as y
left join survey as s
on s.year = y.year
and s.cust = c.cust
group by y.year, c.cust
If ms-access don't have cross join, you can do the same with:
from customer as c
join year as y
on 1 = 1
If you don't have domain tables you will somehow need to "invent" the domains since you cant create something from nothing.
If you have domain tables as others said, well and good. If you have to depend only on data in your table, the below query will do that for you.
select cp.cust, cp.year, iif(isnull(sum(cnt)), 0, sum(cnt)) as count from
(select * from (
(select distinct cust from survey) as c cross join
(select distinct year from survey) as y)
) cp left join
(select *, 1 as cnt from survey) s on cp.cust=s.cust and cp.year=s.year
group by cp.cust, cp.year
order by cp.cust,cp.year
Instead of iif(isnull(sum(cnt)), 0, sum(cnt)), you can use coalesce(sum(cnt),0) if that works. In MS Access use iif function and in other databases coalesce works.

Using SQL, what is the correct way to total columns from multiple tables into common groups?

I am coding a dashboard, and I need to pull some data out of Microsoft SQL Server.
For a simple example, I have three tables, one master Category table, and two tables containing values linked to the Category table via a primary/foreign key relationship (Blue and Green value tables).
Using Microsoft SQL Sever (t-sql), I wish to total (sum) the values in the two value tables, grouped by the common category found in the category table.
Category Table
CategoryID (PK) | CategoryName
1 | Square
2 | Circle
Blue Table
BlueID (PK) | CategoryID (FK) | BlueValue | BlueMonth | BlueYear
1 | 1 | 10 | 6 | 2012
2 | 1 | 20 | 12 | 2012
3 | 2 | 5 | 6 | 2012
4 | 2 | 9 | 12 | 2012
5 | 1 | 12 | 6 | 2013
6 | 1 | 21 | 12 | 2013
7 | 2 | 4 | 6 | 2013
8 | 2 | 8 | 12 | 2013
Green Table
GreenID (PK)| CategoryID (FK) | GreenValue| GreenMonth| GreenYear
1 | 1 | 3 | 6 | 2012
2 | 1 | 6 | 12 | 2012
3 | 2 | 2 | 6 | 2012
4 | 2 | 7 | 12 | 2012
5 | 1 | 2 | 6 | 2013
6 | 1 | 5 | 12 | 2013
7 | 2 | 4 | 6 | 2013
8 | 2 | 8 | 12 | 2013
If I use the following SQL, I get the results I expect.
SELECT
[Category].[CategoryName],
SUM([Green].[GreenValue]) AS [GreenTotal]
FROM
[Category]
LEFT JOIN
[Green] ON [Category].[CategoryID] = [Green].[CategoryID]
GROUP BY
[Category].[CategoryName]
Results:
CategoryName | GreenTotal
Square | 16
Triangle | 21
However, if I add the Blue table, to try and fetch a total for BlueValue as well, my obviously incorrect T-SQL gives me unexpected results.
SELECT
[Category].[CategoryName],
SUM([Green].[GreenValue]) AS [GreenTotal],
SUM([Blue].[BlueValue]) AS [BlueTotal]
FROM
[Category]
LEFT JOIN
[Green] ON [Category].[CategoryID] = [Green].[CategoryID]
LEFT JOIN
[Blue] ON [Category].[CategoryID] = [Blue].[CategoryID]
GROUP BY
[Category].[CategoryName]
Incorrect Results:
CategoryName | GreenTotal | BlueTotal
Square | 64 | 252
Triangle | 84 | 104
The results all seem to be out by a factor of 4, which is the total number of rows in each value table for each category.
I am aiming to see the following results:
CategoryName | GreenTotal | BlueTotal
Square | 16 | 63
Triangle | 21 | 26
I would be over the moon if someone could tell me what on earth I am doing wrong?
Thanks,
Mark.
Something like this would be best done with an APPLY in my opinion. Fast performance-wise, simple to use, and easy to control in case of variations in the query.
IE:
SELECT C.[CategoryName], G.[GreenTotal], B.[BlueTotal]
FROM [Category] C
OUTER APPLY (SELECT SUM([GreenValue]) AS [GreenTotal] FROM [Green] WHERE [CategoryID] = C.CategoryID) G
OUTER APPLY (SELECT SUM([BlueValue]) AS [BlueTotal] FROM [Blue] WHERE [CategoryID] = C.CategoryID) B
What you're getting is a Cartesian product. You can see the effects of this by removing the grouping and looking through the data.
For example; if your green table contained 2 rows and your blue table contained 4, your join would return a total of 8 records.
To resolve the problem, well, you're nearly there. You've got all the right pieces, just not put them together quite right.
Assuming the following query returns the correct results for green:
SELECT CategoryID
, Sum(GreenValue) As GreenTotal
FROM Green
GROUP
BY CategoryID
The results for blue can be retrieved by following the same method:
SELECT CategoryID
, Sum(BueValue) As BlueTotal
FROM Blue
GROUP
BY CategoryID
Now that we have two distinct results that are correct, we should join these results to our category table:
SELECT Category.CategoryName
, GreenSummary.GreenTotal
, BlueSummary.BlueTotal
FROM Category
LEFT
JOIN (
SELECT CategoryID
, Sum(GreenValue) As GreenTotal
FROM Green
GROUP
BY CategoryID
) As GreenSummary
ON GreenSummary.CategoryID = Category.CategoryID
LEFT
JOIN (
SELECT CategoryID
, Sum(BlueValue) As BlueTotal
FROM Blue
GROUP
BY CategoryID
) As BlueSummary
ON BlueSummary.CategoryID = Category.CategoryID
Create a query for each total table. Group by category, create the sum column and add the column id.
Then use this querys as sub query and make a left outer join with the main table. This can give you the result expected and can have null values when the sum is not available. You can use the isnull function to convert the null values to 0.
I would sum them up first with CTE. Then simply join the 2 together on what is common with only 1 occurrence in each, the CategoryName. This way you can't get a Cartesian product. I put the isnull statement in because there is a possibility that there are no results for a CategoryName in Blue or in Green. If you didn't do this you could get null for your CategoryName.
WITH GREENSUM as (
SELECT
[Category].[CategoryName],
SUM([Green].[GreenValue]) AS [GreenTotal]
FROM
[Category]
LEFT JOIN
[Green] ON [Category].[CategoryID] = [Green].[CategoryID]
GROUP BY
[Category].[CategoryName]
),
WITH BLUESUM as (
SELECT
[Category].[CategoryName],
SUM([Blue].[BlueValue]) AS [BlueTotal]
FROM
[Category]
LEFT JOIN
[Blue] ON [Category].[CategoryID] = [Blue].[CategoryID]
GROUP BY
[Category].[CategoryName])
SELECT isnull(GREENSUM.CategoryName, BLUESUM.CategoryName) as CategoryName,
GreenTotal, BlueTotal
FROM [GREENSUM]
FULL OUTER JOIN
[BLUESUM] ON [GREENSUM].CategoryName = [BLUESUM].CategoryName)
I also use CTE, find it easier on the eye - but rank the selects internal.
/*
create table Category ( CategoryId Integer, CategoryName nvarchar(50) )
create table Green ( CategoryId Integer, GreenValue Integer )
create table Blue ( CategoryId Integer, BlueValue Integer )
insert into Category VALUES (1,'Square'),(2,'Circle')
insert into Blue VALUES (1,10),(1,20),(2,5),(2,9),(1,12),(1,21),(2,4),(2,8)
insert into Green VALUES (1,3),(1,6),(2,2),(2,7),(1,2),(1,5),(2,4),(2,8)
*/
with CatSums(ColorRank, CategoryId, CategoryValue) as
(
select 1, CategoryId, GreenValue from Green
union all
select 2, CategoryId, BlueValue from Blue
)
select
C.CategoryName,
Sum(case when ColorRank = 1 then CategoryValue else 0 end) as GreenTotal,
Sum(case when ColorRank = 2 then CategoryValue else 0 end) as BlueTotal
from CatSums S left join Category C on C.CategoryId = S.CategoryId
group by C.CategoryName
Although I must profess to increasingly liking the OUTER APPLY solution.

Mysql4: SQL for selecting one or zero record

Table layout:
CREATE TABLE t_order (id INT, custId INT, order DATE)
I'm looking for a SQL command to select a maximum of one row per order (the customer who owns the order is identified by a field named custId).
I want to select ONE of the customer's orders (doesn't matter which one, say sorted by id) if there is no order date given for any of the rows.
I want to retrieve an empty Resultset for the customerId, if there is already a record with given order date.
Here is an example. Per customer there should be one order at most (one without a date given). Orders that have already a date value should not appear at all.
+---------------------------------------------------------+
|id | custId | date |
+---------------------------------------------------------+
| 1 10 NULL |
| 2 11 2008-11-11 |
| 3 12 2008-10-23 |
| 4 11 NULL |
| 5 13 NULL |
| 6 13 NULL |
+---------------------------------------------------------+
|
|
| Result
\ | /
\ /
+---------------------------------------------------------+
|id | custId | date |
+---------------------------------------------------------+
| 1 10 NULL |
| |
| |
| |
| 5 13 NULL |
| |
+---------------------------------------------------------+
powered be JavE
Edit:
I've choosen glavić's answer as the correct one, because it provides
the correct result with slightly modified data:
+---------------------------------------------------------+
|id | custId | date |
+---------------------------------------------------------+
| 1 10 NULL |
| 2 11 2008-11-11 |
| 3 12 2008-10-23 |
| 4 11 NULL |
| 5 13 NULL |
| 6 13 NULL |
| 7 11 NULL |
+---------------------------------------------------------+
Sfossen's answer will not work when customers appear more than twice because of its where clause constraint a.id != b.id.
Quassnoi's answer does not work for me, as I run server version 4.0.24 which yields the following error:
alt text http://img25.imageshack.us/img25/8186/picture1vyj.png
For a specific customer it's:
SELECT *
FROM t_order
WHERE date IS NULL AND custId=? LIMIT 1
For all customers its:
SELECT a.*
FROM t_order a
LEFT JOIN t_order b ON a.custId=b.custID and a.id != b.id
WHERE a.date IS NULL AND b.date IS NULL
GROUP BY custId;
Try this:
SELECT to1.*
FROM t_order AS to1
WHERE
to1.date IS NULL AND
to1.custId NOT IN (
SELECT to2.custId
FROM t_order AS to2
WHERE to2.date IS NOT NULL
GROUP BY to2.custId
)
GROUP BY to1.custId
For MySQL 4:
SELECT to1.*
FROM t_order AS to1
LEFT JOIN t_order AS to2 ON
to2.custId = to1.custId AND
to2.date IS NOT NULL
WHERE
to1.date IS NULL AND
to2.id IS NULL
GROUP BY to1.custId
This query will use one pass over index on custId.
For each distinct custId it will use one subquery over same index.
No GROUP BY, no TEMPORARY and no FILESORT — efficient, if your table is large.
SELECT VERSION()
--------
'4.1.22-standard'
CREATE INDEX ix_order_cust_id ON t_order(custId)
SELECT id, custId, order_date
FROM (
SELECT o.*,
CASE
WHEN custId <> #c THEN
(
SELECT 1
FROM t_order oi
WHERE oi.custId = o.custId
AND order_date IS NOT NULL
LIMIT 1
)
END AS n,
#c <> custId AS f,
#c := custId
FROM
(
SELECT #c := -1
) r,
t_order o
ORDER BY custId
) oo
WHERE n IS NULL AND f
---------
1, 10, ''
5, 13, ''
First filter out rows with dates, then filter out any row that has a similar row with a lower id. This should work because the matching record with the least id is unique if id is unique.
select * from t_order o1
where date is null
and not exists (select * from t_order o2
where o2.date is null
and o1.custId = o2.custId
and o1.id > o2.id)