Select unmatching rows from two tables grouping by two columns - sql

I have two tables with sales information that have different number of rows and I want to get these rows. An important thing to notice is that records are added to the tables by the key of two columns: sale_type and sale_date.
So I think I should group by these columns after making a Union of the two tables. And filter by count. But my current solution does not work. How should I fetch the unmatching records correctly?
Here is what I've tried:
SELECT * FROM
(SELECT * FROM sales_copy
UNION ALL
SELECT * FROM sales)
GROUP BY sale_type, sale_date
HAVING count(*)!=1;

To select the difference between two tables, I've managed to do this successfully using a full join.
So in your case, you would want something like:
SELECT S.*, SC.*
FROM sales AS S
FULL JOIN sales_copy AS SC ON (S.sale_type = SC.sale_type) AND (S.sale_date =
SC.sale_date
WHERE (S.sale_type IS NULL AND S.sale_date IS NULL) OR (SC.sale_type IS NULL AND
SC.sale_date IS NULL)
The result of this will select all the rows which are in one table only, ignoring the rows which are in both.
See it in action here: SQL Fiddle

You can use both EXCEPT and UNION operator :
SELECT sale_type, sale_date FROM sales EXCEPT SELECT sale_type,sale_date FROM sales_copy
UNION
SELECT sale_type, sale_date FROM sales_copy EXCEPT SELECT sale_type,sale_date FROM sales
It returns rows from sales wich are not in sales_copy and rows from sales_copy wich are not in sales
The same thing can be achieved with a full join by filtering rows wich are matching:
SELECT ISNULL(sales.sale_type, sales_copy.sale_type) AS sale_type
, ISNULL(sales.sale_date, sales_copy.sale_date) AS sale_date
FROM sales
FULL JOIN sales_copy
ON sales.sale_type = sales_copy.sale_type
AND sales.sale_date = sales_copy.sale_date
WHERE sales.sale_type IS NULL
OR sales_copy.sale_type IS NULL

Related

How to write SQL query without join?

Recently during an interview I was asked a question: if I have a table like as below:
The requirement is: how many orders and how many shipments per day (based on date column) - output needs to be like this:
I have written the following code, but interviewer ask me to write a SQL query without JOIN and UNION, achieve the same output.
SELECT
COALESCE(a.order_date, b.ship_date), orders, shipments
FROM
(SELECT
order_date, COUNT(1) AS orders
FROM
table
GROUP BY 1) a
FULL JOIN
(SELECT
ship_date, COUNT(1) AS shipments
FROM table) b ON a.order_date = b.ship_date
Is this possible? Could you guys please advice?
You can use UNION and GROUP BY with conditional aggregation as follows:
SELECT DATE_,
COUNT(CASE WHEN FLAG = 'ORDER' THEN 1 END) AS ORDERS,
COUNT(CASE WHEN FLAG = 'SHIP' THEN 1 END) AS SHIPMENTS
FROM (SELECT ORDER_DATE AS DATE_, 'ORDER' AS FLAG FROM YOUR_TABLE
UNION ALL
SELECT SHIP_DATE AS DATE_, 'SHIP' AS FLAG FROM YOUR_TABLE) T
In BigQuery, I would express this as:
select date, countif(n = 0) as orders, countif(n = 1) as numships
from t cross join
unnest(array[order_date, ship_date]) date with offset n
group by 1
order by date;
The advantage of this approach (over union all) is two-fold. First, it only scans the table once. More importantly, the unnest() is all on the same node where the data resides -- so data does not need to be moved for the unpivot.

Select difference between two tables

I want to list four columns, date, hourly count, daily count and difference between two counts.
I have used union all for two tables, but I am getting 2rows as shown in the image:
Select a.date, a.hour,b.daily,sum(a.hour-b.daily)
from (select date,count(*) hour,''daily
From table a union all select '' hour,count(*) daily from table b)
Group by date, daily, hourly..
Please suggest to me a solution.
I see that the code supplied uses a UNION to achieve the output. This would be better served by using a JOIN of some kind.
The result is the total number of rows in table_a grouped by the date subtracted from the total number of rows in table_b grouped by the date.
This code is untested but should give a good indication of how to achieve this:
SELECT a.date,
a.hour,
ISNULL(b.daily, 0) AS daily,
a.hour - ISNULL(b.daily) AS difference
FROM (
SELECT date,
COUNT(*) AS hour
FROM table_a
GROUP BY date
) a
LEFT JOIN (
SELECT date,
COUNT(*) AS daily
FROM table_b
GROUP BY date
) b ON b.date = a.date
ORDER BY a.date;
This works by:
Calculating the count per date in table_a.
Calculating the count per date in table_b.
Joining all results from table_a with those matching in table_b.
Outputting the date, the hour from table_a, the daily (or 0 if NULL) from table_b, and the difference between the two.
Notes:
I have renamed table a and table b to table_a and table_b. I presume these are not the actual table names
An INNER JOIN may be preferable if you only want results that have matching date columns in both tables. Using the LEFT JOIN will return all results from table_a regardless of whether table_b has an entry.
I'm not convinced that date is an allowed column name but I have reproduced it in the code as per the example given by OP.
Your method is fine. Your group by columns are not correct:
Select date, sum(hourly) as hourly, sum(daily) as daily,
sum(hourly) - sum(daily) as diff
from ((select date, count(*) as hourly, 0 as daily
from table a
group by date
) union all
(select date, 0 as hourly, count(*) as daily
from table b
group by date
)
) ab
group by date;
The key idea is that the outer query aggregates only by date -- and you still need aggregation functions there as well.
You have other errors in your subquery, such as missing group bys and date columns. I assume those are transcription errors.

SQL - Pull distinct row based on max value

I am trying to pull the most recent sale amount for each salesperson. The salespeople have made a sale on multiple days, I only want the most recent one.
My attempt below:
SELECT salesperson, amount
FROM table
WHERE date = (SELECT MAX(date) FROM table);
Use correlated subquery :
SELECT t.salesperson, t.amount
FROM table t
WHERE t.date = (SELECT MAX(t1.date)
FROM table t1
WHERE t1.salesperson = t.salesperson -- for each salesperson
);
If you are using PostgreSQL, you can take advantage of DISTINCT ON:
SELECT DISTINCT ON (salesperson) salesperson, amount
FROM table t
ORDER BY salesperson, date DESC
This will return only one row for each salesperson. The ORDER BY clause says to return the one with the largest date for that salesperson.
Unfortunately, DISTINCT ON is not supported by other databases.

MS-Access: HAVING clause not returning any records

I have a Select query to extract Customer Names and Purchase Dates from a table. My goal is to select only those names and dates for customers who have ordered on more than one distinct date. My code is as follows:
SELECT Customer, PurchDate
FROM (SELECT DISTINCT PurchDate, Customer
FROM (SELECT CDate(FORMAT(DateAdd("h",-7,Mid([purchase-date],1,10)+""+Mid([purchase-date],12,8)), "Short Date")) AS PurchDate,
[buyer-name] AS Customer
FROM RawImport
WHERE sku ALIKE "%RE%"))
GROUP BY Customer, PurchDate
HAVING COUNT(PurchDate)>1
ORDER BY PurchDate
This returns no results, even though there are many customers with more than one Purchase Date. The inner two Selects work perfectly and return a set of distinct dates for each customer, so I believe there is some problem in my GROUP/HAVING/ORDER clauses.
Thanks in advance for any help!
You are doing in the inner select
SELECT DISTINCT PurchDate, Customer
and in the outter select
GROUP BY Customer, PurchDate
That mean all are
having count(*) = 1
I cant give you the exact sintaxis in access but you need something like this
I will use YourTable as a replacement of your inner derivated table to make it easy to read
SELECT DISTINCT Customer, PurchDate
FROM YourTable
WHERE Customer IN (
SELECT Customer
FROM (SELECT DISTINCT Customer, PurchDate
FROM YourTable)
GROUP BY Customer
HAVING COUNT(*) > 1
)
inner select will give you which customer order on more than one day.
outside select will bring you those customer on all those days.
.
Maybe you can try something simple to get the list of customer who brought in more than one day like this
SELECT [buyer-name]
FROM RawImport
WHERE sku ALIKE "%RE%"
GROUP BY [buyer-name]
HAVING Format(MAX(purchase-date,"DD/MM/YYYY")) <>
Format(MIN(purchase-date,"DD/MM/YYYY"))

How to join three select queries which has one common column

I have three select queries as below which gives a respective output
select DATE_FORMAT(table1.value_date,'%b')as Month,
DATE_FORMAT(table1.value_date,'%Y') as Year,
table1.open as Open
from index_main as table1
join ( select min(`value_date`) `value_date`
from index_main
group by month(`value_date`), year( `value_date`)
) as table2 on table1.`value_date` = table2.`value_date`
Output columns - Month,year,open
select DATE_FORMAT(table1.value_date,'%b')as Month,
DATE_FORMAT(table1.value_date,'%Y') as Year,
table1.close as Open
from index_main as table1
join ( select max(`value_date`) `value_date`
from index_main group by month(`value_date`), year( `value_date`)
) as table2 on(table1.`value_date` = table2.`value_date`)
Output columns - Month,year,close
select DATE_FORMAT(table1.value_date,'%b')as Month,
DATE_FORMAT(table1.value_date,'%Y') as Year,
max(table1.high) as High
FROM `index_main` as table1
GROUP BY table1.month,table1.year
ORDER BY year(table1.value_date) desc, month(table1.value_date) desc
Output columns - Month,year,high,low
I want to join these three select queries based on the common columns i.e month & year.
My final result should have the following columns - month,year,open,close,high,low.
Try this.
First create 3 views, one with each query (vw1, vw2 and vw3). Then use a query like this:
SELECT vw1.Month, vw1.Year, Open, Close High FROM vw1 LEFT join vw2 on vw1.Year=vw2.Year and vw1.Month=vw2.Month LEFT JOIN vw3 on vw1.Year=vw3.Year and vw1.Month=vw3.Month
Hope this helps you.