Select difference between two tables - sql

I want to list four columns, date, hourly count, daily count and difference between two counts.
I have used union all for two tables, but I am getting 2rows as shown in the image:
Select a.date, a.hour,b.daily,sum(a.hour-b.daily)
from (select date,count(*) hour,''daily
From table a union all select '' hour,count(*) daily from table b)
Group by date, daily, hourly..
Please suggest to me a solution.

I see that the code supplied uses a UNION to achieve the output. This would be better served by using a JOIN of some kind.
The result is the total number of rows in table_a grouped by the date subtracted from the total number of rows in table_b grouped by the date.
This code is untested but should give a good indication of how to achieve this:
SELECT a.date,
a.hour,
ISNULL(b.daily, 0) AS daily,
a.hour - ISNULL(b.daily) AS difference
FROM (
SELECT date,
COUNT(*) AS hour
FROM table_a
GROUP BY date
) a
LEFT JOIN (
SELECT date,
COUNT(*) AS daily
FROM table_b
GROUP BY date
) b ON b.date = a.date
ORDER BY a.date;
This works by:
Calculating the count per date in table_a.
Calculating the count per date in table_b.
Joining all results from table_a with those matching in table_b.
Outputting the date, the hour from table_a, the daily (or 0 if NULL) from table_b, and the difference between the two.
Notes:
I have renamed table a and table b to table_a and table_b. I presume these are not the actual table names
An INNER JOIN may be preferable if you only want results that have matching date columns in both tables. Using the LEFT JOIN will return all results from table_a regardless of whether table_b has an entry.
I'm not convinced that date is an allowed column name but I have reproduced it in the code as per the example given by OP.

Your method is fine. Your group by columns are not correct:
Select date, sum(hourly) as hourly, sum(daily) as daily,
sum(hourly) - sum(daily) as diff
from ((select date, count(*) as hourly, 0 as daily
from table a
group by date
) union all
(select date, 0 as hourly, count(*) as daily
from table b
group by date
)
) ab
group by date;
The key idea is that the outer query aggregates only by date -- and you still need aggregation functions there as well.
You have other errors in your subquery, such as missing group bys and date columns. I assume those are transcription errors.

Related

Finding Max(Date) BEFORE specified date in Redshift SQL

I have a table (Table A) in SQL (AWS Redshift) where I've isolated my beginning population that contains account id's and dates. I'd like to take the output from that table and LEFT join back to the "accounts" table to ONLY return the start date that precedes or comes directly before the date stored in the table from my output.
Table A (Beg Pop)
-------
select account_id,
min(start_date),
min(end_date)
from accounts
group by 1;
I want to return ONLY the date that precedes the date in my current table where account_id match. I'm looking for something like...
Table B
-------
select a.account_id,
a.start_date,
a.end_date,
b.start_date_prev,
b.end_date_prev
from accounts as a
left join accounts as b on a.account_id = b.account_id
where max(b.start_date) less than a.start_date;
Ultimately, I want to return everything from table a and only the dates where max(start_date) is less than the start_date from table A. I know aggregation is not allowed in the WHERE clause and I guess I can do a subquery but I only want the Max date BEFORE the dates in my output. Any suggestions are greatly appreciated.
I want to return ONLY the date that precedes the date in my current table where account_id match
If you want the previous date for a given row, use lag():
select a.*,
lag(start_date) over (partition by account_id order by start_date) as prev_start_date
from accounts a;
As I understand from the requirement is to display all rows from a base table with the preceeding data sorted based on a column and with some conditions
Please check following example which I took from article Select Next and Previous Rows with Current Row using SQL CTE Expression
WITH CTE as (
SELECT
ROW_NUMBER() OVER (PARTITION BY account_id ORDER BY start_date) as RN,
*
FROM accounts
)
SELECT
PreviousRow.*,
CurrentRow.*,
NextRow.*
FROM CTE as CurrentRow
LEFT JOIN CTE as PreviousRow ON
PreviousRow.RN = CurrentRow.RN - 1 and PreviousRow.account_id = CurrentRow.account_id
LEFT JOIN CTE as NextRow ON
NextRow.RN = CurrentRow.RN + 1 and NextRow.account_id = CurrentRow.account_id
ORDER BY CurrentRow.account_id, CurrentRow.start_date;
I tested with following sample data and it seems to be working
create table accounts(account_id int, start_date date, end_date date);
insert into accounts values (1,'20201001','20201003');
insert into accounts values (1,'20201002','20201005');
insert into accounts values (1,'20201007','20201008');
insert into accounts values (1,'20201011','20201013');
insert into accounts values (2,'20201001','20201002');
insert into accounts values (2,'20201015','20201016');
Output is as follows

How to filter records by them amount per date?

i have a tablet 'A' that have a column of date. and the same date can be in a few records. I'm trying to filter the records where the amount of the records by day is less than 5. And still keep all the fields of the tablet.
I mean that if i have only 4 records on 11/10/2017 I need to filter all of this 4 records.
So You can SELECT them basing at sub-query . In SUB-Query group them by this date column and then use HAVING with aggregated count to know how many in every date-group we have and then select all which have this count lesser than 5 ;
SELECT *
FROM A
WHERE A.date in (SELECT subA.date
FROM A
GROUP BY A.date
HAVING COUNT(*) < 5 );
Take Care's answer is good. Alternatively, you can use an analytic/windowing function. I'd benchmark both and see which one works better.
with cte as (
select *, count(1) over (partition by date) as cnt
from table_a
)
select *
from cte
where cnt < 5

Showing zeroes in sql count

I`m using redshift and trying to count different things by days, but its not showing when the count in table 2 is zero. How can i make it show count zero?
SELECT TO_CHAR(date1,'dd') AS day,
COUNT(*) as Volume,sum(CASE WHEN status = 'ANSWERED' THEN 1 ELSE 0 END )as ANSWERED , t2.Volume AS TRANSFERS
FROM table1 t1
RIGHT JOIN (SELECT TO_CHAR(date2,'dd') AS day,
COUNT(*) as Volume
FROM table2
WHERE TO_CHAR(date2,'yyyy_MM') IN (SELECT DISTINCT TO_CHAR(date2,'yyyy_MM')
FROM table2
WHERE date2 BETWEEN DATE ('2016-11-01') AND DATE ('2016-12-30'))
AND type = 'Active'
GROUP BY day) t2 ON TO_CHAR(date1,'dd') = day
WHERE TO_CHAR(date1,'yyyy_MM') IN (SELECT DISTINCT TO_CHAR(date1,'yyyy_MM')
FROM table1
WHERE date1 BETWEEN DATE ('2016-11-01') AND DATE ('2016-12-30'))
GROUP BY 1,4
ORDER BY 1
Notice that you used a right join between the tables. This means that any row from the first table that doesn't have a matching day in the second table will not display.
If you're new with SQL joins you can refer to this image that explains it.
If your first (or left table) contains all of the unique days that should show up in the result, just switch the "right" to a "left" join.

Count transactions within a month only once

I have a situations like below:
I have two database tables. The first table, which I will call TB1 contains all the salaries that the client credits & also the date when the transaction is made.
The second table, which I will call TB2, contains all the products the client has in the bank.
My purpose is to find the number of salaries the client has got before the date he/she got a product (OVERDRAFT in my case) in our bank.
Till now, everything works fine and I have made the query to extract the necessary data.
The only problem, is that I need to improve the query. So, if a certain client has got more than 1 salary (for example every 15 days) within the same month of the same year, the salary is counted only once.
How can I do that PLEASE?
The query is like below:
SELECT TB1.customer_id, COUNT(TB1.customer_id)
FROM table_1 TB1
JOIN
( SELECT TB2.CUSTOMER_ID, TB2.OD_START_DATE
FROM table_2 TB2
JOIN table_2 TB2_MAX
ON TB2.CUSTOMER_ID = TB2_MAX.CUSTOMER_ID
HAVING TB2.od_start_date = MAX(TB2.od_start_date)
GROUP BY TB2.customer_id, TB2.od_start_date
) TB2
ON TB1.CUSTOMER_ID = TB2.CUSTOMER_ID
WHERE TB1.DATE_FROM < TB2.OD_START_DATE
GROUP BY TB1.CUSTOMER_ID
PS: DATE_FROM field contains the date when the transaction is made, while OD_START_DATE field contains the date when the LATEST product is opened.
JOIN in your inner query is redundant. You simply need a MAX date for each customer.
In your outer query you should be counting the DATE_FROM, and not Customer_Id. Since you want to count only once for transactions in a month, Convert DATE_FROM to year month combination and use DISTINCT to count only once.
SELECT TB1.customer_id, COUNT(DISTINCT TO_CHAR(TB1.DATE_FROM,'YYYYMM'))
FROM table_1 TB1
JOIN
( SELECT CUSTOMER_ID, MAX(OD_START_DATE) AS OD_START_DATE
FROM table_2
GROUP BY customer_id
) TB2
ON TB1.CUSTOMER_ID = TB2.CUSTOMER_ID
WHERE TB1.DATE_FROM < TB2.OD_START_DATE
GROUP BY TB1.CUSTOMER_ID

SQL Percentage Count Query By Date

I am able to calculate the percentage count on a particular date in a Microsoft Access 2007 SQL query using:
SELECT Date, Val, (Count(Val) / (SELECT Count(*) From Table HAVING Date=#7/31/2012#) as PercentVal
FROM Table
GROUP BY Date, Val
HAVING Date=#7/31/2012#
However, I would like to make this same calculation over every date using the count totals . For instance, the query:
SELECT Date, Val, Count(*) AS CountVal
FROM Table
GROUP BY Date, Val
finds the counts in every period. I would like to add an additional column with the percent counts. However, I can't seem to figure out how to calculate count percentage in every period without using the above block of text and setting up queries for each individual period.
You can subquery it like this:
SELECT A.ADate, A.Val, COUNT(A.Val) / B.DateCount
FROM Table1 AS A
INNER JOIN (
SELECT C.ADate, COUNT(*) AS DateCount
FROM Table1 C
GROUP BY C.ADate
) AS B ON A.ADate = B.ADate
GROUP BY A.ADate, A.Val, B.DateCount