Showing zeroes in sql count - sql

I`m using redshift and trying to count different things by days, but its not showing when the count in table 2 is zero. How can i make it show count zero?
SELECT TO_CHAR(date1,'dd') AS day,
COUNT(*) as Volume,sum(CASE WHEN status = 'ANSWERED' THEN 1 ELSE 0 END )as ANSWERED , t2.Volume AS TRANSFERS
FROM table1 t1
RIGHT JOIN (SELECT TO_CHAR(date2,'dd') AS day,
COUNT(*) as Volume
FROM table2
WHERE TO_CHAR(date2,'yyyy_MM') IN (SELECT DISTINCT TO_CHAR(date2,'yyyy_MM')
FROM table2
WHERE date2 BETWEEN DATE ('2016-11-01') AND DATE ('2016-12-30'))
AND type = 'Active'
GROUP BY day) t2 ON TO_CHAR(date1,'dd') = day
WHERE TO_CHAR(date1,'yyyy_MM') IN (SELECT DISTINCT TO_CHAR(date1,'yyyy_MM')
FROM table1
WHERE date1 BETWEEN DATE ('2016-11-01') AND DATE ('2016-12-30'))
GROUP BY 1,4
ORDER BY 1

Notice that you used a right join between the tables. This means that any row from the first table that doesn't have a matching day in the second table will not display.
If you're new with SQL joins you can refer to this image that explains it.
If your first (or left table) contains all of the unique days that should show up in the result, just switch the "right" to a "left" join.

Related

Select difference between two tables

I want to list four columns, date, hourly count, daily count and difference between two counts.
I have used union all for two tables, but I am getting 2rows as shown in the image:
Select a.date, a.hour,b.daily,sum(a.hour-b.daily)
from (select date,count(*) hour,''daily
From table a union all select '' hour,count(*) daily from table b)
Group by date, daily, hourly..
Please suggest to me a solution.
I see that the code supplied uses a UNION to achieve the output. This would be better served by using a JOIN of some kind.
The result is the total number of rows in table_a grouped by the date subtracted from the total number of rows in table_b grouped by the date.
This code is untested but should give a good indication of how to achieve this:
SELECT a.date,
a.hour,
ISNULL(b.daily, 0) AS daily,
a.hour - ISNULL(b.daily) AS difference
FROM (
SELECT date,
COUNT(*) AS hour
FROM table_a
GROUP BY date
) a
LEFT JOIN (
SELECT date,
COUNT(*) AS daily
FROM table_b
GROUP BY date
) b ON b.date = a.date
ORDER BY a.date;
This works by:
Calculating the count per date in table_a.
Calculating the count per date in table_b.
Joining all results from table_a with those matching in table_b.
Outputting the date, the hour from table_a, the daily (or 0 if NULL) from table_b, and the difference between the two.
Notes:
I have renamed table a and table b to table_a and table_b. I presume these are not the actual table names
An INNER JOIN may be preferable if you only want results that have matching date columns in both tables. Using the LEFT JOIN will return all results from table_a regardless of whether table_b has an entry.
I'm not convinced that date is an allowed column name but I have reproduced it in the code as per the example given by OP.
Your method is fine. Your group by columns are not correct:
Select date, sum(hourly) as hourly, sum(daily) as daily,
sum(hourly) - sum(daily) as diff
from ((select date, count(*) as hourly, 0 as daily
from table a
group by date
) union all
(select date, 0 as hourly, count(*) as daily
from table b
group by date
)
) ab
group by date;
The key idea is that the outer query aggregates only by date -- and you still need aggregation functions there as well.
You have other errors in your subquery, such as missing group bys and date columns. I assume those are transcription errors.

Need to find a difference of data from the same table in hive

I have a history table with loaded timestamp column. I need to fetch the subtracted data using the timestamp column.
Logic:To get the email address by subtracting data from (loaded_timestamp -1)and current_timestamp.Only the subtracted data should be the output.
Select query :
select t1.email_addr
from (select *
from table t1
where loaded_timestamp = current_timestamp
) left outer join
(select *
from table t2
where loaded_timestamp = date_sub(current_timestamp,1)
)
where t1.email!=t2.email;
Table has following columns
Email address, First name , last name, loaded_timestamp.
xxx#gmail.com,xxx,aaa,2020-03-08.
yyy#gmail.com,yyy,bbb,2020-03-08.
zzz#gmail.com,zzz,ccc,2020-03-08.
xxx#gmail.com,xxx,aaa,2020-03-09.
yyy#gmail.com,yyy,bbb,2020-03-09.
Desired Result
zzz#gmail.com
So if subtract the two dates from the same table i.e (2020-03-09 - 2020-03-08 ). I should get only the record which is not matching . Matching records should be discarded and unmatched record should be the output.
The best I can figure out is that you want emails that appear only once. If that is the case, use window functions:
select t.*
from (select t.*, count(*) over (partition by email) as cnt
from t
) t
where cnt = 1;
If you want emails in the data but not loaded on the current date, then:
select t.email
from t
group by t.email
having max(timestamp) <> current_date;

Count transactions within a month only once

I have a situations like below:
I have two database tables. The first table, which I will call TB1 contains all the salaries that the client credits & also the date when the transaction is made.
The second table, which I will call TB2, contains all the products the client has in the bank.
My purpose is to find the number of salaries the client has got before the date he/she got a product (OVERDRAFT in my case) in our bank.
Till now, everything works fine and I have made the query to extract the necessary data.
The only problem, is that I need to improve the query. So, if a certain client has got more than 1 salary (for example every 15 days) within the same month of the same year, the salary is counted only once.
How can I do that PLEASE?
The query is like below:
SELECT TB1.customer_id, COUNT(TB1.customer_id)
FROM table_1 TB1
JOIN
( SELECT TB2.CUSTOMER_ID, TB2.OD_START_DATE
FROM table_2 TB2
JOIN table_2 TB2_MAX
ON TB2.CUSTOMER_ID = TB2_MAX.CUSTOMER_ID
HAVING TB2.od_start_date = MAX(TB2.od_start_date)
GROUP BY TB2.customer_id, TB2.od_start_date
) TB2
ON TB1.CUSTOMER_ID = TB2.CUSTOMER_ID
WHERE TB1.DATE_FROM < TB2.OD_START_DATE
GROUP BY TB1.CUSTOMER_ID
PS: DATE_FROM field contains the date when the transaction is made, while OD_START_DATE field contains the date when the LATEST product is opened.
JOIN in your inner query is redundant. You simply need a MAX date for each customer.
In your outer query you should be counting the DATE_FROM, and not Customer_Id. Since you want to count only once for transactions in a month, Convert DATE_FROM to year month combination and use DISTINCT to count only once.
SELECT TB1.customer_id, COUNT(DISTINCT TO_CHAR(TB1.DATE_FROM,'YYYYMM'))
FROM table_1 TB1
JOIN
( SELECT CUSTOMER_ID, MAX(OD_START_DATE) AS OD_START_DATE
FROM table_2
GROUP BY customer_id
) TB2
ON TB1.CUSTOMER_ID = TB2.CUSTOMER_ID
WHERE TB1.DATE_FROM < TB2.OD_START_DATE
GROUP BY TB1.CUSTOMER_ID

How to include missing rows in sql return

I am currently trying to do a query like this:
(Psuedocode)
SELECT
NAME, SUM(VALUE), MONTH
FROM TABLE
WHERE MONTH BETWEEN 12 MONTHS AGO AND NOW
GROUP BY MONTH, NAME
The problem I am getting is that a name exists in a few of the months, but not all of the months, so if i filter this down to return the values for only one name, i sometimes get only 3 or 4 rows, rather than the 12 I expect to see.
My question is, is there a way to return rows, where it will still include the name, and month within the range, where the value would just be set to zero when I am missing the row from the previous result.
My first thought was to just union another select onto it, but I cant seem to get the logic to work to adhere to the group by, as well as the where clauses for limiting the names.
I you have data for all months, you can take the following approach. Generate all the rows (uses a cross join) then bring in the data you want:
select m.month, n.name, sum(t.value)
from (select distinct month from table) m cross join
(select distinct name from table) n left join
table t
on t.month = m.month and t.name = n.name
group by m.month, n.name;
This will return the missing sums as NULL values. If you want zero, then use coalesce(sum(t.value), 0).
you can use something like the following table to generate all the past 12 months as separate rows:
SELECT add_months(trunc(add_months(sysdate, -12), 'MONTH'), LEVEL - 1) AS month_in_range
FROM all_objects
CONNECT BY LEVEL <= 1 + months_between(add_months(sysdate, -12), TRUNC (sysdate, 'MONTH'));
and then do an outer join between you table and this.
I ended up implementing a left outer join similar to #paqogomez 's comment. As my team is already maintaining a time table, its very easy to get the month list for an outer join.
SELECT NAME, SUM(VALUE), TIME.MONTH
FROM (SELECT DISTINCT MONTH FROM TIME_TABLE
WHERE MONTH BETWEEN 12 MONTHS AGO AND NOW) TIME
LEFT OUTER JOIN TABLE ON (TIME.MONTH = TABLE.MONTH)
GROUP BY TIME.MONTH, NAME

SQL GROUP BY ( DATEPART(), field1 ) result set to zero nulls

I want to aggregate counts, grouped by a datepart and column.
For example, a table with 3 columns with each row representing a unique event: id, name, date
I want to select total counts grouped by name and hour, with zeros when there are no events. If I'm only grouping by name, I can join it with a table of every name. With an hour I could do something similar.
How would I handle the case of grouping by both without having a table with a row for every name+hour combination?
The following is the mysql solution:
create table hours (hour int)
insert hours (hour) values (0), (1) .... (23)
select hour, name, sum(case when name is null then 0 else 1 end)
from hours left outer join
event on (hour(event.date) = hours.hour)
group by hour, name
the sum(case when name is null then 0 else 1 end) handles the case when there are no events for a particular hour and name. the count will show as 0. For others each matching row contributes 1 to the sum.
For sql server use datepart(hour, event.date) instead. The rest should be similar
You can use cross join to generate all the rows and then other logic to fill in the values:
select h.hour, n.name, count(a.name) as cnt
from (select distinct hour(date) as hour from atable) h cross join
(select distinct name from atable) n left join
atable a
on hour(a.date) = h.hour and a.name = n.name
group by h.hour, n.name;