COUNT with WHERE clause giving more rows than without WHERE clause - sql

This may not be the right forum to ask this but I want to understand the logical error happening in my query.
I have wrote below query to understand how many users have delivered messages greater than sent messages(possibly an error in data capture, just wanted to assess it).
SELECT COUNT(DISTINCT user_id)
FROM wk_24_trigger
UNION
SELECT COUNT(DISTINCT user_id)
FROM (
SELECT *, (CASE WHEN delivered > 0 THEN 1 ELSE 0 END) as D,
(CASE WHEN sent > 0 THEN 1 ELSE 0 END) as S
FROM wk_24_trigger) t
WHERE t.D > t.s
The result which I got are as belows
_c0
1 1056840
2 1819729
I am not getting why row 2 > row 1.
Ideally even if for every entry Delivered > Sent then row 2 and row 1 should have been same

Are you sure that the first row is the result from the first query and the second one from the second query..??
It always need not be..
Try adding alias name after the count in each query and verify the result..
you can check below example as well..
WITH TEMP
AS(
SELECT 'A' USER_ID , 1 DELIVERED , NULL SENT FROM DUAL
UNION
SELECT 'B' ID , 10 A , 1 B FROM DUAL
UNION
SELECT 'C' ID , NULL A , 1 B FROM DUAL
UNION
SELECT 'D' ID , -1 A , 1 B FROM DUAL
)
SELECT COUNT(DISTINCT USER_ID), 'QUERY_1' QUERY
FROM TEMP
UNION
(SELECT COUNT(DISTINCT USER_ID), 'QUERY_2'
FROM (
SELECT USER_ID,DELIVERED,SENT,
(CASE
WHEN DELIVERED > 0 THEN
1
ELSE
0
END) D,
(CASE
WHEN SENT > 0 THEN
1
ELSE
0
END) S
FROM TEMP) T
WHERE T.D > T.S);
and system output is as below..
COUNT(DISTINCTUSER_ID) QUERY
1 1 QUERY_2
2 4 QUERY_1
the same could be your case as well..

Related

Checking if all values for user_id IS NOT NULL

I have dataset which looks like this:
UserID AccountID CloseDate
1 1000 14/3/2022
1 2000 16/3/2022
2 1000 NULL
2 2000 4/3/2022
2 3000 NULL
And I would like to check if within one user_id all of the close dates are not null. In other words if all accounts within user_id are closed. I was trying using MAX or MIN but it is not working as I expected, because it is simply avoiding NULL values. Is there any other function which can check it? Let's say that my output would be another column which will assign 1 when all CloseDates are not null and else 0.
Sample output:
UserID AccountID CloseDate Check
1 1000 14/3/2022 1
1 2000 16/3/2022 1
2 1000 NULL 0
2 2000 4/3/2022 0
2 3000 NULL 0
Use conditional aggregation to explicitly COUNT the rows where the column has the value NULL:
SELECT GroupedColumn,
COUNT(CASE WHEN NullableColumn IS NULL THEN 1 END) AS NullCount
FROM dbo.YourTable
GROUP BY GroupedColumn;
If you want to just have a 1 or 0 just wrap the count in a CASE expression:
CASE COUNT(CASE WHEN NullableColumn IS NULL THEN 1 END) WHEN 0 THEN 1 ELSE 0 END
You can try to use FIRST_VALUE condition window function
SELECT *,
FIRST_VALUE(IIF(CloseDate IS NULL,0,1)) OVER(PARTITION BY UserID ORDER BY CloseDate )
FROM T
sqlfiddle
with dataset as (select 1 as UserId, 1000 as AccountID, '14/3/2022' as CloseDate
union all select 1, 2000, '16/3/2022'
union all select 2, 1000, NULL
union all select 2, 2000, '4/3/2022'
union all select 2, 3000, NULL)
select userid from dataset
group by userid
having sum(case when closedate is null then 1 else 0 end) = 0;
select d.*, iif(chk>0, 0, 1) chk
from d
outer apply (
select UserId, COUNT(*) CHK
from d dd
WHERE d.UserId = dd.UserId
and dd.CloseDate IS NULL
group by UserId
) C
You can also use "exists". e.g. :
select y.UserID, y.AccountID, y.CloseDate,
-- [Check]: returns 0 if there is a row in the table for the
-- UserID where CloseDate is null, else 1
(case when exists(select * from YourTable y2 where y2.UserID = y.UserID
AND y2.CloseDate is null) then 0 else 1 end) as [Check]
from YourTable y

Transform a table with duplicate unique id's with different column values onto a single row

I am trying to transform the current format I have my data in 1 into the format of image 2. As you can see the data is currently split over two rows per one cust_id for each code they have but I want it on a single line. The open and replied for a given code is mutually exclusive for 1's i.e. custx for code A does not have matching assigned values of 1 & 1 for open and replied but can have a 0 & 0, 1 & 0 or 0 & 1. I am using Oracle SQL Developer 19.2.1. Thank you in advance
Current SQL data format
Desired SQL data format
Try the following, here is the demo.
select
cust_key,
min(case when code = 'A' then open end) as A_open,
min(case when code = 'B' then open end) as B_open,
min(case when code = 'A' then replied end) as A_replied,
min(case when code = 'B' then replied end) as B_replied
from yourTable
group by
cust_key
Output:
|cust_key A_open B_open A_replied B_replied |
---------------------------------------------------
| cust1 0 0 1 0 |
| cust2 0 0 1 1 |
---------------------------------------------------
You can use 'PIVOT' like below.
with xxx as (
select 'cust1' custkey,'A' code, 0 opened,1 replied from dual
union all
select 'cust1' custkey,'B' code, 0 opened,0 replied from dual
union all
select 'cust2' custkey,'A' code, 0 opened,1 replied from dual
union all
select 'cust2' custkey,'B' code, 0 opened,1 replied from dual
)
SELECT * FROM
(
select xxx.* from XXX
)
PIVOT
(
sum(opened) as OPENED,sum(replied) REPLIED FOR (CODE) IN ('A' AS A,'B' AS B)
)
group by custkey,a_opened,a_REPLIED,b_opened,b_REPLIED

Oracle query with group

I have a scenario where I need to fetch all the records within an ID for the same source. Given below is my input set of records
ID SOURCE CURR_FLAG TYPE
1 IBM Y P
1 IBM Y OF
1 IBM Y P
2 IBM Y P
2 TCS Y P
3 IBM NULL P
3 IBM NULL P
3 IBM NULL P
4 IBM NULL OF
4 IBM NULL OF
4 IBM Y ON
From the above settings, I need to select all the records with source as IBM within that same ID group.Within the ID group if there is at least one record with a source other than IBM, then I don't want any record from that ID group. Also, we need to fetch only those records where at least one record in that ID group with curr_fl='Y'
In the above scenario even though the ID=3 have a source as IBM, but there is no record with CURR_FL='Y', my query should not fetch the value.In the case of ID=4, it can fetch all the records with ID=4, as one of the records have value='Y'.
Also within the group which has satisfied the above condition, I need one more condition for source_type. if there are records with source_type='P', then I need to fetch only that record.If there are no records with P, then I will search for source_type='OF' else source_type='ON'
I have written a query as given below.But it's running for long and not fetching any results. Is there any better way to modify this query
select
ID,
SOURCE,
CURR_FL,
TYPE
from TABLE a
where
not exists(select 1 from TABLE B where a.ID = B.ID and source <> 'IBM')
and exists(select 1 from TABLE C where a.ID = C.ID and CURR_FL = 'Y') and
(TYPE, ID) IN (
select case type when 1 then 'P' when 2 then 'OF' else 'ON' END TYPE,ID from
(select ID,
max(priority) keep (dense_rank first order by priority asc) as type
from ( select ID,TYPE,
case TYPE
when 'P' then 1
when 'OF' then 2
when 'ON' then 3
end as priority
from TABLE where ID
in(select ID from TABLE where CURR_FL='Y') AND SOURCE='IBM')
group by ID))
I think you can just do a single aggregation over your table by ID and check for the yes flag as well as assert that no non IBM source appears. I do this in a CTE below, and then join back to your original table to return full matching records.
WITH cte AS (
SELECT
ID,
CASE WHEN SUM(CASE WHEN TYPE = 'P' THEN 1 ELSE 0 END) > 0
THEN 1
WHEN SUM(CASE WHEN TYPE = 'OF' THEN 1 ELSE 0 END) > 0
THEN 2
WHEN SUM(CASE WHEN TYPE = 'ON' THEN 1 ELSE 0 END) > 0
THEN 3 ELSE 4 END AS p_type
FROM yourTable
GROUP BY ID
HAVING
SUM(CASE WHEN CURR_FLAG = 'Y' THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN SOURCE <> 'IBM' THEN 1 ELSE 0 END) = 0
)
SELECT t1.*
FROM yourTable t1
INNER JOIN cte t2
ON t1.ID = t2.ID
WHERE
t2.p_type = 1 AND t1.TYPE = 'P' OR
t2.p_type = 2 AND t1.TYPE = 'OF' OR
t2.p_type = 3 AND t1.TYPE = 'ON';

SQL query - sum of values by status for date interval

I get crazy because of one query. I have a table like following and I want to get a data - Summa of Values by Status For every Date in interval.
Table
Id Name Value Date Status
1 pro1 2 01.04.14 0
2 pro1 8 02.04.14 1
3 pro2 6 02.04.14 1
4 pro3 0 03.04.14 0
5 pro4 7 03.04.14 0
6 pro4 2 03.04.14 0
7 pro4 4 03.04.14 1
8 pro4 6 04.04.14 1
9 pro4 1 04.04.14 1
For example,
Input: Name = pro4, minDate = 01.02.14, maxDate = 04.09.14
Output:
Date Values sum for 0 Status Values sum for 1 Status
01.04.14 0 0
02.04.14 0 0
03.04.14 9 (=7+2) 4 (only 4 exist)
04.04.14 0 7 (6+1)
In 01.02.14 and 02.04.14 dates, pro4 has not values by status, but I want to show that rows, because I need all dates in that interval. Can anyone help me to create this query?
Edit:
I can not change structure, I have already that table with data. Every day exist in table many times (minimum 1 time)
Thanks in advance.
Assuming you have a row for each date in the table, use conditional aggregation:
select date,
sum(Case when name = 'pro4' and status = 0 then Value else 0 end) as values_0,
sum(case when name = 'pro4' and status = 1 then Value else 0 end) as values_1
from Table t
where date >= '2014-04-01' and date <= '2014-04-09'
group by date
order by date;
If you don't have this list of dates, you can take this approach instead:
with dates as (
select cast('2014-04-01' as date) as thedate
union all
select dateadd(day, 1, thedate)
from dates
where thedate < '2014-04-09'
)
select dates.thedate,
sum(Case when status = 0 then Value else 0 end) as values_0,
sum(case when status = 1 then Value else 0 end) as values_1
from dates left outer join
table t
on t.date = dates.thedate and t.name = 'pro4'
group by dates.thedate;
just an assumption query :
select Distinct date ,case when status = 0 and MAX(date) then SUM(value) ELSE 0 END Status0 ,
case when status = 1 and MAX(date) then SUM(value) ELSE 0 END Status1 from table
To expand my comment the complete query is
WITH [counter](N) AS
(SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1)
, days(N) AS (
SELECT row_number() over (ORDER BY (SELECT NULL)) FROM [counter])
, months (N) AS (
SELECT N - 1 FROM days WHERE N < 13)
, calendar ([date]) AS (
SELECT DISTINCT cast(dateadd(DAY, days.n
, dateadd(MONTH, months.n, '20131231')) AS date)
FROM months
CROSS JOIN days
)
SELECT a.Name
, c.Date
, [Sum of 0] = SUM(CASE Status WHEN 0 THEN Value ELSE 0 END)
, [Sum of 1] = SUM(CASE Status WHEN 1 THEN Value ELSE 0 END)
FROM Calendar c
LEFT JOIN myTable a ON c.Date = a.Date AND a.name = 'pro4'
WHERE c.date BETWEEN '20140201' AND '20140904'
GROUP BY c.Date, a.Name
ORDER BY c.Date
Note that the condition on the name need to be in the JOIN, otherwise you'll get only the date of your table.
If you need multiple years just add another CTE for the count and a dateadd(YEAR,...) in the CTE calendar
This is not really the exact query, but I think you can get that by having a query that looks like:
select date, status, sum(value) from table
where (date between mindate and maxdate) and name = product_name
group by date, status;
this page gives more info.
EDIT
So the above query only gives a part of the answer required by the OP. A LEFT OUTER JOIN of the original table and the result of the above query on thedate and status fields will give the missing info.
e.g.
select x.date, x.status, x.sum_of_values from table as y
left outer join
(select date, status, sum(value) as sum_of_values
from table
where (date between mindate and maxdate) and name = product_name
group by date, status) as x
on y.date= x.date and y.status = x.status
order by x.date;

How to transpose recordset columns into rows

I have a query whose code looks like this:
SELECT DocumentID, ComplexSubquery1 ... ComplexSubquery5
FROM Document
WHERE ...
ComplexSubquery are all numerical fields that are calculated using, duh, complex subqueries.
I would like to use this query as a subquery to a query that generates a summary like the following one:
Field DocumentCount Total
1 dc1 s1
2 dc2 s2
3 dc3 s3
4 dc4 s4
5 dc5 s5
Where:
dc<n> = SUM(CASE WHEN ComplexSubquery<n> > 0 THEN 1 END)
s <n> = SUM(CASE WHEN Field = n THEN ComplexSubquery<n> END)
How could I do that in SQL Server?
NOTE: I know I could avoid the problem by discarding the original query and using unions:
SELECT '1' AS TypeID,
SUM(CASE WHEN ComplexSubquery1 > 0 THEN 1 END) AS DocumentCount
SUM(ComplexSubquery1) AS Total
FROM (SELECT DocumentID, BLARGH ... AS ComplexSubquery1) T
UNION ALL
SELECT '2' AS TypeID,
SUM(CASE WHEN ComplexSubquery2 > 0 THEN 1 END) AS DocumentCount
SUM(ComplexSubquery2) AS Total
FROM (SELECT DocumentID, BLARGH ... AS ComplexSubquery2) T
UNION ALL
...
But I want to avoid this route, because redundant code makes my eyes bleed. (Besides, there is a real possibility that the number of complex subqueries grow in the future.)
WITH Document(DocumentID, Field) As
(
SELECT 1, 1 union all
SELECT 2, 1 union all
SELECT 3, 2 union all
SELECT 4, 3 union all
SELECT 5, 4 union all
SELECT 6, 5 union all
SELECT 7, 5
), CTE AS
(
SELECT DocumentID,
Field,
(select 10) As ComplexSubquery1,
(select 20) as ComplexSubquery2,
(select 30) As ComplexSubquery3,
(select 40) as ComplexSubquery4,
(select 50) as ComplexSubquery5
FROM Document
)
SELECT Field,
SUM(CASE WHEN RIGHT(Query,1) = Field AND QueryValue > 1 THEN 1 END ) AS DocumentCount,
SUM(CASE WHEN RIGHT(Query,1) = Field THEN QueryValue END ) AS Total
FROM CTE
UNPIVOT (QueryValue FOR Query IN
(ComplexSubquery1, ComplexSubquery2, ComplexSubquery3,
ComplexSubquery4, ComplexSubquery5)
)AS unpvt
GROUP BY Field
Returns
Field DocumentCount Total
----------- ------------- -----------
1 2 20
2 1 20
3 1 30
4 1 40
5 2 100
I'm not 100% positive from your example, but perhaps the PIVOT operator will help you out here? I think if you selected your original query into a temporary table, you could pivot on the document ID and get the sums for the other queries.
I don't have much experience with it though, so I'm not sure how complex you can get with your subqueries - you might have to break it down.