Aggregate Function on an Expression Containing A Subquery - sql

With the following t-sql query:
select u.userid
into #temp
from user u
where u.type = 1;
select top 50
contentid,
count(*) as all_views,
sum(case when hc.userid in (select userid from #temp) then 1 else 0 end) as first_count,
sum(case when hc.userid in (40615, 40616) then 1 else 0 end) as another_count
from hitcounts hc
inner join user u on u.userid = hc.userid
group by hc.contentid
order by count(*) desc;
I get an error message
Cannot perform an aggregate function on an expression containing an aggregate or a subquery.
However, if just include the column 'another_count' (with the hard-coded list of identifiers), everything works as I expected. Is there a way I should go about only getting the count for userids contained within a subquery? I plan to have multiple columns, each counting up a set/subquery of different userids.
Performance is not a concern at this point and I appreciate any guidance.

You don't need a temporary table for this purpose. Just use a conditional aggregation:
select top 50 contentid,
count(*) as all_views,
sum(case when u.type = 1 then 1 else 0 end) as first_count,
sum(case when hc.userid in (40615, 40616) then 1 else 0 end) as another_count
from hitcounts hc join
user u
on u.userid = hc.userid
group by hc.contentid
order by count(*) desc;

Related

how to select in select postgresql

I have a column named jenis_kelamin I want to display the gender of male and female in a select query. I have a query like this but the data that comes out is still wrong, what is the correct way?
SELECT distinct mj.id, mj.nama_pd,
(select distinct count(jenis_layanan) from public.isian_kuis
where jenis_kelamin = 'Laki' ) as jumlah_laki,
(select distinct count(jenis_layanan)
from public.isian_kuis
where jenis_kelamin = 'Perempuan' ) as jumlah_perempuan
FROM public.master_jabatan mj
join isian_kuis ik on ik.jabatan_id = mj.id
group by mj.id,mj.nama_pd
order by mj.id;
I have an example of the image of my query, this is still wrong
The correct data is that at ID 30 it has two men and one woman, in ID 29 there is only one woman
No need to use nested select just use Group By like this:
SELECT distinct mj.id,
mj.nama_pd,
SUM(CASE WHEN jenis_kelamin = 'Laki' THEN 1 ELSE 0 end) AS jumlah_laki,
SUM(CASE WHEN jenis_kelamin = 'Perempuan' THEN 1 ELSE 0 end) AS jumlah_perempuan
FROM public.master_jabatan mj
join isian_kuis ik on ik.jabatan_id = mj.id
group by mj.id,mj.nama_pd
order by mj.id;
In Postgres, you can use conditional aggregation which looks like:
SELECT mj.id, mj.nama_pd,
COUNT(*) FILTER (WHERE jenis_kelamin = 'Laki')AS jumlah_laki,
COUNT(*) FILTER (WHERE jenis_kelamin = 'Perempuan') AS jumlah_perempuan
FROM public.master_jabatan mj JOIN
isian_kuis ik
ON ik.jabatan_id = mj.id
GROUP BY mj.id, mj.nama_pd
ORDER BY mj.id;
Note that it is very, very rare to use SELECT DISTINCT with GROUP BY. This might slow down the query. And FILTER is standard SQL so is a good way to implement this.

Optimizing code with multple conditions on multiple tables?

I want to check whether these customers have LEAD action or SELL action which both stay in another tables. However, It takes like forever to finish it.
create table ct_nguyendang.visitor
as
select user_id, updated_at::date,
case
when user_id in (select distinct d_visitor_id from xiti.lead_detail) then 'lead'
else 'None'
end as lead_action,
case
when user_id in (select distinct account_id from ct_nguyendang.daily_listor) then 'sell'
else 'None'
end as sell_action
I think you can use union all and aggregation:
select user_id, max(is_lead) as has_lead, max(is_sale) as has_sale
from ((select d_visitor_id as user_id, 1 as is_lead, 0 as is_sale
from xiti.lead_detail
) union all
(select account_id, 0, 1
from ct_nguyendang.daily_listor
)
) ls
group by user_id;
If you have a table of users, then you can use correlated subqueries:
select u.*,
(case when exists (select 1
from xiti.lead_detail l
where u.user_id = l.d_visitor_id
)
then 1 else 0
end) as has_lead,
(case when exists (select 1
from ct_nguyendang.daily_listor s
where u.user_id = s.account_id
)
then 1 else 0
end) as has_sale
from users u;
Note that I prefer using 1 for "true" and 0 for "false". Of course, you can use string values if you prefer.
To optimize this query, you want indexes on xiti.lead_detail(d_visitor_id) and ct_nguyendang.daily_listor(account_id).

Sum a column and perform more calculations on the result? [duplicate]

This question already has an answer here:
How to use an Alias in a Calculation for Another Field
(1 answer)
Closed 3 years ago.
In my query below I am counting occurrences in a table based on the Status column. I also want to perform calculations based on the counts I am returning. For example, let's say I want to add 100 to the Snoozed value... how do I do this? Below is what I thought would do it:
SELECT
pu.ID Id, pu.Name Name,
COUNT(*) LeadCount,
SUM(CASE WHEN Status = 'Working' THEN 1 ELSE 0 END) AS Working,
SUM(CASE WHEN Status = 'Uninterested' THEN 1 ELSE 0 END) AS Uninterested,
SUM(CASE WHEN Status = 'Converted' THEN 1 ELSE 0 END) AS Converted,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END) AS Snoozed,
Snoozed + 100 AS Test
FROM
Prospects p
INNER JOIN
ProspectsUsers pu on p.OwnerId = pu.SalesForceId
WHERE
p.Store = '108'
GROUP BY
pu.Name, pu.Id
ORDER BY
Name
I get this error:
Invalid column name 'Snoozed'.
How can I take the value of the previous SUM statement, add 100 to it, and return it as another column? What I was aiming for is an additional column labeled Test that has the Snooze count + 100.
You can't use one column to create another column in the same way that you are attempting. You have 2 options:
Do the full calculation (as #forpas has mentioned in the comments above)
Use a temp table or table variable to store the data, this way you can get the first 5 columns, and then you can add the last column or you can select from the temp table and do the last column calculations from there.
You can not use an alias as a column reference in the same query. The correct script is:
SELECT
pu.ID Id, pu.Name Name,
COUNT(*) LeadCount,
SUM(CASE WHEN Status = 'Working' THEN 1 ELSE 0 END) AS Working,
SUM(CASE WHEN Status = 'Uninterested' THEN 1 ELSE 0 END) AS Uninterested,
SUM(CASE WHEN Status = 'Converted' THEN 1 ELSE 0 END) AS Converted,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END)+100 AS Snoozed
FROM
Prospects p
INNER JOIN
ProspectsUsers pu on p.OwnerId = pu.SalesForceId
WHERE
p.Store = '108'
GROUP BY
pu.Name, pu.Id
ORDER BY
Name
MSSQL does not allow you to reference fields (or aliases) in the SELECT statement from within the same SELECT statement.
To work around this:
Use a CTE. Define the columns you want to select from in the CTE, and then select from them outside the CTE.
;WITH OurCte AS (
SELECT
5 + 5 - 3 AS OurInitialValue
)
SELECT
OurInitialValue / 2 AS OurFinalValue
FROM OurCte
Use a temp table. This is very similar in functionality to using a CTE, however, it does have different performance implications.
SELECT
5 + 5 - 3 AS OurInitialValue
INTO #OurTempTable
SELECT
OurInitialValue / 2 AS OurFinalValue
FROM #OurTempTable
Use a subquery. This tends to be more difficult to read than the above. I'm not certain what the advantage is to this - maybe someone in the comments can enlighten me.
SELECT
5 + 5 - 3 AS OurInitialValue
FROM (
SELECT
OurInitialValue / 2 AS OurFinalValue
) OurSubquery
Embed your calculations. opinion warning This is really sloppy, and not a great approach as you end up having to duplicate code, and can easily throw columns out-of-sync if you update the calculation in one location and not the other.
SELECT
5 + 5 - 3 AS OurInitialValue
, (5 + 5 - 3) / 2 AS OurFinalValue
You can't use a column alias in the same select. The column alias do not precedence / sequence; they are all created after the eval of the select result, just before group by and order by.
You must repeat code :
SELECT
pu.ID Id,pu.Name Name,
COUNT(*) LeadCount,
SUM(CASE WHEN Status = 'Working' THEN 1 ELSE 0 END) AS Working,
SUM(CASE WHEN Status = 'Uninterested' THEN 1 ELSE 0 END) AS Uninterested,
SUM(CASE WHEN Status = 'Converted' THEN 1 ELSE 0 END) AS Converted,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END) AS Snoozed,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END)+ 100 AS Test
FROM
Prospects p
INNER JOIN
ProspectsUsers pu on p.OwnerId = pu.SalesForceId
WHERE
p.Store = '108'
GROUP BY
pu.Name, pu.Id
ORDER BY
Name
If you don't want to repeat the code, use a subquery
SELECT
ID, Name, LeadCount, Working, Uninterested,Converted, Snoozed, Snoozed +100 AS test
FROM
(SELECT
pu.ID Id,pu.Name Name,
COUNT(*) LeadCount,
SUM(CASE WHEN Status = 'Working' THEN 1 ELSE 0 END) AS Working,
SUM(CASE WHEN Status = 'Uninterested' THEN 1 ELSE 0 END) AS Uninterested,
SUM(CASE WHEN Status = 'Converted' THEN 1 ELSE 0 END) AS Converted,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END) AS Snoozed
FROM Prospects p
INNER JOIN ProspectsUsers pu on p.OwnerId = pu.SalesForceId
WHERE p.Store = '108'
GROUP BY pu.Name, pu.Id) t
ORDER BY Name
or a view

How to add where condition if result count is greater than one

I want to build SQL query that returns unique id.
My problem is that i need to add another condition to query if i have more than one result.
select u.id
from users u
where u.id in ('1','2','3')
and u.active = 'Y'
if i get more than one result i need to add:
and u.active_contact = 'Y'
I tried to build this query
select * from (
select u.id, count(u.id) as results
from users u
where u.id in ('1','2','3')
and u.active = 'Y'
group by u.id
) tab
If(tab.results > 1) then
where tab.u.active_contact = 'Y'
end
Thanks in advanced.
Hope i explained my self good enough.
Here's a different approach:
SELECT id
FROM (SELECT id, (CASE WHEN active ='Y' THEN 1 ELSE 0 END) + (CASE WHEN active_contact ='Y' THEN 1 ELSE 0 END) as actv FROM users ORDER BY actv DESC)
WHERE actv > 0
LIMIT 1
The subquery adds a column which aggregates active and active_contact. The main SELECT then optimizes the combination of these two fields, requiring at least one of them. I believe this provides the intended result.
Among the possible ways to solve this, here are two.
1) Use the active_contact id. If there is none use another id.
select coalesce( max(case when active_contact = 'Y' then id end), max(id) ) as id
from users
where id in ('1','2','3')
and active = 'Y';
2) Sort with active_contact coming first. Then get the first record.
select id
from
(
select id
from users
where id in ('1','2','3')
and active = 'Y'
order by case when active_contact = 'Y' then 1 else 2 end
) where rownum = 1;
A method using Analytic functions
SELECT id
FROM (SELECT u.id
, u.active_contact
, count(*) OVER () actives
FROM users u
WHERE u.id IN ('1','2','3')
AND u.active = 'Y')
WHERE ( actives = 1
OR ( actives > 1
AND active_contact = 'Y'))
If there is more than one record where active = 'Y' AND active_contact = 'Y' it will return them all. If only one of these is required you will need to identify the criteria for choosing that one.

Subselect Query Improvement

How can I improve the SQL query below (SQL Server 2008)? I want to try to avoid sub-selects, and I'm using a couple of them to produce results like this
StateId TotalCount SFRCount OtherCount
---------------------------------------------------------
AZ 102 50 52
CA 2931 2750 181
etc...
SELECT
StateId,
COUNT(*) AS TotalCount,
(SELECT COUNT(*) AS Expr1 FROM Property AS P2
WHERE (PropertyTypeId = 1) AND (StateId = P.StateId)) AS SFRCount,
(SELECT COUNT(*) AS Expr1 FROM Property AS P3
WHERE (PropertyTypeId <> 1) AND (StateId = P.StateId)) AS OtherCount
FROM Property AS P
GROUP BY StateId
HAVING (COUNT(*) > 99)
ORDER BY StateId
This may work the same, hard to test without data
SELECT
StateId,
COUNT(*) AS TotalCount,
SUM(CASE WHEN PropertyTypeId = 1 THEN 1 ELSE 0 END) as SFRCount,
SUM(CASE WHEN PropertyTypeId <> 1 THEN 1 ELSE 0 END) as OtherCount
FROM Property AS P
GROUP BY StateId
HAVING (COUNT(*) > 99)
ORDER BY StateId
Your alternative is a single self-join of Property using your WHERE conditions as a join parameter. The OtherCount can be derived by subtracting the TotalCount - SFRCount in a derived query.
Another alternative would be to use the PIVOT function like this:
SELECT StateID, [1] + [2] AS TotalCount, [1] AS SFRCount, [2] AS OtherCount
FROM Property
PIVOT ( COUNT(PropertyTypeID)
FOR PropertyTypeID IN ([1],[2])
) AS pvt
WHERE [1] + [2] > 99
You would need to add an entry for each property type which could be daunting but it is another alternative. Scott has a great answer.
If PropertyTypeId is not null then you could do this with a single join. Count is faster than Sum. But is Count plus Join faster than Sum. The test case below mimics your data. docSVsys has 800,000 rows and there are about 300 unique values for caseID. The Count plus Join in this test case is slightly faster than the Sum. But if I remove the with (nolock) then Sum is about 1/4 faster. You would need to test with your data.
select GETDATE()
go;
select caseID, COUNT(*) as Ttl,
SUM(CASE WHEN mimeType = 'message/rfc822' THEN 1 ELSE 0 END) as SFRCount,
SUM(CASE WHEN mimeType <> 'message/rfc822' THEN 1 ELSE 0 END) as OtherCount,
COUNT(*) - SUM(CASE WHEN mimeType = 'message/rfc822' THEN 1 ELSE 0 END) as OtherCount2
from docSVsys with (nolock)
group by caseID
having COUNT(*) > 1000
select GETDATE()
go;
select docSVsys.caseID, COUNT(*) as Ttl
, COUNT(primaryCount.sID) as priCount
, COUNT(*) - COUNT(primaryCount.sID) as otherCount
from docSVsys with (nolock)
left outer join docSVsys as primaryCount with (nolock)
on primaryCount.sID = docSVsys.sID
and primaryCount.mimeType = 'message/rfc822'
group by docSVsys.caseID
having COUNT(*) > 1000
select GETDATE()
go;