Having trouble with the subquery in this code - sql

I'm trying to run this code for an assignment for a class I've got. The "x" at the end of my subquery keeps on giving me errors and I can't wrap my head around why this is.
The goal of this assignment is to count (by age group) the number of reports that Carditis was a symptom after receiving a COVID shot.
Thanks in advance
Select agegroup, sum(case when died= 'Y' then 1 else 0 end) as Deaths
From (Select *,
Case
when age<=2 then 'infant'
when age<18 then 'juvenile'
when age<35 then 'adult'
when age<65 then 'old adult'
when age>=65 then 'senior'
else 'unknown' end as agegroup
from dbo.symptoms as s
join dbo.vaersvax as v on s.vaers_id=v.vaers_id
join dbo.patient as p on s.vaers_id=p.vaers_id
where v.vax_type='COVID19' and OneVax='Y' and symptom='Carditis'
) as x
Group By agegroup
Order By avg(age)

As #Schmocken already said, you can't perform a SELECT FROM a subquery that returns more than one column with the same name. As I suppose from your external query, this would do the job for you:
Select agegroup, sum(case when died= 'Y' then 1 else 0 end) as Deaths
From (Select died, age,
Case
when age<=2 then 'infant'
when age<18 then 'juvenile'
when age<35 then 'adult'
when age<65 then 'old adult'
when age>=65 then 'senior'
else 'unknown' end as agegroup
from dbo.symptoms as s
join dbo.vaersvax as v on s.vaers_id=v.vaers_id
join dbo.patient as p on s.vaers_id=p.vaers_id
where v.vax_type='COVID19' and OneVax='Y' and symptom='Carditis'
) as x
Group By agegroup
Order By avg(age)

By using Select * you have specified the same column name to be returned more than once.
As an example, you are returning both s.vaers_id and v.vaers_id, which are the same. This is not allowed; a subquery must return a unique set of column names.
You could return s.* successfully, but not all columns from all tables.

Related

How to check unique values in SQL

I have a table named Bank that contains a Bank_Values column. I need a calculated Bank_Value_Unique column to shows whether each Bank_Value exists somewhere else in the table (i.e. whether its count is greater than 1).
I prepared this query, but it does not work. Could anyone help me with this and/or modify this query?
SELECT
CASE
WHEN NULLIF(LTRIM(RTRIM(Bank_Value)), '') =
(SELECT Bank_Value
FROM [Bank]
GROUP BY Bank_Value
HAVING COUNT(*) = 1)
THEN '0' ELSE '1'
END AS Bank_Key_Unique
FROM [Bank]
A windowed count should work:
SELECT
*,
CASE
COUNT(*) OVER (PARTITION BY Bank_Value)
WHEN 1 THEN 1 ELSE 0
END AS Bank_Value_Unique
FROM
Bank
;
It works also, but I found solution also:
select CASE WHEN NULLIF(LTRIM(RTRIM(Bank_Value)),'') =
(select Bank_Value
from Bank
group by Bank_Value
having (count(distinct Bank_Value) > 2 )) THEN '1' ELSE '0' END AS
Bank_Value_Uniquness
from Bank
It was missing "distinct" in having part.

SQL aggregate function alias

I'm a beginner at SQL and this is the question I have been asked to solve:
Say that a big city is defined as a place of type city with a population of at
least 100,000. Write an SQL query that returns the scheme (state_name,no_big_city,big_city_population) ordered by state_name, listing those states which have either (a) at least five big cities or (b) at least one million people living in big cities. The column state_name is the name of the state, no_big_city is the number of big cities in the state, and big_city_population is the number of people living in big cities in the state.
Now, as far as I can see, the following query returns correct results:
SELECT state.name AS state_name
, COUNT(CASE WHEN place.type = 'city' AND place.population >= 100000 THEN 1 ELSE NULL END) AS no_big_city
, SUM(CASE WHEN place.type = 'city' AND place.population >= 100000 THEN place.population ELSE NULL END) AS big_city_population
FROM state
JOIN place
ON state.code = place.state_code
GROUP BY state_name
HAVING
COUNT(CASE WHEN place.type = 'city' AND place.population >= 100000 THEN 1 ELSE NULL END) >= 5 OR
SUM(CASE WHEN place.type = 'city' AND place.population >= 100000 THEN place.population ELSE NULL END) >= 1000000
ORDER BY state_name;
However, the two aggregate functions used in the code appear twice. MY question: is there any way of making this code duplication disappear preserving functionality?
To be clear, I have already tried using the alias, but I just get a "column does not exist" error.
The manual clarifies:
An output column's name can be used to refer to the column's value in
ORDER BY and GROUP BY clauses, but not in the WHERE or HAVING clauses;
there you must write out the expression instead.
Bold emphasis mine.
You can avoid typing long expressions repeatedly with a subquery or CTE:
SELECT state_name, no_big_city, big_city_population
FROM (
SELECT s.name AS state_name
, COUNT(*) FILTER (WHERE p.type = 'city' AND p.population >= 100000) AS no_big_city
, SUM(population) FILTER (WHERE p.type = 'city' AND p.population >= 100000) AS big_city_population
FROM state s
JOIN place p ON s.code = p.state_code
GROUP BY s.name -- can be input column name as well, best schema-qualified to avoid ambiguity
) sub
WHERE no_big_city >= 5
OR big_city_population >= 1000000
ORDER BY state_name;
While being at it, I simplified with the aggregate FILTER clause (Postgres 9.4+):
How can I simplify this game statistics query?
However, I suggest this simpler and faster query to begin with:
SELECT s.state_name, p.no_big_city, p.big_city_population
FROM state s
JOIN (
SELECT state_code AS code -- alias just to simplify join
, count(*) AS no_big_city
, sum(population) AS big_city_population
FROM place
WHERE type = 'city'
AND population >= 100000
GROUP BY 1 -- can be ordinal number referencing position in SELECT list
HAVING count(*) >= 5 OR sum(population) >= 1000000 -- simple expressions now
) p USING (code)
ORDER BY 1; -- can also be ordinal number
I am demonstrating another option to reference expressions in GROUP BY and ORDER BY. Only use that if it doesn't impair readability and maintainability.
Not sure if this is a comment or an answer, since it is more preference based as opposed to technical, but I'll post it anyway
What I usually do when I need to reference calculated columns (usually a LOT at the same time) is I put my calculated columns within a derived table and then reference the calculated columns using its alias outside of the derived table. This syntax should be ANSI-SQL correct, but I am not familiar with PostGRES
select * from (
SELECT STATE.NAME AS state_name
,COUNT(CASE WHEN place.type = 'city'
AND place.population >= 100000 THEN 1 ELSE NULL END) AS no_big_city
,SUM(CASE WHEN place.type = 'city'
AND place.population >= 100000 THEN place.population ELSE NULL END) AS big_city_population
FROM STATE
INNER JOIN place
ON STATE.code = place.state_code
GROUP BY state_name
) sub
where no_big_city >= 5
and big_city_population >=100000
--HAVING COUNT(CASE WHEN place.type = 'city'
-- AND place.population >= 100000 THEN 1 ELSE NULL END) >= 5
-- OR SUM(CASE WHEN place.type = 'city'
-- AND place.population >= 100000 THEN place.population ELSE NULL END) >= 1000000
ORDER BY state_name;
The nice thing about this approach is, although you are adding complication via a subquery/derived table, the formula is kept in one place, so any changes only have to happen once. I do not know if this will perform worse than simply repeating the calcuation in the group-by, but I can't imagine it would be that much worse.
SELECT clause is what you want to select from the filtred by WHERE clause table(s).
GROUP BY is a condition how to group filtered records to use in aggregation functions in the SELECT. So alias cannot be there.
But you can wrap your filtered records and select from them. Something like that:
SELECT state_name, no_big_city, big_city_population
FROM
(
SELECT
state.name AS state_name,
COUNT(1) no_big_city,
MAX(place.population) max_city_population,
SUM(place.population) AS big_city_population
FROM state JOIN place ON state.code = place.state_code
WHERE
place.type = 'city' AND
place.population >= 100000
GROUP BY state.name
)
WHERE
no_big_city >= 5 OR
max_city_population > 1000000
ORDER BY state_name
Also, moving conditions
place.type = 'city' AND
place.population >= 100000
out of CASE to WHERE will perform better. "No city" or "small city records will not be processed. especially if there is an index on place.type column.

Renaming result categories in SQL

I have the following query which outputs the number and percentage of members whose Salutation is Mr / Ms.
I want to rename the results to say 'Male' instead of 'Mr' and Female instead of 'Ms'.
It's probably a fairly simple CASE thing, but can't get it to work...
SELECT AspNetUsers.Salutation AS Sex, COUNT(AspNetUsers.Salutation) as Total,
CAST(ROUND((COUNT(AspNetUsers.Salutation)* 100.0 / (SELECT COUNT(*) FROM Member, AspNetUsers WHERE Member.AspNetUserId = AspNetUsers.Id)),1) AS NUMERIC(36,1)) AS Percentage
FROM Member, AspNetUsers
WHERE Member.AspNetUserId=AspNetUsers.Id
GROUP BY Salutation
You're absolutely correct that it's a simple case expression that is needed, but you also need to group by the same case expression.
SELECT
CASE
WHEN AspNetUsers.Salutation = 'Mr' THEN 'Male'
WHEN AspNetUsers.Salutation = 'Ms' THEN 'Female'
ELSE 'Other' -- this is of course optional
END AS Sex,
COUNT(AspNetUsers.Salutation) as Total,
CAST(ROUND((COUNT(AspNetUsers.Salutation)* 100.0 / (SELECT COUNT(*) FROM Member JOIN AspNetUsers ON Member.AspNetUserId = AspNetUsers.Id)),1) AS NUMERIC(36,1)) AS Percentage
FROM Member
JOIN AspNetUsers ON Member.AspNetUserId = AspNetUsers.Id
GROUP BY
CASE
WHEN AspNetUsers.Salutation = 'Mr' THEN 'Male'
WHEN AspNetUsers.Salutation = 'Ms' THEN 'Female'
ELSE 'Other' -- this is of course optional
END;
The query could probably be improved by using a common table expression to not have to repeat the case expression, and a windowed count instead of a subquery, but I'll leave that to you.

calculating completed task ratio

I've a table with NAMES and STATUS with C(completed) and N(not completed) status. I want check how many tasks are not completed for each name. I tried the following code and it is returning all '0' values:
select name, (select count(status) from alteon where status= 'n') / (select count(status) from alteon) from alteon group by name;
I'm expecting the result as not completed / total assigned where total assigned = complete+not completed.
as mentioned earlier, I'm getting value as '0' beside each employee name.
I think the following query does what you want:
select name,
sum(case when status = 'n' then 1 else 0 end) as n_status,
avg(case when status = 'n' then 1.0 else 0 end) as n_status_ratio
from alteon;
Here is the query which gives the result as you explained above.
select count(status)as Total_assigned,
sum(IF(status='n', 1, 0)) as Not_completed,name
from alteon group by name ;
Here is the sqlfiddle
You don't have to use multiple select statements. Use CASE to count the incomplete tasks.
select name, count(case when status = 'n' then 1 else null end)/count(status)
from alteon
group by name;
sqlfiddle.

SQL Nested Select statements with COUNT()

I'll try to describe as best I can, but it's hard for me to wrap my whole head around this problem let alone describe it....
I am trying to select multiple results in one query to display the current status of a database. I have the first column as one type of record, and the second column as a sub-category of the first column. The subcategory is then linked to more records underneath that, distinguished by status, forming several more columns. I need to display every main-category/subcategory combination, and then the count of how many of each sub-status there are beneath that subcategory in the subsequent columns. I've got it so that I can display the unique combinations, but I'm not sure how to nest the select statements so that I can select the count of a completely different table from the main query. My problem lies in that to display the main category and sub category, I can pull from one table, but I need to count from a different table. Any ideas on the matter would be greatly appreciated
Here's what I have. The count statements would be replaced with the count of each status:
SELECT wave_num "WAVE NUMBER",
int_tasktype "INT / TaskType",
COUNT (1) total,
COUNT (1) "LOCKED/DISABLED",
COUNT (1) released,
COUNT (1) "PARTIALLY ASSEMBLED",
COUNT (1) assembled
FROM (SELECT DISTINCT
(t.invn_need_type || ' / ' || s.code_desc) int_tasktype,
t.task_genrtn_ref_nbr wave_num
FROM sys_code s, task_hdr t
WHERE t.task_genrtn_ref_nbr IN
(SELECT ship_wave_nbr
FROM ship_wave_parm
WHERE TRUNC (create_date_time) LIKE SYSDATE - 7)
AND s.code_type = '590'
AND s.rec_type = 'S'
AND s.code_id = t.task_type),
ship_wave_parm swp
GROUP BY wave_num, int_tasktype
ORDER BY wave_num
Image here: http://i.imgur.com/JX334.png
Guessing a bit,both regarding your problem and Oracle (which I've - unfortunately - never used), hopefully it will give you some ideas. Sorry for completely messing up the way you write SQL, SELECT ... FROM (SELECT ... WHERE ... IN (SELECT ...)) simply confuses me, so I have to restructure:
with tmp(int_tasktype, wave_num) as
(select distinct (t.invn_need_type || ' / ' || s.code_desc), t.task_genrtn_ref_nbr
from sys_code s
join task_hdr t
on s.code_id = t.task_type
where s.code_type = '590'
and s.rec_type = 'S'
and exists(select 1 from ship_wave_parm p
where t.task_genrtn_ref_nbr = p.ship_wave_nbr
and trunc(p.create_date_time) = sysdate - 7))
select t.wave_num "WAVE NUMBER", t.int_tasktype "INT / TaskType",
count(*) TOTAL,
sum(case when sst.sub_status = 'LOCKED' then 1 end) "LOCKED/DISABLED",
sum(case when sst.sub_status = 'RELEASED' then 1 end) RELEASED,
sum(case when sst.sub_status = 'PARTIAL' then 1 end) "PARTIALLY ASSEMBLED",
sum(case when sst.sub_status = 'ASSEMBLED' then 1 end) ASSEMBLED
from tmp t
join sub_status_table sst
on t.wave_num = sst.wave_num
group by t.wave_num, t.int_tasktype
order by t.wave_num
As you notice, I don't know anything about the table with the substatuses.
You can use inner join, grouping and count to get your result:
suppose tables are as follow :
cat (1)--->(n) subcat (1)----->(n) subcat_detail.
so the query would be :
select cat.title cat_title ,subcat.title subcat_title ,count(*) as cnt from
cat inner join sub_cat on cat.id=subcat.cat_id
inner join subcat_detail on subcat.ID=am.subcat_detail_id
group by cat.title,subcat.title
Generally when you need different counts, you need to use the CASE statment.
select count(*) as total
, case when field1 = "test' then 1 else 0 end as testcount
, case when field2 = 'yes' then 1 else 0 endas field2count
FROM table1