Combine GROUP BY and LIKE SQL - sql

My objective is to display states that have 20+ of the value in the 2nd column..
Currently I have been able to display states and the values but I need to combine states that are similar and their values (e.g VIC and Vic and vic should equal VIC 68).
I also only want to display States, not their values but the values keep showing. I'm guessing its using LIKE combined with GROUP BY but I can't figure out how.
My current SQL query:
SELECT DEPARTMENT.STATE, COUNT(ACADEMIC.DEPTNUM) FROM ACADEMIC
JOIN DEPARTMENT
ON DEPARTMENT.DEPTNUM=ACADEMIC.DEPTNUM
GROUP BY DEPARTMENT.STATE;
Output:
STATE COUNT(ACADEMIC.DEPTNUM)
----- -----------------------
NSW 82
7
QLD 21
VIC 14
vic 1
WA 42
Tas 1
SA 40
Qld 55
Vic 53
ACT 35
TAS 8
I have no idea how to do this, can anyone help?

SELECT DEPARTMENT.STATE, COUNT(ACADEMIC.DEPTNUM) FROM ACADEMIC
JOIN DEPARTMENT
ON DEPARTMENT.DEPTNUM=ACADEMIC.DEPTNUM
GROUP BY DEPARTMENT.STATE
HAVING COUNT(ACADEMIC.DEPTNUM) >= 20;
Use HAVING to return only rows where the count is 20+.
To take care of different case, do UPPER on all states:
SELECT UPPER(DEPARTMENT.STATE), COUNT(ACADEMIC.DEPTNUM) FROM ACADEMIC
JOIN DEPARTMENT
ON DEPARTMENT.DEPTNUM=ACADEMIC.DEPTNUM
GROUP BY UPPER(DEPARTMENT.STATE)
HAVING COUNT(ACADEMIC.DEPTNUM) >= 20;

I need to combine states that are similar and their values (e.g VIC and Vic and vic should equal VIC 68)
You need to use SUM and GROUP BY(UPPER/LOWER) on your sub-query or simply use UPPER/LOWER in the GROUP BY expression in your original query.
For example,
SQL> with data as(
2 select 'VIC' state, 14 cnt from dual union all
3 select 'vic' state, 1 cnt from dual union all
4 select 'Vic' state, 53 cnt from dual
5 )
6 select upper(state), sum(cnt) count
7 from data
8 group by upper(state);
UPP COUNT
--- ----------
VIC 68
Since you already have the sub-query which gives you the count, all you need to use UPPER/LOWER in GROUP BY, such that the count would now consider similar states:
SELECT UPPER(DEPARTMENT.STATE) AS "STATE"
FROM ACADEMIC
JOIN DEPARTMENT
ON DEPARTMENT.DEPTNUM=ACADEMIC.DEPTNUM
GROUP BY UPPER(DEPARTMENT.STATE)
HAVING COUNT(ACADEMIC.DEPTNUM) >= 20;

Related

Redshift: I try to use Union but it returns 3 columns instead of 4. What can I do?

I have to find the streams that took place in a specific country and specific dates (overall_streams) and then for the same country and dates, I have to find the streams for a specific product.
In other words, I am trying to compare how the product did compared to the overall number of streams that took place in this place and time.
For this reason, I tried to use UNION (the subquery I did wouldn't give the right results).
Here is my- simplified- code:
Select age_group, gender, sum(streams) as product_streams
From t1
Where product='A'
And country= 'US'
And date= '1st week of July'
Group by 1,2
Union
Select age_group, gender, sum(streams) as overall_streams
From t1
Where country='US'
And date='1st week of July'
Group by 1,2
Notice the difference in the second query is that I haven't specified a product.
The results I get is 3 columns. The third column is named "product_streams" and it alternates between the product_streams and the overall_streams.
Example:
0-18 f 100
0-18 f 560
0-18 m 45
0-18 m 398
The results are correct, I just want to have 4 columns instead of 3.
Like this:
age_group gender product_streams overall_streams
Any ideas?
I think you want conditional aggregation:
Select age_group, gender,
sum(streams) as overall_streams
sum(case when product = 'A' then streams else 0 end) as product_streams
From t1
Where country = 'US' and
date = '1st week of July'
group by age_group, gender;

Update duplicate rows only with a MAX function in SQL

I have a table like this, where, suppose for the sake of an example, NAME is a unique identifier.
NAME AGE VALUE
Jack Under 65 3
Jack 66-74 5
John 66-74 7
John Over 75 9
Gill 25-35 11
Some NAMEs have more than one AGE, which is undesirable, as this is due to dirtiness of the data.
My aim is to update the duplicates only to have one AGE within each NAME. The desired output is thus:
NAME AGE VALUE
Jack Under 65 3
Jack Under 65 5
John 66-74 7
John 66-74 9
Gill 25-35 11
Something like this UPDATE statement should work, but it doesn't.
UPDATE table t1
SET t1.age=MAX(t1.age)
WHERE EXISTS (SELECT COUNT(t2.AGE)
FROM table t2
WHERE t1.NAME=t2.NAME
GROUP BY t2.NAME
HAVING COUNT(t2.AGE) > 1)
SQL Error: ORA-00934: group function is not allowed here
Second issue
Even if I got the above statement to work, there is a second issue. The idea there is to use the MAX (or MIN) function on strings to set the same value for all repeats within a group.
But unfortunately, this too would not quite work as desired. For consistency, ideally an age would default to the lowest age group. But because MAX/MIN compare alphabetic order on strings, this would give, e.g.:
"66-74" and "Under 65" => MAX="Under 65" -- Lowest
"66-74" and "Over 75" => MAX="Over 75" -- Highest
There are only four age groups, would it be possible to specify a custom order?
NB1: I am using Oracle SQL.
NB2: I do not mind if there is a way to achieve the result using a SELECT instead of an UPDATE statement.
Reproducible example
SELECT 'Jack' as NAME, 'Under 65' as AGE, 3 as VALUE from dual
UNION ALL
SELECT 'Jack' as NAME, '66-74' as AGE, 5 as VALUE from dual
UNION ALL
SELECT 'John' as NAME, '66-74' as AGE, 7 as VALUE from dual
UNION ALL
SELECT 'John' as NAME, 'Over 75' as AGE, 9 as VALUE from dual
UNION ALL
SELECT 'Gill' as NAME, '25-35' as AGE, 11 as VALUE from dual
You can define custom order with case when clause and then use analytic max(). This worked for given examples:
update t1 set age = (
select max(age) keep (dense_rank last
order by case when age = 'Over 75' then 1
when age = '66-74' then 2
when age = 'Under 65' then 3
when age = '25-35' then 4
end)
from t1 tx where tx.name = t1.name )

SQL Query that shows individual grades then a row that has a caption for the average

I have the following problem. Create a query that shows all of the individual grades for student 127 in section 95 and also the average of those grades. The individual grades should come first with the average at the bottom. List the grade type code and numeric grade. The average row should have a caption of, "Average for student 127".
I'm able to get the student's grade type and grade but having an issue understanding how to do the caption. Here is some code I have right now, I know its not correct but I'll post it here so you can see what I have.
SELECT Grade_Type_Code, CAST(Numeric_Grade as DECIMAL(10,2)) AS Grade
FROM Grade
WHERE Student_Id = 127
AND Section_Id = 95
UNION
SELECT Grade_Type_Code, AVG(Numeric_Grade)
FROM Grade
WHERE Student_Id = 127
AND Section_Id = 95
GROUP BY Numeric_Grade, Grade_Type_Code;
I'm assuming I might have to throw COUNT(*) in there to get the average? But even if that's the way of going about it how would I add the caption?
Any help would be great, also here is the Schema.
DBMS: I'm using Oracle SQL Developer
Here is the Expected Result
GRADE_TYPE_CODE GRADE
----------------------- ------
QZ 92.00
QZ 91.00
PA 91.00
MT 88.00
HM 74.00
HM 84.00
HM 84.00
HM 74.00
FI 85.00
Average for student 127 84.78
Note: The Chapter this problem is based off of is consisted of
UNION
UNION ALL
INTERSECT
MINUS
You could do something like:
-- vvv Setting up test data vvv --
create table #temp (thing varchar(50), grade decimal(4,2))
insert into #temp (thing, grade)
select 'test', 90
union all select 'test2', 95.5
union all select 'test3', 60
union all select 'test4', 80
-- ^^^ Setting up test data ^^^ --
select thing, grade
from #temp
union all
select 'Student average', avg(grade)
from #temp
drop table #temp
Note in this example i'm creating my own test data, you could just switch out my temp objects for your real objects - especially since it sounds like this is school work don't want to give you the full answer :P

SQL: Select distinct one column but include all fields

How can I select distinct one column (user) and then output the rest of the fields based on this one column?
Input:
user age country
--------------------------------
Tom 34 US
Tom 32 EN
Dick 29 MX
Dick 29 DE
Harry 15 CA
output (distinct user column, and pick one row to output for rest of fields):
user age country count
--------------------------------------
Tom 34 US 2
Dick 29 MX 2
Harry 15 CA 1
Any help would be appreciated!
SELECT USER, AGE, MAX(COUNTRY), COUNT(*)
FROM TABLE
GROUP BY USER, AGE
You could try changing the MAX for a MIN. No need for a DISTINCT here.
You could use some data format like SUBSTRING, but I'm not sure if the rest of the data will always be like that US and USS etc. Buy if you have more than 2/3 or if the changes start beyond a specific character you may encounter some wrong query results.
According to comments and updates.
SELECT USER, MAX(AGE), MAX(COUNTRY), COUNT(*)
FROM TABLE
GROUP BY USER.
SELECT user, age, country, COUNT(*) AS c_rec FROM
(
SELECT DISTINCT user, age, SUBSTRING(country, 1, 2) AS country FROM yourTable
) T
GROUP BY user, age, country

SQL find entire row where only 2 columns values

I'm attempting to
select columns Age, Height, House_number, Street
from my_table
where count(combination of House_number, Street)
occurs more than once.
My table looks like this
Age, Height, House_number, Street
15 178 6 Mc Gill Crst
85 166 6 Mc Gill Crst
85 166 195 Mc Gill Crst
18 151 99 Moon Street
52 189 14a Grimm Lane
My desired outcome looks like this
Age, Height, House_number, Street
15 178 6 Mc Gill Crst
85 166 6 Mc Gill Crst
Stuck!
The best way to do this is with window functions, assuming your database supports them:
select columns Age, Height, House_number, Street
from (select t.*, count(*) over (partition by house_number, street) as cnt
from my_table t
) t
where cnt > 1
This is using a windows function (also called analytic function) in Oracle. The expression count(*) over (partition by house_number, street) is counting the number of rows for each house_number and street combination. It is kind of like doing a group by, but it adds the count to each row rather than combining multiple rows into one.
Once you have that, it is easy to simply choose the rows where the value is greater than 1.
Since you haven't mentioned the RDBMS you are using, the query below will amost work on most RDBMS.
SELECT *
FROM tableName
WHERE (House_number, Street) IN
(
SELECT House_number, STREET
FROM tableName
GROUP BY House_number, STREET
HAVING COUNT(*) >= 2
)
SQLFiddle Demo
Sounds like you need a NOT DISTINCT. The following might give you what you need: Multiple NOT distinct
If you do not have windowing function, then you can use a subquery with a JOIN. The subquery gets the list of the house_number and street that have a count of greater than 1, this result is then used to join back to your table:
select t1.age,
t1.height,
t1.house_number,
t1.street
from my_table t1
inner join
(
select house_number, street
from my_table
group by house_number, street
having count(*) > 1
) t2
on t1.house_number = t2.house_number
and t1.street = t2.street
See SQL Fiddle with Demo