Select query group by alias - sql

I have this query:
SELECT * FROM TABLE1 WHERE AREA_CODE IN ('929', '718', '347', '646') GROUP BY AREA_CODE
Is it possible to get only one record row with name 'NEW_YORK_AREA' that includes all these four area codes? To be more clear, let's say you have 4 records in the table for each area code listed above but you want to get only one result(row) with alias 'NEW_YOUR_AREA'. I hope it is clear, let me know if you have any questions, I will edit the question. Thank you all and have a great day.
UPDATE: requirements have changed and it is no longer needed. Thank you all for your help! :)

DB2 supports listagg(). So:
SELECT 'NEW_YORK_AREA' as cityname,
LISTAGG(AREA_CODE, ',') WITHIN GROUP (ORDER BY AREA_CODE) as areacodes
FROM TABLE1
WHERE AREA_CODE IN ('212', '929', '718', '347', '646') ;
I helpfully added 212, the most famous NYC area code ;)
If you have duplicates, then you need to use a subquery to remove them before aggregating.

Logically, what you want to do is group everything into the same category. You could do this by explicitly grouping all rows by a single value:
select 'NEW_YORK_AREA',
--whatever functions you need to aggregate the data here.
count(var1),
max(var2)
from table1
where area_code in ('929', '718', '347', '646')
group by 1
However, if the only functions that refer to the data in the table are aggregate functions, DB2 lets you omit the group by, and it will automatically group everything into a single row. The following is equivalent to the above query:
select 'NEW_YORK_AREA',
count(var1),
max(var2)
from table1
where area_code in ('929', '718', '347', '646')

What about creating a AREA_CODE_GROUP table
AREA_GROUP,AREA_CODE
'NEW_YORK_AREA','929'
'NEW_YORK_AREA','718'
'NEW_YORK_AREA','347'
'NEW_YORK_AREA','646'
that you can join:
SELECT t.* FROM TABLE1 "t"
INNER JOIN AREA_CODE_GROUP "g"
ON t.AREA_CODE = g.AREA_CODE
WHERE AREA_GROUP = 'NEW_YORK_AREA'

Related

Expression "neither grouped nor aggregated" in BigQuery, how to include it without having to group by it?

I have this block of code:
(SELECT SELL_STR.SELL_STR_NBR, DATES.PSTD_FSCL_YR_WK, count(*) as WKLY_DLVRY--, WORK_ORD_NBR
from `analytics-df-thd.DFS.FACTS_DLY_DELV_STATS` A
inner join R1
on r1.FSCL_YR_WK = A.DATES.PSTD_FSCL_YR_WK
group by SELL_STR.SELL_STR_NBR, DATES.PSTD_FSCL_YR_WK
)
I need to add WORK_ORD_NBR to the SELECT so I can later join it
select
WKLY_DLVRY.WKLY_DLVRY
...
from table1 a
join
WKLY_DLVRY
on WKLY_DLVRY.WORK_ORD_NBR = a.WORK_ORD_NBR
However when added to WKLY_DLVRY, I receive this error:
SELECT list expression references column WORK_ORD_NBR which is neither grouped nor aggregated at
I have read many threads and they didn't give me many options, only suggesting that WORK_ORD_NBR must be grouped, however that changes the count. I appreciate any help
You can keep a column without grouping like below.
For more information please check here.
SELECT
count(*) as WKLY_DLVRY,
ARRAY_AGG(
STRUCT(WORK_ORD_NBR)
-- ORDER BY [column] DESC LIMIT 1 -- if you want to distinguish multiple values
)[OFFSET(0)].*,
FROM ~

Oracle query mistake

I need to know where the mistake is in this oracle query?
SELECT(KEY1),COUNT(*) FROM TABLE1 GROUP BY AGE
SELECT KEY1,COUNT(*) FROM TABLE1 GROUP BY KEY1
There are two problems. First one: You cannot close the parenthesis after the first keyword. Second: You have to group by all keys that are in the query that are not all row dependend. In that case "KEY1". If you want to order by age you have to query age as parameter.
SELECT AGE,COUNT(*) FROM TABLE1 GROUP BY AGE
Your table naming is not very good. I assume you should have a look at group by tutorials like https://www.w3schools.com/sql/sql_groupby.asp or the sql tutorial https://www.w3schools.com/sql/
Your query had an issue. You have to modify your query as below
SELECT KEY1,COUNT(*) FROM TABLE1 GROUP BY KEY1.
Observation:
All the columns that are added in the select statement alongside the aggregate functions, should be included the group by columns.
Your first column does have the bracket in it which should be removed.

SQL: How to select full rows in each group which matched conditions on some fields

I have one table in postgresql database, for example:
Is there any way to get result as below output with good performance? That means in each group I want get full of rows which matched with some conditions, such as userid=100, also add more fields by aggregate functions
Output (with userid=100 as the condition I want, or other condition):
Note: The data is dynamically, such as the content, seen... field are random
I have used this SQL query, but it only can two fields:
SELECT groupid,
string_agg(text(userid), ', ') AS lst_userids,
FROM t1
GROUP BY groupid
Thanks for any help!
You seem to want something like this:
SELECT min(id) as id, groupid,
string_agg(text(userid), ', ') AS lst_userids,
max(case when seen then content end) as content,
bool_or(seen) as seen
FROM t1
GROUP BY groupid;
I am guessing what the actual logic is, but you can definitely have multiple columns in an aggregation query.

To Remove Duplicates from Netezza Table

I have a scenario for a type2 table where I have to remove duplicates on total row level.
Lets consider below example as the data in table.
A|B|C|D|E
100|12-01-2016|2|3|4
100|13-01-2016|3|4|5
100|14-01-2016|2|3|4
100|15-01-2016|5|6|7
100|16-01-2016|5|6|7
If you consider A as key column, you know that last 2 rows are duplicates.
Generally to find duplicates, we use group by function.
select A,C,D,E,count(1)
from table
group by A,C,D,E
having count(*)>1
for this output would be 100|2|3|4 as duplicate and also 100|5|6|7.
However, only 100|5|6|7 is only duplicate as per type 2 and not 100|2|3|4 because this value has come back in 3rd run and not soon after 1st load.
If I add date field into group by 100|5|6|7 will not be considered as duplicate, but in reality it is.
Trying to figure out duplicates as explained above.
Duplicates should only be 100|5|6|7 and not 100|2|3|4.
can someone please help out with SQL for the same.
Regards
Raghav
Use row_number analytical function to get rid of duplicates.
delete from
(
select a,b,c,d,e,row_number() over (partition by a,b,c,d,e) as rownumb
from table
) as a
where rownumb > 1
if you want to see all duplicated rows, you need join table with your group by query or filter table using group query as subquery.
wITH CTE AS (select a, B, C,D,E, count(*)
from TABLE
group by 1,2,3,4,5
having count(*)>1)
sELECT * FROM cte
WHERE B <> B + 1
Try this query and see if it works. In case you are getting any errors then let me know.
I am assuming that your column B is in the Date format if not then cast it to date
If you can see the duplicate then just replace select * to delete

Why shouldn’t you use DISTINCT when you could use GROUP BY?

According to tips from MySQL performance wiki:
Don't use DISTINCT when you have or could use GROUP BY.
Can somebody post example of queries where GROUP BY can be used instead of DISTINCT?
If you know that two columns from your result are always directly related then it's slower to do this:
SELECT DISTINCT CustomerId, CustomerName FROM (...)
than this:
SELECT CustomerId, CustomerName FROM (...) GROUP BY CustomerId
because in the second case it only has to compare the id, but in the first case it has to compare both fields. This is a MySQL specific trick. It won't work with other databases.
SELECT Code
FROM YourTable
GROUP BY Code
vs
SELECT DISTINCT Code
FROM YourTable
The basic rule : Put all the columns from the SELECT clause into the GROUP BY clause
so
SELECT DISTINCT a,b,c FROM D
becomes
SELECT a,b,c FROM D GROUP BY a,b,c
Example.
Relation customer(ssnum,name, zipcode, address) PK(ssnum). ssnum is social security number.
SQL:
Select DISTINCT ssnum from customer where zipcode=1234 group by name
This SQL statement returns unique records for those customer's that have zipcode 1234. At the end results are grouped by name.
Here DISTINCT is no not necessary. because you are selecting ssnum which is already unique because ssnun is primary key. two person can not have same ssnum.
In this case Select ssnum from customer where zipcode=1234 group by name will give better performance than "... DISTINCT.......".
DISTINCT is an expensive operation in a DBMS.