Window Function in PostgreSQL - sql

I have a with my SELECT statement that I just can't figure out. The query is as follows:
SELECT
count(1),
interaction_type_id
FROM
tibrptsassure.d_interaction_sub_type
GROUP BY
interaction_type_id
HAVING
count(interaction_type_id) > 1
ORDER BY
count(interaction_type_id) DESC
LIMIT 5;
Since my application does not support the use of the LIMIT keyword, I tried changing my query using the rank() function like so:
SELECT
interaction_type_id,
rank() OVER (PARTITION BY interaction_type_id ORDER BY count(interaction_type_id)
DESC)
FROM
tibrptsassure.d_interaction_sub_type;
However, this way I ended up with the following error message:
ERROR: column "d_interaction_sub_type.interaction_type_id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT interaction_type_id, rank() OVER (PARTITION BY inter...
^
********** Error **********
ERROR: column "d_interaction_sub_type.interaction_type_id" must appear in the GROUP BY clause or be used in an aggregate function
SQL state: 42803
Character: 9
Is there an equivalent of rownum() in PostgreSQL? (Apart from using the LIMIT keyword to achieve the same result, that is.)
Does anybody have any suggestions for me? Thanks in advance.

Test whether the following works (it is standard postgresql syntax and should work):
with
t as (
select 1 as id union all
select 1 as id union all
select 2 union all
select 2 union all
select 3)
select
id
from
t
group by
id
having
count(id) > 1
order by
id desc
limit 1
If this works then you have some syntax problem. If this does not work then you have some other issue - maybe the software you are using is constrained in some really strange way.
You can also use row_number(), but it is not very efficient way:
with
t as (
select 1 as id union all
select 1 as id union all
select 2 union all
select 2 union all
select 3)
, u as (
select
id,
count(*)
from
t
group by
id
)
, v as (
select
*,
row_number() over(order by id) c
from
u
)
select
*
from
v
where
c < 2

The problem was in my query, i.e. there was a syntax error.
What I needed was the top 5 category_id and top 5 instances of type_id in each category_id and top 5 instances of sub_type_id in each type_id. To achieve this, I changed the query in the following way and finally got the expected output:
SELECT * FROM (
SELECT t1.int_subtype_key, t2.interaction_sub_type_desc, interaction_category_id,
interaction_type_id, interaction_sub_type_id, count(interaction_sub_type_id) AS
subtype_cnt,
rank()
over (PARTITION BY interaction_category_id, interaction_type_id ORDER BY
count(interaction_sub_type_id) DESC) AS rank
FROM tibrptsassure.f_cc_call_analysis t1 INNER JOIN
tibrptsassure.d_interaction_sub_type t2 ON t1.int_cat_key = t2.intr_catg_ref_nbr
AND t1.int_subtype_key = t2.intr_sub_type_ref_nbr INNER JOIN
tibrptsassure.d_calendar t3 ON t1.interaction_date = t3.calendar_date GROUP BY
t2.interaction_sub_type_desc, t1.int_subtype_key, interaction_category_id,
interaction_type_id, interaction_sub_type_id) AS sub_type
WHERE rank <= 5;
Thanks to everyone for paying attention and helping me with this.

Related

oracle db from keyword not found where expected in double cte

I have a double cte expression , the first one join two tables and the second is implementing a partition by function:
with cte as (
select *
from memuat.product p
join memuat.licence l on p.id = l.product_id
where l.managed = 'TRUE'
),
joined as (
select
*,
row_number() over (partition by id order by id) as rn
from cte
)
select * from joined;
I get the following error:
ORA-00923: FROM keyword not found where expected, ERROR at line 12.
I cannot figure out which syntax error is wrong in my query.
Oracle is nitpicking when it comes to SELECT *. SELECT * means "select everything", so how can you possibly add something to it? In Oracle you cannot SELECT *, 1 AS something_else FROM some_table. You must have SELECT some_table.*, 1 AS something_else FROM some_table, so you are no longer selecting "everything", but "everything from the table" :-)
You have
select
*,
row_number() over (partition by id order by id) as rn
from cte
It must be
select
cte.*,
row_number() over (partition by id order by id) as rn
from cte
instead.

Finding the highest COUNT of a group per individual GROUP BY query in Hive

I have a table of customer transactions where an individual_id appears once for every different transaction.
There is a category column called Name_desc which i would like to group by individual and find the most common category of name_desc per individual.
Suppose data is like below
Id Name_desc
---- ------
1 a
2 c
1 b
2 c
1 b
I want below output
Id Name_desc( most occuring category)
------ ------
1 b
2 c
I tried with below query and got an
Error while compiling statement: FAILED: ParseException line 4:19 cannot recognize input near 'select' 'max' '(' in expression specification
error
select name_desc, count(*) as count_e
from db.cust_scan
group by id, name_desc
having count(*)= ( select max(count_e),id
from
(
select id, name_desc, count(*) as count_e
from
db.cust_scan
where
base_div_nbr =1
and
country_code ='US'
and
retail_channel_code=1
and visit_date between '2019-01-01' and '2019-12-31'
GROUP by
individual_id, tt_id_desc
order by individual_id, count_e desc
) as t
group by individual_id )
I would appreciate any suggestions or help with regard to query. If there is an efficient way of getting this job done. Let me know.
This following script written and tested for MSSQL. But as HIVE also support the same Row_Number() ans sub query, this following query should help you getting your required output-
SELECT A.Id, A.Name_desc
FROM
(
SELECT Id,Name_desc,
row_number() over (partition by id order by COUNT(*) desc) AS RN
FROM your_table
GROUP BY Id,Name_desc
) A
WHERE RN = 1
You need subquery in Hive:
SELECT s.Id, s.Name_desc
FROM
(
select s.*, row_number() over (partition by s.id order by s.cnt desc) rn
from
(
SELECT Id, Name_desc, COUNT(*) cnt
FROM your_table
GROUP BY Id, Name_desc
) s
) s
WHERE rn= 1;

Select Top 100 Groups

I have thousands of groups in a table, something like :
1..
1..
2..
2..
2..
2..
3..
3..
.
.
.
10000..
10000..
How can i make a select that give me the Top 3 groups each time.
I Want something like select Top 3 from rows , but it have to return the first three groups not the first three rows.
You can try this :
;with cte as (
select distinct groupId from mytable order by groupid
)
select * from mytable where TheGroupId in (select top 3 groupdid from cte)
You can use DENSE_RANK to assign a number to each group. All members of the same group will have the same number. Then in an outer query, select top 3 groups:
SELECT *
FROM (SELECT *, DENSE_RANK() OVER (ORDER BY id) AS rnk
FROM mytable ) t
WHERE t.rnk <= 3
The above query assumes that id is the column used to group records together.
SQL Fiddle Demo
Use Ranking function Row_Number() :
SELECT *
FROM (SELECT *,
Row_number()
OVER(
partition BY GroupId
ORDER BY GroupId) AS [rn]
FROM YourTable) t
WHERE rn <= 3
Check this MSDN doc for details of all ranking functions.
There is a sql TOP statement that does this
SELECT TOP number|percent column_name(s) FROM table_name;
a description of what it does and how it is used in alternative sql statements for example for mysql and ms access can be found here: http://www.w3schools.com/sql/sql_top.asp
My bad i misread your question, this will return the top rows not groups, could you explain what you are trying to do in more detail?
SELECT *
FROM
(SELECT *
,ROW_NUMBER() OVER (PARTITION BY [Group] ORDER BY [Group] ASC)rn
FROM TableName
)A
WHERE rn <= 3

ORDER BY upper(...) with a UNION giving me problems

I'm having a bit of trouble figuring out why I'm having this problem.
This code works exactly how it should. It combines the two tables (MESSAGES and MESSAGES_ARCHIVE) and orders them correctly.
SELECT * FROM (
SELECT rownum as rn, a.* FROM (
SELECT
outbound.FROM_ADDR, outbound.TO_ADDR, outbound.EMAIL_SUBJECT
from MESSAGES outbound
where (1 = 1)
UNION ALL
SELECT
outboundarch.FROM_ADDR, outboundarch.TO_ADDR, outboundarch.EMAIL_SUBJECT
from MESSAGES_ARCHIVE outboundarch
where (1 = 1)
order by FROM_ADDR DESC
) a
) where rn between 1 and 25
However, this code does not work.
SELECT * FROM (
SELECT rownum as rn, a.* FROM (
SELECT
outbound.FROM_ADDR, outbound.TO_ADDR, outbound.EMAIL_SUBJECT
from MESSAGES outbound
where (1 = 1)
UNION ALL
SELECT
outboundarch.FROM_ADDR, outboundarch.TO_ADDR, outboundarch.EMAIL_SUBJECT
from MESSAGES_ARCHIVE outboundarch
where (1 = 1)
order by upper(FROM_ADDR) DESC
) a
) where rn between 1 and 25
and returns this error
ORA-01785: ORDER BY item must be the number of a SELECT-list expression
01785. 00000 - "ORDER BY item must be the number of a SELECT-list expression"
I'm trying to get the two tables ordered regardless of letter case, which is why I'm using upper(FROM_ADDR). Any suggestions? Thanks!
I'm not quite sure why this is generating an error, but it probably has to do with scoping rules for union queries. There is an easy work-around, using row_number():
SELECT * FROM (
SELECT row_number() over (order by upper(FROM_ADDR)) as rn, a.*
FROM (
SELECT
outbound.FROM_ADDR, outbound.TO_ADDR, outbound.EMAIL_SUBJECT
from MESSAGES outbound
where (1 = 1)
UNION ALL
SELECT
outboundarch.FROM_ADDR, outboundarch.TO_ADDR, outboundarch.EMAIL_SUBJECT
from MESSAGES_ARCHIVE outboundarch
where (1 = 1)
) a
)
where rn between 1 and 25
Your upper() is returning a value, but not a column name.
Instead of:
order by upper(FROM_ADDR) DESC
try:
order by upper(FROM_ADDR) as FROM_ADDR DESC

SQL Server query for top rows to select with condition

I want to skip first 5 records and then select 10 records
I have a column email in table user. Here I am trying to select top 10 unique rows from table user using this query
select DISTINCT TOP 10 email from user
Now I am trying to select top 10 unique rows from table skipping the first 5 records
select DISTINCT SKIP 5 TOP 10 email from user
which is not done and return error.. can anyone help me
SELECT A.NAME FROM
(SELECT distinct RANK() OVER(ORDER BY NAME) RNK,NAME FROM USERS) A
WHERE A.RNK>4 AND A.RNK<16
Using LIMIT will not guarantee you that you will get top rows with proper order.
If you use ANALYTIC functions, it will give you proper results.
SQL_LIVE_DEMO
Here is one way to do it. I like to use Common Table Expressions for some things like this because it makes the query easy to understand, although this isn't particularly complicated.
WITH CTE AS
(
Select Distinct Email From User
)
,
CTE1 AS
(
Select Email, ROW_NUMBER() over (ORDER BY Email) AS RowNumber
From CTE
)
Select Top 10 * From CTE1 Where RowNumber > 5
with t2 as
(
select t1.*,
row_number() over (order by id) rn
from
(select email, max(id) as id from [user] group by email) as t1
)
select * from t2 where rn between 5 and 10
How about this:
SELECT *
FROM (SELECT *, ROW_NUMBER() OVER (ORDER BY email) AS row
FROM user ) a
WHERE row > 5 and row <= 10
I think you are using SKIP incorrectly, it should be part of the ORDER BY clause.
SELECT DISTINCT TOP(10) Email FROM TableName WHERE Email not in (SELECT TOP(5) Email From TableName)
You can try this code, in this query fetch distinct 10 email ids skip 5 records as you say in this question.