Why is the subquery not working in Snowflake?

Why is the subquery not working in Snowflake? - sql

I am trying to run this query in Snowflake, but I keep getting an error.
select x.*, (select z.status from TBLA z where z.number_id=x.number_id and z.country=x.country and z.datetime=x.datetime) status
from
(
select a.number_id, a.country, max(datetime) as datetime
from TBLA a
group by a.number_id, a.country
) x
This is the error I am getting:
SQL compilation error: Unsupported subquery type cannot be evaluated
Does anyone know how to fix this?

To get the status for the latest datetime per number_id/country windowed function could be used:
SELECT a.*,
(ARRAY_AGG(a.status) WITHIN GROUP(ORDER BY a.datetime DESC)
OVER(PARTITION BY a.number_id, a.country))[0] AS latest_status
FROM TBLA a;

Looks like you are trying to get the latest status by number_id and country. A simple query to do that in Snowflake using window function row_number() is
select * from TBLA
qualify row_number() over (partition by number_id, country order by datetime desc) = 1;

Only scalar subqueries are allowed in the SELECT. Your subquery is not inherently scalar.
https://docs.snowflake.com/en/user-guide/querying-subqueries.html

Related

MAX TO_DATE OVER two IDs of an IF statement

I’m using the this query to connect with amazon redshift.
And I have the following query:
Select b.*, c."releasedate",
DENSE_RANK() OVER(PARTITION BY b.originboardid ORDER BY TO_DATE(SUBSTRING(b.sprintenddate,0,9), 'DD/Mon/YY') DESC) AS "rank_sprint",
DENSE_RANK() OVER(PARTITION BY b.originboardid ORDER BY TO_DATE(c.releasedate, 'YYYY-MM-DD') DESC) AS "rank_release",
RANK() OVER (ORDER BY b.issueid, b.sprintid DESC) as "rank_issue",
MAX(IF (b.issueorigin='completed') AND (b.changeto='In Progress') and (b.changefield='status')
max(TO_DATE(SUBSTRING(b.changecreation,0,10),'YYYY-MM-DD')) OVER(b.issueid,b.sprintid)
) OVER (b.issueid,b.sprintid) as "lastinprogress"
from digitalplatforms.issues_braze b
Left join jira.releases c
On b.version_id=c.versionid
and its outputing the following error:
[Amazon](500310) Invalid operation: syntax error at or near "max"
Position: 459;
Also if I query just:
Select b.*, c.“releasedate”,
DENSE_RANK() OVER(PARTITION BY b.originboardid ORDER BY TO_DATE(SUBSTRING(b.sprintenddate,0,9), ‘DD/Mon/YY’) DESC) AS “rank_sprint”,
DENSE_RANK() OVER(PARTITION BY b.originboardid ORDER BY TO_DATE(c.releasedate, ‘YYYY-MM-DD’) DESC) AS “rank_release”,
RANK() OVER (ORDER BY b.issueid, b.sprintid DESC) as “rank_issue”
from digitalplatforms.issues_braze b
Left join jira.releases c
On b.version_id=c.versionid
it works.
Can someone help?
Thank you

There is no "IF" statement in SQL. SQL is not procedural. You need to rewrite you query using "CASE" or "DECODE" statements.
Also you cannot nest window functions. If you logic requires this then these need to operate at different levels of the query (SELECT level). However, are you sure both of these need to window functions - MAX() OVER vs. MAX()? They are using the same OVER clause so I expect not.
Just guessing based on you query but does this give you what you want?
MAX(DECODE((b.issueorigin='completed') AND (b.changeto='In Progress') and (b.changefield='status')), true,
TO_DATE(SUBSTRING(b.changecreation,0,10),'YYYY-MM-DD')
) OVER (b.issueid,b.sprintid) as "lastinprogress"

Filter out null values resulting from window function lag() in SQL query

Example query:
SELECT *,
lag(sum(sales), 1) OVER(PARTITION BY department
ORDER BY date ASC) AS end_date_sales
FROM revenue
GROUP BY department, date;
I want to show only the rows where end_date is not NULL.
Is there a clause used specifically for these cases? WHERE or HAVING does not allow aggregate or window function cases.

One method uses a subquery:
SELECT r.*
FROM (SELECT r. *,
LAG(sum(sales), 1) OVER (ORDER BY date ASC) AS end_date
FROM revenue r
) r
WHERE end_date IS NOT NULL;
That said, I don't think the query is correct as you have written it. I would assume that you want something like this:
SELECT r.*
FROM (SELECT r. *,
LEAD(end_date, 1) OVER (PARTITION BY ? ORDER BY date ASC) AS end_date
FROM revenue r
) r
WHERE end_date IS NOT NULL;
Where ? is a column such as the customer id.

Try this
select * from (select distinct *,SUM(sales) OVER (PARTITION BY dept) from test)t
where t.date in(select max(date) from test group by dept)
order by date,dept;
And one more simpler way without sub query
SELECT distinct dept,MAX(date) OVER (PARTITION BY dept),
SUM(sales) OVER (PARTITION BY dept)
FROM test;

how can i write the sql query to get the result set, get the result set from the max time

Name Longititue latutute Time
tharindu 79.94148 6.9748404 00:15:47
shane 79.8630765 6.8910388 13:23:24
shane 79.862815 6.8909349 14:41:29
shane 79.8628665 6.8911084 09:39:33
shane 79.8626956 6.890992 11:00:07
shane 79.8628831 6.89099 11:43:00
i want get the result set as below
shane 79.862815 6.8909349 14:41:29
tharindu 79.94148 6.9748404 00:15:47
how can i write the sql query to get the result set, get the result set from the max time

You can try to use ROW_NUMBER window function.
SELECT Name,Longititue,latutute,[Time]
FROM (
SELECT *,ROW_NUMBER() OVER(PARTITION BY Name ORDER BY [Time] DESC) rn
FROM T
)t1
WHERE rn = 1

you can also try using correlated subquery
select * from tablename a
where Time in (select max(Time) from tablename b where a.name=b.name)

you can use corelated subquery
select t.* from table_name t
where t.[Time]=( select max([Time]) from table_name t1 where t1.Name=t.Name)

Here is a way to use ROW_NUMBER without a formal subquery:
SELECT TOP 1 WITH TIES
Name,
Longititue,
latutute,
Time
FROM yourTable
ORDER BY
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY [Time] DESC);
Demo

You can achive this use CTE(Common Table Expression) and Ranking Function.
SQL Query:
WITH CTE AS
(
SELECT Name,Longititue,latutute,Time,DENSE_RANK() OVER(PARTITION BY Name ORDER BY time desc) as RN
FROM MaxTime
)
SELECT * FROM CTE
WHERE RN = 1

SQL - Get max date from dd/mm/yyyy formated column

I have db table which have column name STATUSDATE. Type of this column is varchar2 and that column already have data in dd/mm/yyyy format. And i want get the recent date(max date). I used max() method for this but it not give the correct result,
as example consider following dates
31/08/2014
01/09/2016
after using max(STATUSDATE) the result is 31/08/2014. I'm using oracle db.
I'm try to use following quarry but since above problem its give incorrect results
SELECT * FROM MY_DB.MY_TABLE t
inner join (
select CLIENTNAME, max(STATUSDATE) as MaxDate
from FROM MY_DB.MY_TABLE
group by CLIENTNAME
) tm on t.CLIENTNAME = tm.CLIENTNAME and t.STATUSDATE = tm.MaxDate
please can anyone suggest proper way to do this
Thank You

Moral: Don't store dates as strings. Databases have built-in types for a reason.
So, convert to a proper date and take the max, but you don't need a JOIN for this:
select t.*
from (select t.*,
rank() over (partition by client_name
order by to_date(statusdate, 'DD/MM/YYYY') desc
) as seqnum
from my_db.my_table t
) t
where seqnum = 1;

There is no need of inner join. You can do simply that:
select CLIENTNAME, desnse_rank() over (partition by client_name order by statusdate desc ) as MaxDate
FROM MY_DB.MY_TABLE
where maxdate =1
group by CLIENTNAME

Column is invalid error when using derived table

I'm using ROW_NUMBER() and a derived table to fetch data from the derived table result.
However, I get the error message telling me I don't have the appropriate columns in the GROUP BY clause.
Here's the error:
Column 'tblCompetition.objID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
What column am I missing? Or am I doing something else wrong? Find below the query that is not working, and the (more simple) query that is working.
SQL Server 2008.
Query that isn't working:
SELECT
objID,
objTypeID,
userID,
datAdded,
count,
sno
FROM
(
SELECT scc.objID,scc.objTypeID,scc.userID,scc.datAdded,
COUNT(sci.favID) as count,
ROW_NUMBER() OVER(PARTITION BY scc.userID ORDER BY scc.unqID DESC) as sno
FROM tblCompetition scc
LEFT JOIN tblFavourites sci
ON sci.favID = scc.objID
AND sci.datTimeStamp BETWEEN #datStart AND #datEnd
) as t
WHERE sno <= 2 AND objTypeID = #objTypeID
AND datAdded BETWEEN #datStart AND #datEnd
GROUP BY objID,objTypeID,userID,datAdded,count,sno
Simple query that is working:
SELECT objId,objTypeID,userId,datAdded FROM
(
SELECT objId,objTypeID,userId,datAdded,
ROW_NUMBER() OVER(PARTITION BY userId ORDER BY unqid DESC) as sno
FROM tblRdbCompetition
) as t
WHERE sno<=2 AND objtypeid=#objTypeID
AND datAdded BETWEEN #datStart AND #datEnd
Thank you!

you need the GROUP BY in your subquery since that's where the aggregate is:
SELECT
objID,
objTypeID,
userID,
datAdded,
count,
sno
FROM
(
SELECT scc.objID,scc.objTypeID,scc.userID,scc.datAdded,
COUNT(sci.favID) as count,
ROW_NUMBER() OVER(PARTITION BY scc.userID ORDER BY scc.unqID DESC) as sno
FROM tblCompetition scc
LEFT JOIN tblFavourites sci
ON sci.favID = scc.objID
AND sci.datTimeStamp BETWEEN #datStart AND #datEnd
GROUP BY scc.objID,scc.objTypeID,scc.userID,scc.datAdded) as t
WHERE sno <= 2 AND objTypeID = #objTypeID
AND datAdded BETWEEN #datStart AND #datEnd

You cannot have count in a group by clause. Infact the count is derived when you have other fields in group by. Remove count from your Group by.

In the innermost query you are using
COUNT(sci.favID) as count,
which is an aggregate, and you select other non-aggregating columns along with it.
I believe you wanted an analytic COUNT instead:
SELECT objID,
objTypeID,
userID,
datAdded,
count,
sno
FROM (
SELECT scc.objID,scc.objTypeID,scc.userID,scc.datAdded,
COUNT(sci.favID) OVER (PARTITION BY scc.userID ) AS count,
ROW_NUMBER() OVER (PARTITION BY scc.userID ORDER BY scc.unqID DESC) as sno
FROM tblCompetition scc
LEFT JOIN
tblFavourites sci
ON sci.favID = scc.objID
AND sci.datTimeStamp BETWEEN #datStart AND #datEnd
) as t
WHERE sno = 1
AND objTypeID = #objTypeID

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Why is the subquery not working in Snowflake? - sql

To get the status for the latest datetime per number_id/country windowed function could be used: SELECT a.*, (ARRAY_AGG(a.status) WITHIN GROUP(ORDER BY a.datetime DESC) OVER(PARTITION BY a.number_id, a.country))[0] AS latest_status FROM TBLA a;

Looks like you are trying to get the latest status by number_id and country. A simple query to do that in Snowflake using window function row_number() is select * from TBLA qualify row_number() over (partition by number_id, country order by datetime desc) = 1;

Only scalar subqueries are allowed in the SELECT. Your subquery is not inherently scalar. https://docs.snowflake.com/en/user-guide/querying-subqueries.html

Related

MAX TO_DATE OVER two IDs of an IF statement

Filter out null values resulting from window function lag() in SQL query

how can i write the sql query to get the result set, get the result set from the max time

SQL - Get max date from dd/mm/yyyy formated column

Column is invalid error when using derived table

Categories

Resources