Row_number() and group by together not working - sql

I wrote below query, I need to know what am I doing wrong. After adding row_number(), the output is always this error:
ORA-00979: not a GROUP BY expression
00979. 00000 - "not a GROUP BY expression"
*Cause:
*Action:
Error at Line: 22 Column: 32
The SQL Developer tool tip suggested to append row_number() in group by together with Is_Data_Picked. But as I know row_number() is an analytical function to count the each row, and it can't be use in GROUP BY clause
When I use row_number() inside GROUP BY, then it shows the below error
ORA-30484: missing window specification for this function
30484. 00000 - "missing window specification for this function"
*Cause: All window functions should be followed by window specification, like () OVER ()
*Action:
Error at Line: 26 Column: 26
I want to use both "GROUP BY" and "ROW_NUMBER()" in my query.
Kindly help me to rectify this issue and suggest me the solution.
Query:
SELECT *
FROM
(SELECT
COUNT(DISTINCT Emp_Code) totalEmployees,
SUM(CASE WHEN pay_code = 999 THEN AMOUNT ELSE '0' END) net_salary,
SUM(CASE WHEN pay_code = 997 THEN AMOUNT ELSE '0' END) gross_earning,
SUM(CASE WHEN pay_code = 998 THEN AMOUNT ELSE '0' END) gross_deduction,
Is_Data_Picked,
ROW_NUMBER() OVER (ORDER BY (Emp_Code)) AS ROW_NUM
FROM
Xxmpcd_Salary_Detail_Table
WHERE
Prayas_Erp_Org_Id LIKE '302-%'
AND Yyyymm = '201805'
GROUP BY
Is_Data_Picked, ROW_NUMBER()) mytbl
WHERE
ROW_NUM < 600 AND ROW_NUM > 0

This is the relevant part of your subquery:
SELECT . . .
ROW_NUMBER() OVER (ORDER BY (Emp_Code)) AS ROW_NUM
FROM Xxmpcd_Salary_Detail_Table
WHERE Prayas_Erp_Org_Id LIKE '302-%' AND Yyyymm = '201805'
GROUP BY Is_Data_Picked, ROW_NUMBER()
You have an error in the first ROW_NUMBER() because Emp_Code is not in the GROUP BY. You have an error in the second because ROW_NUMBER() is not a function.
I could speculate that you intend:
SELECT . . .
ROW_NUMBER() OVER (ORDER BY Emp_Code) AS ROW_NUM
FROM Xxmpcd_Salary_Detail_Table
WHERE Prayas_Erp_Org_Id LIKE '302-%' AND Yyyymm = '201805'
GROUP BY Is_Data_Picked, Emp_Code
If you don't want to aggregate by Emp_Code, then you might intend:
SELECT . . .
ROW_NUMBER() OVER (ORDER BY MIN(Emp_Code)) AS ROW_NUM
FROM Xxmpcd_Salary_Detail_Table
WHERE Prayas_Erp_Org_Id LIKE '302-%' AND Yyyymm = '201805'
GROUP BY Is_Data_Picked

Related

Assistance with PERCENTILE_CONT function and GROUP By error

All,
I am having problems with the below query. I am trying to get stat data from our database for the last 3 years but I keep getting the error message:
***Column 'OC_VDATA.DATA1' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.***
I know it has something to do with the DATA1 column but I am not familiar enough using the PERCENTILE_CONT function to know what the solution is.
Anyone have any ideas?
WITH Q AS
(
SELECT stagingPLM.dbo.ITEM_CODES.ITEM_CODE,
AVG(OC_VDATA.DATA1) AS Mean,
STDEVP(OC_VDATA.DATA1) AS StandardDev,
PERCENTILE_CONT(0.5)
WITHIN GROUP (ORDER BY OC_VDATA.DATA1)
OVER (PARTITION BY stagingPLM.dbo.ITEM_CODES.ITEM_CODE) AS Median
FROM OC_VDATA INNER JOIN
OC_VDAT_AUX ON OC_VDATA.PARTNO = OC_VDAT_AUX.PARTNOAUX
AND OC_VDATA.DATETIME = OC_VDAT_AUX.DATETIMEAUX INNER JOIN
stagingPLM.dbo.ITEM_CODES ON LEFT(OC_VDATA.PARTNO, 12) = stagingPLM.dbo.ITEM_CODES.SPEC_NO
AND LEFT(OC_VDAT_AUX.PARTNOAUX, 12) = stagingPLM.dbo.ITEM_CODES.SPEC_NO
WHERE (OC_VDAT_AUX.UDL28 LIKE '%PLASTIC%')
AND (RIGHT(OC_VDATA.PARTNO, 6) = '036150')
AND (CAST(OC_VDAT_AUX.UDL40 AS DATETIME)
BETWEEN CONVERT(datetime, '2019-05-18 00:00:00', 102) AND CONVERT(datetime, '2022-05-18 00:00:00', 102))
GROUP BY stagingPLM.dbo.ITEM_CODES.ITEM_CODE
)
SELECT * FROM Q
The error is because of the code WITHIN GROUP (ORDER BY OC_VDATA.DATA1).
You are doing GROUP BY(for AVG and STDEVP) based on ITEM_CODE, whereas ORDER BY is there on OC_VDATA.DATA1 for the Window function.
Better to calculate AVG,STDEVP and PERCENTILE_CONT with Window Function, instead of half through GROUP BY and half through Window Function.
By considering the minimum required columns to reproduce the issue, you can rewrite the query as below to get the desired output.
SELECT DISTINCT item_codes.item_code,
Avg(oc_vdata.data1)
over(
PARTITION BY item_codes.item_code) AS Mean,
Stdevp(oc_vdata.data1)
over(
PARTITION BY item_codes.item_code) AS StandardDev,
Percentile_cont(0.5)
within GROUP (ORDER BY oc_vdata.data1) over (
PARTITION BY item_codes.item_code) AS Median
FROM oc_vdata
inner join item_codes
ON Left(oc_vdata.partno, 12) = item_codes.spec_no
DB Fiddle: Try it here
Minimum steps to reproduce the error:
SELECT item_codes.item_code,
Avg(oc_vdata.data1) AS Mean,
Stdevp(oc_vdata.data1) AS StandardDev
FROM oc_vdata
INNER JOIN item_codes
ON LEFT(oc_vdata.partno, 12) = item_codes.spec_no
GROUP BY item_codes.item_code
ORDER BY oc_vdata.data1 -- This will cause the error

SQL calculation with previous row + current row

I want to make a calculation based on the excel file. I succeed to obtain 2 of the first records with LAG (as you can check on the 2nd screenshot). Im out of ideas how to proceed from now and need help. I just need the Calculation column take its previous data. I want to automatically calculate it over all the dates. I also tried to make a LAG for the calculation but manually and the result was +1 row more data instead of NULL. This is a headache.
LAG(Data ingested, 1) OVER ( ORDER BY DATE ASC ) AS LAG
You seem to want cumulative sums:
select t.*,
(sum(reconciliation + aves - microa) over (order by date) -
first_value(aves - microa) over (order by date)
) as calculation
from CalcTable t;
Here is a SQL Fiddle.
EDIT:
Based on your comment, you just need to define a group:
select t.*,
(sum(reconciliation + aves - microa) over (partition by grp order by date) -
first_value(aves - microa) over (partition by grp order by date)
) as calculation
from (select t.*,
count(nullif(reconciliation, 0)) over (order by date) as grp
from CalcTable t
) t
order by date;
Imo this could be solved using a "gaps and islands" approach. When Reconciliation>0 then create a gap. SUM(GAP) OVER converts the gaps into island groupings. In the outer query the 'sum_over' column (which corresponds to the 'Calculation') is a cumumlative sum partitioned by the island groupings.
with
gap_cte as (
select *, case when [Reconciliation]>0 then 1 else 0 end gap
from CalcTable),
grp_cte as (
select *, sum(gap) over (order by [Date]) grp
from gap_cte)
select *, sum([Reconciliation]+
(case when gap=1 then 0 else Aves end)-
(case when gap=1 then 0 else Microa end))
over (partition by grp order by [Date]) sum_over
from grp_cte;
[EDIT]
The CASE statement could be CROSS APPLY'ed instead
with
grp_cte as (
select c.*, v.gap, sum(v.gap) over (order by [Date]) grp
from #CalcTable c
cross apply (values (case when [Reconciliation]>0 then 1 else 0 end)) v(gap))
select *, sum([Reconciliation]+
(case when gap=1 then 0 else Aves end)-
(case when gap=1 then 0 else Microa end))
over (partition by grp order by [Date]) sum_over
from grp_cte;
Here is a fiddle

sql Create a Index ID as a new column to sort the case categories i just created Sql server

I'd like to create a SortID column to sort the results in different case categories. Each case category represents a revenue range. I grouped by the results on the case statements to get how many orders for each case. Now I am stuck with this SortID creating issue. Below please find my current query. Please suggest me where i can put the index creation statement in my query.Many thanks in advance!
select SalesAmountCategory, count(*) as Orders
from
(Select case
when ((SalesAmount-TaxAmt-Freight)>=100000) then '>$100000'
when ((SalesAmount-TaxAmt-Freight)>=50000) then '$50000-$100000'
when ((SalesAmount-TaxAmt-Freight)>=10000) then '$10000-$50000'
when ((SalesAmount-TaxAmt-Freight)>=5000) then '$5000-$10000'
when ((SalesAmount-TaxAmt-Freight)>=2500) then '$2500-$5000'
when ((SalesAmount-TaxAmt-Freight)>=1000) then '$1000-$2500'
when ((SalesAmount-TaxAmt-Freight)>=500) then '$500-$1000'
when ((SalesAmount-TaxAmt-Freight)>=100) then '$100-$500'
when ((SalesAmount-TaxAmt-Freight)<100) then '$0-$100'
end as SalesAmountCategory
From dbo.FactResellerSales
where OrderDate BETWEEN '2010-01-01 00:00:00.000' AND '2010-12-31 23:59:59.999'
) as t
group by SalesAmountCategory
order by SalesAmountCategory;
Below please find the image as a expected result example
I would use apply instead of sub-query, you can use row_number to create sortid :
select row_number() over (order by min(SalesAmount - TaxAmt - Freight)) as sortid,
SalesAmountCategory, count(*) as Orders
from dbo.FactResellerSales frs cross apply
( values (case when (SalesAmount - TaxAmt - Freight) >= 100000 then '>$100000'
when (SalesAmount - TaxAmt - Freight) >= 50000) then '$50000-$100000'
. . .
when (SalesAmount - TaxAmt - Freight) < 100 then '$0-$100'
end)
) frss(SalesAmountCategory)
where OrderDate BETWEEN '2010-01-01 00:00:00.000' AND '2010-12-31 23:59:59.999'
group by SalesAmountCategory;
Instead of ordering by the category, order by the minimum amount in each category:
select row_number() over (order by min(SalesAmount-TaxAmt-Freight)) as sortid,
. . .
order by min(SalesAmount-TaxAmt-Freight)
Of course, you will need to select these columns as well in the subquery (or do the calculation in the subquery).

SQL LAG IN CASE STATEMENT

I would appreciate any pointers on what is wrong with my case statement, if the Current CLUSTERn = Previous CLUSTERn Then add the Previous PRODCAT to the current line as PREVCAT...
ORA-30484: missing window specification for this function
30484. 00000 - "missing window specification for this function"
*Cause: All window functions should be followed by window specification,
like <function>(<argument list>) OVER (<window specification>)
*Action:
Error at Line: 11 Column: 30
SELECT CLUSTERn,
MEMBERn,
COUNT(*) OVER ( PARTITION BY CLUSTERn ORDER BY MEMBERn, PRODCAT, STARTd, ENDd ) AS NEWRANK,
CASE WHEN CLUSTERn = LAG(CLUSTERn) THEN LAG(PRODCAT) ELSE 'New' END AS PREVCAT,
STATUS,
PRODCAT,
JOINTYPE,
JOINRANK,
CSP,
PROGID,
PROMNAME,
PROMOID,
COHORT,
FWEEK,
STARTd,
ENDd,
SOURCE
FROM(
I'm not sure what the confusing is. You have:
(CASE WHEN CLUSTERn = LAG(CLUSTERn)
THEN LAG(PRODCAT)
ELSE 'New'
END) AS PREVCAT,
You are missing the OVER clause -- pretty fundamental for all window functions.
Without sample data it is pretty hard to figure out what you really want. Perhaps:
(CASE WHEN CLUSTERn = LAG(CLUSTERn) OVER (ORDER BY MEMBERn, PRODCAT, STARTd, ENDd)
THEN LAG(PRODCAT) OVER (ORDER BY MEMBERn, PRODCAT, STARTd, ENDd)
ELSE 'New'
END) AS PREVCAT,
It is also possible that no CASE is required. LAG() has a three-argument form that allows you to specify a default value:
LAG(PRODCAT, 1, 'NEW') OVER (PARTITION BY ClusterN ORDER BY STARTd, ENDd)

SQL Query to get percentages of two selects?

I currently use two seperate Queries to recieve lists of total runs and lists of errors, so i use excel to divide these numbers to get percentages.
The problem is, that i use a subselect to get the errors, because i group the first select, and therefore cannot use the conditions in the first.
So my Query to get all runs is:
Select
Count(*) as All, year([US-Date]) as year, month([US-Date]) as month, day([US-Date]) as day
FROM
(Select
ROW_NUMBER() OVER (PARTITION BY Int_No ORDER BY Time desc) AS RowNumber, [US-Date]
FROM
dbo.Mydatabase
Where
[US-Date] between '2017-10-01' and '2018-03-01') AS a
WHERE
a.RowNumber = 1
GROUP BY
year([US-Date]), month([US-Date]), day([US-Date])
ORDER BY
year([US-Date]), month([US-Date]), day([US-Date])
which gives me a list of all testruns for each day.
then i use this Query to get the errors:
Select
Count(*) as fejlende, year([US-Date]) as år,
month([US-Date]) as måned, day([US-Date]) as dag
From
(Select
ROW_NUMBER() OVER (PARTITION BY Int_No ORDER BY Time desc) AS RowNumber, [US-Date]
From
dbo.Mydatabase
Where
[US-Date] between '2017-10-01' and '2018-03-01'
and ErrorCode in
(Select
ErrorCode from dbo.Mydatabase
Where
(ErrorCode like '2374' or ErrorCode like '2373' or ErrorCode like '2061'))) AS a
WHERE
a.RowNumber = 1
GROUP BY
year([US-Date]), month([US-Date]), day([US-Date])
ORDER BY
year([US-Date]), month([US-Date]), day([US-Date])
So my question is: can i make one query that finds both lists, and divide them, so i dont have to put them into excel and so on :-)?
You can use a CASE expression for this (I simplified the errorcode check):
Select COUNT(*) as ALL
, COUNT(CASE WHEN ErrorCode IN ('2374', '2373', '2061') THEN 1 END) AS fejlende
, YEAR([US-Date]) as year
, MONTH([US-Date]) as month
, DAY([US-Date]) as day
from (
Select ROW_NUMBER() OVER (PARTITION BY Int_No ORDER BY Time desc) AS RowNumber, [US-Date]
From dbo.Mydatabase
Where [US-Date] between '2017-10-01' and '2018-03-01') AS a
where a.RowNumber = 1
GROUP BY year([US-Date]), month([US-Date]), day([US-Date])
ORDER BY year([US-Date]), month([US-Date]), day([US-Date])
Something like this??
SELECT
Count(*) as [Total],
SUM(CASE WHEN (ErrorCode like '2374' or ErrorCode like '2373' or ErrorCode like '2061') THEN 1 ELSE 0 END) AS Errors,
year([US-Date]) as [Year],
month([US-Date]) as [Month],
day([US-Date]) as [Day]
FROM dbo.Mydatabase
WHERE ROW_NUMBER() OVER (PARTITION BY Int_No ORDER BY Time desc) = 1
AND [US-Date] between '2017-10-01' and '2018-03-01'
GROUP BY year([US-Date]), month([US-Date]), day([US-Date])
ORDER BY year([US-Date]), month([US-Date]), day([US-Date])
Not really sure what your ROW_NUMBER is used for, but hopefully you get the idea and can adopt to your needs now you know the SUM(CASE WHEN) method?