Divide derived rank, count columns from subquery to find percentile

Divide derived rank, count columns from subquery to find percentile - sql-server-2005

I'm having trouble dividing two columns from a subquery. The only answer that's returned is 0. I've tried multiplying the two columns just to see if it works and it does. I cannot figure out what the problem is.
SELECT cert, repdte, NAMEFULL, Rnk, Cnt, (Cnt - Rnk) / Cnt as 'Perc'
FROM
(
SELECT STRU.cert, STRU.repdte, STRU.NAMEFULL,
CASE
WHEN ISNULL(BAL.DEPI5,0) = 0 THEN NULL
ELSE (ISNULL(INC.EINTEXPA,0) / ISNULL(BAL.DEPI5,0))*100
END AS 'CoF', RANK() OVER (Partition by STRU.repdte ORDER BY
CASE
WHEN ISNULL(BAL.DEPI5,0) = 0 THEN NULL
ELSE (ISNULL(INC.EINTEXPA,0) / ISNULL(BAL.DEPI5,0))*100
END DESC) AS 'Rnk', COUNT(*) OVER (PARTITION BY STRU.repdte) as 'Cnt'
FROM MODEL_RIS_RMS_FDIC.dbo.STRU as STRU
JOIN MODEL_RIS_FDIC.dbo.CDI_RC_BAL as BAL
ON STRU.cert = BAL.cert AND STRU.callYMD = BAL.callYMD
JOIN MODEL_RIS_FDIC.dbo.CDI_RI_INC as INC
ON STRU.cert = INC.cert and STRU.callYMD = INC.callYMD
WHERE
CASE
WHEN ISNULL(BAL.DEPI5,0) = 0 THEN NULL
ELSE (ISNULL(INC.EINTEXPA,0) / ISNULL(BAL.DEPI5,0))*100
END IS NOT NULL AND STRU.callYMD >= '2008-03-31'
) A
WHERE Perc < .11

I think the datatypes of Cnt and Rnk are INT and hence the result gives you the least value of INT type which is 0. Try casting Cnt and Rnk to FLOAT
(cast(Cnt - Rnk) as FLOAT)/CAST(Cnt as FLOAT)

Related

Why error: 01428. 00000 - "argument '%s' is out of range" SQl Developer

I have the following SQL script,
Select * From
(Select To_Char(Bmret.Pricedate, 'dd-mm-yyyy') As Pricedate, Bmret.Bmval, Bmret.id
, Cast(Exp(Sum(Ln(Cast(Bmret.Bmval As number))) Over (Partition By bmret.id)) As Number) As Twr
, RANK() OVER (PARTITION BY bmret.id ORDER BY bmret.pricedate asc) AS rank
From Tab_A Bmret
Where 1=1
) B
Where 1=1
And B.Rank=1
;
, which provides me with the desired result of a column, twr, that contains the product of the elements in column Bmval across pricedates, grouped by id.
However, I obtain the following error: 01428. 00000 - "argument '%s' is out of range".
I am aware that the error stems from the part Cast(Exp(Sum(Ln(Cast(Bmret.Bmval As number))) Over (Partition By bmret.id)) As Number) of the code and in particular that the "parameter passed into the function was not a valid value". Hence, my question is, is there any way to identify the id with values that are not valid?
I am not allowed to share the sample data. I am sorry.
Thank you in advance.
Best regards,

Please check the value of Cast(Bmret.Bmval As number). It must be greater than 0.
For further read:
https://www.techonthenet.com/oracle/functions/ln.php
Oracle / PLSQL: LN Function This Oracle tutorial explains how to use
the Oracle/PLSQL LN function with syntax and examples.
Description The Oracle/PLSQL LN function returns the natural logarithm
of a number.
Syntax The syntax for the LN function in Oracle/PLSQL is:
LN( number ) Parameters or Arguments number The numeric value used to
calculate the natural logarithm. It must be greater than 0.
You need to define what will be the Ln(Cast(Bmret.Bmval As number)) if Bmret.Bmval <=0. If you define it as 0( which might not be correct for the calculation) then your query would be:
Select * From
(Select To_Char(Bmret.Pricedate, 'dd-mm-yyyy') As Pricedate, Bmret.Bmval, Bmret.id
, Cast(Exp(Sum(case when Cast(Bmret.Bmval As number)>0 then Ln(Cast(Bmret.Bmval As number)) else 0 end) Over (Partition By bmret.id)) As Number) As Twr
, RANK() OVER (PARTITION BY bmret.id ORDER BY bmret.pricedate asc) AS rank
From Tab_A Bmret
Where 1=1
) B
Where 1=1
And B.Rank=1;

As #Kazi said, and as earlier answers had already mentioned, the issue is with using ln() with a negative number or zero. The documentation says:
LN returns the natural logarithm of n, where n is greater than 0.
so you can identify the IDs with out-of-range values with:
select id from tab_a where bmval <= 0
As you want the product of several numbers, you probably still want to include those values; but then having a zero amongst them should make the result zero, one negative number should make the result negative, two should make it positive, etc.
You can use the absolute value of your numbers for the calculation, and at the same time count how many negative values there are - then if that count of negatives is an odd number, multiply the whole result by -1.
Adapting the answer to your previous question, and changing the table and column names to match this question, that would be:
select to_char(a1.pricedate, 'dd-mm-yyyy') as pricedate, b1.bm, a1.bmval,
round(cast(exp(sum(ln(cast(abs(a1.bmval) as binary_double))) over (partition by b1.bmik)) as number))
*
case
when mod(count(case when a1.bmval < 0 then pricedate end) over (partition by b1.bmik), 2) = 0
then 1
else -1
end as product
from tab_a a1
inner join benchmarkdefs b1 on (a1.id = b1.bmik);
db<>fiddle with a group that has two negatives (which cancel out), one negative (which is applied), and one with a zero - where the product ends up as zero, as you'd hopefully expect.
The point of the cast() calls was to improve performance, as noted in the old question I linked to, by performing the exp/ln part as binary_double; there is no point casting a number to number. If you don't want the binary_double part then you can take the casts out completely; but then you do also have to deal with zeros as well as negative values, e.g. keeping track of whether you have any of those too:
select to_char(a1.pricedate, 'dd-mm-yyyy') as pricedate, b1.bm, a1.bmval,
round(exp(sum(ln(abs(nullif(a1.bmval, 0)))) over (partition by b1.bmik)))
*
case when min(abs(a1.bmval)) over (partition by b1.bmik) = 0 then 0 else 1 end
*
case
when mod(count(case when a1.bmval < 0 then pricedate end) over (partition by b1.bmik), 2) = 0
then 1
else -1
end as product
from tab_a a1
inner join benchmarkdefs b1 on (a1.id = b1.bmik);
db<>fiddle
For this query, which just gets values for the first date and product across all dates, that would translate (with casting) to:
select * from
(
select to_char(bmret.pricedate, 'dd-mm-yyyy') as pricedate, bmret.bmval, bmret.id
, round(exp(sum(ln(abs(nullif(bmret.bmval, 0)))) over (partition by bmret.id)))
*
case when min(abs(bmret.bmval)) over (partition by bmret.id) = 0 then 0 else 1 end
*
case
when mod(count(case when bmret.bmval < 0 then pricedate end) over (partition by bmret.id), 2) = 0
then 1
else -1
end as twr
, rank() over (partition by bmret.id order by bmret.pricedate asc) as rank
from tab_a bmret
) b
where b.rank=1
PRICEDATE
BMVAL
ID
TWR
RANK
11-08-2021
1
1
120
1
11-08-2021
12
2
524160
1
11-08-2021
22
3
-7893600
1
11-08-2021
1
4
0
1
db<>fiddle
As you were told in an old answer, if you don't want to see the (not very interesting) rank column then change select * from to select pricedate, bmval, id, twr from in the outer query.
You could also use aggregation with keep to avoid needing an inline view:
select to_char(min(pricedate), 'dd-mm-yyyy') as pricedate
, min(bmret.bmval) keep (dense_rank first order by pricedate) as bmval
, min(bmret.id) keep (dense_rank first order by pricedate) as id
, round(exp(sum(ln(abs(nullif(bmret.bmval, 0))))))
*
case when min(abs(bmret.bmval)) = 0 then 0 else 1 end
*
case
when mod(count(case when bmret.bmval < 0 then pricedate end), 2) = 0
then 1
else -1
end as twr
from tab_a bmret
group by bmret.id
PRICEDATE
BMVAL
ID
TWR
11-08-2021
1
1
120
11-08-2021
12
2
524160
11-08-2021
22
3
-7893600
11-08-2021
1
4
0
db<>fiddle

Aggregating values in SQL

Im trying to aggregate output values of a customer's bankruptcy in terms of yes (Y), no (N) or no data (N/D) using a window function in a subquery below. For example, if there arise edge cases when in one record the Customer is classified as not bankrupt (N) but in another record on the same CDate it is also classified as no data (N/D), I should get a final aggregated output value as N/D, but it gives me N instead, because of what I've done here by partitioning the customer records over IsBankrupt ascending (asc). The logic behind it which is supposed to be implemented:
Y and Y = Y;
Y and N = Y;
N and N = N;
Y and N/D = Y;
N and N/D = N/D
with sample as (
select date('2020-12-32') as CDate, 123 as CustomerID, 'N/D' as IsBankrupt
union all
select date('2020-12-32') as CDate, 123 as CustomerID, 'N' as IsBankrupt)
select CDate, CustomerID, IsBankrupt, case when CustomerID = 123 then 'N/D' end as ExpectedResult
from
(
select CDate, CustomerID, IsBankrupt,
row_number() over (partition by CustomerID, CDate order by IsBankrupt asc) as flag
from sample
) from subsample
where flag = 1
output:
CDate
CustomerID
IsBankrupt
ExpectedOutput
2020-12-31
123
N
N/D
All the other cases of the previously mentioned logic work. So Question is - how could i update my row_number() over partition by clause so that the logic doesnt break down?

I would suggest aggregation:
select cdate, customerid,
(case when sum(case when IsBankrupt = 'Y' then 1 else 0 end) > 0
then 'Y'
when sum(case when IsBankrupt = 'N/D' then 1 else 0 end) > 0
then 'N/D'
else 'N'
end) as new_flag
from t
group by cdate, customerid;
If you don't like the nested case expressions, you can actually do this based on the ordering of the values:
select cdate, customerid,
max(IsBankrupt) as new_flag
from t
group by cdate, customerid;

Oracle SQL -- Pick Max Value from RN Function but update all fields with that value

My query is as follows
SELECT HEADER_TABLE.SEGMENT1,
LINES_TABLE.LINE_NUM,
CASE
WHEN ( HEADER_TABLE.REVISION_NUM = '0'
AND HEADER_TABLE.PRINT_COUNT = '0')
THEN
'Unavailable'
ELSE
NVL (ACK_TABLE.ACK_TYPE, 'Absent')
END
AS X_ACK_TYPE,
ACK_TABLE.GXS_DATE
FROM HEADER_TABLE,
LINES_TABLE,
(SELECT po_number,
po_line_number,
gxs_date,
po_ack_filename,
ack_type
FROM (SELECT po_number,
po_line_number,
gxs_date,
po_ack_filename,
ack_type,
ROW_NUMBER ()
OVER (PARTITION BY po_number ORDER BY gxs_date DESC)
rn
FROM xxcmst_po_ack_from_gxs_stg)
WHERE rn = 1) ACK_TABLE,
(SELECT PO_NUMBER FROM XXCMST.XXCMST_ACTION_TABLE_ACKNOWLEDGEMENT) ACTION_TABLE
WHERE HEADER_TABLE.PO_HEADER_ID = LINES_TABLE.PO_HEADER_ID
AND HEADER_TABLE.SEGMENT1 = ACK_TABLE.PO_NUMBER(+)
AND HEADER_TABLE.SEGMENT1 = ACTION_TABLE.PO_NUMBER(+)
AND LINES_TABLE.LINE_NUM = ACK_TABLE.PO_LINE_NUMBER(+)
AND HEADER_TABLE.SEGMENT1 = '100';
This is giving me 6 records with 1 GXS_DATE and X_ACK_TYPE = 'Absent'. The RN function is needed here to pull 1 record only from the subquery but the requirement is to have all the 6 records have the same date and ACK_TYPE which is not happening. How can I achieve this? Please refer to the below screenshot and I need X_ACK_TYPE = AK for all the 6 LINE_NUMs and GXS_DATE = 3/6/2020 for all these 6 records.
My current data screenshot here

Instead of
ACK_TABLE.GXS_DATE
in SELECT clause use the LAG function as follows:
CASE WHEN ACK_TABLE.GXS_DATE IS NOT NULL
THEN ACK_TABLE.GXS_DATE
ELSE LAG(ACK_TABLE.GXS_DATE IGNORE NULLS)
OVER (PARTITION BY HEADER_TABLE.SEGMENT1 ORDER BY LINES_TABLE.LINE_NUM )
END AS GXS_DATE
or If there will be always one value of ACK_TABLE.GXS_DATE exists per HEADER_TABLE.SEGMENT1 then you can simply write it as
MIN(ACK_TABLE.GXS_DATE)
OVER (PARTITION BY HEADER_TABLE.SEGMENT1) AS GXS_DATE
-- Update --
for ACK_TYPE, You need to apply the same logic in ELSE portion of your CASE statement from the original query as follows:
Replace this:
ELSE
NVL (ACK_TABLE.ACK_TYPE, 'Absent')
END
With this:
ELSE
NVL (MIN(ACK_TABLE.ACK_TYPE)
OVER (PARTITION BY HEADER_TABLE.SEGMENT1), 'Absent')
END

SQL Server 2008 equivalent for FETCH OFFSET with WHERE clause

I have a program with which my users can look up all the data traffic that happend the last 7 days. I use a stored procedure to get me that data - 250 records at a time (the user can page through that). The problem was, that the users get a lot of timeouts when they wanted to see that data.
Here is the stored procedure before I tried to optimize ist.
#MaxRecCount INT,
#PageOffset INT,
#IncludeData BIT
SELECT [Client], [Schema], [Version], [Records], [Fetched], [Receipted], [ProvidedAt], [FetchedAt], [ReceiptedAt],[PacketIds], [Record] FROM (
SELECT TOP(#MaxRecCount) MAX(bai_ExportPendingArchive.[UserName]) AS Client,
MAX(bai_ExportPendingArchive.Category) AS [Schema],
MAX(bai_ExportPendingArchive.ContractVersion) AS [Version],
COUNT(*) AS [Records],
SUM (CASE WHEN bai_ExportPendingAckArchive.ExportPendingId IS NULL THEN 0 ELSE 1 END) as [Fetched],
SUM (CASE WHEN bai_ExportPendingAckArchive.Receipted IS NULL THEN 0 ELSE 1 END) as [Receipted],
MAX(bai_ExportArchive.Inserted) AS [ProvidedAt],
MAX(CASE WHEN bai_ExportPendingAckArchive.ExportPendingId IS NULL THEN NULL ELSE bai_ExportPendingAckArchive.Inserted END) AS [FetchedAt],
MAX(CASE WHEN bai_ExportPendingAckArchive.Receipted IS NULL THEN NULL ELSE bai_ExportPendingAckArchive.Receipted END) AS [ReceiptedAt],
bai_ExportArchive.PacketIds AS [PacketIds],
NULL AS [Record],
ROW_NUMBER() Over (Order By MAX(bai_ExportArchive.Inserted) desc) as [RowNumber]
FROM bai_ExportArchive
INNER JOIN bai_ExportPendingArchive ON bai_ExportArchive.Id = bai_ExportPendingArchive.ExportId
LEFT OUTER JOIN bai_ExportPendingAckArchive ON bai_ExportPendingAckArchive.ExportPendingId = bai_ExportPendingArchive.Id
GROUP BY bai_ExportPendingArchive.[UserName], bai_ExportArchive.PacketIds, bai_ExportPendingArchive.Category
) AS InnerTable WHERE RowNumber > (#PageOffset * #MaxRecCount) and RowNumber <= (#PageOffset * #MaxRecCount + #MaxRecCount)
ORDER BY RowNumber
#MaxRecCount, #PageOffset and #IncludeData are parameter which came from my C#-method.
This version needed about 1:35min to get me the data I wanted. To make the stored procedure faster I insered a WHERE clause to filter for the Inserted col (also I made an Index on this column) and to use OFFSET FETCH:
The stored procedure after the optimization:
#MaxRecCount INT,
#PageOffset INT,
#IncludeData BIT
Declare #pageStart int
Declare #pageEnd int
SET #pageStart = #PageOffset * #MaxRecCount
SET #pageEnd = #pageStart + #MaxRecCount + 50
IF #IncludeData = 0
BEGIN
SELECT [Client], [Schema], [Version], [Records], [Fetched], [Receipted], [ProvidedAt], [FetchedAt], [ReceiptedAt],[PacketIds], [Record] FROM (
SELECT TOP(#MaxRecCount) bai_ExportPendingArchive.[UserName] AS Client,
bai_ExportPendingArchive.Category AS [Schema],
MAX(bai_ExportPendingArchive.ContractVersion) AS [Version],
COUNT(*) AS [Records],
SUM (CASE WHEN bai_ExportPendingAckArchive.ExportPendingId IS NULL THEN 0 ELSE 1 END) as [Fetched],
SUM (CASE WHEN bai_ExportPendingAckArchive.Receipted IS NULL THEN 0 ELSE 1 END) as [Receipted],
MAX(bai_ExportArchive.Inserted) AS [ProvidedAt],
MAX(CASE WHEN bai_ExportPendingAckArchive.ExportPendingId IS NULL THEN NULL ELSE bai_ExportPendingAckArchive.Inserted END) AS [FetchedAt],
MAX(CASE WHEN bai_ExportPendingAckArchive.Receipted IS NULL THEN NULL ELSE bai_ExportPendingAckArchive.Receipted END) AS [ReceiptedAt],
bai_ExportArchive.PacketIds AS [PacketIds],
NULL AS [Record],
ROW_NUMBER() Over (Order By MAX(bai_ExportArchive.Inserted) desc) as [RowNumber]
FROM bai_ExportArchive
INNER JOIN bai_ExportPendingArchive ON bai_ExportArchive.Id = bai_ExportPendingArchive.ExportId
LEFT OUTER JOIN bai_ExportPendingAckArchive ON bai_ExportPendingAckArchive.ExportPendingId = bai_ExportPendingArchive.Id
Where bai_ExportArchive.Inserted <= (Select bai_ExportArchive.Inserted from bai_ExportArchive Order by bai_ExportArchive.Inserted DESC Offset #pageStart ROWS FETCH NEXT 1 ROWS Only)
And bai_ExportArchive.Inserted > (Select bai_ExportArchive.Inserted from bai_ExportArchive Order by bai_ExportArchive.Inserted DESC Offset #pageEnd ROWS FETCH NEXT 1 ROWS Only)
GROUP BY bai_ExportPendingArchive.[UserName], bai_ExportArchive.PacketIds, bai_ExportPendingArchive.Category
) AS InnerTable
ORDER BY RowNumber
This version gives me the data in about 2s. The only problem is, I work on Microsoft SQL Server 2014 BUT my Users use SQL Server 2008+. The Problem now is, that the OFFSET FETCH dosn't work in Server 2008. And now I'm clueless how I can optimize my stored procedure that it is fast and work on SQl Server 2008.
I'm thankful for any help :)

Try this method to handle the pagination in SQL Server 2005/2008.
First use a CTE for your select query with a ROW_NUMBER() column to identify the record number/count. After that you can select a range of records from this CTE using your PAGE_NUMBER and PAGE_COUNT. Example is below
DECLARE #P_PAGE_NUM INT = 0
,#P_PAGE_SIZE INT = 20
;WITH CTE
AS
( /*SELECT ROW_NUMBER() OVER (ORDER BY COL_to_SORT DESC) AS [ROW_NO]
,...
WHERE ....
*/ -- You can replace your select query here, but column [ROW_NO] should be there in your select list.
--ie ROW_NUMBER() OVER (ORDER BY put_column-to-sort-here DESC) AS [ROW_NO]
)
SELECT *
--,( SELECT COUNT(*) FROM CTE) AS [TOTAL_ROW_COUNT]
FROM CTE
WHERE (
ISNULL(#P_PAGE_NUM,0) = 0 OR
[ROW_NO] BETWEEN ( #P_PAGE_NUM - 1) * #P_PAGE_SIZE + 1
AND #P_PAGE_NUM * #P_PAGE_SIZE
)
ORDER BY [ROW_NO]

SQL order by sum column, when tie need to set order by subquery

I have a SQL Server 2008 query that groups a calculated column "points". When the "points" tie I need to look to another field to determine the correct order.
SELECT
p.DriverID,
p.DriverName,
p.CarNum,
SUM(CASE WHEN r.RaceType = 10 THEN (200 - ((p.CarPosition - 1) * 2)) ELSE 0 END) AS Points
FROM
RaceParticipants AS p
INNER JOIN Race AS r ON p.RaceID = r.RaceID
GROUP BY
r.RaceDateID, p.DriverID, p.DriverName, p.CarNum
HAVING
(r.RaceDateID IN (255, 256))
ORDER BY
Points DESC
The column I would need to look to would be p.CarPosition WHERE r.RaceType = 60
so it would have to be some sort of sub query?

Something like:
SELECT DriverID, DriverName,CarNum,Points
FROM (SELECT
p.DriverID,
p.DriverName,
p.CarNum,
SUM(CASE WHEN r.RaceType = 10 THEN (200 - ((p.CarPosition - 1) * 2)) ELSE 0 END) AS Points,
MAX(CASE WHEN r.RaceType = 60 THEN p.CarPosition ELSE 999999 END) AS OrderField
FROM
RaceParticipants AS p
INNER JOIN Race AS r ON p.RaceID = r.RaceID
WHERE r.RaceDateID IN (255, 256)
GROUP BY
r.RaceDateID, p.DriverID, p.DriverName, p.CarNum
)sub
ORDER BY
Points DESC, OrderField
Depending on how you want the 2nd order field to be handled you can alter the ELSE, with no ELSE you'd return NULL which sorted ascending comes before other values.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Divide derived rank, count columns from subquery to find percentile - sql-server-2005

I think the datatypes of Cnt and Rnk are INT and hence the result gives you the least value of INT type which is 0. Try casting Cnt and Rnk to FLOAT (cast(Cnt - Rnk) as FLOAT)/CAST(Cnt as FLOAT)

Related

Why error: 01428. 00000 - "argument '%s' is out of range" SQl Developer

Aggregating values in SQL

Oracle SQL -- Pick Max Value from RN Function but update all fields with that value

SQL Server 2008 equivalent for FETCH OFFSET with WHERE clause

SQL order by sum column, when tie need to set order by subquery

Categories

Resources