Retrieve the Median from a decimal column using PERCENTILE_CONT SQL - sql

I have a table Prices Like:
ID PurchasePriceCalc
0146301 0.002875161
00006L00 0.00396
00087G03 NULL
00001G04 0.0020004
00006S 0.003689818
01580h01 NULL
00082EE00 0.002462687
00038R05 0.002237565
01666R01 0.002666667
I Would like to get the Median per each PurchasePriceCalc and then subtract the result with the PurchasePriceCalc, for a better explanation the Formula should be : (PurchasePriceCalc - Median(PurchasePriceCalc)).
I'm using the query below but is not working:
SELECT ID,PurchasePriceCalc, PurchasePriceCalc - PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY PurchasePriceCalc)
OVER (PARTITION BY ID) AS MediaCalc FROM Prices
This is how should be the Output, (Yellow Column).
Any assistance or help would be really appreciated!

I guess the problem is with OVER (PARTITION BY ID). If ID is UNIQUE then each group consists of only one row that is why you get all values equal 0/NULL.
You should remove PARTITION BY part.
SELECT ID,PurchasePriceCalc,
PurchasePriceCalc - PERCENTILE_CONT(0.5)
WITHIN GROUP(ORDER BY PurchasePriceCalc) OVER () AS MediaCalc
FROM Prices;

Related

ROW_NUMBER function does not start from 1

I would like to ask about strange behaviour in SQL Server whilst using ROW_NUMBER() Function. Typically it should start from 1 and Order values by the selected column in Order By clause, which for the most scenarios works for me just as it is supposed to, but I have a particular case when I use a basic Select Statement:
SELECT
ROW_NUMBER() OVER (ORDER BY VIN) AS RN,
*
FROM dbo.RawData
and I get such result:
RN VIN
6301 JTEBR3FJ00K096082
6302 JTEBR3FJ00K096132
6303 JTEBR3FJ00K096146
6304 JTEBR3FJ00K096163
6305 JTEBR3FJ00K096180
6306 JTEBR3FJ00K096275
1801 5TDDZRFHX0S820530
1802 5TDDZRFHX0S824111
1803 5TDDZRFHX0S824500
1804 5TDDZRFHX0S825971
1805 5TDDZRFHX0S826456
and those are the first columns in the return table. The whole ROW_NUMBER function works randomly, after chain from 6301 to 6306, the chain from 1801 to 1940 starts etc.
The VIN column (the one I sort data based on) is set to nvarchar(17)
could you please help with solving the issue which might occur in this case?
I would be grateful for any tips what might be wrong
You can use ORDER BY to order the rows in a desired way:
SELECT ROW_NUMBER() OVER (ORDER BY VIN) AS RN
,*
FROM dbo.RawData
ORDER BY RN;
As the row_number is calculated in the SELECTE, you can use its value in the ORDER BY clause without the need of nested query.

How to select unique records from a result in oracle SQL?

I am running a SQL query on oracle database.
SELECT DISTINCT flow_id , COMPOSITE_NAME FROM CUBE_INSTANCE where flow_id IN(200148,
200162);
I am getting below results as follow.
200162 ABCWS1
200148 ABCWS3
200162 ABCWS2
200148 OutputLog
200162 OutputLog
In this result 200162 came thrice as composite Name is different in each result. But my requirement is to get only one row of 200162 which is 1st one. If result contains same flow_id multiple times then it should only display result of first flow_id and ignore whatever it has in 2nd and 3rd.
EXPECTED OUTPUT -
200162 ABCWS1
200148 ABCWS3
Could you please help me with modification of query?
Thank you in advance !!!
It appears that you want to take the lexicographically first composite name for each flow_id:
WITH cte AS (
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY flow_id ORDER BY COMPOSITE_NAME) rn
FROM CUBE_INSTANCE t
WHERE flow_id IN (200148, 200162)
)
SELECT flow_id, COMPOSITE_NAME
FROM cte
WHERE rn = 1;
There is no such thing as a "first" row, unless a column specifies that information.
But you can easily use aggregation for this purpose:
select ci.flow_id, min(ci.composite_name)
from cube_instance ci
where flow_id in (200148, 200162);
group by ci.flow_id
If you do have a column that specifies the ordering, you can still use aggregation. The equivalent of the "first" function in Oracle is:
select ci.flow_id,
min(ci.composite_name) keep (dense_rank first order by <ordering col>)
from cube_instance ci
where flow_id in (200148, 200162);
group by ci.flow_id

Sql Islands and Gaps Merge Contiguous records if relevant fields hold same values

I have created a test case here for my problem https://rextester.com/ZRXSQ14415
Its must each easier to show the problem to explain what I am trying to achieve.
I have a list of records across time and I wish to merge contiguous records into a single record.
Each record has a period Date, Risk Levels and a couple of flags. When these risks and flags are the same the records should be merged when they are different then they should be a separate row.
On the Rextester example, i have almost achieved my goal, however look at rows 3 + 4 of the result.
What I want to achieve is that rows 3 + 4 would be combined such that row 3
StartDate End Date Name ... ...
17.03.2019 20.03.2019 CPWJ40-A ... ...
As all flags and risk levels are the same.
Change the SEQ expression to
..
ROW_NUMBER() OVER (ORDER BY PeriodDate) - ROW_NUMBER() OVER (Partition BY ImplicitRisk,QCReadyRisk,IsQualityControlReady, ActivePeriod ORDER BY PeriodDate) AS SEQ
..
This way you'll get the proper grouping of islands of ImplicitRisk,QCReadyRisk,IsQualityControlReady, ActivePeriod.
This answer is purely to complement Serg answer with the full query.
SELECT MIN(d.PeriodDate) AS StartDate,
MAX(d.PeriodDate) AS EndDate,
ImplicitRisk,
QcReadyRisk,
IsQualityControlReady,
ActivePeriod,
LocationEventName
FROM
(
SELECT c.*,
ROW_NUMBER() OVER (ORDER BY PeriodDate) - ROW_NUMBER() OVER (Partition BY LocationEventId, ImplicitRisk, QCReadyRisk, IsQualityControlReady, ActivePeriod ORDER BY PeriodDate) AS grp
FROM tab c
--order by PeriodDate
) d
group by ImplicitRisk, QcReadyRisk, IsQualityControlReady, ActivePeriod, LocationEventName, grp
order by 1

SQL - Group Values by Percentile/Merge Rankings

I have multiple tables that contain the name of a company/attribute and a ranking.
I would like to write a piece of code which allows a range of Scores to be placed into specific Groups based on the percentile of the score in relationship to tables Score total. I provided a very easy use case to demonstrate what I am looking for, splitting a group of 10 companies into 5 groups, but I would like to scales this in order to apply the 5 groups to data sets with many rows WITHOUT having to specify values in a CASE statement.
You can use NTILE to divide the data into 5 buckets based on score. However, if the data can't be divided into equal number of bins or if there are ties, one of the groups will have more members.
SELECT t.*, NTILE(5) OVER(ORDER BY score) as grp
FROM tablename t
Read more about NTILE here
NTILE(5) OVER(ORDER BY score) might actually put rows with the same value into different quantiles (This is probably not what you want, at least I never liked that).
It's quite similar to
5 * (row_number() over (order by score) - 1) / count(*) over ()
but if the number of rows can't be evenly divided the remainder rows are added to the first quantiles when using NTILE and randomly for ROW_NUMBER.
To assign all the rows with the same value to the same quantile you need to do your own calculation:
5 * (rank() over (order by score) - 1) / count(*) over ()
You can try using ROW_NUMBER() and CEILING() :
SELECT t.name,t.score,
CEILING(ROW_NUMBER() OVER(ORDER BY t.score)/2) as group
FROM YourTable t
This will divide each group of two into a single group, using the ROW_NUMBER() result .

Calculate Rank Pattern without any order value

My Data is like this -
You can check 3 columns, jil_equipment_id,req_group,operand.
Based on these 3 columns i have to generate a new "Patern" Column.
The patern column is a patern and starts from 2 and increases by 1 for each repeated combination of jil_equipment_id,req_group,operand.
The final data will look like this.
Please suggest me any possible approach. I am not able to use the RANK()/DENSE_RANK() Function on this.
You can use row_number(). You want to use the partition by as well:
select t.*,
(1 + row_number() over (partition by jil_equipment_id, req_group, operand
order by content_id
)
) as pattern
from t;
select *,Row_Number() over(partition by jil_equipment_id,req_group,operand order by jil_equipment_id,req_group,operand) + 1 as pattern
from tab
you can use row_number() function for this.