SQL Grouping even and odd - sql

I have to Group Certain data as so that it comes in 2 sets.
Attached image has details of actal data, expected result and data from query I used.
I am sure i am missing something in group by of max option .Please help
select agrmnt_id ,location_name, slab_no,target_start,target_end, tier_perc ,mod(RANK, 2) col from
(select agrmnt_id ,location_name, slab_no, target as target_start ,LAG(target) OVER (PARTITION BY location_name ORDER BY slab_no DESC)-1 as target_end ,PAY_PREC|| '%' as tier_perc,
DENSE_RANK() over(partition by agrmnt_id order by location_name) RANK
from plb_addnl_slab_details
where agrmnt_id='PLBCAI140262' order by location_name,slab_no
)) group by agrmnt_id,location_name ,slab_no
order by location_name1 ,slab_no1, location_name2 ,slab_no2

If I understand what you want, which is more than a little doubtful, it seems like you are able to generate a list of all the values you want, but you can't get them aligned in two sets? If so I think you need to treat your initial list as a base view and left outer join it to itself, using your col value to decide which is in first set and which in the second.
The criteria for joining seem a bit vague. If I add another ranking to stop the same values appearing twice in the second columns, I can get your expected result with this:
with t as (
select agrmnt_id, location_name, slab_no, target_start, target_end,
tier_perc , mod(col_rnk, 2) col, rnk
from (
select agrmnt_id, location_name, slab_no, target as target_start,
LAG(target) OVER (PARTITION BY location_name
ORDER BY slab_no DESC)-1 as target_end,
SLAB_PERC|| '%' as tier_perc,
DENSE_RANK() over(partition by agrmnt_id order by location_name) col_rnk,
RANK() over(partition by agrmnt_id, slab_no order by location_name) rnk
from plb_addnl_slab_details
where agrmnt_id='PLBCAI140262'
)
)
select t1.agrmnt_id as agrmnt_id_1, t1.location_name as location_name_1,
t1.slab_no as slab_no_1, t1.target_start as target_start_1,
t1.target_end as target_end_1,
t2.agrmnt_id as agrmnt_id_2, t2.location_name as location_name_2,
t2.slab_no as slab_no_2, t2.target_start as target_start_2,
t2.target_end as target_end_2
from t t1
left join t t2 on t2.agrmnt_id = t1.agrmnt_id
and t2.slab_no = t1.slab_no
and t2.rnk = t1.rnk + 1
and t2.col = 0
where t1.col = 1
order by t1.agrmnt_id, t1.location_name, t1.slab_no;
SQL Fiddle. I'm not convinced those join conditions (or the new rank) are quite right but can't really tell without more data, or more information about the logic you want to use. Hopefully this gives you something you can adapt though.

Related

sorting and comparing columns in Big Query SQL

I have a requirement to to sort and compare columns values. In a table having 6 columns
Need to do sorting for A_Length, A_breadth, A_Width and similar sorting need to be done for B_length, B_breadth and B_width
After sorting comparison need to do be done between A_* and B_* column based on their sorting order like
After sorting:
comparison need to be done with out put true or False
(3<11=True and 22<22= false and 23<32 =true) over all result for this is false
(5<11=true and 11<22=true and 17<32=true ) over all result for this is true
(17<11=false and 23<22=false and 27<32=true) over all result for this is False
In Biq query i can do greatest and least but not sure how to take the 3rd value(that is neither greatest nor least)
Let me know if any one can suggest a logic Its a big table having multiple column and above 6 column will be part of it.
Some more info below :
Sorting is smallest to largest. suppose i have 6 columns with values: A_Len=3, A_Bred=15,A_Wid=10, B_Len=20, B_Bred=11,B_Wid=7 . So in this first sorting for A_* columns is needed (3,10,15) then sorting for B_* column needed(7,11,20). Then in same order comparison(less than) need to be done between A_* and B_* sorted values (3<7 = result True , 10<11 = result true ,15<20= result true), i need the output after comparison as true or false . Need an suggestion how this can be done in GCP BQ , all these 6 column are part of a table.
Regards,
It seems like you want do a comparison with something that is not quite the table you currently have. It seems that you want to sort each column individually, without affecting the others, and then compare.
You could try something like this (changing all the instances of TABLE for your table name):
SELECT
sorted_A_LENGTH.A_LENGTH,
sorted_A_BREADTH.A_BREADTH,
sorted_A_WIDTH.A_WIDTH,
sorted_B_LENGTH.B_LENGTH,
sorted_B_BREADTH.B_BREADTH,
sorted_B_WIDTH.B_WIDTH,
sorted_A_LENGTH.A_LENGTH < Sorted_B_LENGTH.B_LENGTH,
sorted_A_BREADTH.A_BREADTH < sorted_B_BREADTH.B_BREADTH,
sorted_A_WIDTH.A_WIDTH < Sorted_B_WIDTH.B_WIDTH
FROM
(SELECT A_LENGTH, row_number() OVER (ORDER A_LENGTH) AS row_num FROM TABLE ORDER BY ASC) as sorted_A_LENGTH
LEFT JOIN
(SELECT A_BREADTH, row_number() OVER (ORDER A_BREADTH) AS row_num FROM TABLE ORDER BY ASC) as sorted_A_BREADTH
ON sorted_A_LENGTH.row_num = sorted_A_BREADTH.row_num
LEFT JOIN
(SELECT A_WIDTH, row_number() OVER (ORDER A_WIDTH) AS row_num FROM TABLE ORDER BY ASC) as sorted_A_WIDTH
ON sorted_A_LENGTH.row_num = sorted_A_WIDTH.row_num
LEFT JOIN
(SELECT B_LENGTH, row_number() OVER (ORDER B_LENGTH) AS row_num FROM TABLE ORDER BY ASC) as sorted_B_LENGTH
ON sorted_A_LENGTH.row_num = sorted_B_LENGTH.row_num
LEFT JOIN
(SELECT B_BREADTH, row_number() OVER (ORDER B_BREADTH) AS row_num FROM TABLE ORDER BY ASC) as sorted_B_BREADTH
ON sorted_A_LENGTH.row_num = sorted_B_BREADTH.row_num
LEFT JOIN
(SELECT B_WIDTH, row_number() OVER (ORDER B_WIDTH) AS row_num FROM TABLE ORDER BY ASC) as sorted_B_WIDTH
ON sorted_A_LENGTH.row_num = sorted_B_WIDTH.row_num
Consider below
select *,
a_length < b_length as length_a_less_b,
a_breadth < b_breadth as breadth_a_less_b,
a_width < b_width as width_a_less_b
from (
select * from(select * from your_table limit 0) union all
select a_arr[offset(0)], a_arr[offset(1)], a_arr[offset(2)],
b_arr[offset(0)], b_arr[offset(1)], b_arr[offset(2)]
from your_table,
unnest([struct((
select array_agg(a order by a) from unnest([a_length, a_breadth, a_width]) a
) as a_arr)]),
unnest([struct((
select array_agg(b order by b) from unnest([b_length, b_breadth, b_width]) b
) as b_arr)])
)
if applied to sample data in your question
output is

ORA-00937: not a single-group group function for sum function

I am trying to sum up the COUNT(IHID.RSID_PROD_N) by IHID.CS_ID but facing a problem. How to solve it?
SELECT
IHID.CS_ID ,IHID.RSID_PROD_N,COUNT(IHID.RSID_PROD_N),
RSPF.RSPF_PROD_N,COUNT(RSPF.RSPF_PROD_N),sum(COUNT(IHID.RSID_PROD_N))
from IHIH
JOIN IHID
ON ihih.rsih_invoice_n = ihid.rsih_invoice_n AND ihih.cs_id = ihid.cs_id
JOIN RSPF
ON ihih.cs_id = rspf.cs_id AND ihid.rsid_prod_n=rspf.rspf_prod_n
WHERE rspf_desc LIKE '%SCISSOR LIFT'
GROUP BY IHID.CS_ID, IHID.RSID_PROD_N,RSPF.RSPF_PROD_N,IHID.CS_ID;
The table is something like this
16 SJIII4626 1 SJIII4626 1
16 SJIII4632 1 SJIII4632 1
I want 1+1=2 for 16
I think you need analytic functions rather than aggregates here. Something like:
SELECT
IHID.CS_ID
,IHID.RSID_PROD_N
,row_number() over (partition by IHID.CS_ID order by IHID.RSID_PROD_N) as IHID_RSID_PROD_N
,RSPF.RSPF_PROD_N
,row_number() over (partition by IHID.CS_ID order by RSPF.RSPF_PROD_N) as RSPF_RSPF_PROD_N
,COUNT(IHID.RSID_PROD_N) over (partition by IHID.CS_ID) as sum_count
from IHIH
JOIN IHID
ON ihih.rsih_invoice_n = ihid.rsih_invoice_n AND ihih.cs_id = ihid.cs_id
JOIN RSPF
ON ihih.cs_id = rspf.cs_id AND ihid.rsid_prod_n=rspf.rspf_prod_n
WHERE rspf_desc LIKE '%SCISSOR LIFT'
;
Not entirely sure because your question lacks a complete test case.
If this answer isn't quite what you want please edit your question to provide table structures and sample input data together with required output derived from that data.
Its not grouping as you would like because of the unique values in IHID.RSID_PROD_N and RSPF.RSPF_PROD_N. Remove those columns and it will group as expected.
One option is to use your current query (almost unchanged) as a CTE, and then apply SUM to a COUNT which you couldn't have done in a nested manner. Something like this:
with your_current_query as
-- removed nested SUM(COUNT)
(select
ihid.cs_id,
ihid.rsid_prod_n,
rspf.rspf_prod_n,
count(ihid.rsid_prod_n) cnt_rsid
count(rspf.rspf_prod_n) cnt_rspf
from ihih join ihid on ihih.rsih_invoice_n = ihid.rsih_invoice_n
and ihih.cs_id = ihid.cs_id
join rspf on ihih.cs_id = rspf.cs_id
and ihid.rsid_prod_n=rspf.rspf_prod_n
where rspf_desc like '%SCISSOR LIFT'
group by ihid.cs_id,
ihid.rsid_prod_n,
rspf.rspf_prod_n
)
select cs_id,
rsid_prod_n,
rspf_prod_n,
cnt_rsid,
cnt_rspf,
sum(cnt_rsid) sum_cnt_rsid --> this represents nested SUM(COUNT)
from your_current_query
group by cs_id,
rsid_prod_n,
rspf_prod_n,
cnt_rsid,
cnt_rspf;

Modify my SQL Server query -- returns too many rows sometimes

I need to update the following query so that it only returns one child record (remittance) per parent (claim).
Table Remit_To_Activate contains exactly one date/timestamp per claim, which is what I wanted.
But when I join the full Remittance table to it, since some claims have multiple remittances with the same date/timestamps, the outermost query returns more than 1 row per claim for those claim IDs.
SELECT * FROM REMITTANCE
WHERE BILLED_AMOUNT>0 AND ACTIVE=0
AND REMITTANCE_UUID IN (
SELECT REMITTANCE_UUID FROM Claims_Group2 G2
INNER JOIN Remit_To_Activate t ON (
(t.ClaimID = G2.CLAIM_ID) AND
(t.DATE_OF_LATEST_REGULAR_REMIT = G2.CREATE_DATETIME)
)
where ACTIVE=0 and BILLED_AMOUNT>0
)
I believe the problem would be resolved if I included REMITTANCE_UUID as a column in Remit_To_Activate. That's the REAL issue. This is how I created the Remit_To_Activate table (trying to get the most recent remittance for a claim):
SELECT MAX(create_datetime) as DATE_OF_LATEST_REMIT,
MAX(claim_id) AS ClaimID,
INTO Latest_Remit_To_Activate
FROM Claims_Group2
WHERE BILLED_AMOUNT>0
GROUP BY Claim_ID
ORDER BY Claim_ID
Claims_Group2 contains these fields:
REMITTANCE_UUID,
CLAIM_ID,
BILLED_AMOUNT,
CREATE_DATETIME
Here are the 2 rows that are currently giving me the problem--they're both remitts for the SAME CLAIM, with the SAME TIMESTAMP. I only want one of them in the Remits_To_Activate table, so only ONE remittance will be "activated" per Claim:
enter image description here
You can change your query like this:
SELECT
p.*, latest_remit.DATE_OF_LATEST_REMIT
FROM
Remittance AS p inner join
(SELECT MAX(create_datetime) as DATE_OF_LATEST_REMIT,
claim_id,
FROM Claims_Group2
WHERE BILLED_AMOUNT>0
GROUP BY Claim_ID
ORDER BY Claim_ID) as latest_remit
on latest_remit.claim_id = p.claim_id;
This will give you only one row. Untested (so please run and make changes).
Without having more information on the structure of your database -- especially the structure of Claims_Group2 and REMITTANCE, and the relationship between them, it's not really possible to advise you on how to introduce a remittance UUID into DATE_OF_LATEST_REMIT.
Since you are using SQL Server, however, it is possible to use a window function to introduce a synthetic means to choose among remittances having the same timestamp. For example, it looks like you could approach the problem something like this:
select *
from (
select
r.*,
row_number() over (partition by cg2.claim_id order by cg2.create_datetime desc) as rn
from
remittance r
join claims_group2 cg2
on r.remittance_uuid = cg2.remittance_uuid
where
r.active = 0
and r.billed_amount > 0
and cg2.active = 0
and cg2.billed_amount > 0
) t
where t.rn = 1
Note that that that does not depend on your DATE_OF_LATEST_REMIT table at all, it having been subsumed into the inline view. Note also that this will introduce one extra column into your results, though you could avoid that by enumerating the columns of table remittance in the outer select clause.
It also seems odd to be filtering on two sets of active and billed_amount columns, but that appears to follow from what you were doing in your original queries. In that vein, I urge you to check the results carefully, as lifting the filter conditions on cg2 columns up to the level of the join to remittance yields a result that may return rows that the original query did not (but never more than one per claim_id).
A co-worker offered me this elegant demonstration of a solution. I'd never used "over" or "partition" before. Works great! Thank you John and Gaurasvsa for your input.
if OBJECT_ID('tempdb..#t') is not null
drop table #t
select *, ROW_NUMBER() over (partition by CLAIM_ID order by CLAIM_ID) as ROW_NUM
into #t
from
(
select '2018-08-15 13:07:50.933' as CREATE_DATE, 1 as CLAIM_ID, NEWID() as
REMIT_UUID
union select '2018-08-15 13:07:50.933', 1, NEWID()
union select '2017-12-31 10:00:00.000', 2, NEWID()
) x
select *
from #t
order by CLAIM_ID, ROW_NUM
select CREATE_DATE, MAX(CLAIM_ID), MAX(REMIT_UUID)
from #t
where ROW_NUM = 1
group by CREATE_DATE

Percentage difference between numbers in two columns

My SQL experience is fairly minimal so please go easy on me here. I have a table tblForEx and I'm trying to create a query that looks at one particular column LastSalesRateChangeDate and also ForExRate.
Basically what I want to do is for the query to check that LastSalesRateChangeDate and then pull the ForExRate that is on the same line (obviously in the ForExRate column), then I need to check to see if there is a +/- 5% change since the last time the LastSalesRateChangeDate changed. I hope this makes sense, I tried to explain it as clearly as possible.
I believe I would need to create a 'subquery' to look at the LastSalesRateChangeDate and pull the ForEx rate from that date, but I just don't know how to go about this.
I should add this is being done in Access (SQL)
Sample data, here is what the table looks like:
| BaseCur | ForCur | ForExRate | LastSalesRateChangeDate
| USD | BRL | 1.718 | 12/9/2008
| USD | BRL | 1.65 | 11/8/2008
So I would need a query to look at the LastSalesRateChangeDate column, check to see if the date has changed, if so take the ForExRate value and then give a percentage difference of that ForExRate value since the last record.
So the final result would likely look like
"BaseCur" "ForCur" "Percentage Change since Last Sales Rate Change"
USD BRL X%
Gordon's answer pointed in the right direction:
SELECT t2.*, (SELECT top 1 t.ForExRate
FROM tblForEx t
where t.BaseCur=t2.BaseCur AND t.ForCur=t2.ForCur and t.LastSalesRateChangeDate<t2.LastSalesRateChangeDate
order by t.LastSalesRateChangeDate DESC, t.ForExRate DESC
) AS PreviousRate, [ForExRate]/[PreviousRate]-1 AS ChangeRatio
FROM tblForEx AS t2;
Access gives errors where the TOP 1 in the subquery causes "ties". We broke the ties and therefore removed the error by adding an extra item to the ORDER BY clause. To get the ratio to display as a percentage, switch to the design view and change the properties of that column accordingly.
If I understand correctly, you want the previous value. In MS Access, you can use a correlated subquery:
select t.*,
(select top (1) t2.LastSalesRateChangeDate
from tblForEx as t2
where t2.BaseCur = t.BaseCur and t2.ForCur = t.ForCur
t2.LastSalesRateChangeDate < t.LastSalesRateChangeDate
order by t2.LastSalesRateChangeDate desc
) as prev_LastSalesRateChangeDate
from t;
Now, with this as a subquery, you can get the previous exchange rate using a join:
select t.*, ( (t.ForExRate / tprev.ForExRate) - 1) as change_ratio
from (select t.*,
(select top (1) t2.LastSalesRateChangeDate
from tblForEx as t2
where t2.BaseCur = t.BaseCur and t2.ForCur = t.ForCur
t2.LastSalesRateChangeDate < t.LastSalesRateChangeDate
order by t2.LastSalesRateChangeDate desc
) as prev_LastSalesRateChangeDate
from t
) as t inner join
tblForEx as tprev
on tprev.BaseCur = t.BaseCur and tprev.ForCur = t.ForCur
tprev.LastSalesRateChangeDate = t.prev_LastSalesRateChangeDate;
As per my understanding, you can use LEAD function to get last changed date Rate in a new column by using below query:
WITH CTE AS (
SELECT *, LEAD(ForExRate, 1) OVER(PARTITION BY BaseCur, ForCur ORDER BY LastChangeDate DESC) LastValue
FROM #TT
)
SELECT BaseCur, ForCur, ForExRate, LastChangeDate , CAST( ((ForExRate - ISNULL(LastValue, 0))/LastValue)*100 AS float)
FROM CTE
Problem here is:
for every last row in group by you will have new calculalted column which we have made using LEAD function.
If there is only a single row for a particular BaseCur and ForCur, then also you will have NULL in column.
Resolution:
If you are sure that there will be at least two rows for each BaseCur and ForCur, then you can use WHERE clause to remove NULL values in final result.
WITH CTE AS (
SELECT *, LEAD(ForExRate, 1) OVER(PARTITION BY BaseCur, ForCur ORDER BY LastChangeDate DESC) LastValue
FROM #TT
)
SELECT BaseCur, ForCur, ForExRate, LastChangeDate , CAST( ((ForExRate - ISNULL(LastValue, 0))/LastValue)*100 AS float) Percentage
FROM CTE
WHERE LastValue IS NOT NULL
SELECT basetbl.BaseCur, basetbl.ForCur, basetbl.NewDate, basetbl.OldDate, num2.ForExRate/num1.ForExRate*100 AS PercentChange FROM
(((SELECT t.BaseCur, t.ForCur, MAX(t.LastSalesRateChangeDate) AS NewDate, summary.Last_Date AS OldDate
FROM (tblForEx AS t
LEFT JOIN (SELECT TOP 2 BaseCur, ForCur, MAX(LastSalesRateChangeDate) AS Last_Date FROM tblForEx AS t1
WHERE LastSalesRateChangeDate <>
(SELECT MAX(LastSalesRateChangeDate) FROM tblForEx t2 WHERE t2.BaseCur = t1.BaseCur AND t2.ForCur = t1.ForCur)
GROUP BY BaseCur, ForCur) AS summary
ON summary.ForCur = t.ForCur AND summary.BaseCur = t.BaseCur)
GROUP BY t.BaseCur, t.ForCur, summary.Last_Date) basetbl
LEFT JOIN tblForEx num1 ON num1.BaseCur=basetbl.BaseCur AND num1.ForCur = basetbl.ForCur AND num1.LastSalesRateChangeDate = basetbl.OldDate))
LEFT JOIN tblForEx num2 ON num2.BaseCur=basetbl.BaseCur AND num2.ForCur = basetbl.ForCur AND num2.LastSalesRateChangeDate = basetbl.NewDate;
This uses a series of subqueries. First, you are selecting the most recent date for the BaseCur and ForCur. Then, you are joining onto that the previous date. I do that by using another subquery to select the top two dates, and exclude the one that is equal to the previously established most recent date. This is the "summary" subquery.
Then, you get the BaseCur, ForCur, NewDate, and OldDate in the "basetbl" subquery. After that, it is two simple joins of the original table back onto those dates to get the rate that was applicable then.
Finally, you are selecting your BaseCur, ForCur, and whatever formula you want to use to calculate the rate change. I used a simple ratio in that one, but it is easy to change. You can remove the dates in the first line if you want, they are there solely as a reference point.
It doesn't look pretty, but complicated Access SQL queries never do.

Optimization of multiple aggregate sorting in SQL

I have a postgres query written for the Spree Commerce store that sorts all of it's products in the following order: In stock (then first available), Backorder (then first available), Sold out (then first available).
In order to chain it with rails scopes I had to put it in the order by clause as opposed to anywhere else. The query itself works, and is fairly performant, but complex. I was curious if anyone with a bit more knowledge could discuss a better way to do it? I'm interested in performance, but also different ways to approach the problem.
ORDER BY (
SELECT
CASE
WHEN tt.count_on_hand > 0
THEN 2
WHEN zz.backorderable = true
THEN 1
ELSE 0
END
FROM (
SELECT
row_number() OVER (dpartition),
z.id,
bool_or(backorderable) OVER (dpartition) as backorderable
FROM (
SELECT DISTINCT ON (spree_variants.id) spree_products.id, spree_stock_items.backorderable as backorderable
FROM spree_products
JOIN "spree_variants" ON "spree_variants"."product_id" = "spree_products"."id" AND "spree_variants"."deleted_at" IS NULL
JOIN "spree_stock_items" ON "spree_stock_items"."variant_id" = "spree_variants"."id" AND "spree_stock_items"."deleted_at" IS NULL
JOIN "spree_stock_locations" ON spree_stock_locations.id=spree_stock_items.stock_location_id
WHERE spree_stock_locations.active = true
) z window dpartition as (PARTITION by id)
) zz
JOIN (
SELECT
row_number() OVER (dpartition),
t.id,
sum(count_on_hand) OVER (dpartition) as count_on_hand
FROM (
SELECT DISTINCT ON (spree_variants.id) spree_products.id, spree_stock_items.count_on_hand as count_on_hand
FROM spree_products
JOIN "spree_variants" ON "spree_variants"."product_id" = "spree_products"."id" AND "spree_variants"."deleted_at" IS NULL
JOIN "spree_stock_items" ON "spree_stock_items"."variant_id" = "spree_variants"."id" AND "spree_stock_items"."deleted_at" IS NULL
) t window dpartition as (PARTITION by id)
) tt ON tt.row_number = 1 AND tt.id = spree_products.id
WHERE zz.row_number = 1 AND zz.id=spree_products.id
) DESC, available_on DESC
The FROM shown above determines whether or not a product is backorderable, and the JOIN shown above determines the stock in inventory. Note that these are very similar queries, except that I need to determine if something is backorderable based on a locations ability to support backorders and its state, WHERE spree_stock_locations.active=true.
Thanks for any advice!