Get Max date by another column

Get Max date by another column - sql

I am trying to write a simple query to get the MAX DEMAND_DATE for each INV_CART_ID. Here is my existing query:
SELECT BUSINESS_UNIT, INV_CART_ID, INV_ITEM_ID, CART_COUNT_QTY, DEMAND_DATE
FROM PS_CART_CT_INF_INV A
WHERE A.INV_ITEM_ID = 1
AND A.BUSINESS_UNIT = '11MMS'
AND A.CART_COUNT_QTY <> 0
ORDER BY DEMAND_DATE DESC
Current Output:
Desired Output:
BUSINESS_UNIT INV_CART_ID INV_ITEM_ID CART_COUNT_QTY DEMAND_DATE
11MMS 405 1 5.0000 2018-05-29
11MMS OUTPT_INFUSION 1 4.0000 2018-05-29
11MMS 938 1 15.0000 2018-05-31
11MMS 286 1 1.0000 2018-05-07
11MMS 708 1 4.0000 2018-04-05
This is what I have tried doing so far:
SELECT MAX(DEMAND_DATE) AS DEMAND_DATE, INV_CART_ID, BUSINESS_UNIT,
INV_ITEM_ID, CART_COUNT_QTY
FROM PS_CART_CT_INF_INV A
WHERE A.INV_ITEM_ID = 1
AND A.BUSINESS_UNIT = '11MMS'
AND A.CART_COUNT_QTY <> 0
AND A.DEMAND_DATE IN (SELECT MAX (DEMAND_DATE) FROM PS_CART_CT_INF_INV B
WHERE A.INV_ITEM_ID = B.INV_ITEM_ID GROUP BY INV_CART_ID)
GROUP BY INV_CART_ID, BUSINESS_UNIT, INV_ITEM_ID, CART_COUNT_QTY
However it doesn't return all INV_CART_ID #'s and is not retrieving the correct row (wrong DEMAND_DATE):

Use ROW_NUMBER:
WITH cte AS (
SELECT BUSINESS_UNIT, INV_CART_ID, INV_ITEM_ID, CART_COUNT_QTY, DEMAND_DATE,
ROW_NUMBER() OVER (PARTITION BY INV_CART_ID ORDER BY DEMAND_DATE DESC) rn
FROM PS_CART_CT_INF_INV
WHERE
INV_ITEM_ID = 1 AND
BUSINESS_UNIT = '11MMS' AND
CART_COUNT_QTY <> 0
)
SELECT
BUSINESS_UNIT, INV_CART_ID, INV_ITEM_ID, CART_COUNT_QTY, DEMAND_DATE
FROM cte
WHERE rn = 1
ORDER BY DEMAND_DATE DESC;
If you don't want to use analytic functions, then I still would not use your current approach. Instead, I would join to a subquery, like this:
SELECT
t1.BUSINESS_UNIT,
t1.INV_CART_ID,
t1.INV_ITEM_ID,
t1.CART_COUNT_QTY,
t1.DEMAND_DATE
FROM PS_CART_CT_INF_INV t1
INNER JOIN
(
SELECT INV_CART_ID, MAX(DEMAND_DATE) AS MAX_DEMAND_DATE
FROM PS_CART_CT_INF_INV
WHERE INV_ITEM_ID = 1 AND BUSINESS_UNIT = '11MMS' AND CART_COUNT_QTY <> 0
GROUP BY INV_CART_ID
) t2
ON t1.INV_CART_ID = t2.INV_CART_ID AND t1.DEMAND_DATE = t2.MAX_DEMAND_DATE
WHERE
t1.INV_ITEM_ID = 1 AND
t1.BUSINESS_UNIT = '11MMS' AND
t1.CART_COUNT_QTY <> 0;
The issue with your current query, even once corrected, is that it is using a correlated subquery in the WHERE clause. These are known to be potential performance killers, and so should be avoided if possible.

I think you want :
SELECT BUSINESS_UNIT, INV_CART_ID, INV_ITEM_ID, CART_COUNT_QTY, DEMAND_DATE
FROM PS_CART_CT_INF_INV AS a
WHERE INV_ITEM_ID = 1 AND BUSINESS_UNIT = '11MMS' AND
CART_COUNT_QTY <> 0 AND
DEMAND_DATE = (SELECT MAX(b.DEMAND_DATE)
FROM PS_CART_CT_INF_INV as b
WHERE a.INV_CART_ID = b.INV_CART_ID
);
However, this would gives you duplicate records, if you want to avoid duplicates then you can use identity column or pk instead in WHERE clause :
. . .
WHERE pk = (SELECT TOP (1) b.pk
FROM PS_CART_CT_INF_INV as b
WHERE a.INV_CART_ID = b.INV_CART_ID
ORDER BY b.DEMAND_DATE DESC
);

Related

How to rewrite nested subqueries so that hive can run them

select
cd_gender,
cd_marital_status,
cd_education_status,
count(*) cnt1,
cd_purchase_estimate,
count(*) cnt2,
cd_credit_rating,
count(*) cnt3,
cd_dep_count,
count(*) cnt4,
cd_dep_employed_count,
count(*) cnt5,
cd_dep_college_count,
count(*) cnt6
from
customer c,customer_address ca,customer_demographics
where
c.c_current_addr_sk = ca.ca_address_sk and
ca_county in ('Greer County','Boone County','Cumberland County','Tyler County','Marion County') and
cd_demo_sk = c.c_current_cdemo_sk and
exists (select *
from store_sales,date_dim
where c.c_customer_sk = ss_customer_sk and
ss_sold_date_sk = d_date_sk and
d_year = 1999 and
d_moy between 1 and 1+3) and
(exists (select *
from web_sales,date_dim
where c.c_customer_sk = ws_bill_customer_sk and
ws_sold_date_sk = d_date_sk and
d_year = 1999 and
d_moy between 1 ANd 1+3) or
exists (select *
from catalog_sales,date_dim
where c.c_customer_sk = cs_ship_customer_sk and
cs_sold_date_sk = d_date_sk and
d_year = 1999 and
d_moy between 1 and 1+3))
group by cd_gender,
cd_marital_status,
cd_education_status,
cd_purchase_estimate,
cd_credit_rating,
cd_dep_count,
cd_dep_employed_count,
cd_dep_college_count
order by cd_gender,
cd_marital_status,
cd_education_status,
cd_purchase_estimate,
cd_credit_rating,
cd_dep_count,
cd_dep_employed_count,
cd_dep_college_count
limit 100;
When i run this query on hive it returns this error
"FAILED: SemanticException [Error 10249]: org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSubquerySemanticException: Line 23:2 Unsupported SubQuery Expression '3': Only SubQuery expressions that are top level conjuncts are allowed
"
This error occurs due to the second exists statement which contains a nested subquery.
Any ideas on how can i rewrite this query so it can work on hive?

You might try reordering the query to prevent having subqueries. The conditions are evaluated serially in SQL, so operator precedence shouldn't be a problem.
where
exists (select *
from web_sales,date_dim
where c.c_customer_sk = ws_bill_customer_sk and
ws_sold_date_sk = d_date_sk and
d_year = 1999 and
d_moy between 1 ANd 1+3) or
exists (select *
from catalog_sales,date_dim
where c.c_customer_sk = cs_ship_customer_sk and
cs_sold_date_sk = d_date_sk and
d_year = 1999 and
d_moy between 1 and 1+3) and
c.c_current_addr_sk = ca.ca_address_sk and
ca_county in ('Greer County','Boone County','Cumberland County','Tyler County','Marion County') and
cd_demo_sk = c.c_current_cdemo_sk and
exists (select *
from store_sales,date_dim
where c.c_customer_sk = ss_customer_sk and
ss_sold_date_sk = d_date_sk and
d_year = 1999 and
d_moy between 1 and 1+3)

Select from left join with multiple cte blocks

The odr cte gets num where code is 1. The adv cte gets num where code is 2. comm and medi cte's do the same where code is 3 and 4 respectively. The total cte gets counts of all nums regardless of the code.
with odr (num, odr_count) as
(
SELECT num, count(*)
FROM pay t
where
code = '1'
group by num
),
adv (num, adv_count) as
(
SELECT num, count(*)
FROM pay t
where
code = '2'
group by num
),comm (num, comm_count) as
(
SELECT num, count(*)
FROM pay t
where
code = '3'
group by num
),medi (num, medi_count) as
(
SELECT num, count(*)
FROM pay t
where
code = '4'
or
code = '5'
group by num
),
total (num, tot_count) as
(
SELECT num, count(*)
FROM pay t
group by num
)
select t.num, tot_count, o.num,
o.odr_count, adv.num, adv_count,
c.num, comm_count,
medi.num, medi_count
FROM total t
left join odr o
on t.num = o.num
left join adv
on o.num = adv.num
left join comm c
on medicaid.num = c.num
left join medi
on c.num = medi.num
This is the output of this query -
Num tot_count num odr_count adv.num adv_count c.num comm_count medi.num medi_count
14476 10
15082 156
Why do all the other columns not have data?
I would expect a result like this
Num tot_count num odr_count adv.num adv_count c.num comm_count medi.num medi_count
14476 10 14476 35345 14476 234 14476 343246 14476 8
15082 156 15082 4354 NULL NULL 15082 3432
I am expecting NULL and empty columns in the second row because 15082 does not have all the codes from 1 through 4.

how to use case with group by?

the query works well but when iam adding group by it gives me [Error] ORA-01427 here is the main query
SELECT DISTINCT Contract_number,
area_number,
area_name,
ADVANCE_PAY,
Postponed_Amount,
extract_number,
total
FROM (SELECT xxr.Contract_num Contract_number,
xxr.p_area_no area_number,
xxr.p_area_name area_name,
xxr.ADVANCE_PAY ADVANCE_PAY,
xxr.DEFERRED_BOOST Postponed_Amount,
xxr.release_num extract_number,
and here is the case statement :
(SELECT DISTINCT
CASE
WHEN :p_item_code IS NOT NULL
THEN
TOTAL_AMOUNT
WHEN :p_item_code IS NULL
THEN
( (SELECT NVL (SUM (TOTAL_AMOUNT), 0)
FROM XXEXTRACT.XXNATGAS_REALSES_LINES
WHERE XXEXTRACT.XXNATGAS_REALSES.release_id =
XXEXTRACT.XXNATGAS_REALSES_LINES.release_id))
ELSE
NULL
END
FROM XXEXTRACT.XXNATGAS_REALSES_LINES xxrl,
XXEXTRACT.XXNATGAS_REALSES
WHERE 1 = 1
AND xxrl.release_id =
XXEXTRACT.XXNATGAS_REALSES.release_id)
AS total
and here is the from part :
FROM XXEXTRACT.XXNATGAS_REALSES_LINES xxrl,
XXEXTRACT.XXNATGAS_REALSES xxr
WHERE 1 = 1
AND xxrl.release_id = xxr.release_id
AND xxr.release_date >= NVL (:p_date_from, xxr.release_date)
AND xxr.release_date <= NVL (:p_date_to, xxr.release_date)
AND xxr.Contract_num = NVL (:p_cont_num, xxr.Contract_num)
AND xxr.vendor_id = NVL (:p_ven_id, xxr.vendor_id)
AND xxr.vendor_site_id = NVL (:p_site_id, xxr.vendor_site_id)
)
and here is the group by :
GROUP BY Contract_number,
area_number,
area_name,
ADVANCE_PAY,
Postponed_Amount,
extract_number,
total;
these is the full query so please any help

For sure I couldn't understand very well your query. You could improve your post for next time.
As answer, I think you should encapsulate your select statment and group by using your select as subquery. It is not the best approach but it may works fine.
select *
from (
select distinct Contract_number
,area_number
,area_name
,ADVANCE_PAY
,Postponed_Amount
,extract_number
,total
from (
select xxr.Contract_num Contract_number
,xxr.p_area_no area_number
,xxr.p_area_name area_name
,xxr.ADVANCE_PAY ADVANCE_PAY
,xxr.DEFERRED_BOOST Postponed_Amount
,xxr.release_num extract_number
,(
select distinct case
when :p_item_code is not null
then TOTAL_AMOUNT
when :p_item_code is null
then (
(
select NVL(SUM(TOTAL_AMOUNT), 0)
from XXEXTRACT.XXNATGAS_REALSES_LINES
where XXEXTRACT.XXNATGAS_REALSES.release_id = XXEXTRACT.XXNATGAS_REALSES_LINES.release_id
)
)
else null
end
from XXEXTRACT.XXNATGAS_REALSES_LINES xxrl
,XXEXTRACT.XXNATGAS_REALSES
where 1 = 1
and xxrl.release_id = XXEXTRACT.XXNATGAS_REALSES.release_id
) as total
from XXEXTRACT.XXNATGAS_REALSES_LINES xxrl
,XXEXTRACT.XXNATGAS_REALSES xxr
where 1 = 1
and xxrl.release_id = xxr.release_id
and xxr.release_date >= NVL(:p_date_from, xxr.release_date)
and xxr.release_date <= NVL(:p_date_to, xxr.release_date)
and xxr.Contract_num = NVL(:p_cont_num, xxr.Contract_num)
and xxr.vendor_id = NVL(:p_ven_id, xxr.vendor_id)
and xxr.vendor_site_id = NVL(:p_site_id, xxr.vendor_site_id)
)
) TBL1
group by TBL1.Contract_number
,TBL1.area_number
,TBL1.area_name
,TBL1.ADVANCE_PAY
,TBL1.Postponed_Amount
,TBL1.extract_number
,TBL1.total;

I need to pick highest NPA_TIME_ZONE_COUNT

I am using this query in Oracle.
SELECT /*+parallel (reject,4) */
distinct n.rowid as npanxx_row_id, r.rating_orignum_used, n.npa, n.nxx, npanxx_effdate, n.line_range_from_number, n.line_range_to_number, n.city, n.state, n.country, n.country_code, n.ocn, n.lata, n.clli_code, n.stepcode, n.juris, n.time_zone as current_time_zone--, x.time_zone as npanxx_timezone, x2.time_zone as npa_timezone, case when x.time_zone >= '1' then x.time_zone else x2.time_zone end new_time_zone, count(x2.time_zone) as npa_time_zone_count
from npanxx n
left join npanxx x
on n.npa = x.npa and (substr(n.nxx, 1,1) = substr(x.nxx,1,1))
and x.time_zone is not null and x.time_zone <> '0'
left join npanxx x2
on n.npa = x2.npa
and x2.time_zone is not null and x2.time_zone <> '0'
inner join reject r
on substr(r.rating_orignum_used,1,3) = n.npa and substr(r.rating_orignum_used,4,3) = n.nxx and substr(r.rating_orignum_used, 7,1) = substr(n.line_range_from_number,1,1)
where
n.npanxx_effdate = (select max(sub.npanxx_effdate) from npanxx sub where n.npa=sub.npa and n.nxx = sub.nxx and n.line_range_from_number = sub.line_range_from_number)
and r.carrier = 'LEVEL3' and r.error_code = '309' and r.rowid in ('AAQBSyAKKAABZ7yAAJ')and trunc(r.processdate) >= trunc(sysdate-90)
group by n.rowid, r.rating_orignum_used, n.npa, n.nxx, n.npanxx_effdate, n.line_range_from_number, n.line_range_to_number, n.city, n.state, n.country, n.country_code, n.ocn, n.lata, n.clli_code, n.stepcode, n.juris, n.time_zone, x.time_zone , x2.time_zone
By running this query I get the result
NPANXX_ROW_ID ..... npa_time_zone_count
AABWcFABmAAAxMrAAy 3780
AABWcFABmAAAxMrAAy 10
and I need one row with the highest count so it come as
NPANXX_ROW_ID ..... npa_time_zone_count
AABWcFABmAAAxMrAAy 3780
I used HAVING statement but its just giving me error
ORA-01427: single-row subquery returns more than one row
HAVING
COUNT(*) = (
SELECT
MAX(count(x2.time_zone))
FROM
npanxx inner
WHERE
inner.time_zone IS NOT NULL AND
inner.time_zone <> 0 AND
npa = inner.npa
and x2.NXX = INNER.NXX
GROUP BY
inner.state,
inner.country,
inner.time_zone)

You can use an analytic function ROW_NUMBER or DENSE_RANK to number rows and pick the highest, int this way:
WITH subquery AS
(
/* your complex query goes here */
SELECT 'AABWcFABmAAAxMrAAy' as NPANXX_ROW_ID, '....' As "....", 3780 As npa_time_zone_count
FROM dual
Union All
SELECT 'AABWcFABmAAAxMrAAy' as NPANXX_ROW_ID, '....' As "....", 10 As npa_time_zone_count
FROM dual
)
SELECT *
FROM (
SELECT t.*,
row_number() over (partition by NPANXX_ROW_ID
ORDER BY npa_time_zone_count DESC ) rn
FROM subquery t
)
WHERE rn = 1;
======================================================
NPANXX_ROW_ID .... NPA_TIME_ZONE_COUNT RN
------------------ ---- ------------------- ----------
AABWcFABmAAAxMrAAy .... 3780 1

How to get alternate row using SQL query?

I have a requirement which I need to produce the result that returns alternately 1 and 0.
SELECT *
FROM
(SELECT
id
,itemNo
,convert(int,tStationsType_id) as tStationsType_id
,tSpecSetpoint_descriptions_id
,SetpointValue
,rowEvenOdd
FROM
TEST S
INNER JOIN
(SELECT
itemNo AS IM, tStationsType_id as ST,
ROW_NUMBER() OVER(PARTITION BY itemNo ORDER BY itemNo) % 2 AS rowEvenOdd
FROM TEST
GROUP BY itemNo, tStationsType_id) A ON S.itemNo = A.IM
AND S.tStationsType_id = A.ST) t
WHERE
itemno = '1000911752202'
ORDER BY
tStationsType_id
The result I get is something like below.
I would like to produce alternate 1 and 0 in rowEvenOdd. However I notice it I can't get it alternate if I order by tStationsType_id.
So basically, what I want is when the
StationsType_id = 2, then rowEvenOdd = 0
StationsType_id = 3, then rowEvenOdd = 1
StationsType_id = 6, then rowEvenOdd = 0
StationsType_id = 8, then rowEvenOdd = 1
StationsType_id = 10, then rowEvenOdd = 0
Can someone help me with this query?
Thanks.

If you just need alternating 0 and 1 in the result set, use SEQUENCE like this:
CREATE SEQUENCE EvenOdd
START WITH 0
INCREMENT BY 1
MAXVALUE 1
MINVALUE 0
CYCLE;
GO
SELECT SalesId, NEXT VALUE FOR EvenOdd as EvenOddColumn FROM Sales
DROP SEQUENCE EvenOdd
To learn more, go to the MSDN page on sequences here: https://msdn.microsoft.com/en-us/library/ff878091.aspx

you can Use case when then
SELECT
case
when exists (SELECT 1 FROM StationsTypeMaster
M WHERE M.StationsType_id=A.StationsType_id)
then 1
else 0 end as rowEvenOdd
,*
FROM
(
select id
,itemNo
,convert(int,tStationsType_id) as tStationsType_id
,tSpecSetpoint_descriptions_id
,SetpointValue
,rowEvenOdd from TEST S
inner join
(select itemNo AS IM,tStationsType_id as ST,
ROW_NUMBER() OVER(PARTITION BY itemNo ORDER BY itemNo)%2 AS rowEvenOdd
from TEST
group by itemNo,tStationsType_id
)A
on S.itemNo = A.IM
and S.tStationsType_id = A.ST) t
where itemno = '1000911752202'
order by tStationsType_id

Use DENSE_RANK instead of ROW_NUMBER:
SELECT * FROM
(
SELECT
id,
itemNo
convert(int,tStationsType_id) as tStationsType_id,
tSpecSetpoint_descriptions_id,
SetpointValue,
(DENSE_RANK() OVER(ORDER BY CAST(A.ST as INT)) - 1)%2 AS rowEvenOdd
FROM
TEST S
JOIN
(
SELECT
itemNo AS IM,
tStationsType_id as ST
FROM
TEST
GROUP BY
itemNo,tStationsType_id
)A
ON
S.itemNo = A.IM
and S.tStationsType_id = A.ST
) t
WHERE
itemno = '1000911752202'
ORDER BY
tStationsType_id

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Get Max date by another column - sql

Related

How to rewrite nested subqueries so that hive can run them

Select from left join with multiple cte blocks

how to use case with group by?

I need to pick highest NPA_TIME_ZONE_COUNT

How to get alternate row using SQL query?

Categories

Resources