I have a table with details of sold cars. Some of these cars have been resold within last 1, 2 or 3 years. The table looks like this:
Car_Type || Car_Reg_No || Sold_Date || Listing_No
Hatch || 23789 || 2017-02-03 11:26 || X6529
Coupe || 16723 || 2016-11-07 09:40 || N8156
Sedan || 35216 || 2016-05-23 10:34 || M8164
Hatch || 23789 || 2016-09-16 04:30 || O7361
Now, I need to query records (cars) which were re-sold within 1 year of their latest sold date and how many times were they sold. So, my output would be like this:
Car_Type || Car_Reg_No || Sold_Count || Latest_Sold_Date
Hatch || 23789 || 2 || 2017-02-03 11:26
In essence, How do I check for re-sold records within a specific time frame of their latest sold date?
You can do this by finding the max, and joining based on your conditions.
declare #TableA table (Car_Type varchar(64)
,Car_Reg_No int
,Sold_Date datetime
,Listing_No varchar(6))
insert into #TableA
values
insert into #TableA
values
('Hatch',23789,'2017-02-03 11:26','X6529'),
('Coupe',16723,'2017-11-07 09:40','N8156'),
('Sedan',35216,'2017-05-23 10:34','M8164'),
('Hatch',23789,'2016-09-16 04:30','O7361'),
('Coupe',16723,'2014-11-07 09:40','N8156')
;with cte as(
select
Car_Type
,Car_Reg_No
,Latest_Sold_Date = max(Sold_Date)
from
#TableA
group by
Car_Type
,Car_Reg_No)
select
a.Car_Type
,a.Car_Reg_No
,Sold_Count = count(b.Listing_No) + 1
,a.Latest_Sold_Date
from cte a
inner join
#TableA b on
b.Car_Reg_No = a.Car_Reg_No
and b.Sold_Date != a.Latest_Sold_Date
and datediff(day,b.Sold_Date,a.Latest_Sold_Date) < 366
--if you want only cars which were sold within last year too, uncomment this
--and datediff(day,a.Latest_Sold_Date,getdate()) < 366
group by
a.Car_Type
,a.Car_Reg_No
,a.Latest_Sold_Date
By my understanding..,
select sd1.Car_Type, sd1.Car_Reg_No,
count(sd1.Car_Reg_No) + 1 'no of sales in last one year', --1 is added because, see the last condition
sd1.Sold_Date 'Last sold date'
from(
select *,ROW_NUMBER() over(partition by Car_Reg_No order by sold_date desc) as rn from #Table) as sd1
join
(select * from #Table) as sd2
on sd1.Car_Type = sd2.Car_Type
and DATEDIFF(dd,sd2.Sold_Date,sd1.Sold_Date) < 366
and sd1.rn = 1
and sd1.Sold_Date <> sd2.Sold_Date -- here last sold is eliminated. so count is added by one.
group by sd1.Car_Type,sd1.Sold_Date, sd1.Car_Reg_No
order by sd1.Car_Reg_No
Related
My goal is to choose the PTID with the most recent date with its recent time. I was trying to use the MAX() function to choose the most recent date with its recent time but received an error syntax (see the double asterisk on Line 9 in my code). Is there a statement to do that or would it be easier to do it in Python? I appreciate all the help!
Table 1
PTID
RESULT_DATE1
RESULT_TIME
DIAGNOSIS_CD
54
2020-01-06
10:03:02
W34
54
2020-01-01
09:18:05
S38
54
2020-01-01
03:08:45
V98
54
2020-04-04
02:09:08
V98
54
2020-04-04
12:12:34
V89
My Goal:
PTID
RESULT_DATE1
RESULT_TIME
DIAGNOSIS_CD
54
2020-04-04
12:12:34
V98
54
2020-01-06
10:03:02
W34
54
2020-01-01
09:18:05
S38
My Code:
CREATE TABLE covid AS
SELECT t1.*, t2.*
FROM lab9 t1 JOIN diagnosis9 t2 ON t2.PTID = t1.PTID
AND t1.RESULT_DATE1 || ' ' || t1.RESULT_TIME
BETWEEN
date(t2.diagdate1, '-7 day') || ' ' || t2.DIAG_TIME
AND
t2.diagdate1 || ' ' || t2.DIAG_TIME
**WHERE RESULT_DATE1 = MAX(RESULT_DATE1)**
GROUP BY t1.PTID || DIAGNOSIS_CD
ORDER BY t1.PTID;
First, you should not group by the concatenation of 2 columns because this may lead to unexpected results.
You should group by the 2 columns.
Also, you can't use an aggregate function like MAX() in the WHERE clause of a query.
What you need is the max value of the expression t1.RESULT_DATE1 || ' ' || t1.RESULT_TIME which you can finally split to date and time with the functions date() and time():
CREATE TABLE covid AS
SELECT t1.PTID,
date(MAX(t1.RESULT_DATE1 || ' ' || t1.RESULT_TIME)) RESULT_DATE1,
time(MAX(t1.RESULT_DATE1 || ' ' || t1.RESULT_TIME)) RESULT_TIME,
t2.*
FROM lab9 t1 JOIN diagnosis9 t2
ON t2.PTID = t1.PTID
AND t1.RESULT_DATE1 || ' ' || t1.RESULT_TIME
BETWEEN
date(t2.diagdate1, '-7 day') || ' ' || t2.DIAG_TIME AND t2.diagdate1 || ' ' || t2.DIAG_TIME
GROUP BY t1.PTID, t2.DIAGNOSIS_CD
ORDER BY t1.PTID;
The above query will return the rows with the max datetime for each combination of PTID and DIAGNOSIS_CD with the use if SQLite's feature of bare columns.
In the data set, every shop is selling some books and every shop has its own price for each book. In the data, I have the price information for each book. With the query in Amazon Athena, I want to calculate the median price for each shop and each product in a specific time period.
But honestly, I have no idea how to do it. Here is my query so far:
SELECT product_id,
shop_id,
XXX AS median_price
FROM data_f
WHERE site_id = 10
AND year || month || day || hour >= '2020022500'
AND year || month || day || hour < '2020022600'
GROUP BY product_id, shop_id
Thanks!
Unfortunately, AWS doesn't support a median() aggregation function or the percentile() functions. Perhaps the simplest method is to use ntile(2) in a subquery and then take the maximum of the first tile (or the minimum of the second tile:
SELECT product_id, shop_id,
MAX(CASE WHEN tile2 = 1 THEN price END) as median
FROM (SELECT d.*, NTILE(2) OVER (PARTITION BY product_id, shop_id ORDER BY price) as tile2
FROM data_f d
WHERE site_id = 10 AND
action NOT IN ('base', 'delete') AND
year || month || day || hour >= '2020022500' AND
year || month || day || hour < '2020022600'
) d
GROUP BY product_id, shop_id;
Note: This is undoubtedly good enough for any practical purpose. However, "median" is usually defined as the average of the two middle values when the total number of rows is even. If you want to be pedantic:
SELECT product_id, shop_id,
(CASE WHEN COUNT(*) % 2 = 0
THEN (MAX(CASE WHEN tile2 = 1 THEN price END) +
MIN(CASE WHEN tile2 = 2 THEN price END)
) / 2.0
ELSE MAX(CASE WHEN tile2 = 1 THEN price END)
END) as median
The median value is the one in the middle when all are listed in order, so let's create that order with a dense_rank()
with q1 as
(
SELECT product_id,
shop_id,
price,
dense_rank() over (partition by product_id, shop_id order by price) as price_rank
FROM data_f
WHERE site_id = 10
AND action <> 'base'
AND action <> 'delete'
AND year || month || day || hour >= '2020022500'
AND year || month || day || hour < '2020022600'
)
, q2 as
(
select max(price_rank) as mp
from q1
)
select q1.*
from q1
where q1.price_rank = (select floor(mp/2) from q2)
Documentation of window functions is part of the Presto Functions documentation here
You can use approx_percentile
select approx_percentile(column_name, 0.5) from table
solution from Philipp Johannis Calculate Median for each group in AWS Athena table
SELECT product_id,
shop_id,
approx_percentile(price, 0.5) AS median_price
FROM data_f
WHERE site_id = 10
AND year || month || day || hour >= '2020022500'
AND year || month || day || hour < '2020022600'
GROUP BY product_id, shop_id
Below query for calculating median:
with res1 as
(select id,ROW_NUMBER() over (order by id) "median_row_num" from test ),
res2 as
(select count(median_row_num) as i from res1)
select id as "median" from res1 where res1.median_row_num = (select case when i%2 = 0 then i/2 else i/2+1 end from res2)
Note : Remember median is middle element in sorted list of numbers.
if a = [3,4,2,6,7]
sorted list a = [2,3,4,6,7]
count of elements is 5 so median would be 4.
But in case of if a = [2,3,4,6,7,8]
Count of elements 6 which is even number so there are two mid elements 4 and 6
So median would be 5 (4+6 = 10/2 = 5)
So above query is good for odd counts and incase of even counts it will always give you first half element.
I am stuck with a query to fetch sum of transaction amount financial year wise.
My table is
TXN_DATE TXN_AMOUNT
12/01/2014 100
12/08/2014 200
12/01/2015 300
12/04/2015 400
12/04/2013 500
I want the result to be displayed as
FinYear Amount
2013-2014 600
2014-2015 500
2015-2016 400
Considering financial year duration from 1st April of an year to 31st March of next year.
Databse DB2
Date is timestamp in database and date format specified in the example is dd/MM/yyyy
Following query should do (tested in DB2 LUW):
with temp as (finyear, txn_amount) as (
select case
when month(TXN_DATE)< 4 Then year(TXN_DATE)-1 || '-' || year(TXN_DATE)
else month(TXN_DATE)>= 4 Then year(TXN_DATE) || '-' || year(TXN_DATE)+1
end as finyear,
txn_amount
from test4)
select finyear,
sum(txn_amount) as sum_amount
from temp
group by finyear;
SELECT FinYear, cast(sum(Amount) as decimal(16,2)) AS Amount FROM
(SELECT
CASE
WHEN TXN_DATE > to_date(year(TXN_DATE) ||'-03-31','YYYY/MM/DD') THEN CONCAT(CONCAT(year(TXN_DATE),'-'),year(TXN_DATE)+1)
WHEN TXN_DATE < to_date(year(TXN_DATE) ||'-04-01','YYYY/MM/DD') THEN CONCAT(CONCAT(year(TXN_DATE)-1,'-'),year(TXN_DATE))
END as FinYear,
Amount AS Amount
FROM tab1)
GROUP BY FinYear
You can use following query:
SELECT FinYear, sum(Amount) AS Amount FROM
(SELECT FinYear =
CASE
WHEN TXN_DATE > char(year(TXN_DATE)) +'0331' THEN char(year(TXN_DATE))+'-'+char(year(TXN_DATE)+1)
WHEN TXN_DATE < char(year(TXN_DATE)) +'0401' THEN char(year(TXN_DATE)-1)+'-'+char(year(TXN_DATE))
END,
TXN_AMOUNT AS Amount
FROM DB2.Transactions) AS Table1
GROUP BY FinYear
Make sure you replace DB2.Transactions with your DatabaseName.TableName
From multiple tables I'm currently selecting a product, value, and contract period. I want to group the results by product and shipment period while summing the value.
My contract period can either be arrival based or shipment based. So, currently to determine which contract period to use, I'm looking to see if one of the period descriptions is null then populating the period end and begin dates as either ship or arrival based on that. Specifically, I'm using the following.
DECODE((P.SHIP_PERIOD_DESCR), NULL,
'ARRIVE' || ' ' ||P.ARRIVAL_PERIOD_BEGIN || ' - ' || P.ARRIVAL_PERIOD_END,
'SHIP' || ' ' ||P.SHIP_PERIOD_BEGIN || ' - ' || P.SHIP_PERIOD_END)
My results are as such:
PRODUCT VALUE CONTRACT_PERIOD
APPLES $600 SHIP 01-FEB-16 - 15-MAR-16
APPLES $700 SHIP 01-MAR-16 - 15-APR-16
LEMONS $200 SHIP 15-JAN-16 - 31-JAN-16
LEMONS $150 SHIP 01-FEB-16 - 15-FEB-16
LEMONS $200 ARRIVE 15-FEB-16 - 28-FEB-16
LEMONS $250 ARRIVE 01-MAR-16 - 15-MAR-16
What I would like to see is the min ship or arrival date and max ship or arrival date per product as such:
PRODUCT VALUE CONTRACT_PERIOD
APPLES $1,300 SHIP 01-FEB-16 - 15-APR-16
LEMONS $350 SHIP 15-JAN-16 - 15-FEB-16
LEMONS $450 ARRIVE 15-FEB-16 - 15-MAR-16
Any suggestions on a way to determine which contract is valid, then group the results using the min and max dates while not interchanging a ship date for an arrival date would be greatly appreciated.
Oracle Setup:
CREATE TABLE table_name ( product, value, ship_period_descr, arrival_period_begin, arrival_period_end, ship_period_begin, ship_period_end ) AS
SELECT 'Apples', 600, 'X', NULL, NULL, DATE '2016-02-01', DATE '2016-03-15' FROM DUAL UNION ALL
SELECT 'Apples', 700, 'X', NULL, NULL, DATE '2016-03-01', DATE '2016-04-16' FROM DUAL UNION ALL
SELECT 'Lemons', 200, 'X', NULL, NULL, DATE '2016-01-15', DATE '2016-01-31' FROM DUAL UNION ALL
SELECT 'Lemons', 150, 'X', NULL, NULL, DATE '2016-02-01', DATE '2016-02-15' FROM DUAL UNION ALL
SELECT 'Lemons', 200, NULL, DATE '2016-02-15', DATE '2016-02-28', NULL, NULL FROM DUAL UNION ALL
SELECT 'Lemons', 250, NULL, DATE '2016-03-01', DATE '2016-03-15', NULL, NULL FROM DUAL;
Query:
SELECT Product,
SUM( Value ) AS Value,
DECODE(
DECODE( P.SHIP_PERIOD_DESCR, NULL, 1, 0 ),
1, 'ARRIVE ' || MIN( P.ARRIVAL_PERIOD_BEGIN ) || ' - ' || MAX( P.ARRIVAL_PERIOD_END ),
'SHIP ' || MIN( P.SHIP_PERIOD_BEGIN ) || ' - ' || MAX( P.SHIP_PERIOD_END )
) AS Contract_Period
FROM table_name p
GROUP BY Product,
DECODE( P.SHIP_PERIOD_DESCR, NULL, 1, 0 );
Results:
PRODUCT VALUE CONTRACT_PERIOD
------- ---------- ----------------------------------------------
Apples 1300 SHIP 01-FEB-16 - 16-APR-16
Lemons 350 SHIP 15-JAN-16 - 15-FEB-16
Lemons 450 ARRIVE 15-FEB-16 - 15-MAR-16
The basic idea is not to combine the different columns into a single concatenated column. Then use intelligent aggregation:
with t as (
<basically your query here, but with each column individually>
)
select product, ship_period_desc,
min(case when ship_period_desc = 'ARRIVAL' then ARRIVAL_PERIOD_BEGIN
else SHIP_PERIOD_BEGIN
end) as PERIOD_BEGIN,
min(case when ship_period_desc = 'ARRIVAL' then ARRIVAL_PERIOD_END
else SHIP_PERIOD_END
end) as PERIOD_END
from t
where ship_period_desc in ('ARRIVAL', 'SHIP')
group by product, ship_period_desc;
I am a novice trying to work through this here with no luck so far, any help is greatly appreciated!!!
Select Distinct
(AB.agency_no || '-' || ab.branch_no) AS "AGENCY-BRANCH",
count (AB.agency_no || '-' || ab.branch_no) AS Occurences,
A.AGY_NAME AS AGENCY,
Sum(AB.annual_premium) as Premium
From Agency_Book_View AB, Agency A, Branch B
Where AB.agency_no = A.Agency_No
AND B.EXPIRATION_DATE = TO_DATE('12-31-2078', 'MM-DD-YYYY')
AND B.EFFECTIVE_DATE <= sysdate and b.effective_date >=sysdate - 364
Group by AB.agency_no || '-' || ab.branch_no, A.Agy_Name, ab.annual_premium
Order by AB.agency_no || '-' || ab.branch_no
So I am trying to return total annual premium per "agency-branch" and I am getting multiple occurrences of agency-branch. I am trying to get one line per agency branch. I hope this is clear. I tried to include a result set but wasnt allowed to include a picture in my post.
Thanks very much!
Brad
Try this :
SELECT (AB.agency_no || '-' || AB.branch_no) AS "AGENCY-BRANCH",
COUNT(AB.agency_no || '-' || AB.branch_no) AS Occurences,
A.AGY_NAME AS AGENCY,
SUM(AB.annual_premium) AS Premium
FROM Agency_Book_View AB, Agency A, Branch B
WHERE AB.agency_no = A.Agency_No AND AB.branch_no = B.branch_no
AND B.EXPIRATION_DATE = TO_DATE('12-31-2078', 'MM-DD-YYYY')
AND B.EFFECTIVE_DATE <= SYSDATE AND B.effective_date >= SYSDATE - 364
GROUP BY AB.agency_no || '-' || AB.branch_no, A.Agy_Name
ORDER BY AB.agency_no || '-' || AB.branch_no
I joined B table and AB table, removed the DISTINCT and the GROUPed BY ab.annual_premium.
I think you need to remove ab.annual_premium from the group by clause.