Select to search column on group by query - sql

I have one table called prices that have a reference from table products through product_id column. I want a query that selects prices grouped by product_id with the max final date and get the value of start_date through one select with id of price grouped.
I try with the following query but I am getting a wrong value of start date. Is weird because of the result subquery return more than one row even though I use the price id on where clause. Because that I put the limit on the query but it is wrong.
select prices.produto_id, prices.id,
MAX(CASE WHEN prices.finish_date IS NULL THEN COALESCE(prices.finish_date,'9999-12-31') ELSE prices.finish_date END) as finish_date,
(select start_date from prices where prices.id = prices.id limit 1)
as start_date from prices group by prices.product_id, prices.id
How I can get the relative start date of the price id in my grouped row? I am using postgresql.
A example to view what I want with my query:
DataSet:
ID | PRODUCT_ID | START_DATE | FINISH_DATE
1 1689 2018-01-19 02:00:00 2019-11-19 23:59:59
2 1689 2019-10-11 03:00:00 2019-10-15 23:59:59
3 1689 2019-01-11 03:00:00 2019-05-15 23:59:59
4 1690 2019-11-11 03:00:00 2019-12-15 23:59:59
5 1690 2019-05-11 03:00:00 2025-12-15 23:59:59
6 1691 2019-05-11 03:00:00 null
I want this result:
ID | PRODUCT_ID | START_DATE | FINISH_DATE
1 1689 2018-01-19 02:00:00 2019-11-19 23:59:59
5 1690 2019-05-11 03:00:00 2025-12-15 23:59:59
6 1691 2019-05-11 03:00:00 9999-12-31 23:59:59
The start date should be the same value of the row before the group by.

I would recommend DISTINCT ON in Postgres:
select distinct on (p.product_id) p.*
from prices p
order by p.product_id,
p.finish_date desc nulls first;
NULL values are treated as larger than any other value, so a descending sort puts them first. However, I've included nulls first just to be explicit.
DISTINCT ON is a very handy Postgres extension, which you can learn more about in the documentation.

Try this
with data as (
SELECT id, product_id,
max(COALESCE(finish_date,'9999-12-31')) as finish_date from prices group by 1,2)
select d.*, p.start_date from data d join prices p on p.id = d.id;
It surely isnt' the most elegant solution, but it should work.

Related

How to average values in one table based on the condition involving another table in SQL?

I have two tables. One defines time intervals (beginning and end). Time intervals are not equal in length. Another contains product ID, start and end date of the product.
TableOne:
Interval StartDateTime EndDateTime
202020201 2020-01-01 00:00:00 2020-02-10 00:00:00
202020202 2020-02-10 00:00:00 2020-02-20 00:00:00
TableTwo
ProductID ProductStartDateTime ProductEndDateTime
ASSDWE1 2018-01-04 00:12:00 2020-04-10 20:00:30
ADFGHER 2020-01-05 00:11:30 2020-01-19 00:00:00
ASDFVBN 2017-10-10 00:12:10 2020-02-23 00:23:23
I need to compute the average length of the products from TableTwo that existed during time intervals defined in TableOne. If the product existed throughout the time interval from TableOne, then the length of the product during this time interval is defined as it length since its start date till the end of the time interval.
I tried the following
select
a.*,
(select
AVG(datediff(day, b.ProductStartDateTime, IIF (b.ProductEndDateTime> a.EndDateTime, a.EndDateTime
,b.ProductEndDateTime))) --compute average length of the products
FROM #TableTwo b
WHERE ( not (b.ProductEndDateTime <= a.StartDateTime ) and not (b.ProductStartDateTime >= a.EndDateTime) )
-- select products that existed during interval from #TableOne
) as AverageProductLength
from #TableOne a
I get the mistake "Multiple columns are specified in an aggregated expression containing an outer reference. If an expression being aggregated contains an outer reference, then that outer reference must be the only column referenced in the expression."
The result I want:
Interval StartDateTime EndDateTime AverageProductLength
202020201 2020-01-01 00:00:00 2020-02-10 00:00:00 23
202020202 2020-02-10 00:00:00 2020-02-20 00:00:00 34.5
Is there a way I can do the averaging?

How can I extract the values of the last aggregation date in sql

I have the following table.
id user time_stamp
1 Mike 2020-02-13 00:00:00 UTC
2 John 2020-02-13 00:00:00 UTC
3 Levy 2020-02-12 00:00:00 UTC
4 Sam 2020-02-12 00:00:00 UTC
5 Frodo 2020-02-11 00:00:00 UTC
Let's say 2020-02-13 00:00:00 UTC is the last day and I would like to query this table to only display last days results? I want to create a view in Bigquery so that I only and always get the last day's results?
So that in the end I get something like this (For last day which is 2020-02-13 00:00:00 UTC )
id user time_stamp
1 Mike 2020-02-13 00:00:00 UTC
2 John 2020-02-13 00:00:00 UTC
You can use window functions:
select t.* except (seqnum)
from (select t.*,
dense_rank() over (order by time_stamp) as seqnum
from t
) t
where seqnum = 1;
This may not work well on a large amount of data -- because of the way that BQ implements window functions with no partitioning. So, you might find that this works better (especially if the above runs out of resources):
select t.*
from t join
(select max(time_stamp) as max_time_stamp
from t
) tt
on t.time_stamp = max_time_stamp;
Also, if the timestamps actually have date components, then you will want to convert to a date or remove the time component somehow.

Group by clause - output is not as expected

Below is my sql query to get the list of dates from a table.
select t2.counter_date as myDates from table1 t1;
output:
myDates
2014-03-14 00:00:00
2014-05-11 00:00:00
2014-11-03 00:00:00
2014-12-23 00:00:00
2015-01-12 00:00:00
2015-08-08 00:00:00
2016-03-14 00:00:00
2017-03-14 00:00:00
2017-03-19 00:00:00
Below is the solution:
select min(t1.counter_date) as oldDate,max(t1.counter_date) as latestDate from table1 t1;
In the following demo you can see that your query is giving the correct results. The problem must be in your data.
EDIT: after the edit it is clear where the problem is. Once you perform the following query:
SELECT min(date), max(date)
FROM tab
GROUP BY date
than min(date) has to be equal to max(date) since there is just one date in the group.

Oracle SQL query about Date

I have a database table named availableTimeslot with fields pk, startDate, endDate, e.g.
PK startDate endDate
1. 2017-03-07 09:00:00 2017-03-07 18:00:00
2. 2017-03-07 18:00:00 2017-03-07 21:00:00
3. 2017-03-08 09:00:00 2017-03-08 18:00:00
records starting from 09:00:00 to 18:00:00 indicate it is a morning time slot, while 18:00:00 to 23:00:00 indicating it is a afternoon time slot
storing available timeslot dates (e.g. 2017-03-06, 2017-03-08) which are available for the customer to choose one.
Can I use one query to get exactly 10 available time slots dates starting on the day after the order date?
e.g. if I order a product on 2016-03-07, then the query returns
2017-03-08 09:00:00
2017-03-08 18:00:00
2017-03-09 09:00:00
2017-03-09 18:00:00
2017-03-10 ...
2017-03-11 ...
2017-03-13 ...
as 12 is a public holiday and not in the table.
In short, it returns 10 dates (5 days with each day having am and pm sessions)
remark: the available time slot dates are in order, but may not be consecutive
select available_date
from ( select available_date, row_number() over (order by available_date) as rn
from your_table
where available_date > :order_date
)
where rn <= 5;
:order_date is a bind variable - the date entered by the user/customer through the interface.
Do you want 5 for a single customer?
select ts.*
from (select ts.*
from customer c join
timeslots ts
on ts.date > c.orderdate
where c.customerid = v_customerid
order by ts.date asc
) ts
where rownum <= 5

Select min/max from group defined by one column as subgroup of another - SQL, HPVertica

I'm trying to find the min and max date within a subgroup of another group. Here's example 'data'
ID Type Date
1 A 7/1/2015
1 B 1/1/2015
1 A 8/5/2014
22 B 3/1/2015
22 B 9/1/2014
333 A 8/1/2015
333 B 4/1/2015
333 B 3/29/2014
333 B 2/28/2013
333 C 1/1/2013
What I'd like to identify is - within an ID, what is the min/max Date for each block of similar Type? So for ID # 333 I want the below info:
A: min & max = 8/1/2015
B: min = 2/28/2013
max = 4/1/2015
C: min & max = 1/1/2013
I'm having trouble figuring out how to identify only uninterrupted groupings of Type within a grouping of ID. For ID #1, I need to keep the two 'A' Types with separate min/max dates because they were split by a Type 'B', so I can't just pull the min date of all Type A's for ID #1, it has to be two separate instances.
What I've tried is something like the below two lines, but neither of these accurately captures the case mentioned above for ID #1 where Type B interrupts Type A.
Max(Date) OVER (Partition By ID, Type)
or this:
Row_Number() OVER (Partition By ID, Type ORDER BY Date DESC)
,then selecting Row #1 for max date, and date ASC w/ row #1 for min date
Thank you for any insight you can provide!
If I understand right, you want the min/max values for an id/type grouped using a descending date sort, but the catch is that you want them based on clusters within the id by time.
What you can do is use CONDITIONAL_CHANGE_EVENT to tag the rows on change of type, then use that in your GROUP BY on a standard min/max aggregation.
This would be the intermediate step towards getting to what you want:
select ID, Type, Date,
CONDITIONAL_CHANGE_EVENT(Type) OVER( PARTITION BY ID ORDER BY Date desc) cce
from mytable
group by ID, Type, Date
order by ID, Date desc, Type
ID Type Date cce
1 A 2015-07-01 00:00:00 0
1 B 2015-01-01 00:00:00 1
1 A 2014-08-05 00:00:00 2
22 B 2015-03-01 00:00:00 0
22 B 2014-09-01 00:00:00 0
333 A 2015-08-01 00:00:00 0
333 B 2015-04-01 00:00:00 1
333 B 2014-03-29 00:00:00 1
333 B 2013-02-28 00:00:00 1
333 C 2013-01-01 00:00:00 2
Once you have them grouped using CCE, you can do an aggregate on this to get the min/max you are looking for grouping on cce. You can play with the order by at the bottom, this ordering seem to make the most sense to me.
select id, type, min(date), max(date)
from (
select ID, Type, Date,
CONDITIONAL_CHANGE_EVENT(Type) OVER( PARTITION BY ID ORDER BY Date desc) cce
from mytable
group by ID, Type, Date
) x
group by id, type, cce
order by id, 3 desc, 4 desc;
id type min max
1 A 2015-07-01 00:00:00 2015-07-01 00:00:00
1 B 2015-01-01 00:00:00 2015-01-01 00:00:00
1 A 2014-08-05 00:00:00 2014-08-05 00:00:00
22 B 2014-09-01 00:00:00 2015-03-01 00:00:00
333 A 2015-08-01 00:00:00 2015-08-01 00:00:00
333 B 2013-02-28 00:00:00 2015-04-01 00:00:00
333 C 2013-01-01 00:00:00 2013-01-01 00:00:00