Group results by number of days appearing - sql

I want to get the number of days someone logs on in a month. Using this query:
select id,
to_char(date_on, 'MM-DD') as mon_dd
from
logs
group by
id, to_char(date_on, 'MM-DD')
I get a table that looks like this:
id | mon_dd
0 | 01-27
3 | 02-23
1 | 01-05
0 | 01-31
2 | 02-01
3 | 02-05
1 | 02-09
I want to get a result that groups the id by the number of days they appear in a month like this:
id | month | days_appeared
0 | jan | 2
0 | feb | 0
1 | jan | 1
1 | feb | 1
2 | jan | 0
2 | feb | 1
3 | jan | 0
3 | feb | 2

You can generate a cartesian product of the distinct months and users in the table, and then bring the table with a left join:
select
i.id,
d.date_month,
count(distinct trunc(l.date_on)) days_appeared
from (select distinct trunc(date_on, 'month') date_month from logs) d
cross join (select distinct id from logs) i
left join logs l
on l.date_on >= d.date_month
and l.date_on < add_months(d.date_month, 1)
and l.id = i.id
group by i.id, d.date_month

If you want to get all months, even those with zeros, then:
select l.id, m.mon, count(distinct trunc(date_on)) as num_days
from (select distinct id from logs) i cross join
(select distinct trunc(date_on, 'month') as mon) m left join
logs l
on l.id = i.id and trunc(date_on, 'month') = m.mon
group by l.id, m.mon;
Note: You might have more efficient sources of the months and ids than using select distinct on the logs table.

I like the use of the WITH clause to build your query in a modular way.
with vals as ( -- all information needed is here
select id
, to_char(mon_dd, 'mon') as month
, to_char(mon_dd,'mm') as mm
from logs
),
months as ( -- the distinct months,
select distinct month, mm -- including the month numbers
from vals -- for ordering the main query
),
ids as ( -- the distinct ids
select distinct id
from vals)
select i.id, m.month, (select count(id) from vals -- for every combination of id
where month=m.month -- and month
and id = i.id) as count -- count the number of ids
from ids i cross join months m
order by i.id, m.mm;

Related

Redshift: Add Row for each hour in a day

I have a table contains item_wise quantity at different hour of date. trying to add data for each hour(24 enteries in a day) with previous hour available quantity. For example for hour(2-10), it will be 5.
I created a table with hours enteries (1-24) & full join with shared table.
How can i add previous available entry. Need suggestion
item_id| date | hour| quantity
101 | 2022-04-25 | 2 | 5
101 | 2022-04-25 | 10 | 13
101 | 2022-04-25 | 18 | 67
101 | 2022-04-25 | 23 | 27
You can try to use generate_series to generate hours number, let it be the OUTER JOIN base table,
Then use a correlated-subquery to get your expect quantity column
SELECT t1.*,
(SELECT quantity
FROM T tt
WHERE t1.item_id = tt.item_id
AND t1.date = tt.date
AND t1.hour >= tt.hour
ORDER BY tt.hour desc
LIMIT 1) quantity
FROM (
SELECT DISTINCT item_id,date,v.hour
FROM generate_series(1,24) v(hour)
CROSS JOIN T
) t1
ORDER BY t1.hour
Provided the table of int 1 .. 24 is all24(hour) you can use lead and join
select t.item_id, t.date, all24.hour, t.quantity
from all24
join (
select *,
lead(hour, 1, 25) over(partition by item_id, date order by hour) - 1 nxt_h
from tbl
) t on all24.hour between t.hour and t.nxt_h

Select first rows where condition [duplicate]

Here's what I'm trying to do. Let's say I have this table t:
key_id | id | record_date | other_cols
1 | 18 | 2011-04-03 | x
2 | 18 | 2012-05-19 | y
3 | 18 | 2012-08-09 | z
4 | 19 | 2009-06-01 | a
5 | 19 | 2011-04-03 | b
6 | 19 | 2011-10-25 | c
7 | 19 | 2012-08-09 | d
For each id, I want to select the row containing the minimum record_date. So I'd get:
key_id | id | record_date | other_cols
1 | 18 | 2011-04-03 | x
4 | 19 | 2009-06-01 | a
The only solutions I've seen to this problem assume that all record_date entries are distinct, but that is not this case in my data. Using a subquery and an inner join with two conditions would give me duplicate rows for some ids, which I don't want:
key_id | id | record_date | other_cols
1 | 18 | 2011-04-03 | x
5 | 19 | 2011-04-03 | b
4 | 19 | 2009-06-01 | a
How about something like:
SELECT mt.*
FROM MyTable mt INNER JOIN
(
SELECT id, MIN(record_date) AS MinDate
FROM MyTable
GROUP BY id
) t ON mt.id = t.id AND mt.record_date = t.MinDate
This gets the minimum date per ID, and then gets the values based on those values. The only time you would have duplicates is if there are duplicate minimum record_dates for the same ID.
I could get to your expected result just by doing this in mysql:
SELECT id, min(record_date), other_cols
FROM mytable
GROUP BY id
Does this work for you?
To get the cheapest product in each category, you use the MIN() function in a correlated subquery as follows:
SELECT categoryid,
productid,
productName,
unitprice
FROM products a WHERE unitprice = (
SELECT MIN(unitprice)
FROM products b
WHERE b.categoryid = a.categoryid)
The outer query scans all rows in the products table and returns the products that have unit prices match with the lowest price in each category returned by the correlated subquery.
I would like to add to some of the other answers here, if you don't need the first item but say the second number for example you can use rownumber in a subquery and base your result set off of that.
SELECT * FROM
(
SELECT
ROW_NUM() OVER (PARTITION BY Id ORDER BY record_date, other_cols) as rownum,
*
FROM products P
) INNER
WHERE rownum = 2
This also allows you to order off multiple columns in the subquery which may help if two record_dates have identical values. You can also partition off of multiple columns if needed by delimiting them with a comma
This does it simply:
select t2.id,t2.record_date,t2.other_cols
from (select ROW_NUMBER() over(partition by id order by record_date)as rownum,id,record_date,other_cols from MyTable)t2
where t2.rownum = 1
If record_date has no duplicates within a group:
think of it as of filtering. Simpliy get (WHERE) one (MIN(record_date)) row from the current group:
SELECT * FROM t t1 WHERE record_date = (
select MIN(record_date)
from t t2 where t2.group_id = t1.group_id)
If there could be 2+ min record_date within a group:
filter out non-min rows (see above)
then (AND) pick only one from the 2+ min record_date rows, within the given group_id. E.g. pick the one with the min unique key:
AND key_id = (select MIN(key_id)
from t t3 where t3.record_date = t1.record_date
and t3.group_id = t1.group_id)
so
key_id | group_id | record_date | other_cols
1 | 18 | 2011-04-03 | x
4 | 19 | 2009-06-01 | a
8 | 19 | 2009-06-01 | e
will select key_ids: #1 and #4
SELECT p.* FROM tbl p
INNER JOIN(
SELECT t.id, MIN(record_date) AS MinDate
FROM tbl t
GROUP BY t.id
) t ON p.id = t.id AND p.record_date = t.MinDate
GROUP BY p.id
This code eliminates duplicate record_date in case there are same ids with same record_date.
If you want duplicates, remove the last line GROUP BY p.id.
This a old question, but this can useful for someone
In my case i can't using a sub query because i have a big query and i need using min() on my result, if i use sub query the db need reexecute my big query. i'm using Mysql
select t.*
from (select m.*, #g := 0
from MyTable m --here i have a big query
order by id, record_date) t
where (1 = case when #g = 0 or #g <> id then 1 else 0 end )
and (#g := id) IS NOT NULL
Basically I ordered the result and then put a variable in order to get only the first record in each group.
The below query takes the first date for each work order (in a table of showing all status changes):
SELECT
WORKORDERNUM,
MIN(DATE)
FROM
WORKORDERS
WHERE
DATE >= to_date('2015-01-01','YYYY-MM-DD')
GROUP BY
WORKORDERNUM
select
department,
min_salary,
(select s1.last_name from staff s1 where s1.salary=s3.min_salary ) lastname
from
(select department, min (salary) min_salary from staff s2 group by s2.department) s3

Joining Table A and B to get elements of both

I have two tables:
Table 'bookings':
id | date | hours
--------------------------
1 | 06/01/2016 | 2
1 | 06/02/2016 | 1
2 | 06/03/2016 | 2
3 | 06/03/2016 | 4
Table 'lookupCalendar':
date
-----
06/01/2016
06/02/2016
06/03/2016
I want to join them together so that I have a date for each booking so that the results look like this:
Table 'results':
id | date | hours
--------------------------
1 | 06/01/2016 | 2
1 | 06/02/2016 | 1
1 | 06/03/2016 | 0 <-- Added by query
2 | 06/01/2016 | 0 <-- Added by query
2 | 06/02/2016 | 0 <-- Added by query
2 | 06/03/2016 | 2
3 | 06/01/2016 | 0 <-- Added by query
3 | 06/02/2016 | 0 <-- Added by query
3 | 06/03/2016 | 4
I have tried doing a cross-apply, but that doesn't get me there, neither does a full join. The FULL JOIN just gives me nulls in the id column and the cross-apply gives me too much data.
Is there a query that can give me the results table above?
More Information
It might be beneficial to note that I am doing this so that I can calculate an average hours booked over a period of time, not just the number of records in the table.
Ideally, I'd be able to do
SELECT AVG(hours) AS my_average, id
FROM bookings
GROUP BY id
But since that would just give me a count of the records instead of the count of the days I want to cross apply it with the dates. Then I think I can just do the query above with the results table.
select i.id, c.date, coalesce(b.hours, 0) as hours
from lookupCalendar c
cross join (select distinct id from bookings) i
left join bookings b
on b.id = i.id
and b.date = c.date
order by i.id, c.date
Try this:
select c.date, b.id, isnull(b.hours, 0)
from lookupCalendar c
left join bookings b on b.date = c.date
LookupCalendar is your main table because you want the bookings against each date, irrespective of whether there was a booking on that date or not, so a left join is required.
I am not sure if you need to include b.id to solve your actual problem though. Wouldn't you just want to get the total number of hours booked against each date like this, to then calculate the average?:
select c.date, sum(isnull(b.hours, 0))
from lookupCalendar c
left join bookings b on b.date = c.date
group by c.date
You can try joining all the combinations of IDs and dates and left joining the data;
WITH Booking AS (SELECT *
FROM (VALUES
( 1 , '06/01/2016', 2 )
, ( 1 , '06/02/2016', 1 )
, ( 2 , '06/03/2016', 2 )
, ( 3 , '06/03/2016', 4 )
) x (id, date, hours)
)
, lookupid AS (
SELECT DISTINCT id FROM Booking
)
, lookupCalender AS (
SELECT DISTINCT date FROM Booking
)
SELECT ID.id, Cal.Date, ISNULL(B.Hours,0) AS hours
FROM lookupid id
INNER JOIN lookupCalender Cal
ON 1 = 1
LEFT JOIN Booking B
ON id.id = B.id
AND Cal.date = B.Date
ORDER BY ID.id, Cal.Date

Group by minimum value in one field while selecting distinct rows

Here's what I'm trying to do. Let's say I have this table t:
key_id | id | record_date | other_cols
1 | 18 | 2011-04-03 | x
2 | 18 | 2012-05-19 | y
3 | 18 | 2012-08-09 | z
4 | 19 | 2009-06-01 | a
5 | 19 | 2011-04-03 | b
6 | 19 | 2011-10-25 | c
7 | 19 | 2012-08-09 | d
For each id, I want to select the row containing the minimum record_date. So I'd get:
key_id | id | record_date | other_cols
1 | 18 | 2011-04-03 | x
4 | 19 | 2009-06-01 | a
The only solutions I've seen to this problem assume that all record_date entries are distinct, but that is not this case in my data. Using a subquery and an inner join with two conditions would give me duplicate rows for some ids, which I don't want:
key_id | id | record_date | other_cols
1 | 18 | 2011-04-03 | x
5 | 19 | 2011-04-03 | b
4 | 19 | 2009-06-01 | a
How about something like:
SELECT mt.*
FROM MyTable mt INNER JOIN
(
SELECT id, MIN(record_date) AS MinDate
FROM MyTable
GROUP BY id
) t ON mt.id = t.id AND mt.record_date = t.MinDate
This gets the minimum date per ID, and then gets the values based on those values. The only time you would have duplicates is if there are duplicate minimum record_dates for the same ID.
I could get to your expected result just by doing this in mysql:
SELECT id, min(record_date), other_cols
FROM mytable
GROUP BY id
Does this work for you?
To get the cheapest product in each category, you use the MIN() function in a correlated subquery as follows:
SELECT categoryid,
productid,
productName,
unitprice
FROM products a WHERE unitprice = (
SELECT MIN(unitprice)
FROM products b
WHERE b.categoryid = a.categoryid)
The outer query scans all rows in the products table and returns the products that have unit prices match with the lowest price in each category returned by the correlated subquery.
I would like to add to some of the other answers here, if you don't need the first item but say the second number for example you can use rownumber in a subquery and base your result set off of that.
SELECT * FROM
(
SELECT
ROW_NUM() OVER (PARTITION BY Id ORDER BY record_date, other_cols) as rownum,
*
FROM products P
) INNER
WHERE rownum = 2
This also allows you to order off multiple columns in the subquery which may help if two record_dates have identical values. You can also partition off of multiple columns if needed by delimiting them with a comma
This does it simply:
select t2.id,t2.record_date,t2.other_cols
from (select ROW_NUMBER() over(partition by id order by record_date)as rownum,id,record_date,other_cols from MyTable)t2
where t2.rownum = 1
If record_date has no duplicates within a group:
think of it as of filtering. Simpliy get (WHERE) one (MIN(record_date)) row from the current group:
SELECT * FROM t t1 WHERE record_date = (
select MIN(record_date)
from t t2 where t2.group_id = t1.group_id)
If there could be 2+ min record_date within a group:
filter out non-min rows (see above)
then (AND) pick only one from the 2+ min record_date rows, within the given group_id. E.g. pick the one with the min unique key:
AND key_id = (select MIN(key_id)
from t t3 where t3.record_date = t1.record_date
and t3.group_id = t1.group_id)
so
key_id | group_id | record_date | other_cols
1 | 18 | 2011-04-03 | x
4 | 19 | 2009-06-01 | a
8 | 19 | 2009-06-01 | e
will select key_ids: #1 and #4
SELECT p.* FROM tbl p
INNER JOIN(
SELECT t.id, MIN(record_date) AS MinDate
FROM tbl t
GROUP BY t.id
) t ON p.id = t.id AND p.record_date = t.MinDate
GROUP BY p.id
This code eliminates duplicate record_date in case there are same ids with same record_date.
If you want duplicates, remove the last line GROUP BY p.id.
This a old question, but this can useful for someone
In my case i can't using a sub query because i have a big query and i need using min() on my result, if i use sub query the db need reexecute my big query. i'm using Mysql
select t.*
from (select m.*, #g := 0
from MyTable m --here i have a big query
order by id, record_date) t
where (1 = case when #g = 0 or #g <> id then 1 else 0 end )
and (#g := id) IS NOT NULL
Basically I ordered the result and then put a variable in order to get only the first record in each group.
The below query takes the first date for each work order (in a table of showing all status changes):
SELECT
WORKORDERNUM,
MIN(DATE)
FROM
WORKORDERS
WHERE
DATE >= to_date('2015-01-01','YYYY-MM-DD')
GROUP BY
WORKORDERNUM
select
department,
min_salary,
(select s1.last_name from staff s1 where s1.salary=s3.min_salary ) lastname
from
(select department, min (salary) min_salary from staff s2 group by s2.department) s3

Sorting over a sum from two tables in mySQL

I can't figure out a query that will add and compare across tables. I have three tables:
house
id
-----
1
2
month
id | house_id | btus
---------------------
3 | 1 | 100
4 | 2 | 200
car
id | month_id | btu
--------------------------
5 | 3 | 10
6 | 4 | 20
7 | 3 | 15
I need a query that will return the house ids sorted by total btus for the month and car.
So for the above example it would return 2,1 as (200+20) > (100 + 10 + 15)
SELECT h.*
FROM house
ORDER BY
(
SELECT SUM(c.btu)
FROM month m
JOIN cars c
ON c.month_id = m.id
WHERE m.house_id = h.id
) +
(
SELECT SUM(m.btus)
FROM month m
WHERE m.house_id = h.id
)
DESC
, or this one (probably slightly more efficient):
SELECT h.*
FROM house
ORDER BY
(
SELECT SUM
(
btus +
(
SELECT SUM(btu)
FROM cars c
WHERE c.month_id = m.id
)
)
FROM month m
WHERE m.house_id = h.id
)
DESC
Probably not the most performant use of joins, but:
SELECT house.id
FROM house JOIN month JOIN car
WHERE house.id = month.house_id AND month.id = car.month_id
GROUP BY house.id
ORDER BY sum(car.btu) + sum(month.btus);
The joins and where clause will explode the tables (try it with SELECT * and skipping the group/order clauses), group by will flatten them to one row each, and the sum()s will do the relevant math.