MySQL Combine multiple rows - sql

I have a table similar to the following (of course with more rows and fields):
category_id | client_id | date | step
1 1 2009-12-15 first_step
1 1 2010-02-03 last_step
1 2 2009-04-05 first_step
1 2 2009-08-07 last_step
2 3 2009-11-22 first_step
3 4 2009-11-14 first_step
3 4 2010-05-09 last_step
I would like to transform this so that I can calculate the time between the first and last steps and eventually find the average time between first and last steps, etc. Basically, I'm stumped at how to transform the above table into something like:
category_id | first_date | last_date
1 2009-12-15 2010-02-03
1 2009-04-05 2009-08-07
2 2009-11-22 NULL
3 2009-11-14 2010-05-09
Any help would be appreciated.

Updated based on question update/clarification:
SELECT t.category_id,
MIN(t.date) AS first_date,
CASE
WHEN MAX(t.date) = MIN(t.date) THEN NULL
ELSE MAX(t.date)
END AS last_date
FROM TABLE t
GROUP BY t.category_id, t.client_id

a simple GROUP BY should do the trick
SELECT category_id
, MIN(first_date)
, MAX(last_date)
FROM TABLE
GROUP BY category_ID

Try:
select
category_id
, min(date) as first_date
, max(date) as last_date
from
table_name
group by
category_id
, client_id

You need to do two sub queries and then join them together - something like the following
select
*
from
(select
*,
date as first_date
from
table
where step = "first_step") a
left join ( select
*
date as last_date
from
table
where step = "lastt_step") b
on (a.category_id = b.category_id)
Enjoy!

simple answer:
SELECT fist.category_id, first.date, last.date
FROM tableName first
LEFT JOIN tableName last
ON first.category_id = last.category_id
AND first.step = 'first_step'
AND last.step ='last_step'
You could also do the calculations in the query instead of just returning the two date values.

Related

Getting count of last records of 2 columns SQL

I was looking for a solution for the below mentioned scenario.
So my table structure is like this ; Table name : energy_readings
equipment_id
meter_id
readings
reading_date
1
1
100
01/01/2022
1
1
200
02/01/2022
1
1
null
03/01/2022
1
2
100
01/01/2022
1
2
null
04/01/2022
2
1
null
04/01/2022
2
1
399
05/01/2022
2
2
null
02/01/2022
So from this , I want to get the number of nulls for the last record of same equipment_id and meter_id. (Should only consider the nulls of the last record of same equipment_id and meter_id)
EX : Here , the last reading for equipment 1 and meter 1 is a null , therefore it should be considered for the count. Also the last reading(Latest Date) for equipment 1 and meter 2 is a null , should be considered for count. But even though equipment 2 and meter 1 has a null , it is not the last record (Latest Date) , therefore should not be considered for the count.
Thus , this should be the result ;
equipment_id
Count
1
2
2
1
Hope I was clear with the question.
Thank you!
You can use CTE like below. CTE LatestRecord will get latest record for equipment_id & meter_id. Later you can join it with your current table and use WHERE to filter out record with null values only.
;WITH LatestRecord AS (
SELECT equipment_id, meter_id, MAX(reading_date) AS reading_date
FROM energy_readings
GROUP BY equipment_id, meter_id
)
SELECT er.meter_id, COUNT(1) AS [Count]
FROM energy_readings er
JOIN LatestRecord lr
ON lr.equipment_id = er.equipment_id
AND lr.meter_id = er.meter_id
AND lr.reading_date = er.reading_date
WHERE er.readings IS NULL
GROUP BY er.meter_id
with records as(
select equ_id,meter_id,reading_date,readings,
RANK() OVER(PARTITION BY meter_id,equ_id
order by reading_date) Count
from equipment order by equ_id
)
select equ_id,count(counter)
from
(
select equ_id,meter_id,reading_date,readings,MAX(Count) as counter
from records
group by meter_id,equ_id
order by equ_id
) where readings IS NULL group by equ_id
Explanation:-
records will order data by reading_date and will give counting as 1,2,3..
select max of count from records
select count of counter where reading is null
Partition by will give counting as shown in image
Result

Select rows from a particular row to latest row if that particular row type exist

I want to achieve these two requirements using a single query. Currently I'm using 2 queries in the program and use C# to do the process part something like this.
Pseudocode
select top 1 id from table where type=b
if result.row.count > 0 {var typeBid = row["id"]}
select * from table where id >= {typeBid}
else
select * from table
Req1: If there is records exist with type=b, Result should be latest row with type=b and all other rows added after.
Table
--------------------
id type date
--------------------
1 b 2021-10-15
2 a 2021-11-16
3 b 2021-11-19
4 a 2021-12-02
5 c 2021-12-12
6 a 2021-12-16
Result
--------------------
id type date
--------------------
3 b 2021-11-19
4 a 2021-12-02
5 c 2021-12-12
6 a 2021-12-16
Req2: There is NO record exist with type=b. Query should select all the records in the table
Table
---------------------
id type date
---------------------
1 a 2021-10-15
2 a 2021-11-16
3 a 2021-11-19
4 a 2021-12-02
5 c 2021-12-12
6 a 2021-12-16
Result
--------------------
id type date
--------------------
1 a 2021-10-15
2 a 2021-11-16
3 a 2021-11-19
4 a 2021-12-02
5 c 2021-12-12
6 a 2021-12-16
with max_b_date as (select max(date) as date
from table1 where type = 'b')
select t1.*
from table1 t1
cross join max_b_date
where t1.date >= max_b_date.date
or max_b_date.date is null
(table is a SQL reserved word, https://en.wikipedia.org/wiki/SQL_reserved_words, so I used table1 as table name instead.)
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=bd05543a9712e27f01528708f10b209f
Please try this(It's somewhat deep but might you exact looking for)
select ab.* from
((select top 1 id, type, date from test where type = 'b' order by id desc)
union
select * from test where type != 'b') as ab
where ab.id >= (select COALESCE((select top 1 id from test where type = 'b' order by id desc), 0))
order by ab.id;
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=739eb6bfee787e5079e616bbf4e933b1
Looks Like you can use an OR condition here
SELECT
*
FROM
(
SELECT
*,
BCount = COUNT(CASE type WHEN 'B' THEN 1 ELSE NULL END)-- to get the Count of Records with Type b.
FROM Table
)Q
WHERE
(
BCount > 0 AND id >= (select top 1 id from table where type=b)-- if there are Row's with Type b then select Req#1
)
OR
(
BCount = 0 -- if THere are no rows with Type B select All
)

Postgresql query to filter latest data based on 2 columns

Table Structure First
users table
id
1
2
3
sites table
id
1
2
site_memberships table
site_id
user_id
created_on
1
1
1
1
1
2
1
1
3
2
1
1
2
1
2
1
2
2
1
2
3
Assuming higher the created_on number, latest the record
Expected Output
site_id
user_id
created_on
1
1
3
2
1
2
1
2
3
Expected output: I need latest record for each user for each site membership.
Tried the following query, but this does not seem to work.
select * from users inner join
(
SELECT ROW_NUMBER () OVER (
PARTITION BY sm.user_id,
sm.created_on
), sm.*
from site_memberships sm
inner join sites s on sm.site_id=s.id
) site_memberships
ON site_memberships.user_id = users.user_id where row_number=1```
I think you have overcomplicated the problem you want to solve.
You seem to want aggregation:
select site_id, user_id, max(created_on)
from site_memberships sm
group by site_id, user_id;
If you had additional columns that you wanted, you could use distinct on instead:
select distinct on (site_id, user_id) sm.*
from site_memberships sm
order by site_id, user_id, created_on desc;

How to join two tables without performing Cartesian product in SQL

I have index_date information for IDs and I want to extract baseline ( information between index_date and Index_date minus 6 months). I want to do this without using Cartesian product.
Total Table
ID index_date detail
1 01Jan2012 xyz
1 01Dec2011 pqr
1 01Nov2010 pqr
2 26Feb2013 abc
3 02Mar2013 abc
3 02Feb2013 ert
3 02Jan2013 tyu
4 07May2015 rts
I have a table A extracted from Total which has the index_dates:
ID index_date index_detail
1 01Jan2012 xyz
2 26Feb2013 abc
3 02Mar2013 abc
4 07May2015 rts
I want to extract baseline periods data for IDs in A from from the Total table
Table want :
ID date index_date detail index_detail
1 01Jan2012 01Jan2012 xyz xyz
1 01Dec2011 01Jan2012 pqr xyz
2 26Feb2013 26Feb2013 abc abc
3 02Mar2013 02Mar2013 abc abc
3 02Feb2013 02Mar2013 ert abc
3 02Jan2013 02Mar2013 tyu abc
4 07May2015 07May2015 rts rts
code used :
create table want as
select a.* , b.date,b.detail
from table_a as a
right join
Total as b
on a.id = b.id where
a.index_date > b.date
AND b.date >= add_months( a.index_date, -6)
;
But this requires Cartesian Product. Is there a way to do this without requiring Cartesian product.
DBMS - Hive
Sorry, I don't know it.
I'll give the solution on pure SQL for MySQL 8+ - maybe you'll find the way to convert it to Hive syntax.
SELECT id,
index_date date,
FIRST_VALUE(index_date) OVER (PARTITION BY ID ORDER BY STR_TO_DATE(index_date, '%d%b%Y') DESC) index_date,
detail,
FIRST_VALUE(detail) OVER (PARTITION BY ID ORDER BY STR_TO_DATE(index_date, '%d%b%Y') DESC) index_detail
FROM test
ORDER BY 1 ASC, 2 DESC
fiddle
I would recommend three steps:
Convert the date to a number.
Find the minimum date in a six month period.
Get the first value in that group.
This looks like:
select t.*, t2.index_date, t2.detail
from (select t.*,
min(index_date) over (partition by id
order by months
range between 6 preceding and current row
) as sixmonth_date
from (select t.*,
year(index_date) * 12 + month(index_date) as months
from total t
) t
) t left join
total t2
on t2.id = t.id and t2.index_date = t.sixmonth_date;
This is marginally simpler if first_value() accepts range window frames -- but I'm not sure if it does. It is worth trying, though:
select t.*,
min(index_date) over (partition by id
order by months
range between 6 preceding and current row
) as sixmonth_date,
first_value(detail) over (partition by id
order by months
range between 6 preceding and current row
) as sixmonth_value
from (select t.*,
year(index_date) * 12 + month(index_date) as months
from total t
) t

How to get MAX Hike in Min month?

below is table:
Name | Hike% | Month
------------------------
A 7 1
A 6 2
A 8 3
b 4 1
b 7 2
b 7 3
Result should be:
Name | Hike% | Month
------------------------
A 8 3
b 7 2
Here is one way of doing this:
SELECT Name, [Hike%], Month
FROM
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY [Hike%] DESC, Month) rn
FROM yourTable
) t
WHERE rn = 1
ORDER BY Name;
If you instead want to return multiple records per name, in the case where two or more records might be tied for having the greatest hike%, then replace ROW_NUMBER with RANK.
use correlated subquery
select Name,min(Hike) as Hike,min(Month) as Month
from
(
select * from tablename a
where Hike in (select max(Hike) from tablename b where a.name=b.name)
)A group by Name
You can use something similar to the below:
SELECT Name, MAX(Hike), Month
FROM table
GROUP BY Name, Month
Hope this helps :)