SELECT cell value and apply for all rows based on ID - sql

How can I take value from cell and display for all rows that belongs to current id?
Example of table:
id
event
parameter
1111111
session_start
value_1
1111111
page_view
null
1111111
page_view
null
2222222
session_start
value_2
2222222
page_view
null
2222222
page_view
null
3333333
session_start
value_3
3333333
page_view
null
3333333
page_view
null
Output:
id
event
parameter
1111111
session_start
value_1
1111111
page_view
value_1
1111111
page_view
value_1
2222222
session_start
value_2
2222222
page_view
value_2
2222222
page_view
value_2
3333333
session_start
value_3
3333333
page_view
value_3
3333333
page_view
value_3

I wouldn't use a CTE or subquery here (both are far too complicated for this use case) and I also don't think using an aggregate function like MIN is correct here. This would also replace other values if present, not only NULL values.
In my opinion, this is a perfect use case for FIRST_VALUE with COALESCE.
You didn't tag your DBMS, but most today's DBMS will correctly execute following query:
SELECT id, event,
COALESCE(parameter, FIRST_VALUE(parameter)
OVER (PARTITION BY id ORDER BY id)) AS parameter
FROM mytable;
This will get the first value for each id. And this value will be set only in case the original value is NULL, otherwise the value will not be changed.
Try out here

Related

Highest value for each GROUP

I have a table with 2 columns (there are more but these are the important ones) timestamp and analysisId. There is no constraint on either but in practice timestamp will be unique. Many rows have the same analysisId and different timestamps. I need a query that returns only the highest timestamp for each analysisId
So for example the data may look something like
timestamp | analysisId
1234 | 1
1236 | 1
1300 | 2
1337 | 3
1400 | 3
And the result I would want would be
timestamp | analysisId
1236 | 1
1300 | 2
1400 | 3
Currently, I have
SELECT "timestamp", analysisId FROM myData GROUP BY (analysisId, "timestamp") ORDER BY "timestamp" DESC LIMIT 1;
However of course this only gives me one result, whereas I want each result per analysisId
This is a simple aggregate using max
select analysisId, max(Timestamp) as Timestamp
from t
group by AnalysisId;

SQL: Group multiple records by ID and get initial and end date based on date field

I have the following table:
+-----------+------------+
| client_id | reg_date |
+-----------+------------+
| 1 | 01-01-2021 |
+-----------+------------+
| 1 | 01-06-2021 |
+-----------+------------+
| 2 | 01-01-2021 |
+-----------+------------+
I need a new table, with the init_reg_date and end_reg_date per client_id, something like this:
client_id
init_reg_date
end_reg_date
1
01-01-2021
01-06-2021
2
01-01-2021
Is there any way to do this with SQL?
Thanks to all!
Do a GROUP BY. Use a case expression to return the latest date for a client only if it's not the same date as the client's first date.
select client_id,
min(reg_date) as init_reg_date,
case when max(reg_date) > min(reg_date) then max(reg_date) end as end_reg_date
from tablename
group by client_id
You can even simplify this a bit using NULLIF. (if the operands are equal, the NULLIF expression has the value null, otherwise it has the value of the first operand.)
select client_id,
min(reg_date) as init_reg_date,
NULLIF(max(reg_date), min(reg_date)) as end_reg_date
from tablename
group by client_id

How to associate date events together by date

I'm working on writing a query to organize install and removal dates for car part numbers. I want to find a record of all car part installs, and removals of the same part if they have been removed from a vehicle, identified by it's VIN. I'm having trouble associating these events together because the only thing tying them together is the dates. Removals must occur after installs and another install cannot occur on the same part unless it has been removed first.
I have been able to summarize the data into separate rows by event type (e.g. each install has its own row and each removal has its own row.
What I've tried is using DECODE() by event type, but it keeps the records in separate rows. Maybe there's something COALESCE() can do here, but I'm not sure.
Here's a summary of how the data looks:
part_no | serial_no | car_vin | event_type | event_date
12345 | a1b2c3 | 9876543 | INSTALL | 01-JAN-2019
12345 | a1b2c3 | 9876543 | REMOVE | 01-AUG-2019
54321 | t3c4a8 | 9876543 | INSTALL | 01-MAR-2019
12345 | a1b2c3 | 3456789 | INSTALL | 01-SEP-2019
And here's what the expected outcome is:
part_no | serial_no | car_vin | install_date | remove_date
12345 | a1b2c3 | 9876543 | 01-JAN-2019 | 01-AUG-2019
12345 | a1b2c3 | 3456789 | 01-SEP-2019 |
54321 | t3c4a8 | 9876543 | 01-MAR-2019 |
We can use pivoting logic here:
SELECT
part_no,
serial_no,
car_vin,
MAX(CASE WHEN event_type = 'INSTALL' THEN event_date END) AS install_date,
MAX(CASE WHEN event_type = 'REMOVE' THEN event_date END) AS remove_date
FROM yourTable
GROUP BY
part_no,
serial_no,
car_vin
ORDER BY
part_no;
Demo
This approach is a typical way to transform a key value store table, which is basically what your table is, into the output you want to see.
You can use the SQL for Pattern Matching (MATCH_RECOGNIZE):
WITH t(part_no,serial_no,car_vin,event_type,event_date) AS
(SELECT 12345, 'a1b2c3', 9876543, 'INSTALL', DATE '2019-01-01' FROM dual
UNION ALL SELECT 12345, 'a1b2c3', 9876543, 'REMOVE', DATE '2019-08-01' FROM dual
UNION ALL SELECT 54321, 't3c4a8', 9876543, 'INSTALL', DATE '2019-03-01' FROM dual
UNION ALL SELECT 12345, 'a1b2c3', 3456789, 'INSTALL', DATE '2019-09-01' FROM dual)
SELECT part_no,serial_no,car_vin, INSTALL_DATE, REMOVE_DATE
FROM t
MATCH_RECOGNIZE (
PARTITION BY part_no,serial_no,car_vin
ORDER BY event_date
MEASURES
FINAL MAX(REMOVE.event_date) AS REMOVE_DATE,
FINAL MAX(INSTALL.event_date) AS INSTALL_DATE
PATTERN ( INSTALL REMOVE? )
DEFINE
REMOVE AS event_type = 'REMOVE',
INSTALL AS event_type = 'INSTALL'
)
ORDER BY part_no, INSTALL_DATE, REMOVE_DATE;
+--------------------------------------------------+
|PART_NO|SERIAL_NO|CAR_VIN|INSTALL_DATE|REMOVE_DATE|
+--------------------------------------------------+
|12345 |a1b2c3 |9876543|01.01.2019 |01.08.2019 |
|12345 |a1b2c3 |3456789|01.09.2019 | |
|54321 |t3c4a8 |9876543|01.03.2019 | |
+--------------------------------------------------+
The key clause here is PATTERN ( INSTALL REMOVE? ). It means, you have exactly one INSTALL event followed by zero or one REMOVE event.
If you can have more than just one INSTALL event then use PATTERN ( INSTALL+ REMOVE? )
If you can have more than just one INSTALL event and optionally more than one REMOVE event then use PATTERN ( INSTALL+ REMOVE* )
You can simply add more events, e.g. ORDER, DISPOSAL, etc.

How to use function 'LISTAGG' in Netteza

My data
B_STAFF_CODE PERIOD_COLL
----------------------------------
1111111 201901
2222222 201901
1111111 201902
3333333 201903
----------------------------------
I have try to use the function 'LISTAGG' via SQL statement in Netteza
and I get the error as below,
ERROR: Function 'LISTAGG' is not an analytic aggregate but is called with a window spec
SELECT B_STAFF_CODE,
LISTAGG(PERIOD_COLL, ' , ') WITHIN GROUP (ORDER BY PERIOD_COLL) as CONCAT_PERIOD
FROM F_STAFF_MASTER
GROUP BY B_STAFF_CODE;
B_STAFF_CODE CONCAT_PERIOD
----------------------------------
1111111 201901, 201902
2222222 201901
3333333 201903
----------------------------------
You can use GROUP_CONCAT()
SELECT B_STAFF_CODE,
GROUP_CONCAT(PERIOD_COLL, ' , ') as CONCAT_PERIOD
FROM F_STAFF_MASTER
GROUP BY B_STAFF_CODE

Display value for each group of records only once

In SAP HANA database I have a table which returns duplicated values for each ID:
ID | NUMBER| VALUE
101| 123 | 0.25
101| 124 | 0.25
102| 125 | 0.7
102| 126 | 0.7
102| 127 | 0.7
In the output I would like to have VALUE displayed only once for each ID, and for others NULL like :
ID | NUMBER| VALUE
101| 123 | 0.25
101| 124 | NULL
102| 125 | 0.7
102| 126 | NULL
102| 127 | NULL
To achieve that I used ROW_NUMBER() function, and displayed VALUE only for records having row number = 1:
SELECT
CASE WHEN
ROW_NUMBER() OVER (PARTITION BY "ID") = 1
THEN
"VALUE"
ELSE
NULL
END AS "VALUE_2"
FROM
"MY_TABLE"
Is there any better (more straightforward) way to achieve that result?
As "straight-forward" is a subjective valuation, this is how I would approach this requirement:
select id,
number,
value,
NULLIF (value,
lag(value) over (partition by id
order by number asc)
) VAL_OR_NULL
from vals
order by id, number;
To me, this "reads" closer to how you describe the desired effect: "display NULL when the same value has just been displayed for the current group".
The EXPLAIN PLAN and the PlanViz results for both approaches are equal, so there is no benefit/disadvantage concerning runtime or memory usage with either of them.
I originally though you were looking for lag(. . . ignore nulls):
select v.*,
coalesce(value,
lag(value ignore nulls) over (partition by id order by number)
) as imputed_value
from vals
order by v.id, v.number;
I don't think Hana supports this. You can implement it using window functions. But, you are asking for the same value for an entire id. For that, use min() or max():
select v.*,
max(value) over (partition by ) as imputed_value
from vals
order by v.id, v.number;