In SQL we need to transform a table in the following way:
Table1:
+-----+---------+-----------+
| ID | insured | DOD |
+-----+---------+-----------+
| 123 | Pam | 6/18/2013 |
| 123 | Nam | 2/12/2010 |
| 123 | Tam | 2/10/2013 |
| 456 | Jessi | 4/6/2003 |
| 457 | Ron | 4/10/2010 |
| 457 | Tom | 5/5/2008 |
+-----+---------+-----------+
Desired output table:
+-----+---------+-----------+-----------+-----------+
| ID | insured | DOD1 | DOD2 | DOD3 |
+-----+---------+-----------+-----------+-----------+
| 123 | Pam | 6/18/2013 | 2/12/2010 | 2/10/2013 |
| 456 | Jessi | 4/6/2003 | null | null |
| 457 | Ron | 4/10/2010 | 5/5/2008 | null |
+-----+---------+-----------+-----------+-----------+
I have seen somewhere that we can use pivot and unpivot, but I am not sure how can I use it here.
Your help is much appreciated.
Assuming that you really want -- or can accept -- the dates in descending order, then you can use conditional aggregation for this:
select id,
max(case when seqnum = 1 then insured end) as insured,
max(case when seqnum = 1 then dod end) as dod_1,
max(case when seqnum = 2 then dod end) as dod_2,
max(case when seqnum = 3 then dod end) as dod_3
from (select t.*,
row_number() over (partition by id order by dod desc) as seqnum
from t
) t
group by id;
If you want to preserve the original ordering, then your question does not have enough information. If you have a column with the ordering, that can be used for row_number().
I would run it like this, guessing that your id is a number and that you want only those records. from vertical to horizontal , pivot is the best option
select id , insured, DOD from yourtable
pivot(max(DOD) for ID in (123,456,457));
Related
I have a (mssql) table like this:
+----+----------+---------+--------+--------+
| id | username | date | scoreA | scoreB |
+----+----------+---------+--------+--------+
| 1 | jim | 01/2020 | 100 | 0 |
| 2 | max | 01/2020 | 0 | 200 |
| 3 | jim | 01/2020 | 0 | 150 |
| 4 | max | 02/2020 | 150 | 0 |
| 5 | jim | 02/2020 | 0 | 300 |
| 6 | lee | 02/2020 | 100 | 0 |
| 7 | max | 02/2020 | 0 | 200 |
+----+----------+---------+--------+--------+
What I need is to get the best "combined" score per date. (With "combined" score I mean the best scores per user and per date summarized)
The result should look like this:
+----------+---------+--------------------------------------------+
| username | date | combined_score (max(scoreA) + max(scoreB)) |
+----------+---------+--------------------------------------------+
| jim | 01/2020 | 250 |
| max | 02/2020 | 350 |
+----------+---------+--------------------------------------------+
I came this far:
I can group the scores by user like this:
SELECT
username, (max(scoreA) + max(scoreB)) AS combined_score,
FROM score_table
GROUP BY username
ORDER BY combined_score DESC
And I can get the best score per date with PARTITION BY like this:
SELECT *
FROM
(SELECT t.*, row_number() OVER (PARTITION BY date ORDER BY scoreA DESC) rn
FROM score_table t) as tmp
WHERE tmp.rn = 1
ORDER BY date
Is there a proper way to combine these statements and get the result I need? Thank you!
Btw. Don't care about possible ties!
You can combine window functions and aggregation functions like this:
SELECT s.*
FROM (SELECT username, date, (max(scoreA) + max(scoreB)) AS combined_score,
ROW_NUMBER() OVER (PARTITION BY date ORDER BY max(scoreA) + max(scoreB) DESC) as seqnum
FROM score_table
GROUP BY username, date
) s
ORDER BY combined_score DESC;
Note that date needs to be part of the aggregation.
Question
Say I have a table with such rows:
id | country | place | last_action | second_to_last_action
----------------------------------------------------------
1 | US | 2 | reply |
1 | US | 2 | | comment
4 | DE | 5 | reply |
4 | | | | comment
What I want to do is to combine these by id, country and place so that the last_action and second_to_last_action would be on the same row
id | country | place | last_action | second_to_last_action
----------------------------------------------------------
1 | US | 2 | reply | comment
4 | DE | 5 | reply | comment
How would I approach this? I guess I would need an aggregate here but my mind is hitting completely blank on which one should I use.
It can be expected that there will always be a matching pair.
Background:
Note: this table has been derived from something like this:
id | country | place | action | time
----------------------------------------------------------
1 | US | 2 | reply | 16:15
1 | US | 2 | comment | 15:16
1 | US | 2 | view | 13:16
4 | DE | 5 | reply | 17:15
4 | DE | 5 | comment | 16:16
4 | DE | 5 | view | 14:12
Code used to partition was:
row_number() over (partition by id order by time desc) as event_no
And then I got the last and second_to_last action by getting event_no 1 & 2. So if there's more efficient way to get the last two actions in two distinct columns I would be happy to hear that.
You can fix your first data by using aggregation:
select id, country, place, max(last_action), max(second_to_last_action)
from derived
group by id, country, place;
You can do this from the original table using conditional aggregation:
select id, country, place,
max(case when seqnum = 1 then action end) as last_action,
max(case when seqnum = 2 then action end) as second_to_last_action
from (select t.*,
row_number() over (partition by id order by time desc) as seqnum
from t
) t
group by id, country, place;
I use MariaDB 10.2.21
I have not seen this exact case elsewhere, hence my request for assistance.
I have a History table containing one record per change on any of the fields in a JIRA issues:
+----------+---------------+----------+-----------------+---------------------+
| IssueKey | OriginalValue | NewValue | Field | ChangeDate |
+----------+---------------+----------+-----------------+---------------------+
| HRSK-184 | (NULL) | 2 | Risk Detection | 2019-10-24 10:57:27 |
| HRSK-184 | (NULL) | 2 | Risk Occurrence | 2019-10-24 10:57:27 |
| HRSK-184 | (NULL) | 2 | Risk Severity | 2019-10-24 10:57:27 |
| HRSK-184 | 2 | 4 | Risk Detection | 2019-10-25 11:54:07 |
| HRSK-184 | 2 | 6 | Risk Detection | 2019-10-25 11:54:07 |
| HRSK-184 | 2 | 3 | Risk Severity | 2019-10-24 11:54:07 |
| HRSK-184 | 6 | 5 | Risk Detection | 2019-10-26 09:11:01 |
+----------+---------------+----------+-----------------+---------------------+
Every record contains the old and new value and the fieldtype that has changed ('Field') and, of course, the corresponding timestamp of that change.
I want to query the point-in-time status providing me the combination of the most recent values of every of the fields 'Risk Severity, Risk Occurrence and Risk Detection'.
The result should be like this:
+----------+----------------+-------------------+------------------+----------------------+
| IssueKey | Risk Severity | Risk Occurrence | Risk Detection | ChangeDate |
+----------+----------------+-------------------+------------------+----------------------+
| HRSK-184 | 3 | 2 | 5 | 2019-10-26 09:11:01 |
+----------+----------------+-------------------+------------------+----------------------+
Any ideas? I'm stuck...
Thanks in advance for you effort!
You cold use a couple of inline queries
select
IssueKey,
(
select t1.NewValue
from mytable t1
where t1.IssueKey = t.IssueKey and t1.Field = 'Risk Severity'
order by ChangeDate desc limit 1
) `Risk Severity`,
(
select t1.NewValue
from mytable t1
where t1.IssueKey = t.IssueKey and t1.Field = 'Risk Occurrence'
order by ChangeDate desc limit 1
) `Risk Occurrence`,
(
select t1.NewValue
from mytable t1
where t1.IssueKey = t.IssueKey and t1.Field = 'Risk Detection'
order by ChangeDate desc limit 1
) `Risk Severity`,
max(ChangeDate) ChangeDate
from mytable t
group by IssueKey
With an index on (IssueKey, Field, ChangeDate, NewValue), this should an efficient option.
Demo on DB Fiddle:
IssueKey | Risk Severity | Risk Occurrence | Risk Severity | ChangeDate
:------- | ------------: | --------------: | ------------: | :------------------
HRSK-184 | 3 | 2 | 5 | 2019-10-26 09:11:01
MariaDB 10.2 has introduced some Window Functions for analytical queries.
One of them is RANK() OVER (PARTITION BY ...ORDER BY...) function.
Firstly, you can apply it, and then pivot through Conditional Aggregation :
SELECT IssueKey,
MAX(CASE WHEN Field = 'Risk Severity' THEN NewValue END ) AS RiskSeverity,
MAX(CASE WHEN Field = 'Risk Occurrence' THEN NewValue END ) AS RiskOccurrence,
MAX(CASE WHEN Field = 'Risk Detection' THEN NewValue END ) AS RiskDetection,
MAX(ChangeDate) AS ChangeDate
FROM
(
SELECT RANK() OVER (PARTITION BY IssueKey, Field ORDER BY ChangeDate Desc) rnk,
t.*
FROM mytable t
) t
WHERE rnk = 1
GROUP BY IssueKey;
IssueKey | RiskSeverity | RiskOccurrence | RiskDetection | ChangeDate
-------- + --------------+-----------------+----------------+--------------------
HRSK-184 | 3 | 2 | 5 | 2019-10-26 09:11:01
Demo
I have table :
+------+-------+-----------------+
| id | name | code | desc |
+------+-------+-----------------+
| 1 | aa | 032016 | grape |
| 1 | aa | 012016 | apple |
| 1 | aa | 032016 | grape |
| 1 | aa | 022016 | orange |
| 1 | aa | 012016 | apple |
| 1 | aa | 032016 | grape |
+------+-------+-----------------+
i tried with query:
SELECT id, name, code, desc, COUNT(code) as view
FROM mytable
GROUP BY id, name, code, desc
and the result is :
+------+-------+------------------------+
| id | name | code | desc | view |
+------+-------+------------------------+
| 1 | aa | 012016 | apple | 2 |
| 1 | aa | 022016 | orange | 1 |
| 1 | aa | 032016 | grape | 3 |
+------+-------+------------------------+
what i expected is like this :
+------+-------+----------------------------------------------------+
| id | name | code | desc | view |
+------+-------+----------------------------------------------------+
| 1 | aa | 012016,022016,032016 | apple,orange,grape | 2,1,3 |
+------+-------+----------------------------------------------------+
can anyone help me how to aggregate the result?
thanks in advance
Your table design has me a bit worried. Is it coincidence that one fruit always has the same code in the table? Then why store it redundantly? There should be a fruit table holding each fruit and its code only once. You know why this is called a relational database system, don't you?
However, with your query you are almost where you wanted to get. You have the counts per id, name, code, and desc. Now you want to aggregate even further. So in the next step group by id and name, because you want one result row per id and name it seems. Use LISTAGG to concatenate the strings in the group:
SELECT
id,
name,
listagg(code, ',') within group(order by code) as codes,
listagg(desc, ',') within group(order by code) as descs,
listagg(view, ',') within group(order by code) as views
FROM
(
SELECT id, name, code, desc, COUNT(*) as view
FROM mytable
GROUP BY id, name, code, desc
)
GROUP BY id, name
ORDER BY id, name;
So I have a table that has data such as this:
SCHD_ID | INST_ID |
|---------|---------|
| 1001 | Mike |
| 1001 | Ted |
| 1001 | Chris |
| 1002 | Jill |
| 1002 | Jamie |
| 1003 | Brad |
| 1003 | Carl |
| 1003 | Drew |
| 1003 | Nick |
I need to come up with a query to display the data like below:
|SCHD_ID | INST 1 | INST 2 | INST 3 |
|---------|--------|--------|--------|
| 1001 | Mike | Ted | Chris |
| 1002 | Jill | Jamie | Null |
| 1003 | Brad | Carl | Drew |
I have tried looking into all the pivot descriptions and some case examples but everything seems to use a common repeated value to pivot around. This is one of those cases where the columns need to be dynamic, but only to a point. I can drop off any data after the third instructor. In the example above I did not put in a column for INST 4 for SCHD_ID 1003 even though in my data set example it existed. Can adding in a restraint like this make it possible to come up with a non dynamic solution for the pivot/case statement?
Thanks for the help,
Dwayne
You can do this using row_number() and conditional aggregation. However, your data doesn't have an ordering column, so you cannot guarantee which three instructors you will get:
select schd_id,
max(case when seqnum = 1 then inst_id end) as inst1,
max(case when seqnum = 2 then inst_id end) as inst2,
max(case when seqnum = 3 then inst_id end) as inst3
from (select t.*,
row_number() over (partition by schd_id order by sched_id) as seqnum
from table t
) t
group by SCHD_ID ;
If you have a priority ordering for choosing the instructors, then put the logic in the order by clause.