MySQL: Select penult values - sql

There are 2 tables:
table1:
id |phone| order|
---|-----|------|
1 | 122 | 6 |
2 | 122 | 4 |
3 | 122 | 3 |
4 | 123 | 6 |
5 | 123 | 5 |
6 | 123 | 3 |
7 | 124 | 6 |
8 | 124 | 5 |
9 | 125 | 6 |
10| 125 | 5 |
table2:
|phone |
|------|
|122 |
|123 |
|124 |
I have to select id and last order according next conditions:
If order not equals 3 take row with max id value for this phone
If order equals 3 take pre-max id for this phone
Id is in table2.
So result should be:
|phone | order|
|------ |------|
|122 | 4 |
|123 | 5 |
|124 | 5 |
MySQL version: Ver 15.1 Distrib 5.5.64-MariaDB

Basically you want to look at the last two records; if the last record has order 3, then use the previous one.
That would have been a simple query with window functions and/or lateral joins be your old MySQL version does not support these features. User variables are an option, as demonstrated by nbk, but they are tricky to use - and MySQL 8.0 annonced that this feature will be deprecated in a future version.
I am going to recommend correlated subqueries and a little logic:
select t2.id,
coalesce(
nullif((select ord from table1 t1 where t1.id = t2.id order by odering_id desc limit 1), 3),
(select ord from table1 t1 where t1.id = t2.id order by odering_id desc limit 1, 1)
) as ord
from table2 t2
The first subquery gets the latest value; nullif() checks the returned value and returns null if it has order 3; this indicate coalesce() that it should return the result of the second subquery, that gets the previous value.
order is a language keyword, so I used ord instead.
Demo in MySQL 5.5:
id | ord
--: | --:
122 | 4
123 | 5
124 | 5

Your mariadb version is a little old
Thta will use the row number sorted by the order column and it will select onl ythe second one.
the LIMIT in the subquery is needed,, because mariadb follows the standard and would not sort the subselect.
CREATE TABLE Table1
(`id` int, `order` int)
;
INSERT INTO Table1
(`id`, `order`)
VALUES
(122, 6),
(122, 4),
(122, 3),
(123, 6),
(123, 5),
(123, 3),
(124, 6),
(124, 5),
(125, 6),
(125, 5)
;
CREATE TABLE Table2
(`id` int)
;
INSERT INTO Table2
(`id`)
VALUES
(122),
(123),
(124)
;
SELECT id,`order`
FROM (SELECT
t1.`order`
, IF ( #id = t1.id ,#rn := #rn +1, #rn:= 1) AS rownum
, #id := t1.`id` as id
FROM Table1 t1 INNER JOIN Table2 t2 ON t1.id = t2.id,(SELECT #id := 0,#rn := 0) t3
ORDEr BY t1.id,t1.`order` DESC LIMIT 18446744073709551615) t4
WHERE rownum = 2
id | order
--: | ----:
122 | 4
123 | 5
124 | 5
db<>fiddle here

Related

Compare two tables and show the value of another table if exist, if not exist just show status in SQL

I have two tables to compare in SQL. When the id from one exists in the other, the result I want is the value of data from the second table; when it doesn't exist it will show "Data not Exist" in the 'value' field name.
Example
Table 1
| id|
-----
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10|
Table 2
|id | value
---------
| 1 | 10|
| 2 | 9 |
| 3 | 7 |
| 4 | 8 |
| 5 | 6 |
I've tried the query below:
select a.id,
CASE when exists(select a.id from table2 b where a.id = b.id)
THEN value
else 'Data Not Exist'
END as Result_Value
from table1 a inner join table2 b
on a.id=b.id
order by a.id;
The Result is:
|id | Result_Value
---------
| 1 | 10|
| 2 | 9 |
| 3 | 7 |
| 4 | 8 |
| 5 | 6 |
Above result that's not I wanted, my expectation result like below:
|id | Result_Value
---------
| 1 | 10 |
| 2 | 9 |
| 3 | 7 |
| 4 | 8 |
| 5 | 6 |
| 6 | Data Not Exist |
| 7 | Data Not Exist |
| 8 | Data Not Exist |
| 9 | Data Not Exist |
| 10| Data Not Exist |
Note: This is simple explanation from my query, because my query have complexity to join another table with inner join, I don't know where I'm exactly wrong using select exist.
Just use a LEFT JOIN, and COALESCE any NULL values to Data not Exist:
SELECT a.id, COALESCE(b.value, 'Data not exist') AS value
FROM a
LEFT JOIN b ON b.id = a.id
Output:
id value
1 10
2 9
3 7
4 8
5 Data not exist
6 Data not exist
7 Data not exist
8 Data not exist
9 Data not exist
10 Data not exist
Demo on dbfiddle
I found 2 issues here.
Don't use join/inner join if you want your next table to show up.
DataTypes of your select case values should be the same.
Here's your query.
select a.id,
case
when isnull(b.id, '') != ''
then cast(b.value as varchar(50))
else
'Data Not Exist'
END as Result_Value
from table1 a
left join table2 b on a.id=b.id
order by a.id;
Alternatively, Using LEFT JOIN between Table1 and Table2 and ISNULL to check NULL, If NULL then replace with Data not Exist
SELECT a.id, ISNULL(b.value,'Data not Exist') AS value FROM dbo.Table1 a
LEFT JOIN dbo.Table2 b ON a.id=b.id
You can get the desired results by using a LEFT JOIN a long side with one of:
COALESCE() expression.
ISNULL() function.
CASE expression.
IIF() function.
As the following
SELECT T1.Id,
COALESCE(CAST(T2.Value AS VARCHAR(10)), 'Data Not Exist') ByCoalesce,
ISNULL(CAST(T2.Value AS VARCHAR(10)), 'Data Not Exist') ByIsNull,
CASE WHEN T2.Value IS NULL
THEN 'Data Not Exist'
ELSE CAST(T2.Value AS VARCHAR(10))
END ByCaseExpression,
IIF(T2.Value IS NULL, 'Data Not Exist', CAST(T2.Value AS VARCHAR(10))) ByIifFunction
FROM
(
VALUES
(1),
(2),
(3),
(4),
(5),
(6),
(7),
(8),
(9),
(10)
) T1(Id) LEFT JOIN
(
VALUES
(1, 10),
(2, 9 ),
(3, 7 ),
(4, 8 ),
(5, 6 )
) T2(Id, Value)
ON T1.Id = T2.Id;
Note that you need to CAST() / CONVERT() the INT values to VARCHAR(n) because VARCHAR data type has a lower precedence than INT data type.
Online Demo

Select MAX date using data from several columns SQL

I know this is a much asked question and I've had a look through whats already available but I believe my case is slightly unique (and if it's not please point me in the right direction).
I am trying to find the latest occurrence of a row associated to a user a currently across two tables and several columns.
table: statusUpdate
+-------+-----------+-----------+-------------------+
| id | name | status | date_change |
+-------+-----------+-----------+-------------------+
| 1 | Matt | 0 | 01-01-2001 |
| 2 | Jeff | 1 | 01-01-2001 |
| 3 | Jeff | 2 | 01-01-2002 |
| 4 | Bill | 2 | 01-01-2001 |
| 5 | Bill | 3 | 01-01-2004 |
+-------+-----------+-----------+-------------------+
table: relationship
+-------+-----------+--------------+
| id | userID |stautsUpdateID|
+-------+-----------+--------------+
| 1 | 22 | 1 |
| 2 | 33 | 2 |
| 3 | 33 | 3 |
| 4 | 44 | 4 |
| 5 | 44 | 5 |
+-------+-----------+--------------+
There is a third table which links userID to its own table but these sample tables should be good enough to get my question over.
I am looking to get the latest status change by date. The problem currently is that it returns all instances of a status change.
Current results:
+-------+---------+-----------+-------------------+
|userID |statusID | status | date_change |
+-------+---------+-----------+-------------------+
| 33 | 2 | 1 | 01-01-2001 |
| 33 | 3 | 2 | 01-01-2002 |
| 44 | 4 | 2 | 01-01-2001 |
| 44 | 5 | 3 | 01-01-2004 |
+-------+---------+-----------+-------------------+
Expected results:
+-------+-----------+-----------+-------------------+
|userID |statusID | status | date_change |
+-------+-----------+-----------+-------------------+
| 33 | 3 | 2 | 01-01-2002 |
| 44 | 5 | 3 | 01-01-2004 |
+-------+-----------+-----------+-------------------+
I hope this all makes sense, please ask for more information otherwise.
Just to reiterate I just want to return the latest instance of a users status change by date.
Sample code of one of my attempts:
select
st.ID, st.status, st.date_change, r.userID
from statusUpdate st
inner join Relationship r on st.ID = r.statusUpdateID
inner join (select ID, max(date_change) as recent from statusUpdate
group by ID) as y on r.stausUpdateID = y.ID and st.date_change =
y.recent
Hope someone can point me in the right direction.
use row_number() to get the last row by user
select *
from
(
select st.ID, st.status, st.date_change, r.userID,
rn = row_number() over (partition by r.userID order by st.date_change desc)
from statusUpdate st
inner join Relationship r on st.ID = r.statusUpdateID
) as d
where rn = 1
I ADDED MAX condition to your answer
CREATE TABLE #Table1
([id] int, [name] varchar(4), [status] int, [date_change] datetime)
;
INSERT INTO #Table1
([id], [name], [status], [date_change])
VALUES
(1, 'Matt', 0, '2001-01-01 00:00:00'),
(2, 'Jeff', 1, '2001-01-01 00:00:00'),
(3, 'Jeff', 2, '2002-01-01 00:00:00'),
(4, 'Bill', 2, '2001-01-01 00:00:00'),
(5, 'Bill', 3, '2004-01-01 00:00:00')
;
CREATE TABLE #Table2
([id] int, [userID] int, [stautsUpdateID] int)
;
INSERT INTO #Table2
([id], [userID], [stautsUpdateID])
VALUES
(1, 22, 1),
(2, 33, 2),
(3, 33, 3),
(4, 44, 4),
(5, 44, 5)
select
max(st.ID) id , max(st.status) status , max(st.date_change) date_change, r.userID
from #Table1 st
inner join #Table2 r on st.ID = r.stautsUpdateID
inner join (select ID, max(date_change) as recent from #Table1
group by ID) as y on r.stautsUpdateID = y.ID and st.date_change =
y.recent
group by r.userID
output
id status date_change userID
1 0 2001-01-01 00:00:00.000 22
3 2 2002-01-01 00:00:00.000 33
5 3 2004-01-01 00:00:00.000 44

SQL difference between Multiple Rows having the same ID

SQL Sever 2012
Raw Data
ID VAL Time
+---+----+---------------------+
| 2 | 1 | 2015-05-09 12:54:39 |
| 3 | 10 | 2015-05-09 12:54:39 |
| 2 | 1 | 2015-05-09 12:56:39 |
| 3 | 10 | 2015-05-09 12:56:39 |
| 2 | 5 | 2015-05-09 13:48:30 |
| 3 | 16 | 2015-05-09 13:48:30 |
| 2 | 7 | 2015-05-09 15:01:09 |
| 3 | 20 | 2015-05-09 15:01:09 |
+---+----+---------------------+
I have a table where VAL is increasing forever in time. I want to manipulate the data to show how much VAL is increasing for each ID over time. So Val at Time2 - Val at Time1
Ideal Result:
ID VALI Time
+---+----+---------------------+
| 2 | 0 | 2015-05-09 12:56:39 |
| 3 | 0 | 2015-05-09 12:56:39 |
| 2 | 4 | 2015-05-09 13:48:30 |
| 3 | 6 | 2015-05-09 13:48:30 |
| 2 | 2 | 2015-05-09 15:01:09 |
| 3 | 4 | 2015-05-09 15:01:09 |
+---+----+---------------------+
Code so far:
select
t1.Time,t1.[ID],t2.[VAL]-t1.[VAL] AS [ValI]
from #tempTable t1
inner join #tempTable t2 ON t1.[ID]=t2.[ID]
AND t1.[Time]<t2.[Time]
I need to calculate the difference between the current timestamp and ONLY the Time right before current timestamp not all timestamps before the current timestamp. As of now I get a lot of repeating values when VAL did not change.
You can use this.
DECLARE #MyTable TABLE (ID INT, VAL INT, [Time] DATETIME)
INSERT INTO #MyTable VALUES
(2, 1 ,'2015-05-09 12:54:39'),
(3, 10 ,'2015-05-09 12:54:39'),
(2, 1 ,'2015-05-09 12:56:39'),
(3, 10 ,'2015-05-09 12:56:39'),
(2, 5 ,'2015-05-09 13:48:30'),
(3, 16 ,'2015-05-09 13:48:30'),
(2, 7 ,'2015-05-09 15:01:09'),
(3, 20 ,'2015-05-09 15:01:09')
;WITH CTE AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY [Time]) RN FROM #MyTable
)
SELECT T1.ID, T2.VAL - T1.VAL AS VALI, T2.Time FROM CTE T1
INNER JOIN CTE T2 ON T1.ID = T2.ID AND T1.RN = T2.RN - 1
ORDER BY T1.[Time], T1.ID
Result:
ID VALI Time
----------- ----------- -----------------------
2 0 2015-05-09 12:56:39.000
3 0 2015-05-09 12:56:39.000
2 4 2015-05-09 13:48:30.000
3 6 2015-05-09 13:48:30.000
2 2 2015-05-09 15:01:09.000
3 4 2015-05-09 15:01:09.000
Here this should work:
select id, time, val-prevval val1 from (
select * , lag(val, 1, 0) over(partition by id order by val, time) prevVal from #Temp)A
order by time
You could first put a Rank on your #tempTable ordered by Time descending, and partitioned by ID.
Then your join becomes this:
select
t1.Time,
t1.[ID],
t1.[VAL] - t2.[VAL] AS [ValI]
from #tempTable t1
inner join #tempTable t2 ON t1.[ID] = t2.[ID]
AND t2.Rank = (t1.Rank + 1)
LAG() became available in SQL 2012. This allows you to take the current row's val and subtract the val from the previous row, grouped by the id and sorted by Time. That will return NULL for the first two rows, since they don't have a previous record to compare to. You can exclude them by placing the query in a sub-select then applying a WHERE valDiff IS NULL, or you can default the valDiff using the third argument of LAG() > LAG(Val,1,0) to default the first two rows to 0.
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE t1 ( ID int, VAL int, [Time] datetime) ;
INSERT INTO t1 ( ID, Val, [Time] )
VALUES
( 2, 1 , '2015-05-09 12:54:39')
, ( 3, 10, '2015-05-09 12:54:39')
, ( 2, 1 , '2015-05-09 12:56:39')
, ( 3, 10, '2015-05-09 12:56:39')
, ( 2, 5 , '2015-05-09 13:48:30')
, ( 3, 16, '2015-05-09 13:48:30')
, ( 2, 7 , '2015-05-09 15:01:09')
, ( 3, 20, '2015-05-09 15:01:09')
;
Query 1:
SELECT s1.ID
, s1.ValDiff
, FORMAT(s1.[Time], 'yyyy-MM-dd hh:mm:ss') AS fTime
FROM (
SELECT ID
, Val - LAG(Val,1) OVER ( PARTITION BY ID ORDER BY [Time],ID ) AS ValDiff
, [Time]
FROM t1
) s1
WHERE s1.valDiff IS NOT NULL
ORDER BY s1.[Time],s1.ID
Results:
| ID | ValI | fTime |
|----|---------|---------------------|
| 2 | 0 | 2015-05-09 12:56:39 |
| 3 | 0 | 2015-05-09 12:56:39 |
| 2 | 4 | 2015-05-09 01:48:30 |
| 3 | 6 | 2015-05-09 01:48:30 |
| 2 | 2 | 2015-05-09 03:01:09 |
| 3 | 4 | 2015-05-09 03:01:09 |
If you have LAG
DEMO
SELECT
id
, val - LAG(val, 1) OVER (PARTITION BY id ORDER BY time ASC) AS VALI
, time
FROM #TempTable
ORDER BY time ASC, ID ASC

How can I select for the latest row based on two different attributes?

I need to select the latest activity code (A, V, W, J) for the following transactions (109, 154, 982, 745) for my employees. I need to know what my employee last transaction was (from the list) that had one of those activity codes. There are 2 tables involved with a join on the employee ID.
Table 1:
|Emp_id | STUFF
| 1 | stuff
| 2 | stuff
| 3 | stuff
Table 2:
|Emp_id | date | act_code | trans
| 1 | 1/1/17 | A | 109
| 1 | 3/4/12 | X | 203
| 1 | 2/14/09 | A | 154
| 2 | 1/1/17 | A | 110
| 2 | 6/6/13 | V | 109
| 3 | 12/13/16 | J | 982
| 3 | 11/23/14 | W | 745
| 4 | 12/13/16 | X | 154
| 4 | 11/23/14 | W | 745
What I’d like to return is:
|Emp_id | STUFF | date | act_code | trans
| 1 | stuff | 1/1/17 | A | 109
| 3 | stuff | 12/13/16 | J | 982
Emp 2 would not be selected because the latest trans is not one of the correct values. Emp 4 would not be selected because the latest act_code is not one of the correct values. Anyone have an idea as to how to go about this? Thanks in advance.
Here is one way.
Use ROW_NUMBER() to partition the rows by emp_id and order the rows by date:
SELECT t2.emp_id, t1.stuff, t2.date, t2.act_code, t2.trans,
ROW_NUMBER() over (PARTITION BY t2.emp_id ORDER BY t2.date DESC) RN
FROM Table1 t1
JOIN Table2 t2 on t1.emp_id = t2.emp_id;
Then filter this to only the most recent records (RN = 1) that have values in your lists with an outer select:
SELECT emp_id, stuff, date, act_code, trans
FROM (
SELECT t2.emp_id, t1.stuff, t2.date, t2.act_code, t2.trans,
ROW_NUMBER() over (PARTITION BY t2.emp_id ORDER BY t2.date DESC) RN
FROM Table1 t1
JOIN Table2 t2 on t1.emp_id = t2.emp_id
) A
WHERE RN = 1
AND trans IN (109, 154, 982, 745)
AND act_code IN ('A', 'V', 'W', 'J');
Use first_value to get the latest values of act_code,trans and then check if they are in the specified list.
select * from (
select distinct t1.emp_id,t1.stuff,
max(t2.date) over(partition by t2.emp_id) as latest_date,
first_value(t2.act_code) over(partition by t2.emp_id order by t2.date desc) as latest_act_code,
first_value(t2.trans) over(partition by t2.emp_id order by t2.date desc) as latest_trans
from tbl1 t1
join tbl2 t2 on t1.emp_id=t2.emp_id
) t
where latest_act_code in ('A','V','W','J') and latest_trans in (109, 154, 982, 745)

Recursive SQL - count number of descendants in hierarchical structure

Consider a database table with the following columns:
mathematician_id
name
advisor1
advisor2
The database represents data from the Math Genealogy Project, where each mathematician usually has one single advisor, but there are situations when there are two advisors.
Visual aid to make things clearer:
How do I count the number of descendants for each of the mathematicians?
I should probably use Common Table Expressions (WITH RECURSIVE), but I am pretty much stuck at the moment. All the similar examples I found deal with hierarchies having only one parent, not two.
Update:
I adapted the solution for SQL Server provided by Vladimir Baranov to also work in PostgreSQL:
WITH RECURSIVE cte AS (
SELECT m.id as start_id,
m.id,
m.name,
m.advisor1,
m.advisor2,
1 AS level
FROM public.mathematicians AS m
UNION ALL
SELECT cte.start_id,
m.id,
m.name,
m.advisor1,
m.advisor2,
cte.level + 1 AS level
FROM public.mathematicians AS m
INNER JOIN cte ON cte.id = m.advisor1
OR cte.id = m.advisor2
),
cte_distinct AS (
SELECT DISTINCT start_id, id
FROM cte
)
SELECT cte_distinct.start_id,
m.name,
COUNT(*)-1 AS descendants_count
FROM cte_distinct
INNER JOIN public.mathematicians AS m ON m.id = cte_distinct.start_id
GROUP BY cte_distinct.start_id, m.name
ORDER BY cte_distinct.start_id
You didn't say what DBMS you use. I'll use SQL Server for this example, but it will work in other databases that support recursive queries as well.
Sample data
I entered only the right part of your tree, starting from Euler.
The most interesting part is the multiple paths between Lagrange and Dirichlet.
DECLARE #T TABLE (ID int, name nvarchar(50), Advisor1ID int, Advisor2ID int);
INSERT INTO #T (ID, name, Advisor1ID, Advisor2ID) VALUES
(1, 'Euler', NULL, NULL),
(2, 'Lagrange', 1, NULL),
(3, 'Laplace', NULL, NULL),
(4, 'Fourier', 2, NULL),
(5, 'Poisson', 2, 3),
(6, 'Dirichlet', 4, 5),
(7, 'Lipschitz', 6, NULL),
(8, 'Klein', NULL, 7),
(9, 'Lindemann', 8, NULL),
(10, 'Furtwangler', 8, NULL),
(11, 'Hilbert', 9, NULL),
(12, 'Taussky-Todd', 10, NULL);
This is how it looks like:
SELECT * FROM #T;
+----+--------------+------------+------------+
| ID | name | Advisor1ID | Advisor2ID |
+----+--------------+------------+------------+
| 1 | Euler | NULL | NULL |
| 2 | Lagrange | 1 | NULL |
| 3 | Laplace | NULL | NULL |
| 4 | Fourier | 2 | NULL |
| 5 | Poisson | 2 | 3 |
| 6 | Dirichlet | 4 | 5 |
| 7 | Lipschitz | 6 | NULL |
| 8 | Klein | NULL | 7 |
| 9 | Lindemann | 8 | NULL |
| 10 | Furtwangler | 8 | NULL |
| 11 | Hilbert | 9 | NULL |
| 12 | Taussky-Todd | 10 | NULL |
+----+--------------+------------+------------+
Query
It is a classic recursive query with two interesting points.
1) The recursive part of the CTE joins to the anchor part using both Advisor1ID and Advisor2ID:
INNER JOIN CTE
ON CTE.ID = T.Advisor1ID
OR CTE.ID = T.Advisor2ID
2) Since it is possible to have multiple paths to the descendant, recursive query may output the node several times. To eliminate these duplicates I used DISTINCT in CTE_Distinct. It may be possible to solve it more efficiently.
To understand better how the query works run each CTE separately and examine intermediate results.
WITH
CTE
AS
(
SELECT
T.ID AS StartID
,T.ID
,T.name
,T.Advisor1ID
,T.Advisor2ID
,1 AS Lvl
FROM #T AS T
UNION ALL
SELECT
CTE.StartID
,T.ID
,T.name
,T.Advisor1ID
,T.Advisor2ID
,CTE.Lvl + 1 AS Lvl
FROM
#T AS T
INNER JOIN CTE
ON CTE.ID = T.Advisor1ID
OR CTE.ID = T.Advisor2ID
)
,CTE_Distinct
AS
(
SELECT DISTINCT
StartID
,ID
FROM CTE
)
SELECT
CTE_Distinct.StartID
,T.name
,COUNT(*) AS DescendantCount
FROM
CTE_Distinct
INNER JOIN #T AS T ON T.ID = CTE_Distinct.StartID
GROUP BY
CTE_Distinct.StartID
,T.name
ORDER BY CTE_Distinct.StartID;
Result
+---------+--------------+-----------------+
| StartID | name | DescendantCount |
+---------+--------------+-----------------+
| 1 | Euler | 11 |
| 2 | Lagrange | 10 |
| 3 | Laplace | 9 |
| 4 | Fourier | 8 |
| 5 | Poisson | 8 |
| 6 | Dirichlet | 7 |
| 7 | Lipschitz | 6 |
| 8 | Klein | 5 |
| 9 | Lindemann | 2 |
| 10 | Furtwangler | 2 |
| 11 | Hilbert | 1 |
| 12 | Taussky-Todd | 1 |
+---------+--------------+-----------------+
Here DescendantCount counts the node itself as a descendant. You can subtract 1 from this result if you want to see 0 instead of 1 for the leaf nodes.
Here is SQL Fiddle.