How to partition and find the most latest value in SQL - sql

I have a table as follows:
ID | col1 | Date Time
1 | WA | 2/11/20
1 | CI | 1/11/20
2 | CI | 2/11/20
2 | WA | 3/11/20
3 | WA | 2/10/20
3 | WA | 1/11/20
3 | WA | 2/11/20
4 | WA | 1/10/20
4 | CI | 2/10/20
4 | SA | 3/10/20
I want to find all ID values for which col1 had some other value in addition to WA as well and the most latest value in col1 should be 'WA'. i.e. from the sample data above , only ID values 1 & 2 should be returned. Because both of those have an additional value (i.e., CI) in additon to WA, but still the most latest value for them is WA.
How do I get that??
FYI, there could be some IDs that don't have WA value at all. I want to eliminate them. Also those that only have WA value, I want to eliminate those as well.
Thanks for the help.

You can use window functions for this:
select distinct id
from (
select
t.*,
last_value(col1) over(partition by id oder by datetime) last_col1,
min(col1) over(partition by id) min_col1,
max(col1) over(partition by id) max_col1
from mytable t
) t
where last_col1 = 'WA' and min_col1 <> max_col1
The inner query uses last_value() to recover the last value of col1 for the given id, and computes the min and max values in the same partition.
Then, the outer query filters on ids whose last value is 'WA' and that have at least two distinct values (which is phrased as the inequality of the min and max value).

You can do this with aggregation:
select id
from t
group by id
having min(col1) <> max(col1) and -- at least two different values
max(case when col1 = 'WA' then datetime end) = max(datetime) -- last is WA

Related

Select all values (all rows) in one row Oracle

I get multiple rows after executing the select-query.
But I need to get all the values of these rows in one row.
̶C̶o̶u̶n̶t̶ ̶o̶f̶ ̶r̶o̶w̶s̶ ̶i̶s̶ ̶u̶n̶k̶n̶o̶w̶n̶ ̶(̶b̶e̶f̶o̶r̶e̶ ̶t̶h̶e̶ ̶̶̶s̶e̶l̶e̶c̶t̶̶̶-̶q̶u̶e̶r̶y̶ ̶i̶s̶ ̶e̶x̶e̶c̶u̶t̶e̶d̶)̶
For example:
|----------|-----------|
| **Name** | **Value** |
|----------|-----------|
| Alex | 150 |
|----------|-----------|
| Peter | 220 |
|----------|-----------|
| Katty | 34 |
|----------|-----------|
I want to get:
|-----------|-----------|-----------|-----------|-----------|-----------|
| **Col_1** | **Col_2** | **Col_3** | **Col_4** | **Col_5** | **Col_6** |
|-----------|-----------|-----------|-----------|-----------|-----------|
| Alex | 150 | Peter | 220 | Katty | 34 |
|-----------|-----------|-----------|-----------|-----------|-----------|
Oracle 11g.
UPDATE: I realized that with an unknown number of rows, the task is difficult, so I can assume that the number of rows will be known.
To pivot over a fixed number of column, one option uses row_number() and conditional aggregation:
select
max(case when rn = 1 then name end) name1,
max(case when rn = 1 then value end) value1,
max(case when rn = 2 then name end) name2,
max(case when rn = 2 then value end) value2,
...
from (
select t.*, row_number() over(order by id) rn
from mytable t
) t
You need a column that defines the ordering of the rows in the original dataset (and of the columns in the resultset): I assumed id.
You might be better off putting the values into a string or JSON column. For instance, you can aggregate the names and values into separate strings:
select list_agg(name, ',') within group (order by name) as names,
list_agg(value, ',') within group (order by name) as values
from t;
Or into a single string:
select list_agg(name || ':' || value, ',') within group (order by name) as name_values
from t;
Note: The maximum length of strings in Oracle for this purpose is 2000 characters. So this only works on a small amount of data.

in SQL, how to remove distinct column values (not rows, as usually done)

I have a production case, for a supply chain. We have devices that are moved around in warehouses, and I need to find the previous warehouse locations.
I have a table like this:
+--------+------------+--------+--------+--------+
| device | current_WH | prev_1 | prev_2 | prev_3 |
+--------+------------+--------+--------+--------+
| 1 | AB | KK | KK | KK |
| 2 | DE | DE | DE | NQ |
| 3 | FF | MM | ST | ST |
+--------+------------+--------+--------+--------+
I need to find the distinct values of current_WH and the "prev" columns. So I'm not flattening rows, but narrowing columns. I need to get this:
+--------+------------+--------+--------+--------+
| device | current_WH | prev_1 | prev_2 | prev_3 |
+--------+------------+--------+--------+--------+
| 1 | AB | KK | blank | blank |
| 2 | DE | NQ | blank | blank |
| 3 | FF | MM | ST | blank |
+--------+------------+--------+--------+--------+
I'll figure out nulls or blanks later. But for now I need one row for each device that shows the current WH and previous locations. There could be any number - not always the same.
If I do "distinct" that flattens rows. Doing a distinct and group by doesn't achieve the requirement.
Any help is appreciated. Thanks!
You need to do unpivot to let your column value rows, because that will easier to compare before current_WH value data, then do a pivot to recover the data schema.
Do unpivot to let your column value rows, because that will easier to compare before current_WH value data, and add a new grp column it can help to recover your expected result.
use LAG function to get the previous value it will be compared with current_WH value.
use SUM with CASE WHEN and window function to cumulative number if the previous equal to current_WH value.
if the SUM cumulative number greater than 0 means the name was repeated.
look like this.
with cteUnion as(
SELECT device,current_WH,0 grp
FROM T
UNION ALL
SELECT device,prev_1,1 grp
FROM T
UNION ALL
SELECT device,prev_2,2 grp
FROM T
UNION ALL
SELECT device,prev_3,3 grp
FROM T
),cte1 as(
SELECT *,
LAG(current_WH) over(partition by current_WH order by grp) perviosVal
from cteUnion
),cteResult as (
SELECT *,
(CASE WHEN sum(CASE WHEN perviosVal = current_WH then 1 else 0 end) over(partition by device order by grp) > 0 THEN 'Block' else current_WH end) val
FROM cte1
)
select device,
MAX(CASE WHEN grp = 0 then val end) current_WH ,
MAX(CASE WHEN grp = 1 then val end) prev_1,
MAX(CASE WHEN grp = 2 then val end) prev_2,
MAX(CASE WHEN grp = 3 then val end) prev_3
from cteResult
GROUP BY device
sqlfiddle
NOTE
grp column number value depends on your order.

Get row which matched in each group

I am trying to make a sql query. I got some results from 2 tables below. Below results are good for me. Now I want those values which is present in each group. for example, A and B is present in each group(in each ID). so i want only A and B in result. and also i want make my query dynamic. Could anyone help?
| ID | Value |
|----|-------|
| 1 | A |
| 1 | B |
| 1 | C |
| 1 | D |
| 2 | A |
| 2 | B |
| 2 | C |
| 3 | A |
| 3 | B |
In the following query, I have placed your current query into a CTE for further use. We can try selecting those values for which every ID in your current result appears. This would imply that such values are associated with every ID.
WITH cte AS (
-- your current query
)
SELECT Value
FROM cte
GROUP BY Value
HAVING COUNT(DISTINCT ID) = (SELECT COUNT(DISTINCT ID) FROM cte);
Demo
The solution is simple - you can do this in two ways at least. Group by letters (Value), aggregate IDs with SUM or COUNT (distinct values in ID). Having that, choose those letters that have the value for SUM(ID) or COUNT(ID).
select Value from MyTable group by Value
having SUM(ID) = (SELECT SUM(DISTINCT ID) from MyTable)
select Value from MyTable group by Value
having COUNT(ID) = (SELECT COUNT(DISTINCT ID) from MyTable)
Use This
WITH CTE
AS
(
SELECT
Value,
Cnt = COUNT(DISTINCT ID)
FROM T1
GROUP BY Value
)
SELECT
Value
FROM CTE
WHERE Cnt = (SELECT COUNT(DISTINCT ID) FROM T1)

Get values on 1 line based on unique identifier

I'm trying to write a query to get the values of a table placed onto a single line based on a specific key.
table.ID | table.ACCOUNT |
==================================
12345 | 456789 |
12345 | ABCDEF |
12345 | HIJKLM |
For example, I want to get all the ACCOUNTs for ID 12345 (above) onto one line so it looks like what is below.
table.ID | table.ACCOUNT1 | table.ACCOUNT2 | table.ACCOUNT3 |
====================================================================
12345 | 456789 | ABCDEF | HIJKLM |
I think I want to join the table to itself but I keep getting the same values in the 2nd and 3rd ACCOUNT fields (i.e. 456789 shows up in all 3).
If you don't really need separate columns for all of the accounts, consider using an aggregate function such as string_agg in Postgres:
SELECT id, string_agg(account, ',') FROM table GROUP BY id
This will produce a result with two columns, the id and a string containing all of the accounts for the id separated by , characters.
If you know there are at most three accounts per id, you can pivot the data. In most databases, you can use row_number() and conditional aggregation:
select id,
max(case when seqnum = 1 then account end) as account_1,
max(case when seqnum = 2 then account end) as account_2,
max(case when seqnum = 3 then account end) as account_3
from (select t.*
row_number() over (partition by id order by id) as seqnum
from t
) t
group by id;
Unlike a string aggregation method, this puts the values into separate columns.

SQL Select First column and for each row select unique ID and the last date

I have a problems this mornig , I have tried many solutions and nothing gave me the expected result.
I have a table that looks like this :
+----+----------+-------+
| ID | COL2 | DATE |
+----+----------+-------+
| 1 | 1 | 2001 |
| 1 | 2 | 2002 |
| 1 | 3 | 2003 |
| 1 | 4 | 2004 |
| 2 | 1 | 2001 |
| 2 | 2 | 2002 |
| 2 | 3 | 2003 |
| 2 | 4 | 2004 |
+----+----------+-------+
And I have a query that returns a result like this :
I have the unique ID and for this ID I want to take the last date of the ID
+----+----------+-------+
| ID | COL2 | DATE |
+----+----------+-------+
| 1 | 4 | 2004 |
| 2 | 4 | 2004 |
+----+----------+-------+
But I don't have any idea how I can do that.
I tried Join , CROSS APPLY ..
If you have some idea ,
Thank you
Clement FAYARD
declare #t table (ID INT,Col2 INT,Date INT)
insert into #t(ID,Col2,Date)values (1,1,2001)
insert into #t(ID,Col2,Date)values (1,2,2001)
insert into #t(ID,Col2,Date)values (1,3,2001)
insert into #t(ID,Col2,Date)values (1,4,2001)
insert into #t(ID,Col2,Date)values (2,1,2002)
insert into #t(ID,Col2,Date)values (2,2,2002)
insert into #t(ID,Col2,Date)values (2,3,2002)
insert into #t(ID,Col2,Date)values (2,4,2002)
;with cte as(
select
*,
rn = row_number() over(partition by ID order by Col2 desc)
from #t
)
select
ID,
Col2,
Date
from cte
where
rn = 1
SELECT ID,MAX(Col2),MAX(Date) FROM tableName GROUP BY ID
If col2 and date allways the highest value in combination than you can try
SELECT ID, MAX(COL2), MAX(DATE)
FROM Table1
GROUP BY ID
But it is not realy good.
The alternative is a subquery with:
SELECT yourtable.ID, sub1.COL2, sub1.DATE
FROM yourtable
INNER JOIN -- try with CROSS APPLY for performance AND without ON 1=1
(SELECT TOP 1 COL2, DATE
FROM yourtable sub2
WHERE sub2.ID = topquery.ID
ORDER BY COL2, DATE) sub1 ON 1=1
You didn't tell what's the name of your table so I'll assume below it is tbl:
SELECT m.ID, m.COL2, m.DATE
FROM tbl m
LEFT JOIN tbl o ON m.ID = o.ID AND m.DATE < o.DATE
WHERE o.DATE is NULL
ORDER BY m.ID ASC
Explanation:
The query left joins the table tbl aliased as m (for "max") against itself (alias o, for "others") using the column ID; the condition m.DATE < o.DATE will combine all the rows from m with rows from o having a greater value in DATE. The row having the maximum value of DATE for a given value of ID from m has no pair in o (there is no value greater than the maximum value). Because of the LEFT JOIN this row will be combined with a row of NULLs. The WHERE clause selects only these rows that have NULL for o.DATE (i.e. they have the maximum value of m.DATE).
Check the SQL Antipatterns: Avoiding the Pitfalls of Database Programming book for other SQL tips.
In order to do this you MUST exclude COL2 Your query should look like this
SELECT ID, MAX(DATE)
FROM table_name
GROUP BY ID
The above query produces the Maximum Date for each ID.
Having COL2 with that query does not makes sense, unless you want the maximum date for each ID and COL2
In that case you can run:
SELECT ID, COL2, MAX(DATE)
GROUP BY ID, COL2;
When you use aggregation functions(like max()), you must always group by all the other columns you have in the select statement.
I think you are facing this problem because you have some fundemental flaws with the design of the table. Usually ID should be a Primary Key (Which is Unique). In this table you have repeated IDs. I do not understand the business logic behind the table but it seems to have some flaws to me.