I have a table with four columns: Type, SubT, Info and Timestamp.
Type and SubT are type and subtype of the entry, Info is a string which is different foe each row and Timestamp the day/time when the entry happened.
I search a SQL statement which gives me in one query the current (with respect to Timestamp) rows for each Type and SubT including all other columns.
So, with this data:
Type | SubT | Info | Timestamp
-----+------+-------------------+--------------
1 | 2 | Hello | 20190223T1300
1 | 3 | Fuuuu | 20190223T1301
1 | 3 | Baaar | 20190223T1400
3 | 2 | Something | 20190222T1300
I would like to get that result:
Type | SubT | Info | Timestamp
-----+------+-------------------+--------------
1 | 2 | Hello | 20190223T1300
1 | 3 | Baaar | 20190223T1400
3 | 2 | Something | 20190222T1300
In other words:
All columns for each existing combination of Type and SubT and if there are several rows for one Type/SubT, then the one with the youngest timestamp.
To generalize: there could be more then two "distinct" columns like Type and SubT and there could be more then one column like info which is taken from the row with the most current timestamp.
Any suggestions?
You can use below.
select t1.* from table_name t1
inner join (select type, subt, min(timestamp) group by type, sub) a
on (t1.type = t2.type and t1.subt = t2.subt);
I simple method is a correlated subquery in the where clause:
select t.*
from t
where t.timestamp = (select max(t2.timestamp)
from t t2
where t2.type = t.type and t2.subt = t.subt
);
In many databases, this will be a highly performant solution, particularly with an index on (type, subt, timestamp desc).
use corelated subwuery
select t1.* from table t1
where t1.Timestamp=( select max(Timestamp) from table t2
where t1.SubT =t2.SubT and t1.type=t2.type
)
or row_number()which support most dbms
select * from
(
select *,row_number()over(partition by subT,type order by Timestamp desc) rn
from table
) t where t.rn=1
Related
Hi I'm a new learner of SQL. How can I realize this process in SQL or perhaps with python if needed:
First, from table1, I randomly selected two results:
SELECT TOP 2 id, date
FROM table 1
WHERE date >= 2 AND date <= 6
ORDER BY RAND(CHECKSUM(*) * RAND())
+-----------+
| table1 |
+-----------+
| id | date |
+----+------+
| x | 3 |
| y | 4 |
+----+------+
I need to use the value x and y as conditions to display another table. For instance, using x, I can:
SELECT id, date
FROM table1
WHERE date >= 2 AND date <= 6 AND id = 'x'
ORDER BY date ASC
+-----------+
| table2 |
+-----------+
| id | date |
+----+------+
| x | 3 |
| x | 4 |
| x | 5 |
| x | 6 |
| x | 6 |
+----+------+
What I need is to get the length of table2 without duplication on date. For instance, table2 has 5 rows, but last two duplicate in date. So the final answer is 4 rows.
For id = y, I have to do the same thing (say table3) and compare the length of table3 and table2 to see if consistent.
If yes, then return the length (say, 4 rows); If no, then go back to table1 and select another two id (say, z and y).
I was thinking to use python to select value or create variables, then use python variables in SQL. But it is too much for a new learner. I really appreciate it if someone could help me out this process.
You can use subqueries with IN clause
Here is too a Version with two diemsnions, maybe this will help also
CREATE TABLE table1 ([id] varchar(2),[date] int)
GO
✓
SELECT id, date FROM table1
where date >= 2 and date <= 6
and id IN (
SELECT TOP 2 id FROM table1
WHERE date >= 2 and date <= 6
ORDER BY RAND(CHECKSUM(*) * RAND())
)
ORDER BY date ASC
GO
id | date
:- | ---:
SELECT id, date FROM table1
WHERE EXISTS (SELECT 1
FROM (
SELECT TOP 2 id,[date] FROM table1
WHERE date >= 2 and date <= 6
ORDER BY RAND(CHECKSUM(*) * RAND())) AS table2
WHERE table1.[id] = table2.[id]
AND table1.[date] = table2.[date])
GO
id | date
:- | ---:
db<>fiddle here
Our Oracle 11g database has a table that tracks changes to a document, similar to the following:
ID | DATE | STATUS_ID
203 | 10-02-2017 | 2
203 | 10-04-2017 | 3
168 | 08-15-2017 | 2
203 | 11-01-2017 | 4
I'd like to some how make 2 new columns with records based on the date/status_id of the above table. The data above would be converted to the following (Status 2 = Open, Status 4 = Closed; I don't need any other statuses):
ID | OPEN_DATE | CLOSED_DATE
203 | 10-02-2017 | 11-01-2017
168 | 08-15-2017 | NULL
I think that this is similar to a pivot table or crosstab query from Excel or Access but I'm not sure how to approach this. I'd like to ensure that each document ID only has 1 row and the applicable status dates are put in the right columns.
Any and all guidance would be greatly appreciated.
There is a simple option that avoids the cost of joins, conditional aggregation...
SELECT
ID,
MIN(CASE WHEN status = 2 THEN date END) AS DATE_OPEN,
MAX(CASE WHEN status = 4 THEN date END) AS DATE_CLOSED
FROM
yourTable
WHERE
status IN (2, 4)
GROUP BY
ID
Try this.
select t1.id, t1.date AS 'fromdate', t2.date AS 'todate'
from
(SELECT id,date FROM `newtracker` WHERE status= 2) t1
left join
(SELECT id,date FROM `newtracker` WHERE status = 4) t2
on
t1.id = t2.id
Try this out:
select a.*,b.end_date
from
(select id,date as start_date
from test
where status_id = 2 ) a
left join
(select id,date as end_date
from test
where status_id = 4 ) b
on a.id = b.id;
Let me know in case of any queries/explanation required.
Platform: Oracle 10g
I have a table (let's call it t1) like this:
ID | FK_ID | SOME_VALUE | SOME_DATE
----+-------+------------+-----------
1 | 101 | 10 | 1-JAN-2013
2 | 101 | 20 | 1-JAN-2014
3 | 101 | 30 | 1-JAN-2015
4 | 102 | 150 | 1-JAN-2013
5 | 102 | 250 | 1-JAN-2014
6 | 102 | 350 | 1-JAN-2015
For each FK_ID I wish to show a single result showing the two most recent SOME_VALUEs. That is:
FK_ID | CURRENT | PREVIOUS
------+---------+---------
101 | 30 | 20
102 | 350 | 250
There is another table (lets call it t2) for the FK_ID, and it is here that there is a reference
saying which is the 'CURRENT' record. So a table like:
ID | FK_CURRENT | OTHER_FIELDS
----+------------+-------------
101 | 3 | ...
102 | 6 | ...
I was attempting this with a flawed sub query join along the lines of:
SELECT id, curr.some_value as current, prev.some_value as previous FROM t2
JOIN t1 curr ON t2.fk_current = t1.id
JOIN t1 prev ON t1.id = (
SELECT * FROM (
SELECT id FROM (
SELECT id, ROW_NUMBER() OVER (ORDER BY SOME_DATE DESC) as rno FROM t1
WHERE t1.fk_id = t2.id
) WHERE rno = 2
)
)
However the t1.fk_id = t2.id is flawed (i.e. wont run), as (I now know) you can't pass a parent
field value into a sub query more than one level deep.
Then I started wondering if Common Table Expressions (CTE) are the tool for this, but then I've no
experience using these (so would like to know I'm not going down the wrong track attempting to use them - if that is the tool).
So I guess the key complexity that is tripping me up is:
Determining the previous value by ordering, but while limiting it to the first record (and not the whole table). (Hence the somewhat convoluted sub query attempt.)
Otherwise, I can just write some code to first execute a query to get the 'current' value, and then
execute a second query to get the 'previous' - but I'd love to know how to solve this with a single
SQL query as it seems this would be a common enough thing to do (sure is with the DB I need to work
with).
Thanks!
Try an approach with LAG function:
SELECT FK_ID ,
SOME_VALUE as "CURRENT",
PREV_VALUE as Previous
FROM (
SELECT t1.*,
lag( some_value ) over (partition by fk_id order by some_date ) prev_value
FROM t1
) x
JOIN t2 on t2.id = x.fk_id
and t2.fk_current = x.id
Demo: http://sqlfiddle.com/#!4/d3e640/15
Try out this:
select t1.FK_ID ,t1.SOME_VALUE as CURRENT,
(select SOME_VALUE from t1 where p1.id2=t1.id and t1.fk_id=p1.fk_id) as PREVIOUS
from t1 inner join
(
select t1.fk_id, max(t1.id) as id1,max(t1.id)-1 as id2 from t1 group by t1.FK_ID
) as p1 on t1.id=p1.id1
I have a problems this mornig , I have tried many solutions and nothing gave me the expected result.
I have a table that looks like this :
+----+----------+-------+
| ID | COL2 | DATE |
+----+----------+-------+
| 1 | 1 | 2001 |
| 1 | 2 | 2002 |
| 1 | 3 | 2003 |
| 1 | 4 | 2004 |
| 2 | 1 | 2001 |
| 2 | 2 | 2002 |
| 2 | 3 | 2003 |
| 2 | 4 | 2004 |
+----+----------+-------+
And I have a query that returns a result like this :
I have the unique ID and for this ID I want to take the last date of the ID
+----+----------+-------+
| ID | COL2 | DATE |
+----+----------+-------+
| 1 | 4 | 2004 |
| 2 | 4 | 2004 |
+----+----------+-------+
But I don't have any idea how I can do that.
I tried Join , CROSS APPLY ..
If you have some idea ,
Thank you
Clement FAYARD
declare #t table (ID INT,Col2 INT,Date INT)
insert into #t(ID,Col2,Date)values (1,1,2001)
insert into #t(ID,Col2,Date)values (1,2,2001)
insert into #t(ID,Col2,Date)values (1,3,2001)
insert into #t(ID,Col2,Date)values (1,4,2001)
insert into #t(ID,Col2,Date)values (2,1,2002)
insert into #t(ID,Col2,Date)values (2,2,2002)
insert into #t(ID,Col2,Date)values (2,3,2002)
insert into #t(ID,Col2,Date)values (2,4,2002)
;with cte as(
select
*,
rn = row_number() over(partition by ID order by Col2 desc)
from #t
)
select
ID,
Col2,
Date
from cte
where
rn = 1
SELECT ID,MAX(Col2),MAX(Date) FROM tableName GROUP BY ID
If col2 and date allways the highest value in combination than you can try
SELECT ID, MAX(COL2), MAX(DATE)
FROM Table1
GROUP BY ID
But it is not realy good.
The alternative is a subquery with:
SELECT yourtable.ID, sub1.COL2, sub1.DATE
FROM yourtable
INNER JOIN -- try with CROSS APPLY for performance AND without ON 1=1
(SELECT TOP 1 COL2, DATE
FROM yourtable sub2
WHERE sub2.ID = topquery.ID
ORDER BY COL2, DATE) sub1 ON 1=1
You didn't tell what's the name of your table so I'll assume below it is tbl:
SELECT m.ID, m.COL2, m.DATE
FROM tbl m
LEFT JOIN tbl o ON m.ID = o.ID AND m.DATE < o.DATE
WHERE o.DATE is NULL
ORDER BY m.ID ASC
Explanation:
The query left joins the table tbl aliased as m (for "max") against itself (alias o, for "others") using the column ID; the condition m.DATE < o.DATE will combine all the rows from m with rows from o having a greater value in DATE. The row having the maximum value of DATE for a given value of ID from m has no pair in o (there is no value greater than the maximum value). Because of the LEFT JOIN this row will be combined with a row of NULLs. The WHERE clause selects only these rows that have NULL for o.DATE (i.e. they have the maximum value of m.DATE).
Check the SQL Antipatterns: Avoiding the Pitfalls of Database Programming book for other SQL tips.
In order to do this you MUST exclude COL2 Your query should look like this
SELECT ID, MAX(DATE)
FROM table_name
GROUP BY ID
The above query produces the Maximum Date for each ID.
Having COL2 with that query does not makes sense, unless you want the maximum date for each ID and COL2
In that case you can run:
SELECT ID, COL2, MAX(DATE)
GROUP BY ID, COL2;
When you use aggregation functions(like max()), you must always group by all the other columns you have in the select statement.
I think you are facing this problem because you have some fundemental flaws with the design of the table. Usually ID should be a Primary Key (Which is Unique). In this table you have repeated IDs. I do not understand the business logic behind the table but it seems to have some flaws to me.
Each row in my table belongs to some category, has some value and other data.
I would like to select each category with the most common value for it (doesn't matter which one if there are multiple), ordered by category.
some_table: expected result:
+--------+-----+--- +--------+-----+
|category|value|... |category|value|
+--------+-----+--- +--------+-----+
| 1 | a | | 1 | a |
| 1 | a | | 2 | b |
| 1 | b | | 3 | a # or b
| 2 | a | +--------+-----+
| 2 | b |
| 2 | c |
| 2 | b |
| 3 | a |
| 3 | a |
| 3 | b |
| 3 | b |
+--------+-----+---
I have a solution (posting it as an answer) but it seems suboptimal to me. So I'm looking for better solutions.
My table will have up to 10000 rows (possibly, but not likely, beyond that).
I'm planning to use SQLite but I'm not tied to it, so I may reconsider if SQLite can't do this with reasonable performance.
I would be inclined to do this using a correlated subquery:
select distinct category,
(select value
from some_table t2
where t2.category = t.category
group by value
order by count(*) desc
limit 1
) as mode_value
from some_table t;
The name for the most common value is "mode" in statistics.
And, if you had a categories table, this would be written as:
select category,
(select value
from some_table t2
where t2.category = c.category
group by value
order by count(*) desc
limit 1
) as mode_value
from categories c;
Here is one option, but I think it's slow...
SELECT DISTINCT `category` AS `the_category`, `value`
FROM `some_table`
WHERE `value`=(
SELECT `value`
FROM `some_table`
WHERE `category`=`the_category`
GROUP BY `value`
ORDER BY COUNT(`value`) DESC LIMIT 1)
ORDER BY `category`;
You can replace a part of this with WHERE `id`=( SELECT `id` if the table has a unique/primary key column, then the LIMIT 1 is not needed.
select category, value, count(*) value_count
from some_table t
group by category, value
order by category, value_count DESC;
returns us amout of each value in each category
select category, value
from (
select category, value, count(*) value_count
from some_table t
group by category, value) sub
group by category
actually we need the first value because it's sorted.
I am not sure sqlite leaves the first one and can't test but IMHO it should work