For example, I have 2 time tables:
T1
id time
1 18:12:02
2 18:46:57
3 17:49:44
4 12:19:24
5 11:00:01
6 17:12:45
and T2
id time
1 18:13:02
2 17:46:57
I need to get time from T1 that are the closest to time from T2. There is no relationship between this tables.
It should be something like this:
select T1.calldatetime
from T1, T2
where T1.calldatetime between
T2.calldatetime-(
select MIN(ABS(T2.calldatetime-T1.calldatetime))
from T2, T1)
and
T2.calldatetime+(
select MIN(ABS(T2.calldatetime-T1.calldatetime))
from T2, T1)
But I can't get it. Any suggestions?
You only have to use a single Cartesian join to solve you problem unlike the other solutions, which use multiple. I assume time is stored as a VARCHAR2. If it is stored as a date then you can remove the TO_DATE functions. If it is stored as a date (I would highly recommend this), you will have to remove the date portions
I've made it slightly verbose so it's obvious what's going on.
select *
from ( select id, tm
, rank() over ( partition by t2id order by difference asc ) as rnk
from ( select t1.*, t2.id as t2id
, abs( to_date(t1.tm, 'hh24:mi:ss')
- to_date(t2.tm, 'hh24:mi:ss')) as difference
from t1
cross join t2
) a
)
where rnk = 1
Basically, this works out the absolute difference between every time in T1 and T2 then picks the smallest difference by T2 ID; returning the data from T1.
Here it is in SQL Fiddle format.
The less pretty (but shorter) format is:
select *
from ( select t1.*
, rank() over ( partition by t2.id
order by abs(to_date(t1.tm, 'hh24:mi:ss')
- to_date(t2.tm, 'hh24:mi:ss'))
) as rnk
from t1
cross join t2
) a
where rnk = 1
I believe this is the query you are looking for:
CREATE TABLE t1(id INTEGER, time DATE);
CREATE TABLE t2(id INTEGER, time DATE);
INSERT INTO t1 VALUES (1, TO_DATE ('18:12:02', 'HH24:MI:SS'));
INSERT INTO t1 VALUES (2, TO_DATE ('18:46:57', 'HH24:MI:SS'));
INSERT INTO t1 VALUES (3, TO_DATE ('17:49:44', 'HH24:MI:SS'));
INSERT INTO t1 VALUES (4, TO_DATE ('12:19:24', 'HH24:MI:SS'));
INSERT INTO t1 VALUES (5, TO_DATE ('11:00:01', 'HH24:MI:SS'));
INSERT INTO t1 VALUES (6, TO_DATE ('17:12:45', 'HH24:MI:SS'));
INSERT INTO t2 VALUES (1, TO_DATE ('18:13:02', 'HH24:MI:SS'));
INSERT INTO t2 VALUES (2, TO_DATE ('17:46:57', 'HH24:MI:SS'));
SELECT t1.*, t2.*
FROM t1, t2,
( SELECT t2.id, MIN (ABS (t2.time - t1.time)) diff
FROM t1, t2
GROUP BY t2.id) b
WHERE ABS (t2.time - t1.time) = b.diff;
Make sure that the time columns have the same date part, because the t2.time - t1.time part won't work otherwise.
EDIT: Thanks for the accept, but Ben's answer below is better. It uses Oracle analytic functions and will perform much better.
This one here selects that row(s) from T1, which has/have the smallest distance to any in T2:
select T1.id, T1.calldatetime from T1, T2
where ABS(T2.calldatetime-T1.calldatetime)
=( select MIN(ABS(T2.calldatetime-T1.calldatetime))from T1, T2);
(tested it with mysql, hope you dont get an ORA from that)
Edit: according to the last comment, it should be like that:
drop table t1;
drop table t2;
create table t1(id int, t time);
create table t2(id int, t time);
insert into t1 values (1, '18:12:02');
insert into t1 values (2, '18:46:57');
insert into t1 values (3, '17:49:44');
insert into t1 values (4, '12:19:24');
insert into t1 values (5, '11:00:01');
insert into t1 values (6, '17:12:45');
insert into t2 values (1, '18:13:02');
insert into t2 values (2, '17:46:57');
select ot2.id, ot2.t, ot1.id, ot1.t from t2 ot2, t1 ot1
where ABS(ot2.t-ot1.t)=
(select min(abs(t2.t-t1.t)) from t1, t2 where t2.id=ot2.id)
Produces:
id t id t
1 18:13:02 1 18:12:02
2 17:46:57 3 17:49:44
Another one way of using analytic functions.
May be strange :)
select id, time,
case
when to_date(time, 'hh24:mi:ss') - to_date(lag_time, 'hh24:mi:ss') < to_date(lead_time, 'hh24:mi:ss') - to_date(time, 'hh24:mi:ss')
then lag_time
else lead_time
end closest_time
from (
select id, tbl,
LAG(time, 1, null) OVER (ORDER BY time) lag_time,
time,
LEAD(time, 1, null) OVER (ORDER BY time) lead_time
from
(
select id, time, 1 tbl from t1
union all
select id, time, 2 tbl from t2
)
)
where tbl = 2
To SQLFiddle... and beyond!
Try this query its little lengthy, I will try to optimize it
select * from t1
where id in (
select id1 from
(select id1,id2,
rank() over (partition by id2 order by diff) rnk
from
(select distinct t1.id id1,t2.id id2,
round(min(abs(to_date(t1.time,'HH24:MI:SS') - to_date(t2.time,'HH24:MI:SS'))),2) diff
from
t1,t2
group by t1.id,t2.id) )
where rnk = 1);
Related
I have 2 tables as follows:
Table1:
ID Date
1 2022-01-01
2 2022-02-01
3 2022-02-05
Table2
ID Date Amount
1 2021-08-01 15
1 2022-02-10 15
2 2022-02-15 20
2 2021-01-01 15
2 2022-02-20 20
1 2022-03-01 15
I want to select the rows in Table2 such that only rows past the Date in Table1 are selected in Table2 and calculate a sum of amounts of each subset and max(date) in Table2 for each subset grouped by ID.
So the result would look like
ID Date Amount
1 2022-03-01 30
2 2022-02-20 40
SQL newbie here...I tried an inner join, but wasnt able to pass the date filter along...
Tried query:
with table1 as (select * from table1)
,table2 as (select * from table2)
select * from table1 a
inner join table2 b on (a.id=b.id)
Thanks!
Much like Paul, I would use a JOIN but I would put the clauses on the ON, so if you join to more tables, it's cleaner for the SQL optimizer to see what is the intent on a per table/join basis. I would also use aliases on tables and use the alias, so there is no room for confusion where the value is coming from, which again as a habit makes life easier when composing more complex SQL or cut'n'pasting into bigger blocks of code.
so with some CTE's for the data:
WITH table1(id, date) AS (
SELECT * FROM VALUES
(1, '2022-01-01'),
(2 , '2022-02-01'),
(3 , '2022-02-05')
), table2(id, date, amount) AS (
SELECT * FROM VALUES
(1, '2021-08-01'::date, 15),
(1, '2022-02-10'::date, 15),
(2, '2022-02-15'::date, 20),
(2, '2021-01-01'::date, 15),
(2, '2022-02-20'::date, 20),
(1, '2022-03-01'::date, 15)
)
The following SQL:
SELECT a.id,
max(b.date) as max_date,
sum(b.amount) as sum_amount
FROM table1 AS a
JOIN table2 AS b
ON a.id = b.id AND a.date <= b.date
GROUP BY 1
ORDER BY 1;
ID
MAX_DATE
SUM_AMOUNT
1
2022-03-01
30
2
2022-02-20
40
Here is how I would do this with Snowflake:
--create the tables and load data
--table1
CREATE TABLE TABLE1 (ID NUMBER, DATE DATE);
INSERT INTO TABLE1 VALUES (1, '2022-01-01');
INSERT INTO TABLE1 VALUES (2 , '2022-02-01');
INSERT INTO TABLE1 VALUES (3 , '2022-02-05');
--table 2
CREATE TABLE TABLE2 (ID NUMBER, DATE DATE, AMOUNT NUMBER);
INSERT INTO TABLE2 VALUES(1, '2021-08-01', 15);
INSERT INTO TABLE2 VALUES(1, '2022-02-10', 15);
INSERT INTO TABLE2 VALUES(2, '2022-02-15', 20);
INSERT INTO TABLE2 VALUES(2, '2021-01-01', 15);
INSERT INTO TABLE2 VALUES(2, '2022-02-20', 20);
INSERT INTO TABLE2 VALUES(1, '2022-03-01', 15);
Now obtain the data using a select
SELECT TABLE1.ID, MAX(TABLE2.DATE), SUM(AMOUNT)
FROM TABLE1, TABLE2
WHERE TABLE1.ID = TABLE2.ID
AND TABLE1.DATE < TABLE2.DATE
GROUP BY TABLE1.ID
Results
ID
MAX(TABLE2.DATE)
SUM(AMOUNT)
1
2022-03-01
30
2
2022-02-20
40
Not personally familiar with Snowflake but a standard SQL query that should work would be:
select id, Max(date) Date, Sum(Amount) Amount
from Table2 t2
where exists (
select * from Table1 t1
where t1.Id = t2.Id and t1.Date < t2.Date
)
group by Id;
Note that because you are only requiring data from Table2, an exists is preferable over an inner join and in almost all cases will be more performant than a join, at worst the same.
I have a query like the below
SELECT
t1.Supplier,
t2.Product,
FROM
t1
INNER JOIN
t2 ON t1.ProductCode = t2.ProductCode
GROUP BY
t1.Supplier, t2.Product
On table t1, there are also columns called 'Timestamp' and 'Price' - I want to get the most recent price, i.e. SELECT Price ORDER BY Timestamp DESC. Can I do this with any aggregate functions, or would it have to be a subquery?
One standard way of doing this is to use ROW_NUMBER() to create an additional column in the source data, allowing you to identify which row is "first" within each "partition".
WITH
supplier_sorted AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY supplier, ProductCode
ORDER BY timestamp DESC
)
AS recency_id
FROM
supplier
)
SELECT
s.Supplier,
p.Product,
COUNT(*)
FROM
supplier_sorted AS s
INNER JOIN
product AS p
ON s.ProductCode = p.ProductCode
WHERE
s.recency_id = 1
GROUP BY
s.Supplier,
p.Product
You can use cross apply:
SELECT t2.*, t1.*
FROM t2 CROSS APPLY
(SELECT TOP (1) t1.*
FROM t1
WHERE t1.ProductCode = t2.ProductCode
ORDER BY t1.TimeStamp DESC
) t1;
So, GROUP BY is not necessary.
Can use the row_number() over the partiton of ProductCode and Supplier to by using Timestamp Order by desc to get the latest record by based on the partition. Then you can use in the same query without aggregation to get the desired result.
It is good to use Windows functions rather than Group by for these questions.
SELECT
A.Supplier
,A.Product
,A.Price
FROM
(
SELECT
t1.Supplier,
t2.Product,
T1.Price,
ROW_NUMBER () OVER ( PARTITION BY t1.Supplier,t2.Product ORDER BY T1.[Timestamp] DESC ) AS row_num
FROM t1
INNER JOIN t2
ON t1.ProductCode = t2.ProductCode
) AS A WHERE A.row_num = 1
Tested using below added data.
CREATE TABLE t1
( Supplier varchar(100)
,ProductCode int
, Price Decimal (10,2)
, [TimeStamp] datetime
)
CREATE TABLE t2
(
ProductCode int
,Product varchar(100)
)
insert into t1 values ('A', 1, 100.00, GetDate())
insert into t1 values ('A', 1, 80.00, GetDate())
insert into t1 values ('b', 2, 190.00, GetDate())
insert into t1 values ('b', 2, 500.00, GetDate())
insert into t2 values (1, 'Pro1')
insert into t2 values (2, 'Pro2')
insert into t2 values (3, 'Pro3')
I created Tables T1 and T2. I managed to add their sum, but I can't seem to add the sum of the T1 and T2 together (10+12 = 22) by adding a sum() in the beginning of the code.
CREATE TABLE T1(kW int)
CREATE TABLE T2(kW int)
SELECT T1C1, T2C1
FROM
( select SUM(Kw) T1C1 FROM T1 ) A
CROSS JOIN
( select SUM(Kw) T2C1 FROM T2 ) B
BEGIN
INSERT INTO T1 VALUES ('4');
INSERT INTO T1 VALUES ('1');
INSERT INTO T1 VALUES ('5');
INSERT INTO T2 VALUES ('7');
INSERT INTO T2 VALUES ('2');
INSERT INTO T2 VALUES ('3');
END
You should use union all to create a "virtual" column from the columns in the two tables:
SELECT SUM(kw)
FROM (SELECT kw FROM t1
UNION ALL
SELECT kw FROM t2) t
Try using a stored procedure. Doing so you will be able to store the sum of each table on a separated variable and then return the SUM of those two variables.
You can also make a UNION ALL and SUM the column you want. Notice that you should a UNION ALL to avoid eliminating duplicated values.
Another approach is to add the results of the two subqueries directly, using the built-in dummy table dual as the main driving table:
select ( select SUM(Kw) FROM T1 )
+ ( select SUM(Kw) FROM T2 ) as total
from dual;
TOTAL
----------
22
I have the following table
CREATE TABLE Test
(`Id` int, `value` varchar(20), `adate` varchar(20))
;
INSERT INTO Test
(`Id`, `value`, `adate`)
VALUES
(1, 100, '2014-01-01'),
(1, 200, '2014-01-02'),
(1, 300, '2014-01-03'),
(2, 200, '2014-01-01'),
(2, 400, '2014-01-02'),
(2, 30 , '2014-01-04'),
(3, 800, '2014-01-01'),
(3, 300, '2014-01-02'),
(3, 60 , '2014-01-04')
;
I want to achieve the result which selects only Id having max value of date. ie
Id ,value ,adate
1, 300,'2014-01-03'
2, 30 ,'2014-01-04'
3, 60 ,'2014-01-04'
how can I achieve this using group by? I have done as follows but it is not working.
Select Id,value,adate
from Test
group by Id,value,adate
having adate = MAX(adate)
Can someone help with the query?
Select the maximum dates for each id.
select id, max(adate) max_date
from test
group by id
Join on that to get the rest of the columns.
select t1.*
from test t1
inner join (select id, max(adate) max_date
from test
group by id) t2
on t1.id = t2.id and t1.adate = t2.max_date;
Please try:
select
*
from
tbl a
where
a.adate=(select MAX(adate) from tbl b where b.Id=a.Id)
If you are using a DBMS that has analytical functions you can use ROW_NUMBER:
SELECT Id, Value, ADate
FROM ( SELECT ID,
Value,
ADate,
ROW_NUMBER() OVER(PARTITION BY ID ORDER BY Adate DESC) AS RowNum
FROM Test
) AS T
WHERE RowNum = 1;
Otherwise you will need to use a join to the aggregated max date by Id to filter the results from Test to only those where the date matches the maximum date for that Id
SELECT Test.Id, Test.Value, Test.ADate
FROM Test
INNER JOIN
( SELECT ID, MAX(ADate) AS ADate
FROM Test
GROUP BY ID
) AS MaxT
ON MaxT.ID = Test.ID
AND MaxT.ADate = Test.ADate;
I would try something like this
Select t1.Id, t1.value, t1.adate
from Test as t1
where t1.adate = (select max(t2.adate)
from Test as t2
where t2.id = t1.id)
i have temp table named "#Test" which have columns "T1", "T2", "T3" with data.
I have database table named "TestTbl" which have same columns.
I want to insert data from #Test table to TestTbl with distinct records of T1 column.
Do you have any idea how to insert distinct records in TestTbl table?
You Can Try Like this....
INSERT INTO TestTbl (T1,T2,T3) SELECT T1,T2,T3 from
(
Select Row_Number() over(Partition By T1 order By T1) as row,* from #Test
) a
where a.row=1;
INSERT INTO TestTbl (T1,T2,T3)
SELECT Distinct(T1), T2, T3 FROM #Test
EDIT After further explanation
INSERT INTO TestTbl
( T1 ,
T2 ,
T3
)
SELECT T1 ,
T2 ,
T3
FROM ( SELECT T1 ,
T2 ,
T3 ,
Row_Number() OVER ( PARTITION BY T1 ORDER BY T1) AS record
-- you need to select the relevant clause here for the order
-- do you want first or latest record?
FROM #Test
) tmp
WHERE tmp.record = 1 ;
Get distinct records
SELECT DISTINCT column_name,column_name
FROM table_name
Insert records
INSERT INTO table_name (column1,column2,column3,...)
VALUES (value1,value2,value3,...)