Distinct Value of a column in sql server 2008 - sql

Hello all I have made a query using left outer joins which result in some what like the table below:
| 00-00-00-00-00 | 1 | a.txt |
| 00-00-00-00-00 | 2 | b.txt |
| 00-00-00-00-00 | 1 | c.txt |
| 11-11-11-11-11 | 2 | d.txt |
What I want is Distict value of the MAC Column below is the SQL Fiddle to understand better.
SQLFIDDLE
Thanks
EDIT
The purpose is that 2 and 3 are useless or redundant data where as 1 and 4 are useful means the 1 and 4 show the current file on the MACs
Output:
| 00-00-00-00-00 | 1 | a.txt |
| 11-11-11-11-11 | 2 | d.txt |

Is not possible to answer exactly what you ask. However, usually folk that express the question you ask really mean to ask something like 'I want all the columns for a sample of rows containing only distinct MacAddress values'. This question has many answers, as the result is non-deterministic. A trivial solution is to pick the first (for whatever definition of 'first') row for each MacAddress:
with cte as (
select row_number() over (partition by MacAddress order by CounterNo) as rn, *
from Heartbeats
)
select * from cte where rn = 1;

If you want to get only the distinct macaddresses, you can do:
SELECT DISTINCT macaddress FROM heartbeats
If you want all the columns alongside the distinct macaddress, you need to create a rule to get them. The query below gives you the ones with the highest id for each macaddress:
SELECT t1.*
FROM heartbeats t1
LEFT JOIN heartbeats t2
ON (t1.macaddress = t2.macaddress AND t1.id < t2.id)
WHERE t2.id IS NULL
sqlfiddle demo
EDIT:
Since in original query the code used doesnt have ID column the above query was refined as:
with cte as (
select ROW_NUMBER() OVER(ORDER BY (Select 0)) AS ID,* from heartBeats
)
SELECT t1.*
FROM cte t1
LEFT JOIN cte t2
ON (t1.macaddress = t2.macaddress AND t1.id < t2.id)
WHERE t2.id IS NULL
SQL Fiddle

SELECT hb1.* FROM [heartbeats] as hb1
LEFT OUTER JOIN [heartbeats] as hb2
ON (hb1.macaddress = hb2.macaddress AND hb1.id > hb2.id)
WHERE hb2.id IS NULL;

You have to neglect the file name. See http://sqlfiddle.com/#!3/a75e47/13

Related

Compare columns from 2 different tables with only last inserted values in table_2 in SQL Server

If I have two different tables in a SQL Server 2019 database as follows:
Table1
|id | name |
+-----+--------+
| 1 | rose |
| 2 | peter |
| 3 | ann |
| 4 | rose |
| 5 | ann |
Table2
| name2 |
+--------+
|rose |
|ann |
I would like to retrieve only the last tow ids from table1 (which in this case 4 and 5) that match name2 in table2. In other words, match happens only once on the last added names in table1, furthermore, the ids (4, 5) to be inserted in table2.
How to do that using SQL?
Thank you
You can use row_number()
select name,id from
(
select *, row_number() over(partition by t.name order by id desc) as rn
from table1 t join table2 t1 on t.name=t1.name2
)A where rn=1
Your question is vague, so there could be many answers here. My first thought is that you simply want an inner join. This will fetch ONLY the data that both tables share.
SELECT Table1.*
FROM Table1
INNER JOIN Table2 on Table1.name = Table2.name2
You seem to be describing:
select . . . -- whatever columns you want
from (select top (2) t1.*
from table1 t1
order by t1.id desc
) t1 join
table2 t2
on t2.name2 = t1.name;
This doesn't seem particularly useful for the data you have provided, but it does what you describe.
EDIT:
If you want only the most recent rows that match, use row_number():
select . . . -- whatever columns you want
from (select t1.*,
row_number() over (partition by name order by id desc) as seqnum
from table1 t1
) t1 join
table2 t2
on t2.name2 = t1.name and t1.seqnum = 1;

Assistance with SQL Query in Oracle

I heed your help with the following:
I have a table like this:
Table_Values
ID | Value | Date
1 | ASD | 01-Jan-2019
2 | ZXC | 10-Jan-2019
3 | ASD | 01-Jan-2019
4 | QWE | 05-Jan-2019
5 | RTY | 15-Jan-2019
6 | QWE | 29-Jan-2019
That I need is to get the values that are duplicated and have a different Date, for example the value "QWE" is duplicated and has different date:
ID | Value | Date
4 | QWE | 05-Jan-2019
6 | QWE | 29-Jan-2019
With EXISTS:
select * from Table_Values t
where exists (
select 1 from Table_Values
where value = t.value and date <> t.date
)
Using Join:
select
t1.*
from
Table_Values t1
join
Table_Values t2
on t1.Value = t2.Value
and t1.Date <> t2.Date
However, Exists approach is better.
You want all rows where there is more than one date per value. You can use COUNT OVER for this.
One method (featured as of Oracle 12c):
select id, value, date
from mytable
order by case when count(distinct date) over (partition by value) > 1 then 1 else 2 end
fetch first row with ties
But you'll have to put this into a subquery (derived table / cte), if you want the result sorted.
And another method without FETCH FIRST clause (valid as of Oracle 8i):
select id, value, date
from
(
select id, value, date, count(distinct date) over (partition by value) as cnt
from mytable
)
where cnt > 1
order by id, value, date;
forpas' solution with EXISTS may be faster, though. Well, pick whichever method you like better :-)
With EXISTS, "correlated subquery" is used. So I don't think it's better than JOIN.
However, Oracle optimizer could re-write "EXISTS" to JOIN.
I like to use JOIN in classic way :)
SELECT t1.*
FROM table_values t1, table_values t2
WHERE t1.f_value = t2.f_value
AND t1.f_date <> t2.f_date
ORDER BY 1;

Oracle Efficiently joining tables with subquery in FROM

Table 1:
| account_no | **other columns**...
+------------+-----------------------
| 1 |
| 2 |
| 3 |
| 4 |
Table 2:
| account_no | TX_No | Balance | History |
+------------+-------+---------+------------+
| 1 | 123 | 123 | 12.01.2011 |
| 1 | 234 | 2312 | 01.03.2011 |
| 3 | 232 | 212 | 19.02.2011 |
| 4 | 117 | 234 | 24.01.2011 |
I have multiple join query, one of the tables(Table 2) inside a query is problematic as it is a view which computes many other things, that is why each query to that table is costly. From Table 2, for each account_no in Table 1 I need the whole row with the greatest TX_NO, this is how I do it:
SELECT * FROM TABLE1 A LEFT JOIN
( SELECT
X.ACCOUNT_NO,
HISTORY,
X.BALANCE
FROM TABLE2 X INNER JOIN
(SELECT
ACCOUNT_NO,
MAX(TX_NO) AS TX_NO
FROM TABLE2
GROUP BY ACCOUNT_NO) Y ON X.ACCOUNT_NO = Y.ACCOUNT_NO) B
ON B.ACCOUNT_NO = A.ACCOUNT_NO
As I understand at first it will make the inner join for all the rows in Table2 and after that left join needed account_no's with Table1 which is what I would like to avoid.
My question: Is there a way to find the max(TX_NO) for only those accounts that are in Table1 instead of going through all? I think it will help to increase the speed of the query.
I think you are on the right track, but I don't think that you need to, and would not myself, nest the subqueries the way you have done. Instead, if you want to get each record from table 1 and the matching max record from table 2, you can try the following:
SELECT * FROM TABLE1 t1
LEFT JOIN
(
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY account_no ORDER BY TX_No DESC) rn
FROM TABLE2 t
) t2
ON t1.account_no = t2.account_no AND
t2.rn = 1
If you want to continue with your original approach, this is how I would do it:
SELECT *
FROM TABLE1 t1
LEFT JOIN TABLE2 t2
ON t1.account_no = t2.account_no
INNER JOIN
(
SELECT account_no, MAX(TX_No) AS max_tx_no
FROM TABLE2
GROUP BY account_no
) t3
ON t2.account_no = t3.account_no AND
t2.TX_No = t3.max_tx_no
Instead of using a window function to find the greatest record per account in TABLE2, we use a second join to a subquery instead. I would expect the window function approach to perform better than this double join approach, and once you get used to it can even easier to read.
If table1 is comparatiely less expensive then you could think of doing a left outer join first which would considerable decrease the resultset and from that pick the latest transaction id records alone
select <required columns> from
(
select f.<required_columns),row_number() over (partition by account_no order by tx_id desc ) as rn
from
(
a.*,b.tx_id,b.balance,b.History
from table1 a left outer join table2 b
on a.account_no=b.account_no
)f
)g where g.rn=1

How to get a single result with columns from multiple records in a single table?

Platform: Oracle 10g
I have a table (let's call it t1) like this:
ID | FK_ID | SOME_VALUE | SOME_DATE
----+-------+------------+-----------
1 | 101 | 10 | 1-JAN-2013
2 | 101 | 20 | 1-JAN-2014
3 | 101 | 30 | 1-JAN-2015
4 | 102 | 150 | 1-JAN-2013
5 | 102 | 250 | 1-JAN-2014
6 | 102 | 350 | 1-JAN-2015
For each FK_ID I wish to show a single result showing the two most recent SOME_VALUEs. That is:
FK_ID | CURRENT | PREVIOUS
------+---------+---------
101 | 30 | 20
102 | 350 | 250
There is another table (lets call it t2) for the FK_ID, and it is here that there is a reference
saying which is the 'CURRENT' record. So a table like:
ID | FK_CURRENT | OTHER_FIELDS
----+------------+-------------
101 | 3 | ...
102 | 6 | ...
I was attempting this with a flawed sub query join along the lines of:
SELECT id, curr.some_value as current, prev.some_value as previous FROM t2
JOIN t1 curr ON t2.fk_current = t1.id
JOIN t1 prev ON t1.id = (
SELECT * FROM (
SELECT id FROM (
SELECT id, ROW_NUMBER() OVER (ORDER BY SOME_DATE DESC) as rno FROM t1
WHERE t1.fk_id = t2.id
) WHERE rno = 2
)
)
However the t1.fk_id = t2.id is flawed (i.e. wont run), as (I now know) you can't pass a parent
field value into a sub query more than one level deep.
Then I started wondering if Common Table Expressions (CTE) are the tool for this, but then I've no
experience using these (so would like to know I'm not going down the wrong track attempting to use them - if that is the tool).
So I guess the key complexity that is tripping me up is:
Determining the previous value by ordering, but while limiting it to the first record (and not the whole table). (Hence the somewhat convoluted sub query attempt.)
Otherwise, I can just write some code to first execute a query to get the 'current' value, and then
execute a second query to get the 'previous' - but I'd love to know how to solve this with a single
SQL query as it seems this would be a common enough thing to do (sure is with the DB I need to work
with).
Thanks!
Try an approach with LAG function:
SELECT FK_ID ,
SOME_VALUE as "CURRENT",
PREV_VALUE as Previous
FROM (
SELECT t1.*,
lag( some_value ) over (partition by fk_id order by some_date ) prev_value
FROM t1
) x
JOIN t2 on t2.id = x.fk_id
and t2.fk_current = x.id
Demo: http://sqlfiddle.com/#!4/d3e640/15
Try out this:
select t1.FK_ID ,t1.SOME_VALUE as CURRENT,
(select SOME_VALUE from t1 where p1.id2=t1.id and t1.fk_id=p1.fk_id) as PREVIOUS
from t1 inner join
(
select t1.fk_id, max(t1.id) as id1,max(t1.id)-1 as id2 from t1 group by t1.FK_ID
) as p1 on t1.id=p1.id1

Only select first row of repeating value in a column in SQL

I have table that has a column that may have same values in a burst. Like this:
+----+---------+
| id | Col1 |
+----+---------+
| 1 | 6050000 |
+----+---------+
| 2 | 6050000 |
+----+---------+
| 3 | 6050000 |
+----+---------+
| 4 | 6060000 |
+----+---------+
| 5 | 6060000 |
+----+---------+
| 6 | 6060000 |
+----+---------+
| 7 | 6060000 |
+----+---------+
| 8 | 6060000 |
+----+---------+
| 9 | 6050000 |
+----+---------+
| 10 | 6000000 |
+----+---------+
| 11 | 6000000 |
+----+---------+
Now I want to prune rows where the value of Col1 is repeated and only select the first occurrence.
For the above table the result should be:
+----+---------+
| id | Col1 |
+----+---------+
| 1 | 6050000 |
+----+---------+
| 4 | 6060000 |
+----+---------+
| 9 | 6050000 |
+----+---------+
| 10 | 6000000 |
+----+---------+
How can I do this in SQL?
Note that only burst rows should be removed and values can be repeated in non-burst rows! id=1 & id=9 are repeated in sample result.
EDIT:
I achieved it using this:
select id,col1 from data as d1
where not exists (
Select id from data as d2
where d2.id=d1.id-1 and d1.col1=d2.col1 order by id limit 1)
But this only works when ids are sequential. With gaps between ids (deleted ones) the query breaks. How can I fix this?
You can use a EXISTS semi-join to identify candidates:
Select wanted rows:
SELECT * FROM tbl t
WHERE NOT EXISTS (
SELECT *
FROM tbl
WHERE col1 = t.col1
AND id = t.id - 1
)
ORDER BY id;
Get rid of unwanted rows:
DELETE FROM tbl AS t
-- SELECT * FROM tbl t -- check first?
WHERE EXISTS (
SELECT *
FROM tbl
WHERE col1 = t.col1
AND id = t.id - 1
);
This effectively deletes every row, where the preceding row has the same value in col1, thereby arriving at your set goal: only the first row of every burst survives.
I left the commented SELECT statement because you should always check what is going to be deleted before you do the deed.
Solution for non-sequential IDs:
If your RDBMS supports CTEs and window functions (like PostgreSQL, Oracle, SQL Server, ... but not SQLite prior to v3.25, MS Access or MySQL prior to v8.0.1), there is an elegant way:
WITH cte AS (
SELECT *, row_number() OVER (ORDER BY id) AS rn
FROM tbl
)
SELECT id, col1
FROM cte c
WHERE NOT EXISTS (
SELECT *
FROM cte
WHERE col1 = c.col1
AND rn = c.rn - 1
)
ORDER BY id;
Another way doing the job without those niceties (should work for you):
SELECT id, col1
FROM tbl t
WHERE (
SELECT col1 = t.col1
FROM tbl
WHERE id < t.id
ORDER BY id DESC
LIMIT 1) IS NOT TRUE
ORDER BY id;
select min(id), Col1 from tableName group by Col1
If your RDBMS supports Window Aggregate functions and/or LEAD() and LAG() functions you can leverage them to accomplish what you are trying to report. The following SQL will help get you started down the right path:
SELECT id
, Col AS CurCol
, MAX(Col)
OVER(ORDER BY id ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS PrevCol
, MIN(COL)
OVER(ORDER BY id ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS NextCol
FROM MyTable
From there you can put that SQL in a derived table with some CASE logic that if the NextCol or PrevCol is the same as CurCol then set CurCol = NULL. Then you can collapse eliminate all the id records CurCol IS NULL.
If you don't have the ability to use window aggregates or LEAD/LAG functions your task is a little more complex.
Hope this helps.
Since id is always sequential, with no gaps or repetitions, as per your comment, you could use the following method:
SELECT t1.*
FROM atable t1
LEFT JOIN atable t2 ON t1.id = t2.id + 1 AND t1.Col1 = t2.Col1
WHERE t2.id IS NULL
The table is (outer-)joined to itself on the condition that the left side's id is one greater than the right side's and their Col1 values are identical. In other words, the condition is ‘the previous row contains the same Col1 value as the current row’. If there's no match on the right, then the current record should be selected.
UPDATE
To account for non-sequential ids (which, however, are assumed to be unique and defining the order of changes of Col1), you could also try the following query:
SELECT t1.*
FROM atable t1
LEFT JOIN atable t2 ON t1.id > t2.id
LEFT JOIN atable t3 ON t1.id > t3.id AND t3.id > t2.id
WHERE t3.id IS NULL
AND (t2.id IS NULL OR t2.Col1 <> t1.Col1)
The third self-join is there to ensure that the second one yields the row directly preceding that of t1. That is, if there's no match for t3, then either t2 contains the preceding row or it's got no match either, the latter meaning that t1's current row is the top one.