SQL join only if there is no match

SQL join only if there is no match - sql

I have a (postgres) sql table that has the following contents (Hosts):
ip_address | mac_address | hostname | device | physical_port
----------------+----------------+----------+--------+---------------
111.111.111.111 | aaaa.aaaa.aaaa | hosta | swh-a | Gi1/1
111.111.111.112 | bbbb.bbbb.bbbb | hostb | swh-b | Gi2/1
111.111.111.113 | cccc.cccc.cccc | hostc | swh-c | Gi3/1
I have another table (Peers) that contains point-to-point links between devices in the able table.
device | physical_port | peer_device | peer_physical_port
-------+---------------+-------------+----------------------+
swh-a | Gi1/20 | swh-b | Gi2/1
swh-b | Gi2/1 | swh-a | Gi1/20
swh-b | Gi2/1 | swh-c | Gi3/1
swh-c | Gi3/1 | swh-b | Gi2/1
Basically, I would like the exclude entries from the Hosts table that are contained within the Peers table such that I only get:
ip_address | mac_address | hostname | device | physical_port
----------------+----------------+----------+--------+---------------
111.111.111.111 | aaaa.aaaa.aaaa | hosta | swh-a | Gi1/1
(given that device=swh-b physical_port=Gi2/1 and device=swh-c physical_port=Gi3/1 exist within the Peers table).

You can use NOT EXISTS for a self-explanatory query that reads almost as if it were in English:
SELECT *
FROM Hosts h
WHERE NOT EXISTS (
SELECT * FROM Peers p
WHERE p.peer_device = h.device AND p.peer_physical_port = h.physical_port
)

Does this work for you?
SELECT * FROM Hosts
WHERE NOT peer_physical_port IN (
SELECT DISTINCT peer_physical_port FROM Peers
)
You are selecting only the entries that do not appear in the second table.

You need something like this:
SELECT *
FROM Host h
LEFT JOIN Peers p ON p.device= h.device and p.physical_port = h.physical_port
WHERE p.ID IS NULL

Try this..
SELECT *
FROM Host
WHERE device NOT IN (SELECT device FROM Peers )
AND physical_port NOT IN (SELECT physical_port FROM Peers)

Related

SQL - Given sequence of data, how do I query the origin?

Let's assume we have the following data.
| UUID | SEENTIME | LAST_SEENTIME |
------------------------------------------------------
| UUID1 | 2020-11-10T05:00:00 | |
| UUID2 | 2020-11-10T05:01:00 | 2020-11-10T05:00:00 |
| UUID3 | 2020-11-10T05:03:00 | 2020-11-10T05:01:00 |
| UUID4 | 2020-11-10T05:04:00 | 2020-11-10T05:03:00 |
| UUID5 | 2020-11-10T05:07:00 | 2020-11-10T05:04:00 |
| UUID6 | 2020-11-10T05:08:00 | 2020-11-10T05:07:00 |
Each data is connected to each other via LAST_SEENTIME.
In such case, is there a way to use SQL to identify these connected events as one? I want to be able to calculate start and end to calculate the duration of this event.

You can use a recursive CTE. The exact syntax varies by database, but something like this:
with recursive cte as
select uuid as orig_uuid, uuid, seentime
from t
where last_seentime is null
union all
select cte.orig_uuid, t.uuid, t.seentime
from cte join
t
on cte.seentime = t.last_seentime
)
select orig_uuid,
max(seentime) - min(seentime) -- or whatever your database uses
from cte
group by orig_uuid;

Multiple-level mapping (or Tree Hierachy) with SQL

My log table has data like this
====================
| src_ip | dest_ip |
====================
| ip01_1 | ip01_2 |
| ip01_1 | ip01_3 |
| ip01_2 | ip01_4 |
| ip01_4 | ip01_5 |
| ip02_1 | ip02_2 |
| ip02_2 | ip02_3 |
====================
My required output is a table which contains dest_ip and the first requesting ip.
For example,
* ip01_4 (dest_ip) has ip01_1 as its first_src_ip (ip01_1 -> ip01_2 -> ip01_4)
* ip01_5 (dest_ip) has ip01_1 as its first_src_ip (ip01_1 -> ip01_2 -> ip01_4 -> ip01_5)
Is there any way to use a SQL Query to create a table like below ?
==========================
| first_src_ip | dest_ip |
==========================
| ip01_1 | ip01_2 |
| ip01_1 | ip01_3 |
| ip01_1 | ip01_4 |
| ip01_1 | ip01_5 |
| ip02_1 | ip02_2 |
| ip02_1 | ip02_3 |
==========================
I'm thinking of using self-join but the joining times cannot be fixed.

Here is an example that supports three levels of separation between nodes, e.g. ip1 -> ip2, ip2 -> ip3, ip3 -> ip4:
WITH IPs AS (
SELECT 'ip01_1' AS src_ip, 'ip01_2' AS dest_ip UNION ALL
SELECT 'ip01_1', 'ip01_3' UNION ALL
SELECT 'ip01_2', 'ip01_4' UNION ALL
SELECT 'ip01_4', 'ip01_5' UNION ALL
SELECT 'ip02_1', 'ip02_2' UNION ALL
SELECT 'ip02_2', 'ip02_3'
), Hop1 AS (
SELECT
COALESCE(
(SELECT MIN(ip2.src_ip) FROM IPs AS ip2
WHERE ip.src_ip = ip2.dest_ip),
src_ip
) AS src_ip,
dest_ip
FROM IPs AS ip
), Hop2 AS (
SELECT
COALESCE(
(SELECT MIN(ip2.src_ip) FROM IPs AS ip2
WHERE ip.src_ip = ip2.dest_ip),
src_ip
) AS src_ip,
dest_ip
FROM Hop1 AS ip
)
SELECT *
FROM Hop2
ORDER BY src_ip;
Each of the CTEs looks for an association between the current src_ip and another dest_ip in the original IP address mappings.

Select multiple columns based on the max of another column

I have and Oracle 11g Database with the following three tables (simplified):
IP Table, containing an IP identifier, the IP, and IP status and a FQDN. IPs might be repeated.
+-------+-------------+-----------+-----------+
| ID_IP | IP | IP_STATUS | FQDN |
+-------+-------------+-----------+-----------+
| 1 | 192.168.1.1 | 1 | test.com |
| 2 | 192.168.1.1 | 2 | test.com |
| 3 | 192.168.1.1 | 3 | test.com |
| 4 | 10.10.45.12 | 2 | test2.com |
+-------+-------------+-----------+-----------+
VLAN Table, containing and VLAN identifier and the VLAN number
+---------+-------------+
| VLAN_ID | VLAN_NUMBER |
+---------+-------------+
| 1 | 3 |
| 2 | 5 |
| 3 | 7 |
+---------+-------------+
A Table correlating VLANs and IPs:
+-------+---------+
| IP_ID | VLAN_ID |
+-------+---------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 2 |
+-------+---------+
In the actual IP Table, the primary key is the tuple (IP, IP_STATUS). My goal is to create a new table eliminating the IP_STATUS, and to do that, I want to aggregate IPs and get the ID_IP and FQDN of the IP whose VLAN_NUMBER is higher. The answer for the SELECT query would be something like this:
+-------+-------------+-----------+
| ID_IP | IP | FQDN |
+-------+-------------+-----------+
| 3 | 192.168.1.1 | test.com |
| 4 | 10.10.45.12 | test2.com |
+-------+-------------+-----------+
I can get the IP using the following query:
SELECT i.IP, max(v.VLAN_ID)
FROM IPS i
LEFT JOIN VLAN_IP_REL v_i ON i.ID_IP=v_i.ID_IP
LEFT JOIN VLANS v ON v_i.ID_VLAN=v.ID_INSTANCIA
GROUP BY i.IP;
What I don't know is how to get the other columns. I tried with a subquery like the following:
SELECT i.ID_IP, i.IP, i.FQDN
FROM IPS i
WHERE i.IP IN (
SELECT i.IP, max(v.VLAN_ID)
FROM IPS i
LEFT JOIN VLAN_IP_REL v_i ON i.ID_IP=v_i.ID_IP
LEFT JOIN VLANS v ON v_i.ID_VLAN=v.ID_INSTANCIA
GROUP BY i.IP;
)
But it doesn't work, since the subquery returns two values, and I need the max(vlan.VLAN_ID) to do the aggregation.
How could I get the right IP_ID?
Thank you!

You can use an analytical clause to split by IP and order by VLAN_NUMBER, then filter to retain only the first line in each group :
SELECT ID_IP, IP, FQDN
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY i.IP ORDER BY v.VLAN_NUMBER DESC) AS NB,
i.ID_IP, i.IP, i.FQDN
FROM IPS i
LEFT JOIN VLAN_IP_REL v_i ON i.ID_IP = v_i.ID_IP
LEFT JOIN VLANS v ON v_i.VLAN_ID = v.VLAN_ID
) t_a
WHERE NB = 1

You may want try WITH clause. Roughly...
WITH IPWITHMAXVLANID(IP, MAXVLAN) AS (
SELECT i.IP, max(v.VLAN_ID)
FROM IPS i
LEFT JOIN VLAN_IP_REL v_i ON i.ID_IP=v_i.ID_IP
LEFT JOIN VLANS v ON v_i.ID_VLAN=v.ID_INSTANCIA
GROUP BY i.IP
)
SELECT i.ID_IP, i.IP, i.FQDN, iml.MAXVLAN
FROM IPS i
INNER JOIN IPWITHMAXVLANID iml on i.IP = imp.IP
Hope this helps.

Join table condition for between 2 rows

Is it possible to join these tables:
Log table:
+--------+---------------+------------+
| name | ip | created |
+--------+---------------+------------+
| 408901 | 178.22.51.168 | 1390887682 |
| 408901 | 178.22.51.168 | 1390927059 |
| 408901 | 178.22.51.168 | 1390957854 |
+--------+---------------+------------+
Orders table:
+---------+------------+
| id | created |
+---------+------------+
| 8563863 | 1390887692 |
| 8563865 | 1390897682 |
| 8563859 | 1390917059 |
| 8563860 | 1390937059 |
| 8563879 | 1390947854 |
+---------+------------+
Result table would be:
+---------+--------------+---------+---------------+------------+
|orders.id|orders.created|logs.name| logs.ip |logs.created|
+---------+--------------+---------+---------------+------------+
| 8563863 | 1390887692 | 408901 | 178.22.51.168 | 1390887682 |
| 8563865 | 1390897682 | 408901 | 178.22.51.168 | 1390887682 |
| 8563859 | 1390917059 | 408901 | 178.22.51.168 | 1390887682 |
| 8563860 | 1390937059 | 408901 | 178.22.51.168 | 1390927059 |
| 8563879 | 1390947854 | 408901 | 178.22.51.168 | 1390927059 |
+---------+--------------+---------+---------------+------------+
Is it possible?
Espessialy, if first table is result of some query.
UPDATE
Sorry for this mistake. I want found in log who make order. So orders table relate to logs table by created field, i.e.
first row with condition (orders.created >= log.created)

This will result in a non-equi join with a horrible performance:
SELECT *
FROM t2 JOIN t1
ON t1.created =
(
SELECT MAX(t1.created)
FROM t1 WHERE t1.created <= t2.created
)
You might better go with a cursor based on a UNION like this (you probably need to add some type casts to get a working UNION):
SELECT *
FROM
(
SELECT NULL AS name, NULL AS ip, NULL AS created2, t2.*
FROM t2
UNION ALL
SELECT t1.*, NULL AS id, NULL AS created
FROM t1
) AS dt
ORDER BY COALESCE(created, created2)
Now you can process the rows in the right order and remember the rows from the last t1 row.

There is nothing to bind these 2 together.
No ID or other column exists in both tables.
If this were the case, you could join these 2 tables in a stored procedure.
At the moment you ask the first query, store the data in a newly created table, use it in the join to get your results and delete it afterwards.
Kind regards

simply you can use union
select id, created from table_2
union all
select name, ip, created from table_1

SQL select query too slow on webhost, fine on localhost

SELECT c.customers_lastname,
cg.customers_group_name,
dctc.coupons_id AS coupId,
dcto.coupons_id AS coupIdUsed,
dc.coupons_date_start AS coupStart,
count(DISTINCT o.orders_id) AS totalorders,
sum(op.products_quantity * op.final_price) AS ordersum
from
customers c LEFT JOIN customers_groups cg ON cg.customers_group_id = c.customers_group_id
LEFT JOIN (discount_coupons_to_customers dctc
LEFT JOIN discount_coupons dc ON dc.coupons_id = dctc.coupons_id
LEFT JOIN discount_coupons_to_orders dcto ON dcto.coupons_id = dctc.coupons_id
) ON c.customers_id = dctc.customers_id, orders_products op, orders o
WHERE c.customers_id = o.customers_id
AND c.customers_promotions = '0'
AND o.orders_id = op.orders_id
GROUP BY c.customers_id
ORDER BY ordersum DESC LIMIT 0, 10
The above query returns all customers that ever bought anything in our webshop (and some extra data), sorted by total order amount. It runs fine on localhost (a few seconds) but takes up to a minute on remote server. To make matters worse, the query can be modified via a form to include extra bits in the GROUP BY clause like:
HAVING (sum(op.products_quantity * op.final_price) >= 1000
AND/OR count(DISTINCT o.orders_id) > 2)
which doesn't exactly speed things up. there's about 5000 customers and 3000 orders at present. I added a time constraint WHERE order not older than one year but things didnt speed up after that.
i compared my local server and the online one.
localhost linux kernel is 3.2, online 2.6,
localhost php 5.4.4, online 5.3.26,
localhost mysql 5.5, online 5.1,
localhost php memory limit 128M, online 126M.
is there an obvious bottleneck? i sent an email to my webhost but didnt get a response. if I need to swap hosts I will but would like to know what to look out for. cheers,
Edit
using explain: (not sure how to format this, and no idea what it means) returns
`+----+-------------+-------+--------+-----------------------------+--------------+---------+---------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-----------------------------+--------------+---------+---------------------+-------+----------------------------------------------+
| 1 | SIMPLE | c | ALL | PRIMARY | NULL | NULL | NULL | 5541 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | cp | eq_ref | PRIMARY | PRIMARY | 4 | rpc.c.customers_id | 1 | |
| 1 | SIMPLE | dctc | eq_ref | customers_id,customers_id_2 | customers_id | 4 | rpc.c.customers_id | 1 | |
| 1 | SIMPLE | dcto | ref | PRIMARY | PRIMARY | 34 | rpc.dctc.coupons_id | 0 | Using index |
| 1 | SIMPLE | dc | ALL | PRIMARY | NULL | NULL | NULL | 1 | |
| 1 | SIMPLE | cg | ALL | PRIMARY | NULL | NULL | NULL | 5 | Using where; Using join buffer |
| 1 | SIMPLE | o | ALL | PRIMARY | NULL | NULL | NULL | 5010 | Using where; Using join buffer |
| 1 | SIMPLE | op | ALL | NULL | NULL | NULL | NULL | 10675 | Using where; Using join buffer |
+----+-------------+-------+--------+-----------------------------+--------------+---------+---------------------+-------+----------------------------------------------+`

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL join only if there is no match - sql

You can use NOT EXISTS for a self-explanatory query that reads almost as if it were in English: SELECT * FROM Hosts h WHERE NOT EXISTS ( SELECT * FROM Peers p WHERE p.peer_device = h.device AND p.peer_physical_port = h.physical_port )

Does this work for you? SELECT * FROM Hosts WHERE NOT peer_physical_port IN ( SELECT DISTINCT peer_physical_port FROM Peers ) You are selecting only the entries that do not appear in the second table.

You need something like this: SELECT * FROM Host h LEFT JOIN Peers p ON p.device= h.device and p.physical_port = h.physical_port WHERE p.ID IS NULL

Try this.. SELECT * FROM Host WHERE device NOT IN (SELECT device FROM Peers ) AND physical_port NOT IN (SELECT physical_port FROM Peers)

Related

SQL - Given sequence of data, how do I query the origin?

Multiple-level mapping (or Tree Hierachy) with SQL

Select multiple columns based on the max of another column

Join table condition for between 2 rows

SQL select query too slow on webhost, fine on localhost

Categories

Resources