How to find unmatched records in a single table? - sql

I'm scraping a log file for transaction records that I am inserting into a table that will be used for several mining tasks. Each record has (among other things) an ID and a transaction type, either request or response. A request/response pair will have the same ID.
One of my tasks is to find all of the requests that do not have a corresponding response. I thought about joining the table to itself, where A.ID = B.ID AND A.type = 'req' and B.type = 'res', but that gives me the opposite of what I need.
Since the IDs will always occur either once or twice, is there a query that would select ID where there is only one occurrence of that ID in the table?

This is a very common type of query. You can try aggregating over the ID values in your table using GROUP BY, then retaining those ID which appear only once.
SELECT ID
FROM yourTable
GROUP BY ID
HAVING COUNT(*) = 1
If you also want to return the entire records for those ID occurring only once, you could try this:
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT ID FROM yourTable GROUP BY ID HAVING COUNT(*) = 1
) t2
ON t1.ID = t2.ID

The straight-forward way is NOT IN:
select *
from mytable
where type = 'req'
and id not in (select id from mytable where type = 'res');
You can write about the same with NOT EXISTS, but the query becomes slightly less readable:
select *
from mytable req
where type = 'req'
and not exists (select * from mytable res where type = 'res' and res.id = req.id);
And then there are forms of aggregation you can use, e.g.:
select *
from mytable
where type = 'req'
and id in
(
select id
from mytable
group by id
having count(case when type = 'res' then 1 end) = 0
);

This will give you the ones that have Request but not respose
SELECT *
FROM your_table A LEFT OUTER JOIN
your_table B ON A.ID = B.ID
AND A.type = 'req' and B.type = 'res'
WHERE B.ID IS NULL

Related

SQL: Unable to use 'with as' to save selected result

I have an inner join result that I want to save it by using with as but received an error. I'm using snowflake.
My code:
with t as (select *
from
(select ID, PRICE from DB.TABLE1
WHERE PRICE IS NOT NULL and ID = '1111') A
inner join
(select ID, BID, ACCEPTED from DB.TABLE2
WHERE BID IS NOT NULL and ID = '1111') B
ON A.ID = B.ID);
Error: SQL compilation error: syntax error line 8 at position 25 unexpected ';'.
If I only run the inner join
select *
from
(select ID, PRICE from DB.TABLE1
WHERE PRICE IS NOT NULL and ID = '1111') A
inner join
(select ID, BID, ACCEPTED from DB.TABLE2
WHERE BID IS NOT NULL and ID = '1111') B
ON A.ID = B.ID
I got this result
ID, PRICE,ID,BIDS,ACCEPTED
1111,180,1111,200,FALSE
1111,180,1111,180,FALSE
1111,180,1111,180,FALSE
1111,180,1111,100,TRUE
Any idea why I got the error message?
You use with to essentially create an alias (called a common table expression) for the query that can then be used in that specific query. All you've done is create the alias without using it. You need something like:
with t as (select *
from
(select ID, PRICE from DB.TABLE1
WHERE PRICE IS NOT NULL and ID = '1111') A
inner join
(select ID, BID, ACCEPTED from DB.TABLE2
WHERE BID IS NOT NULL and ID = '1111') B
ON A.ID = B.ID)
select * from t
Although obviously you'd usually do more complex work than that or else you'd just write the base query without using with
WITH is syntax used to introduced a common table expression. This is an expression used within a single query. It is a lot like a subquery in the FROM clause, except that it can be referenced more than once.
So a correct usage would be:
with t as (
select . . .
)
select count(*)
from t;
In other words, you need to follow the with with something that uses the CTE. Otherwise, you want to store the results in a real table -- temporary or otherwise.
To use CTE, join should be made after creating the tables.
with t as
(select ID, PRICE from DB.TABLE1
WHERE PRICE IS NOT NULL and ID = '1111') ,
t1 as
(select ID, BID, ACCEPTED from DB.TABLE2
WHERE BID IS NOT NULL and ID = '1111')
select *
from t
inner join
t1
on t.ID = t1.ID;

Query a table that have 2 cols with multiple criteria

I have a table with the following structure and Example data:
Now I want to query the records that have value equals to # and #.
For example according to the above image, It should returns 1 and 2
id
-----
1
2
Also if the parameters were #, # and $ It should give us 1. Because only the records with id 1 have all the given values.
id
-----
1
You can use a group by and having to get the distinct Id's that contain a distinct count of the number of items you're looking for
SELECT Id
FROM Table
WHERE Value IN ('#','$')
GROUP BY Id
HAVING COUNT(DISTINCT Value) = 2
SELECT Id
FROM Table
WHERE Value IN ('#','$','#')
GROUP BY Id
HAVING COUNT(DISTINCT Value) = 3
SQL Fiddle you can use this link to test
There's several ways to do this.
The subquery method:
SELECT DISTINCT Id
FROM Table
WHERE Id IN (SELECT Id FROM Table WHERE Value = '#')
AND Id IN (SELECT Id FROM Table WHERE Value = '#');
The correlated subquery method:
SELECT DISTINCT t.Id
FROM Table t
WHERE EXISTS (SELECT 1 FROM Table a WHERE a.Id = t.Id and a.Value = '#')
AND EXISTS (SELECT 1 FROM Table b WHERE b.Id = t.Id and b.Value = '#');
And the INTERSECT method:
SELECT Id FROM Table WHERE Value = '#'
INTERSECT
SELECT Id FROM Table WHERE Value = '#';
Best performance will depend on RDBMS vendor, size of table, and indexes. Not all RDBMS vendors support all methods.
Maybe a multiple self join like this?
select
distinct t1.id
from
table t1
join table t2 on (t1.id=t2.id)
join table t3 on (t1.id=t3.id)
...
where
t1.value='#' and
t2.value='#' and
t3.value='$' and
...

SQL statement to conditionally select related records

I have a table with fields id (primary key) and fid. I want to get the record where id matches a particular value, as well as all related records that have its same fid value.
I can do this:
SELECT * FROM mytable
WHERE fid = (SELECT TOP 1 fid FROM mytable WHERE id = 'somevalue')
But I don't want the related records if the fid is a particular value (in my case an empty guid value).
Is there a way to do this in a single SQL statement? I am using SQL Server 2008 R2.
UPDATE:
Looking at the answers so far I think I may not have asked my question clearly. id and fid will never be equal. LEFT JOIN may be what I need, but I'm a bit SQL ignorant. What I'm hoping for is the following two queries as a single statement:
SELECT * FROM mytable WHERE id = 'somevalue'
SELECT * FROM mytable WHERE fid =
(SELECT TOP 1 fid FROM mytable
WHERE id = 'somevalue' AND fid != '00000000-0000-0000-0000-000000000000')
Based on your revision, the problem seems to be "select all rows where id has a certain value and all other rows with the id matches "somevalue" and the fid is not null.
The following captures this logic:
SELECT t.*
FROM mytable t left outer join
(SELECT TOP 1 fid
FROM mytable
WHERE id = 'somevalue' AND fid <> '00000000-0000-0000-0000-000000000000'
) t1
on t.fid = t1.fid
WHERE id = 'somevalue' or t1.fid is not null;
Because id is a primary key, the t1 subquery will return 0 or 1 rows. When it returns 0 rows, you will only get the original row matching 'somevalue'.
I'm not certain I understand your question, but I'll take a stab at it. What I think you're asking is if you can select all records from one table where either the id or fid fields equal a particular value, but you don't want the related fields if the particular value you're searching on equals an empty guid value. If so, here's how you can do it:
SELECT
*
FROM
mytable t1
LEFT JOIN
mytable t2 ON (t1.id = t2.fid) AND (t2.fid IS NOT NULL);
Is this what you were looking for?
I think this is what you are trying to do:
SELECT *
FROM mytable a
JOIN mytable b ON a.id = b.fid
WHERE a.id = 'somevalue';
This should return all records in a (joined with all records in b where a.id = b.fid) then filtered to show only records that have a.id = 'somevalue';
You could just add another clause to your sql statement like this:
SELECT * From mutable
WHERE fid = (SELECT TOP 1 fid FROM mytable WHERE id = 'somevalue'
AND fid != '00000000-0000-0000-0000-000000000000')
If you want more than one row, try a join as suggested by #zigdawgydawg.
Maybe this is what you are after:
select * from mytable
where id = 'somevalue'
or id = (select fid from mytable where id = 'somevalue')
Almost like zigdawgydawg's contribution, but slightly different:
SELECT * FROM mytable WHERE fid IN
(SELECT fid FROM mytable WHERE id = 'somevalue' )
AND NOT guid is null;

Select on records based on two column criterias

I would like to do an SQL query to select from the following table:
id type num
1 a 3
1 b 4
2 a 5
2 c 6
In the case where they have the same 'id' and be type 'a or b', so the result would look something like this:
id type num
1 a 3
1 b 4
Any one has any idea how that can be accomplished?
SELECT table1.*
FROM table1,
(
SELECT COUNT(*) as cnt, id
FROM (
SELECT *
FROM table1
WHERE type = 'a' OR type = 'b'
) sub1
GROUP BY id
HAVING cnt > 1
)sub2
WHERE table1.id = sub2.id
Tested here: http://sqlfiddle.com/#!2/4a031/1 seems to work fine.
Method 1:
select a.*
from some_table t
join some_table a on a.id = t.id and a.type = 'a'
join some_table b on b.id = t.id and b.type = 'b'
Method 2:
select *
from some_table t
where exists ( select *
from some_table x
where x.id = t.id
and x.type = 'a'
)
and exists ( select *
from some_table x
where x.id = t.id
and x.type = 'b'
)
The first technique offers the possibilities of duplicate rows in the results set, depending on the cardinality of id and type. The latter is guaranteed to provide a proper subset of the table.
Either query, assuming you have reasonable indices defined on the table should provide pretty equivalent performance.

How do I compare 2 rows from the same table (SQL Server)?

I need to create a background job that processes a table looking for rows matching on a particular id with different statuses. It will store the row data in a string to compare the data against a row with a matching id.
I know the syntax to get the row data, but I have never tried comparing 2 rows from the same table before. How is it done? Would I need to use variables to store the data from each? Or some other way?
(Using SQL Server 2008)
You can join a table to itself as many times as you require, it is called a self join.
An alias is assigned to each instance of the table (as in the example below) to differentiate one from another.
SELECT a.SelfJoinTableID
FROM dbo.SelfJoinTable a
INNER JOIN dbo.SelfJoinTable b
ON a.SelfJoinTableID = b.SelfJoinTableID
INNER JOIN dbo.SelfJoinTable c
ON a.SelfJoinTableID = c.SelfJoinTableID
WHERE a.Status = 'Status to filter a'
AND b.Status = 'Status to filter b'
AND c.Status = 'Status to filter c'
OK, after 2 years it's finally time to correct the syntax:
SELECT t1.value, t2.value
FROM MyTable t1
JOIN MyTable t2
ON t1.id = t2.id
WHERE t1.id = #id
AND t1.status = #status1
AND t2.status = #status2
Some people find the following alternative syntax easier to see what is going on:
select t1.value,t2.value
from MyTable t1
inner join MyTable t2 on
t1.id = t2.id
where t1.id = #id
SELECT COUNT(*) FROM (SELECT * FROM tbl WHERE id=1 UNION SELECT * FROM tbl WHERE id=2) a
If you got two rows, they different, if one - the same.
SELECT * FROM A AS b INNER JOIN A AS c ON b.a = c.a
WHERE b.a = 'some column value'
I had a situation where I needed to compare each row of a table with the next row to it, (next here is relative to my problem specification) in the example next row is specified using the order by clause inside the row_number() function.
so I wrote this:
DECLARE #T TABLE (col1 nvarchar(50));
insert into #T VALUES ('A'),('B'),('C'),('D'),('E')
select I1.col1 Instance_One_Col, I2.col1 Instance_Two_Col from (
select col1,row_number() over (order by col1) as row_num
FROM #T
) AS I1
left join (
select col1,row_number() over (order by col1) as row_num
FROM #T
) AS I2 on I1.row_num = I2.row_num - 1
after that I can compare each row to the next one as I need