SQL need help building a query - sql

I need your help building a query.
I have two tables:
The first table (table1) gives me the historical status , all the status that my product passed and the second table(table2) tells me the status at this moment for my product.
the id columns are the same for both tables like the status column.
I want to build a query that tells me the amount of my products that are with the status D,E and F in my table 2 but on my table 1 didn't passed for the status C, like going to status B to status D,E or F without passing to C.
I tried running this query:
select count(id), status
from table1 e
where status not in (C) EXISTS (SELECT *
FROM table2 c
WHERE e.id = c.id
AND status IN (D,E,F))
group by status
The query didn't return with the expected results. Can you help?

As the other responders noted, you have some syntax errors. Basically, you're just missing a few words.
select count(id)
, status
from table1 t1
where status not in ('C')
*and*
EXISTS (
SELECT *
FROM table2 t2
WHERE t2.id = t1.id
and status in ('D','E','F')
)
group by status
;
Alternatively, you could try solving it this way. Full disclosure - this is probably not as efficient (see In vs Exists).
select count(id)
, status
from table1
where id not in
(
select id
from table1
where status not in ('C')
union
select id
from table2
where status in ('D','E','F')
)
group by status
;

select count(id), status
from table1 a, table2 b
where a.id = b.id
and a.status not in ('C')
group by a.status

Related

Best way to group records with MAX revision

I have a source table like this:
table_a :
id
revision
status
1
0
APPROVED
1
1
PENDING
I am trying to get distinct records from table_a having the latest revision and show the latest approved revision for each one of them.
result table_b :
id
latest_rev
latest_approved_rev
1
1
0
I have written the following query :
SELECT a.id,
a.revision AS latest_rev,
b.latest_approved_rev
FROM table_a a
LEFT JOIN (SELECT id,
MAX(revision) AS latest_approved_revision
FROM table_a
WHERE status = 'APPROVED'
GROUP BY id) b ON b.id = a.id
WHERE a.revision = (SELECT MAX(revision)
FROM table_a
WHERE id = a_id
My query seems to work fine, but I am wondering if I was missing something and/or if there is another way to make the query better/faster.
Seems you could achieve this with some (conditional) aggregation:
SELECT id,
MAX(revision) AS latest_rev,
MAX(CASE status WHEN 'APPROVED' THEN revision END) AS latest_approved_rev
FROM (VALUES(1,0,'APPROVED'),
(1,1,'PENDING'))V(id,revision,status)
GROUP BY id;
You have a correct query (if I understand your requirement). You are very close to an ideal query. But your outer WHERE clause contains a correlated (dependent) subquery, and those don't always perform well.
You can think of this as the JOIN of two subqueries. The one you have.
SELECT id,
MAX(revision) AS revision
FROM table_a
WHERE status = 'APPROVED'
GROUP BY id
and this one.
SELECT id,
MAX(revision) AS revision
FROM table_a
GROUP BY id
You FULL JOIN them together. Like this.
SELECT max.id,
latest.revision as latest_rev,
approved.revision as approved_rev
FROM (
SELECT id,
MAX(revision) AS revision
FROM table_a
GROUP BY id
) latest
FULL JOIN (
SELECT id,
MAX(revision) AS revision
FROM table_a
GROUP BY id
) approved on latest.id = approved.id
In this case you can actually use LEFT JOIN because you know every id in the approved subquery is also present in the latest subquery.

SQL Server - Exclude Records from other tables

I used the search function which brought me to the following solution.
Starting Point is the following: I have one table A which stores all data.
From that table I select a certain amount of records and store it in table B.
In a new statement I want to select new records from table A that do not appear in table B and store them in table c. I tried to solve this with a AND ... NOT IN statement.
But I still receive records in table C that are in table B.
Important: I can only work with select statements, each statement needs to start with select as well.
Does anybody have an idea where the problem in the following statement could be:
Select *
From
(Select TOP 10000 *
FROM [table_A]
WHERE Email like '%#domain_A%'
AND Id NOT IN (SELECT Id
FROM [table_B]))
Union
(Select TOP 7500 *
FROM table_A]
WHERE Email like '%#domain_B%'
AND Id NOT IN (SELECT Id
FROM [table_B]))
Union
(SELECT TOP 5000 *
FROM [table_A]
WHERE Email like '%#domain_C%'
AND Id NOT IN (SELECT Id
FROM [table_B]))
Try NOT EXISTS instead of NOT IN
SELECT
*
FROM TableA A
WHERE NOT EXISTS
(
SELECT 1 FROM TableB WHERE Id = A.Id
)
So Basically the idea here is to select everything from table A that doesnt exists in table B and Insert all that into Table C?
INSERT INTO Table_C
SELECT a.colum1, a.column2,......
FROM [table_A]
LEFT JOIN [table_B] ON a.id = b.ID
WHERE a.Email like '%#domain_A%' AND b.id IS NULL
Thank you guys all for your feedback, from which I learned a lot.
I was able to fix the statement with your help. Above is the statement which is working now with the desired results:
Select Id
From
(Select TOP 10000 * FROM Table_A
WHERE Email like '%#domain_a%'
AND Id NOT IN (SELECT Id
FROM Table_B)
order by No desc) t1
Union
Select Id
From
(Select TOP 7500 * FROM Table_A
WHERE Email like '%#domain_b%'
AND Id NOT IN (SELECT Id
FROM Table_B)
order by No desc) t2
Union
Select Id
From
(SELECT TOP 5000 * FROM Table_A
WHERE Email like '%#domain_c%'
AND Id NOT IN (SELECT Id
FROM Table_B)
order by No desc) t3

SQL - How to "filter out" people who has more than 1 status

I tried to find this question here but I probably didn't know the exact term to search for.
Here is the problem:
I have this set of customers (see image). I need to filter only those with status "user_paused" or "interval_paused". A same customer_id may have more than 1 status, and sometimes, this status can be "active". If so, this customer should not appear in my final result.
See customer 809 - he shouldn't appear in my final result since he has an "active" status. all the others are fine, because they only have paused statuses.
I still couldn't figure out how to proceed from here.
Thank you so much.
SELECT DISTINCT customer_id FROM TABLE
WHERE status IN ( 'user_paused','interval_paused')
EXCEPT
SELECT DISTINCT customer_id FROM TABLE
WHERE status = 'active'
One method uses group by and having:
select customer_id
from t
group by customer_id
having sum(case when status not in ('user_paused', 'interval_paused') then 1 else 0 end) = 0;
select * from table
where customer_id in
(select customer_id from table
where status in ('interval_paused','user_paused') )
You can find all customers with a status of 'active' quite easily:
SELECT customerid FROM table WHERE status = 'active'
If you want to exclude any customer from your results if they have an active row, you can do this in a subquery:
SELECT * FROM table WHERE /* your other query restrictions */
AND customerID NOT IN
(
SELECT customerid FROM table WHERE status = 'active'
)
This will let you eliminate any row with a customerid that has any 'active' row.
Please note that subqueries are not always the most efficient solution - there could be cases where a subquery would make your query very slow.
To exclude any customer with 'active' in either column use the following:
select * from customers
where paused_statuses != 'active'
and status != 'active';
Not sure if you need distinct or not, but here are 2 approaches. I think both will work in Impala but just in case you have an option. The first uses a "left excluding join" (make the join then exclude the matched rows) which enable us to ignore the active status customers. The second uses an even more traditional "not exists" approach to remove customer_ids that have an active status.
select /* distinct */ t1.customer_id
from table t1
left join table t2 on t1.customer_id = t2.customer_id and t2.status = 'active'
where t2.customer_id IS NULL
and t1.status in ('interval_paused','user_paused')
;
select /* distinct */ t1.customer_id
from table t1
where t1.status in ('interval_paused','user_paused')
and NOT EXISTS (
select null
from table t2
where t1.customer_id = t2.customer_id
and t2.status = 'active'
)
;
if your existing query is complex, then to simplify these additions, use a WITH clause like this:
WITH MyCTE AS (
-- place the whole existing query here
)
select /* distinct */ t1.customer_id
from MyCTE t1
left join MyCTE t2 on t1.customer_id = t2.customer_id and t2.status = 'active'
where t2.customer_id IS NULL
and t1.status in ('interval_paused','user_paused')
;
Notice that that the name you give it ("MyCTE") can be reused in the subsequent query - a very useful feature indeed.
In general the structures created by WITH are called common table expressions (CTE) if you are wondering why I use "MyCTE" as a name.
SELECT customer_id, paused_statuses, status
FROM Customer
WHERE NOT IN (SELECT customer_id, paused_statuses, status
FROM Customer
WHERE status = user_paused
AND status = active
AND status = interval_paused)
GROUP BY customer_id
OR
SELECT customer_id, paused_statuses, status
FROM Customer
WHERE status = user_paused
AND status = interval_paused
AND status <> active
GROUP BY customer_id

How to find unmatched records in a single table?

I'm scraping a log file for transaction records that I am inserting into a table that will be used for several mining tasks. Each record has (among other things) an ID and a transaction type, either request or response. A request/response pair will have the same ID.
One of my tasks is to find all of the requests that do not have a corresponding response. I thought about joining the table to itself, where A.ID = B.ID AND A.type = 'req' and B.type = 'res', but that gives me the opposite of what I need.
Since the IDs will always occur either once or twice, is there a query that would select ID where there is only one occurrence of that ID in the table?
This is a very common type of query. You can try aggregating over the ID values in your table using GROUP BY, then retaining those ID which appear only once.
SELECT ID
FROM yourTable
GROUP BY ID
HAVING COUNT(*) = 1
If you also want to return the entire records for those ID occurring only once, you could try this:
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT ID FROM yourTable GROUP BY ID HAVING COUNT(*) = 1
) t2
ON t1.ID = t2.ID
The straight-forward way is NOT IN:
select *
from mytable
where type = 'req'
and id not in (select id from mytable where type = 'res');
You can write about the same with NOT EXISTS, but the query becomes slightly less readable:
select *
from mytable req
where type = 'req'
and not exists (select * from mytable res where type = 'res' and res.id = req.id);
And then there are forms of aggregation you can use, e.g.:
select *
from mytable
where type = 'req'
and id in
(
select id
from mytable
group by id
having count(case when type = 'res' then 1 end) = 0
);
This will give you the ones that have Request but not respose
SELECT *
FROM your_table A LEFT OUTER JOIN
your_table B ON A.ID = B.ID
AND A.type = 'req' and B.type = 'res'
WHERE B.ID IS NULL

TSQL/SQL 2005/2008 Return Row from one table that is not in othe table

I have to compare a row of one table that is not in another table
TableA
ID
1
2
3
4
NULL
TableB
ID
1
4
When comparing TableA with TableB ,the following o/p (NULL Can be ignored)
ID STATUS
1 FOUND
2 NOT FOUND
3 NOT FOUND
4 FOUND
I tried with
SELECT
case T.ID when isnull(T.ID,0)=0 then 'NOT FOUND'
case T.ID when isnull(T.ID,0)<>0 then 'FOUND'
end,T.ID
FROM
TableA T
LEFT JOIN
TableB N
ON T.ID=N.ID
Its ended with incorrect syntax near '=',moreover i have no idea whether the query is correct.
Try this:
SELECT a.ID,
CASE WHEN b.ID IS NULL THEN 'NOT FOUND' ELSE 'FOUND' END AS Status
FROM TableA a
LEFT JOIN TableB b ON a.ID = b.ID
Note the difference in the structure of the CASE statement - that was your problem.
To generate the result as shown in the question:
SELECT ID,
CASE WHEN EXISTS (SELECT * FROM TableB WHERE ID = TableA.ID)
THEN 'FOUND'
ELSE 'NOT FOUND'
END AS STATUS
FROM TableA
But if you are only interested in the missing records:
SELECT ID
FROM TableA
WHERE NOT EXISTS (SELECT * FROM TableB WHERE ID = TableA.ID)
SELECT
T.ID
FROM TableA T WHERE NOT EXISTS ( SELECT X.ID FROM TableB X WHERE X.ID = T.ID)
If you want the 'Found' or 'Not Found' answer go for what AdaTheDev posted