How do I get the middle 45% of a table on SQL? - sql

I already have two tables. One containing the Top 5% (#Top5) and another containing the bottom 50% (#Bottom50). I also have a table containing all 100% (#Total)... Is there a way I can use these to make a third table with the remaining 45%? If I join each row has its own specific userid, so that is what it would be on.

Assuming each row has a unique id, you can use not exists:
select t.*
from #Total t
where not exists (select 1 from #Bottom50 b where b.userid = t.userid) and
not exists (select 1 from #Top5 tt where tt.userid = t.userid)
This is standard SQL and should work in any database.
You can add a create table as or into statement to actually create the new table. The specific syntax depends on the database.

Depending on what SQL dialect you use, there should be set-based operators such as UNION, INTERSECT and EXCEPT/MINUS. If you don't care about retaining duplicate values in the table, the last operator can be used to get what you want:
SELECT * FROM #Total
EXCEPT SELECT * FROM #Top5
EXCEPT SELECT * FROM #Bottom50

Simple Combination of LEFT JOIN and IS NULL in the where clause should do the trick and should work on many platforms:
SELECT T.*
FROM #Total T
LEFT JOIN #Top5 T5 ON T.userid = T5.userid
LEFT JOIN #Bottom50 B50 ON T.userid = B50.userid
WHERE T5.userid IS NULL
AND B50.userid IS NULL

SELECT *
FROM #Total
WHERE NOT EXISTS (SELECT TOP 1 1 FROM #Top5 t5 WHERE t5.PKey = #total.PKey)
AND NOT EXISTS (SELECT TOP 1 1 FROM #Bottom50 t50 WHERE t50.PKey = #total.PKey)

Related

How to get all non matching rows from both tables in one query?

I have two tables with similar columns and I would like to know the difference between these tables. So if all values (column-wise) of the row exists in both table it is fine (I do not want to see this), while I want to see all rows that.
I have tried this:
select m.*, t.*
from test.test1 m
full outer join test.test2 t
on row(m) = row(t)
where m.date = '2022-11-01'
but I am getting all rows only from the first table. Note. I want only one query (no subqueries)
You need to add the null check for your key columns in the where statement:
select m.*, t.*
from test.test1 m
full outer join test.test2 t
on row(m) = row(t)
where m.KEY is null or t.KEY is null and m.date = '2022-11-01'
You can use the EXCEPT/EXCEPT ALL set operators to compare tables with the column layout (data-types and order of columns (if using SELECT *) must match).
SELECT 'IN TEST1 but not in TEST2' as SRC, EA.*
FROM (
SELECT *
FROM test.test1 m
where m.date='2022-11-01'
EXCEPT ALL
SELECT *
FROM test.test2
) EA
union all
SELECT 'IN TEST2 but not in TEST1' as SRC, EA.*
FROM (
SELECT *
FROM test.test2
EXCEPT ALL
SELECT *
FROM test.test1 m
where m.date='2022-11-01'
) EA

Get the latest entry time using SQL if your result returns two different time, should I use cross or outer apply?

So I want to use datediff for two tables that I'm doing a join on. The problem is if I filter by a unique value, it returns two rows of result. For example:
select *
from [internalaudit]..ReprocessTracker with (nolock)
where packageID = '1983446'
It returns two rows, because it was repackaged twice, by two different workers.
User RepackageTime
KimVilder 2021-06-10
DanielaS 2021-06-05
I want to use the latest repackagetime of that unique packageID and then do a datediff with another time record when I do a join with a different table.
Is there way to filer so I can get the latest time entry of Repackagetime?
There are numerous ways you can accomplish this, if I understand your goal - proper example data and tables would be a help here.
One way is using apply and selecting the max date for each packageId
select DateDiff(datepart, t.datecolumn, r.RepackageTime)...
from othertable t
cross apply (
select Max(RepackageTime)RepackageTime
from internalaudit.dbo.ReprocessTracker r
where r.packageId=t.packageId
)r
select *
from Othertable t1
join (
select *
from [internalaudit]..ReprocessTracker t2
where packageID = '1983446'
limit 1
) t2
on t1.id = t2.id
if you are using sql server instead of limit 1 you should use top 1
also otherwise you solid reason to use nolock hint, avoid using it.
also to generalize the query above:
select *
from Othertable t1
cross join (
select *
from [internalaudit]..ReprocessTracker t2
where t1.packageID = t2.packageID
limit 1
) t2

sql - ignore duplicates while joining

I have two tables.
Table1 is 1591 rows. Table2 is 270 rows.
I want to fetch specific column data from Table2 based on some condition between them and also exclude duplicates which are in Table2. Which I mean to join the tables but get only one value from Table2 even if the condition has occurred more than time. The result should be exactly 1591 rows.
I tried to make Left,Right, Inner joins but the data comes more than or less 1591.
Example
Table1
type,address,name
40,blabla,Adam
20,blablabla,Joe
Table2
type,currency
40,usd
40,gbp
40,omr
Joining on 'type'
Result
type,address,name,currency
40,blabla,name,usd
20,blblbla,Joe,null
try this it has to work
select *
from
Table1 h
inner join
(select type,currency,ROW_NUMBER()over (partition by type order by
currency) as rn
from
Table2
) sr on
sr.type=h.type
and rn=1
Try this. It's standard SQL, therefore, it should work on your rdbms system.
select * from Table1 AS t
LEFT OUTER JOIN Table2 AS y ON t.[type] = y.[type] and y.currency IN (SELECT MAX(currency) FROM Table2 GROUP BY [type])
If you want to control which currency is joined, consider altering Table2 by adding a new column active/non active and modifying accordingly the JOIN clause.
You can use outer apply if it's supported.
select a.type, a.address, a.name, b.currency
from Table1 a
outer apply (
select top 1 currency
from Table2
where Table2.type = a.type
) b
I typical way to do this uses a correlated subquery. This guarantees that all rows in the first table are kept. And it generates an error if more than one row is returned from the second.
So:
select t1.*,
(select t2.currency
from table2 t2
where t2.type = t1.type
fetch first 1 row only
) as currency
from table1 t1;
You don't specify what database you are using, so this uses standard syntax for returning one row. Some databases use limit or top instead.

SQL Server--Is it possible to work around using a temporary table for a query that filters based on an alias case column?

I am trying to alter a base query that selects data from several joined tables, and filters out rows based on the CASE WHEN below. The result set is to be returned as follows:
If all of the rows return 0 in the CASE column, return one line with '0' in the OVERDUE column (the "return one line" portion is taken care of by DISTINCT.)
If any of the rows return 1 for the CASE column, return one line with '1' in the OVERDUE column.
The base is as follows:
SELECT DISTINCT t1.*,
CASE WHEN t3.MTemp > t3.MTempLimit
then 1
when t3.TotHours > t3.THoursLimit
then 1
else 0
end [Overdue]
from table_1 t1
LEFT JOIN table_2 t2 on t1.ResNo = t2.ResNo and t1.PCode = t2.PCode
LEFT JOIN table_3 t3 on t2.RepJobNo = t3.RepJobNo
LEFT JOIN table_4 t4 on t4.TypeID = t2.RepType
WHERE t2.RepStat = 1
The catch is, I've already created a working version of this by using a temp table and doing a IF EXISTS/ELSE query on the temp table's OVERDUE column. However, I've been informed that this solution may not be useable (due to having to go through certain front-end software).
Is it possible to do a workaround for this that does not involve using a temporary table? I've been making attempts at using both a derived table and CTEs, neither of which have yielded anything usable, due to the fact that one cannot use IF/ELSE clauses after those (which was what I was counting on).
I'm still getting the hang of T-SQL, so any help would be greatly appreciated.
Sounds like a simple ROW_NUMBER() and a couple of CTEs will work:
;WITH RS1 as (
SELECT t1.*,
CASE WHEN t3.MTemp > t3.MTempLimit
then 1
when t3.TotHours > t3.THoursLimit
then 1
else 0
end [Overdue]
from table_1 t1
LEFT JOIN table_2 t2 on t1.ResNo = t2.ResNo and t1.PCode = t2.PCode
LEFT JOIN table_3 t3 on t2.RepJobNo = t3.RepJobNo
LEFT JOIN table_4 t4 on t4.TypeID = t2.RepType
WHERE t2.RepStat = 1
), RS2 as (
select *,ROW_NUMBER() OVER (ORDER BY Overdue DESC) rn
from RS1
)
select * from RS2 where rn = 1
(There's no need for a DISTINCT now that we're only returning one row)
In general any temporary table referenced in another query can simply be substituted for as follow, so that this:
insert #temp
select -- definition of temptable
;
select ...
from #temp
join ...
becomes
select
from (
-- definition of temptable
) temp
join ...

An issue possibly related to Cursor/Join

Here is my situation:
Table one contains a set of data that uses an id for an unique identifier. This table has a one to many relationship with about 6 other tables such that.
Given Table 1 with Id of 001:
Table 2 might have 3 rows with foreign key: 001
Table 3 might have 12 rows with foreign key: 001
Table 4 might have 0 rows with foreign key: 001
Table 5 might have 28 rows with foreign key: 001
I need to write a report that lists all of the rows from Table 1 for a specified time frame followed by all of the data contained in the handful of tables that reference it.
My current approach in pseudo code would look like this:
select * from table 1
foreach(result) {
print result;
select * from table 2 where id = result.id;
foreach(result2) {
print result2;
}
select * from table 3 where id = result.id
foreach(result3) {
print result3;
}
//continued for each table
}
This means that the single report can run in the neighbor hood of 1000 queries. I know this is excessive however my sql-fu is a little weak and I could use some help.
LEFT OUTER JOIN Tables2-N on Table1
SELECT Table1.*, Table2.*, Table3.*, Table4.*, Table5.*
FROM Table1
LEFT OUTER JOIN Table2 ON Table1.ID = Table2.ID
LEFT OUTER JOIN Table3 ON Table1.ID = Table3.ID
LEFT OUTER JOIN Table4 ON Table1.ID = Table4.ID
LEFT OUTER JOIN Table5 ON Table1.ID = Table5.ID
WHERE (CRITERIA)
Join doesn't do it for me. I hate having to de-tangle the data on the client side. All those nulls from left-joining.
Here's a set-based solution that doesn't use Joins.
INSERT INTO #LocalCollection (theKey)
SELECT id
FROM Table1
WHERE ...
SELECT * FROM Table1 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table2 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table3 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table4 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table5 WHERE id in (SELECT theKey FROM #LocalCollection)
Ah! Procedural! My SQL would look like this, if you needed to order the results from the other tables after the results from the first table.
Insert Into #rows Select id from Table1 where date between '12/30' and '12/31'
Select * from Table1 t join #rows r on t.id = r.id
Select * from Table2 t join #rows r on t.id = r.id
--etc
If you wanted to group the results by the initial ID, use a Left Outer Join, as mentioned previously.
You may be best off to use a reporting tool like Crystal or Jasper, or even XSL-FO if you are feeling bold. They have things built in to handle specifically this. This is not something the would work well in raw SQL.
If the format of all of the rows (the headers as well as all of the details) is the same, it would also be pretty easy to do it as a stored procedure.
What I would do: Do it as a join, so you will have the header data on every row, then use a reporting tool to do the grouping.
SELECT * FROM table1 t1
INNER JOIN table2 t2 ON t1.id = t2.resultid -- this could be a left join if the table is not guaranteed to have entries for t1.id
INNER JOIN table2 t3 ON t1.id = t3.resultid -- etc
OR if the data is all in the same format you could do.
SELECT cola,colb FROM table1 WHERE id = #id
UNION ALL
SELECT cola,colb FROM table2 WHERE resultid = #id
UNION ALL
SELECT cola,colb FROM table3 WHERE resultid = #id
It really depends on the format you require the data in for output to the report.
If you can give a sample of how you would like the output I could probably help more.
Join all of the tables together.
select * from table_1 left join table_2 using(id) left join table_3 using(id);
Then, you'll want to roll up the columns in code to format your report as you see fit.
What I would do is open up cursors on the following queries:
SELECT * from table1 order by id
SELECT * from table1 r, table2 t where t.table1_id = r.id order by r.id
SELECT * from table1 r, table3 t where t.table1_id = r.id order by r.id
And then I would walk those cursors in parallel, printing your results. You can do this because all appear in the same order. (Note that I would suggest that while the primary ID for table1 might be named id, it won't have that name in the other tables.)
Do all the tables have the same format? If not, then if you have to have a report that can display the n different types of rows. If you are only interested in the same columns then it is easier.
Most databases have some form of dynamic SQL. In that case you can do the following:
create temporary table from
select * from table1 where rows within time frame
x integer
sql varchar(something)
x = 1
while x <= numresults {
sql = 'SELECT * from table' + CAST(X as varchar) + ' where id in (select id from temporary table'
execute sql
x = x + 1
}
But I mean basically here you are running one query on your main table to get the rows that you need, then running one query for each sub table to get rows that match your main table.
If the report requires the same 2 or 3 columns for each table you could change the select * from tablex to be an insert into and get a single result set at the end...