I haven't been able to find a similar question for an answer I'm looking for. What is the best way to apply multiple conditions to my query to exclude certain information. Case or Boolean?
Example code:
SELECT test,
testTypeID,
visitType,
submitted
FROM vReport
WHERE (vReport.submitted = 0 OR vReport.submitted IS NULL)
AND vReport.test IN ('Test 1','Test 2','Test 3')
How do I best code for it to return all the tests 1, 2, and 3 while excluding rows for certain visit types (i.e. exclude row ONLY if it is Test 3 AND it is visit Week 26 AND a certain testTypeID)?
<>Not sure what your column names and datatypes are for visitWeek (assuming this is an INT) and testTypeID and what values you want to filter by but here is the logic for it:
SELECT test,
testTypeID,
visitType,
submitted
FROM vReport
WHERE (vReport.submitted = 0 OR vReport.submitted IS NULL)
AND vReport.test IN ('Test 1','Test 2','Test 3')
AND (vReport.test NOT IN ('Test 3') AND vReport.testTypeID NOT IN (some value) AND vReport.visitWeek <> 26)
If you can define your exclusions homogeneously, you can store them in another table. Something like:
ExcludedTest
excludedTestId
test
visitType
testTypeId
and your query can be done like this:
SELECT test,
testTypeID,
visitType,
submitted
FROM vReport VR
WHERE (vReport.submitted = 0 OR vReport.submitted IS NULL)
AND vReport.test IN ('Test 1','Test 2','Test 3')
AND NOT EXISTS ( SELECT 1 FROM ExcludedTest ET
WHERE ET.testTypeID = VR.testTypeID
AND ET.visitType = VR.visitType
AND ET.test = VR.test)
Also, you should have a better performance if you exclude that OR. One way to do this is to keep submitted as NOT NULL with DEFAULT(0) => vReport.submitted = 0 condition is enough.
Related
I have a database containing events which have a "time" (an integer) plus some other attributes.
E.g.
CREATE TABLE events (time, attr1, attr2);
INSERT INTO events VALUES (1, 'a', 'foo');
INSERT INTO events VALUES (2, 'b', 'bar');
INSERT INTO events VALUES (4, 'a', 'baz');
INSERT INTO events VALUES (9, 'b', 'quux');
INSERT INTO events VALUES (10, 'c', 'foobar');
Now I want to do a somewhat complicated query: I want to find all events which have the property that the next event in the table satisfies some condition. For instance, I might want to find all events that satisfy all these conditions:
attr1 == 'a'
the next event (as determined by the time field) has attr2 == 'bar'
This should return the event at time 1, but not the event at time 4. Or a more complicated example would be: find all events that satisfy
attr1 == 'a'
the next event for which attr1 == 'c' has attr2 == 'foobar'
This would return both the events at times 1 and 4.
It seems like this ought to be possible via some sort of complicated nested select, but I haven't managed to work out how.
Other notes:
I'm using sqlite.
Events are irregularly spaced, so strategies that involve computing the position of the 'next' event won't work.
I know these queries are going to be murder on the query optimizer, that's okay.
I know how to do this by doing multiple selects + non-SQL logic, but I'd much rather do it using pure SQL, because this is embedded in a larger query generation system. I need to be able to generate queries of this form in general, conjoined with other constraints, etc., it's not just a single query I'll write once and be done with.
You can find a record that is the next after some specific time by combining ORDER BY and LIMIT:
SELECT *
FROM events
WHERE time > 1
ORDER BY time
LIMIT 1
By using this in a subquery, you can look up values from the next record.
Your first query can be implemented like this:
SELECT *
FROM events AS e2
WHERE attr1 = 'a'
AND (SELECT attr2
FROM events
WHERE time > e2.time
ORDER BY time
LIMIT 1) = 'bar'
Your second query can be implemented like this (the additional condition belongs into the WHERE of the subquery):
SELECT *
FROM events AS e2
WHERE attr1 = 'a'
AND (SELECT attr2
FROM events
WHERE attr1 = 'c'
AND time > e2.time
ORDER BY time
LIMIT 1) = 'foobar'
The subquery lookups can be made faster with an index on the time column.
select * from events a
where exists
(
select * from events c where c.time =
(select min(b.time) from events b where b.time > a.time)--next_event
and c.attr2 = 'bar'
)
and a.attr1 = 'a'
should be your first query. It returns time 1.
http://sqlfiddle.com/#!2/63baf/12
the second could be :
select * from events a
where exists
(
select * from events c where c.time =
(select min(b.time) from events b where b.time > a.time and attr1 = 'c')
and c.attr2 = 'foobar'
)
and a.attr1 = 'a'
but it returns time 1 and 4 (unlike what you expect, but both these rows comply with your conditions)
http://sqlfiddle.com/#!2/63baf/15
hope this helps
Nicolas
Given an example table such as:
CREATE TABLE TESTING_Order(
Order INT,
Name VARCHAR(5)
)
INSERT INTO TESTING_Order
VALUES
(0, 'Zero'),
(1, 'One'),
(2, 'Two'),
(3, 'Three'),
(4, 'Four'),
(5, 'Five'),
(6, 'Six'),
(7, 'Seven')
I'd like to know how to implement a sort of 'Stack' to show, for instance, the turn order in a game where the first player moves 'left' each turn. I can accomplish this using two update statements:
UPDATE TESTING_Order
SET Order = Order + 1
UPDATE TESTING_Order
SET Order = (SELECT Min(Order) -1 FROM TESTING_Order)
WHERE Order = (SELECT MAX(Order) FROM TESTING_Order)
I was wondering if there is a cleaner/more proper way of doing this, and especially if that other way can be done using a single UPDATE statement.
In other words, I think what I'm after is a better implementation of a LIFO Stack that performs a push on every pop--excuse me for possibly butchering any terminology.
Just so that there is an answer here:
To cycle through the list I've decided to use:
UPDATE TESTING_Order
SET nOrder =
Case
WHEN (nOrder - 1) < 0 THEN (SELECT Count(*) FROM TESTING_Order) - 1
WHEN (nOrder - 1) >= 0 THEN (nOrder - 1) % (SELECT MAX(nOrder) + 1 FROM TESTING_Order)
END
This allows me to move the list in the correct way (which I realize my answer probably did not) and has the benefit of doing what I requested in just one statement. Thanks for pointing me in the right direction, Wiseguy.
why does filter for NULL in subqueries does not work?
I hoped to get the correct result by add NULL to the list of allowed values, for example:
SELECT ERP_ServiceProcess.fiStatusNew, RMA.IdRMA
FROM ERP_ServiceProcess RIGHT OUTER JOIN
RMA ON ERP_ServiceProcess.fiRMA = RMA.IdRMA
WHERE (ERP_ServiceProcess.fiStatusNew IN (NULL, 1, 7, 8))
order by ERP_ServiceProcess.fiStatusNew
This gives the incorrect result because all records in RMA that have no records in sub-table ERP_ServiceProcess(where ERP_ServiceProcess.fiStatusNew IS NULL) are dropped.
I must use this (slow) query to get the correct result:
SELECT ERP_ServiceProcess.fiStatusNew, RMA.IdRMA
FROM ERP_ServiceProcess RIGHT OUTER JOIN
RMA ON ERP_ServiceProcess.fiRMA = RMA.IdRMA
WHERE (ERP_ServiceProcess.fiStatusNew IS NULL)
OR (ERP_ServiceProcess.fiStatusNew IN (1, 7, 8))
order by ERP_ServiceProcess.fiStatusNew
Why do i have to use the second, slow query although i used RIGHT OUTER JOIN and i've added NULL to the subquery?
Thank you in advance.
It doesn't work as you expect as it gets expanded to a bunch of equals operations
fiStatusNew = NULL OR fiStatusNew = 1 OR fiStatusNew = 7 OR fiStatusNew = 8
and anything = NULL is unknown.
Given this expansion there's no particular reason to think that adding an additional OR using IS NULL would make things slower on its own (the additional predicate might change the query plan to use a different access path if the statistics lead it to belive that the number of matching rows warrants this though)
You see the same behaviour in the CASE operation
SELECT CASE NULL WHEN NULL THEN 'Yes' ELSE 'No' END /*Returns "No"*/
This is one reason why you should take particular care with the inverse operation NOT IN. If the list contains any NULL values you will always get an empty result set.
fiStatusNew NOT IN (NULL, 1,2)
Would expand to
fiStatusNew<> NULL and fiStatusNew<> 1 and fiStatusNew<> 2
or
Unknown And True/False/Unknown And True/False/Unknown
Which always evaluates to Unknown under three valued logic.
Could you try using
ISNULL(ERP_ServiceProcess.fiStatusNew,0) IN (0, 1, 7, 8)
Untested but might be quicker than the 2nd query.
'ERP_ServiceProcess.fiStatusNew IN (NULL)' evaluates to 'ERP_ServiceProcess.fiStatusNew = NULL' and that always is false. NULL is defined in sql server as 'unknown', not as 'no value'. That's why NULL = NULL or NULL = #var (*) always evaluates to false. If you have two unknowns, you cannot check if they are equal. Only 'is NULL' works.
(*) Well, for sql server, you can set ANSI_NULLS to off but that's not really recommended as it is not standard sql behaviour.
MySQL provides a string function named FIELD() which accepts a variable number of arguments. The return value is the location of the first argument in the list of the remaining ones. In other words:
FIELD('d', 'a', 'b', 'c', 'd', 'e', 'f')
would return 4 since 'd' is the fourth argument following the first.
This function provides the capability to sort a query's results based on a very specific ordering. For my current application there are four statuses that I need to manager: active, approved, rejected, and submitted. However, if I simply order by the status column, I feel the usability of the resulting list is lessened since rejected and active status items are more important than submitted and approved ones.
In MySQL I could do this:
SELECT <stuff> FROM <table> WHERE <conditions> ORDER BY FIELD(status, 'rejected', 'active','submitted', 'approved')
and the results would be ordered such that rejected items were first, followed by active ones, and so on. Thus, the results were ordered in decreasing levels of importance to the visitor.
I could create a separate table which enumerates this importance level for the statuses and then order the query by that in descending order, but this has come up for me a few times since switching to MS SQL Server so I thought I'd inquire as to whether or not I could avoid the extra table and the somewhat more complex queries using a built-in function similar to MySQL's FIELD().
Thank you,
David Kees
Use a CASE expression (SQL Server 2005+):
ORDER BY CASE status
WHEN 'active' THEN 1
WHEN 'approved' THEN 2
WHEN 'rejected' THEN 3
WHEN 'submitted' THEN 4
ELSE 5
END
You can use this syntax for more complex evaluation (including combinations, or if you need to use LIKE)
ORDER BY CASE
WHEN status LIKE 'active' THEN 1
WHEN status LIKE 'approved' THEN 2
WHEN status LIKE 'rejected' THEN 3
WHEN status LIKE 'submitted' THEN 4
ELSE 5
END
For your particular example your could:
ORDER BY CHARINDEX(
',' + status + ',',
',rejected,active,submitted,approved,'
)
Note that FIELD is supposed to return 0, 1, 2, 3, 4 where as the above will return 0, 1, 10, 17 and 27 so this trick is only useful inside the order by clause.
A set based approach would be to outer join with a table-valued-constructor:
LEFT JOIN (VALUES
('rejected', 1),
('active', 2),
('submitted', 3),
('approved', 4)
) AS lu(status, sort_order)
...
ORDER BY lu.sort_order
I recommend a CTE (SQL server 2005+).
No need to repeat the status codes or create the separate table.
WITH cte(status, RN) AS ( -- CTE to create ordered list and define where clause
SELECT 'active', 1
UNION SELECT 'approved', 2
UNION SELECT 'rejected', 3
UNION SELECT 'submitted', 4
)
SELECT <field1>, <field2>
FROM <table> tbl
INNER JOIN cte ON cte.status = tbl.status -- do the join
ORDER BY cte.RN -- use the ordering defined in the cte
Good luck,
Jason
ORDER BY CHARINDEX(','+convert(varchar,status)+',' ,
',rejected,active,submitted,approved,')
just put a comma before and after a string in which you are finding the substring index or you can say that second parameter.
and first parameter of charindex is also surrounded by ,
I have two tables, both with start time and end time fields. I need to find, for each row in the first table, all of the rows in the second table where the time intervals intersect.
For example:
<-----row 1 interval------->
<---find this--> <--and this--> <--and this-->
Please phrase your answer in the form of a SQL WHERE-clause, AND consider the case where the end time in the second table may be NULL.
Target platform is SQL Server 2005, but solutions from other platforms may be of interest also.
SELECT *
FROM table1,table2
WHERE table2.start <= table1.end
AND (table2.end IS NULL OR table2.end >= table1.start)
It's sound very complicated until you start working from reverse.
Below I illustrated ONLY GOOD CASES (no overlaps)! defined by those 2 simple conditions, we have no overlap ranges if condA OR condB is TRUE, so we going to reverse those:
NOT condA AND NOT CondB, in our case I just reversed signs (> became <=)
/*
|--------| A \___ CondA: b.ddStart > a.ddEnd
|=========| B / \____ CondB: a.ddS > b.ddE
|+++++++++| A /
*/
--DROP TABLE ran
create table ran ( mem_nbr int, ID int, ddS date, ddE date)
insert ran values
(100, 1, '2012-1-1','2012-12-30'), ----\ ovl
(100, 11, '2012-12-12','2012-12-24'), ----/
(100, 2, '2012-12-31','2014-1-1'),
(100, 3, '2014-5-1','2014-12-14') ,
(220, 1, '2015-5-5','2015-12-14') , ---\ovl
(220, 22, '2014-4-1','2015-5-25') , ---/
(220, 3, '2016-6-1','2016-12-16')
select DISTINCT a.mem_nbr , a.* , '-' [ ], b.dds, b.dde, b.id
FROM ran a
join ran b on a.mem_nbr = b.mem_nbr -- match by mem#
AND a.ID <> b.ID -- itself
AND b.ddS <= a.ddE -- NOT b.ddS > a.ddE
AND a.ddS <= b.ddE -- NOT a.ddS > b.ddE
"solutions from other platforms may be of interest also."
SQL Standard defines OVERLAPS predicate:
Specify a test for an overlap between two events.
<overlaps predicate> ::=
<row value constructor 1> OVERLAPS <row value constructor 2>
Example:
SELECT 1
WHERE ('2020-03-01'::DATE, '2020-04-15'::DATE) OVERLAPS
('2020-02-01'::DATE, '2020-03-15'::DATE)
-- 1
db<>fiddle demo
select * from table_1
right join
table_2 on
(
table_1.start between table_2.start and table_2.[end]
or
table_1.[end] between table_2.start and table_2.[end]
or
(table_1.[end] > table_2.start and table_2.[end] is null)
)
EDIT: Ok, don't go for my solution, it perfoms like shit. The "where" solution is 14x faster. Oops...
Some statistics: running on a db with ~ 65000 records for both table 1 and 2 (no indexing), having intervals of 2 days between start and end for each row, running for 2 minutes in SQLSMSE (don't have the patience to wait)
Using join: 8356 rows in 2 minutes
Using where: 115436 rows in 2 minutes
And what, if you want to analyse such an overlap on a minute precision with 70m+ rows?
the only solution i could make up myself was a time dimension table for the join
else the dublicate-handling became a headache .. and the processing cost where astronomical