Update Statement SQL doesn't update every record - sql

Question
I've been trying to solve this question for quite a while now. But I'm not getting any closer to fixing it. I select a group of people eligible for getting a contract renewal. Now I want to update everyone with a specific code, but some of the records are staying blank.
What I've tried
These are the queries I'm using. First for selecting the records:
INSERT INTO SELECTION (CLIENDTID, CREATED_DT, FIRSTNAME, MIDDLENAME, LASTNAME, EMAIL, CONTRACTEND_DATE, PRODUCT, MOBILE,TELEPHONE, STREET, HOUSENUMBER, ADDITIVE, POSTALCODE, CITY)
SELECT CLIENDTID, GETDATE(),FIRSTNAME, MIDDLENAME, LASTNAME, EMAIL, CONTRACTEND_DATE, PRODUCT, MOBILE,TELEPHONE, STREET, HOUSENUMBER, ADDITIVE, POSTALCODE, CITY
FROM CONTRACTS C (NOLOCK)
INNER JOIN OPTINS O (NOLOCK) ON O.CLIENTID = C.CLIENTID
INNER JOIN HISTORY HIS(NOLOCK) ON HIS.CLIENTID = C.CLIENTID
WHERE
(
((DATEDIFF(DD,CURRENT_TIMESTAMP, CONTRACTEND_DATE) BETWEEN 26 AND 28)
AND
(O.MAIL=1 OR O.SMS=1 OR O.DM=1 OR 0.TELEPHONE=1 AND HIS.HISTORY IS NULL))
OR
((DATEDIFF(DD,CURRENT_TIMESTAMP, CONTRACTEND_DATE) BETWEEN 19 AND 21)
AND
(HIS.HISTORY<10 OR HIS.HISTORY IS NULL)
AND
O.SMS=1 AND C.MOBILE IS NOT NULL)
OR
((DATEDIFF(DD,CURRENT_TIMESTAMP, CONTRACTEND_DATE) BETWEEN 19 AND 21)
AND
(HIS.HISTORY<100 OR HIS.HISTORY IS NULL)
AND
(O.SMS=0 OR C.MOBILE IS NULL)
AND
O.CALL=1
AND
(C.MOBILE IS NOT NULL OR C.TELEPHONE IS NOT NULL))
OR
((DATEDIFF(DD,CURRENT_TIMESTAMP,CONTRACTEND_DATE) BETWEEN 12 AND 14)
AND
(HIS.HISTORY<100 OR HIS.HISTORY IS NULL)
AND
O.TELEPHONE=1
AND
(C.MOBILE IS NOT NULL OR C.TELEPHONE IS NOT NULL))
)
And then i use this query to update the records.
UPDATE S
SET CODE = CASE
WHEN ( DATEDIFF(DD, CURRENT_TIMESTAMP, C.CONTRACTEND_DATE) BETWEEN 26 AND 28) AND HIS.HISTORY IS NULL AND O.MAIL = 1 AND C.MAIL IS NOT NULL THEN 'MAIL'
WHEN ( DATEDIFF(DD, CURRENT_TIMESTAMP, C.CONTRACTEND_DATE) BETWEEN 26 AND 28) AND HIS.HISTORY IS NULL AND O.DM = 1 AND (O.MAIL=0 OR C.MAIL IS NULL) THEN 'DM'
WHEN ( DATEDIFF(DD, CURRENT_TIMESTAMP, C.CONTRACTEND_DATE) BETWEEN 26 AND 28) AND HIS.HISTORY IS NULL AND O.DM = 0 AND (O.MAIL=0 OR C.MAIL IS NULL) AND O.SMS=1 AND C.MOBILE IS NOT NULL THEN 'SMS'
WHEN ( DATEDIFF(DD, CURRENT_TIMESTAMP, C.CONTRACTEND_DATE) BETWEEN 26 AND 28) AND HIS.HISTORY IS NULL AND O.DM = 0 AND (O.MAIL=0 OR C.MAIL IS NULL) AND (O.SMS=0 OR C.MOBILE IS NULL) AND
O.TELEPHONE=1 AND (C.MOBILE IS NOT NULL OR C.TELEPHONE IS NOT NULL) THEN 'EXPORT'
WHEN ( DATEDIFF(DD, CURRENT_TIMESTAMP, C.CONTRACTEND_DATE) BETWEEN 19 AND 21) AND (HIS.HISTORY<10 OR HIS.HISTORY IS NULL)
AND O.SMS=1 AND C.MOBILE IS NOT NULL THEN 'SMS'
WHEN ( DATEDIFF(DD, CURRENT_TIMESTAMP, C.CONTRACTEND_DATE) BETWEEN 19 AND 21) AND (HIS.HISTORY<100 OR HIS.HISTORY IS NULL)
AND (O.SMS=0 OR C.MOBILE IS NULL) AND O.TELEPHONE=1 AND (C.MOBILE IS NOT NULL OR C.TELEPHONE IS NOT NULL) THEN 'EXPORT'
WHEN ( DATEDIFF(DD, CURRENT_TIMESTAMP, C.CONTRACTEND_DATE) BETWEEN 12 AND 14) AND (HIS.HISTORY<100 OR HIS.HISTORY IS NULL)
AND O.TELEPHONE=1 AND (C.MOBILE IS NOT NULL OR C.TELEPHONE IS NOT NULL) THEN 'EXPORT'
ELSE NULL
END
FROM SELECTION S(NOLOCK)
INNER JOIN CONTRACTS C (NOLOCK) ON C.CLIENTID = S.CLIENTID
INNER JOIN OPTINS O (NOLOCK) ON O.CLIENTID = C.CLIENTID
INNER JOIN HISTORY HIS(NOLOCK) ON HIS.CLIENTID = C.CLIENTID
WHERE S.CREATED_DT>DATEADD(hh,-4,GETDATE())
So basically it's the same selection i'm using to extract the records. But while updating them quite a few stay blank. And when i check the blank records they should've been given a code.
Maybe a case when statement is not the way to go about it, but i don't know how else to pull this off.

Assuming that the use NOLOCK isn't introducing data oddities by allowing "dirty reads", I see several possibilities for why not all the data in your SELECTION table is being updated.
The INSERT has the clause S.CREATED_DT>DATEADD(hh,-4,GETDATE()). If the INSERT was run more than 4 hours before the UPDATE then rows created by that INSERT won't be updated.
Your UPDATE for EXPORT (with BETWEEN 19 AND 21 days) has a condition of O.TELEPHONE = 1 while the INSERT uses O.CALL = 1. I'm guessing the latter is correct and you need to amend the UPDATE code accordingly.
The first part of the WHERE clause (for BETWEEN 26 AND 28) in your INSERT has some odd looking logic related to the HISTORY field update. I think the relevant code should be as I've given below. The order of operations (AND takes precedence over OR) means that my code is not equivalent to your code).
There may be other ways in which the the BETWEEN 26 AND 28 set of records is introducing issues as the code there is not at all equivalent, and seems to rely on business logic rather than logical equivalence.
Revised 26-28 Code for the INSERT
((DATEDIFF(DD,CURRENT_TIMESTAMP, CONTRACTEND_DATE) BETWEEN 26 AND 28)
AND
HIS.HISTORY IS NULL
AND
(O.MAIL=1 OR O.SMS=1 OR O.DM=1 OR O.TELEPHONE=1)
Note: I have assumed that 0.TELEPHONE is a typo and should be O.TELEPHONE.
A Different Approach
If you are confident that one of the two pieces of code is correct, I'd suggest you use exactly the same code in all the relevant spots. Here's a simplified version of how to do that:
INSERT INTO SELECTION
SELECT *
FROM SOURCETABLE t
WHERE
(
CASE
WHEN t.A=1 THEN 'A'
WHEN t.B=1 THEN 'B'
ELSE NULL
END
) IS NOT NULL
UPDATE s
SET s.Target =
CASE
WHEN t.A=1 THEN 'A'
WHEN t.B=1 THEN 'B'
ELSE NULL
END
FROM
SELECTION s
JOIN SOURCETABLE t ON s.ID = t.ID
WHERE
(
CASE
WHEN t.A=1 THEN 'A'
WHEN t.B=1 THEN 'B'
ELSE NULL
END
) IS NOT NULL

Related

TSQL Select only the highest credentail

I have a query that is returning multiple line for a single service because an individual may have multiple credentials. In the medical field you retain several credentials but for simplicity sake I will use just standard credentials Phd, MA, MS, BA, BS, AS
I need to know the simplest way to ignore rows where Z_ServiceLedger.clientvisit_id has any Credentials.credentials lower in the hierarchy. So if an employee does a service and he has a Phd and a MA only return the lines for Phd and if he has a Phd an Ma and a BA only return the lines for phd. We have around 50 credentials so if I use CASE for each credential you can see how mess that will get an I'm hoping there is a better way to avoid that.
Here is my current query:
SELECT DISTINCT
SUM(CASE WHEN v.non_billable = 0 THEN v.duration ELSE 0 END) / 60 AS billable_hours,
SUM(CASE WHEN (v.non_billable = 0 AND Z_ServiceLedger.payer_id = 63) THEN v.duration ELSE 0 END) / 60 AS billable_mro_hours,
Credentials.credentials
FROM
Z_ServiceLedger
INNER JOIN
ClientVisit v ON Z_ServiceLedger.clientvisit_id = v.clientvisit_id
LEFT JOIN
Employees ON v.emp_id = Employees.emp_id
LEFT JOIN
EmployeeCredential ON Employees.emp_id = EmployeeCredential.emp_id
LEFT JOIN
Credentials ON Credentials.credential_id = EmployeeCredential.credential_id
WHERE
v.rev_timein <= CASE
WHEN EmployeeCredential.end_date IS NOT NULL
THEN EmployeeCredential.end_date
ELSE GETDATE()
END
AND v.rev_timein >= #param1
AND v.rev_timein < DateAdd(d, 1, #param2)
AND Z_ServiceLedger.amount > 0
AND v.splitprimary_clientvisit_id IS NULL
AND v.gcode_primary_clientvisit_id IS NULL
AND v.non_billable = 0
AND v.non_billable = 'FALSE'
AND v.duration / 60 > 0
AND Z_ServiceLedger.action_type NOT IN ('SERVICE RATE CHANGE', 'CLIENT STATEMENT')
AND (EmployeeCredential.is_primary IS NULL OR EmployeeCredential.is_primary != 'False')
AND v.client_id != '331771 '
GROUP BY
Credentials.credentials,
v.non_billable
ORDER BY
Credentials.credentials
Some aliases and formatting really shed some light on some major logical flaws here. You have at least two predicates in your where clause that logically turn a left join into an inner join. This is total shot in the dark since from both of your questions today we don't have anything to actually work with for tables or sample data.
The biggest concern though is your where clause is trying to get rows v.non_billable = 0 and where it equals 'FALSE'. It can't be both.
Select sum(Case When v.non_billable = 0 Then v.duration Else 0 End) / 60 As billable_hours
, sum(Case When (v.non_billable = 0 And sl.payer_id = 63) Then v.duration Else 0 End) / 60 As billable_mro_hours
, c.credentials
From Z_ServiceLedger sl
Inner Join ClientVisit v On sl.clientvisit_id = v.clientvisit_id
Left Join Employees e On v.emp_id = e.emp_id
Left Join EmployeeCredential ec On e.emp_id = ec.emp_id
--if you leave these predicates in the where clause you have turned your left join into an inner join.
AND v.rev_timein <= isnull(ec.end_date, GetDate())
and (ec.is_primary Is Null Or ec.is_primary != 'False')
Left Join Credentials c On c.credential_id = ec.credential_id
Where v.rev_timein >= #param1
And v.rev_timein < DateAdd(day, 1, #param2)
And v.splitprimary_clientvisit_id Is Null
And v.gcode_primary_clientvisit_id Is Null
--you need to pick one value for v.non_billable. It can't be both 0 and 'FALSE' at the same time.
And v.non_billable = 0
And v.non_billable = 'FALSE'
--And v.duration / 60 > 0
and v.duration > 60 --this is the same thing and is SARGable
And sl.amount > 0
And sl.action_type NOT IN ('SERVICE RATE CHANGE', 'CLIENT STATEMENT')
And v.client_id != '331771 '
Group By c.credentials
, v.non_billable
Order By c.credentials
EDIT: Modified query to add a CTE to calculate the credential_rank, using a FROM (VALUES (...)) table-value-constructor syntax. This works in SQL 2008+. (https://learn.microsoft.com/en-us/sql/t-sql/queries/table-value-constructor-transact-sql?view=sql-server-2017)
SQL Fiddle
First, I'll build out a very simple piece of data.
SETUP:
CREATE TABLE Employees ( emp_id int, emp_name varchar(20) ) ;
INSERT INTO Employees (emp_id, emp_name)
VALUES (1,'Jay'),(2,'Bob')
;
CREATE TABLE Credentials ( credential_id int, credentials varchar(20), credential_rank int ) ;
INSERT INTO Credentials (credential_id, credentials, credential_rank)
VALUES (1,'BA',3),(2,'MA',2),(3,'PhD',1)
;
CREATE TABLE EmployeeCredential (emp_id int, credential_id int, is_primary bit, end_date date )
INSERT INTO EmployeeCredential (emp_id, credential_id, is_primary, end_date)
VALUES
( 1,2,null,'20200101' )
, ( 1,3,0,'20200101' ) /* NON-PRIMARY */
, ( 1,1,1,'20100101' ) /* EXPIRED CRED */
, ( 2,3,null,'20200101' )
, ( 2,3,1,'20200101' )
;
CREATE TABLE z_ServiceLedger ( payer_id int, clientvisit_id int, amount int, action_type varchar(50) ) ;
INSERT INTO z_ServiceLedger ( payer_id, clientvisit_id, amount, action_type )
VALUES (63,1,10,'XXXXX'),(63,2,20,'XXXXX'),(63,3,10,'XXXXX'),(63,4,30,'XXXXX')
;
CREATE TABLE ClientVisit ( clientvisit_id int, client_id int, non_billable bit, duration int, emp_id int , rev_timein date, splitprimary_clientvisit_id int, gcode_primary_clientvisit_id int ) ;
INSERT INTO ClientVisit ( clientvisit_id, client_id, non_billable, duration, emp_id, rev_timein, splitprimary_clientvisit_id, gcode_primary_clientvisit_id )
VALUES
(1, 1234, 0, 110, 1, getDate(), null, null )
, (2, 1234, null, 120, 1, getDate(), null, null )
, (3, 1234, 1, 110, 2, getDate(), null, null )
, (4, 1234, 0, 130, 2, getDate(), null, null )
;
MAIN QUERY:
; WITH creds AS (
SELECT c.credential_id, c.credentials, r.credential_rank
FROM Credentials c
LEFT OUTER JOIN (VALUES (1,3),(2,2),(3,1) ) r(credential_id, credential_rank)
ON c.credential_id = r.credential_id
)
SELECT DISTINCT
SUM(CASE WHEN ISNULL(v.non_billable,1) = 0 THEN v.duration ELSE 0 END)*1.0 / 60 AS billable_hours,
SUM(CASE WHEN (ISNULL(v.non_billable,1) = 0 AND zsl.payer_id = 63) THEN v.duration ELSE 0 END)*1.0 / 60 AS billable_mro_hours,
s2.credentials
FROM Z_ServiceLedger zsl
INNER JOIN ClientVisit v ON zsl.clientvisit_id = v.clientvisit_id
AND v.rev_timein >= #param1
AND v.rev_timein < DateAdd(d, 1, #param2)
AND v.splitprimary_clientvisit_id IS NULL
AND v.gcode_primary_clientvisit_id IS NULL
AND ISNULL(v.non_billable,1) = 0
AND v.duration*1.0 / 60 > 0
AND v.client_id <> 331771
INNER JOIN (
SELECT s1.emp_id, s1.emp_name, s1.credential_id, s1.credentials, s1.endDate
FROM (
SELECT e.emp_id, e.emp_name, c.credential_id, c.credentials, ISNULL(ec.end_date,GETDATE()) AS endDate
, ROW_NUMBER() OVER (PARTITION BY e.emp_id ORDER BY c.credential_rank) AS rn
FROM Employees e
LEFT OUTER JOIN EmployeeCredential ec ON e.emp_id = ec.emp_id
AND ISNULL(ec.is_primary,1) <> 0 /* I don't think a NULL is_primary should be TRUE */
LEFT OUTER JOIN creds c ON ec.credential_id = c.credential_id
) s1
WHERE s1.rn = 1
) s2 ON v.emp_id = s2.emp_id
AND v.rev_timein <= s2.endDate /* Credential not expired at rev_timein */
WHERE zsl.amount > 0
AND zsl.action_type NOT IN ('SERVICE RATE CHANGE', 'CLIENT STATEMENT')
GROUP BY s2.credentials
ORDER BY s2.credentials
Results:
| billable_hours | billable_mro_hours | credentials |
|----------------|--------------------|-------------|
| 1.833333 | 1.833333 | MA |
| 2.166666 | 2.166666 | PhD |
A couple of things to watch for:
1) Integer Division : duration/60 will return an integer. So if you had duration=70, then you'd have 70/60 = 1. You'd miss that 10 minutes, because of the result will be converted back to an integer. You lose that extra 10 minutes. Probably not what you inteded. The easiest solution is to just multiply duration by 1.0 so that it is forced into a decimal datatype and won't cause the operation to be treated like integers.
2) EmployeeCredential.is_primary != 'False' : Rather than account for the strings of "True"/"False", you should use an actual boolean value (1/0). And a NULL value should indicate that the value is NOT TRUE or FALSE rather than implying TRUE. Also, in SQL, != will work to indicate NOT EQUAL TO, but you should use <> instead. It means the same thing, but is grammatically more correct for SQL.
3) v.non_billable = 0 AND v.non_billable = 'FALSE' : This can be shortened to ISNULL(v.non_billable,1)=0 to short-circuit both checks, especially since non_billable can be NULL. You also avoid the implicit type converstion when comparing the number 0 and the string 'False'.
4) v.client_id != '331771 ' : Change to v.client_id<>33171. First, the != to <> that I mentioned earlier. Then '331771' is implicitly converted to a number. You should avoid implicit conversions.
5) You originally had v.non_billable in your GROUP BY. Since you aren't including it in your SELECT, you can't use it to GROUP BY. Also, you're already filtering out everything other than non_billable=0, so you'd never have more than one value to GROUP BY anyway. Just exclude it.
6) CASE WHEN EmployeeCredential.end_date IS NOT NULL THEN EmployeeCredential.end_date ELSE GETDATE() END : This is the same as saying ISNULL(EmployeeCredential.end_date,GETDATE()).
7) Unless you actually need to filter out specific records for a specific reason, more your JOIN conditions into the JOIN rather than using them in the WHERE clause. This will help you be more efficient with the data your initial query returns before it is filtered or reduced. Also, when using a WHERE filter with a LEFT JOIN, you may end up with unexpected results.

SQL - most efficient way to find if a pair of row does NOT exist

I can't seem to find a similar situation to mine online. I have a table for 'orders' called Order, and a table for details on those orders, called 'order detail'. The definition of a certain type of order is if it has 1 of two pairs of order details (Value-Unit pairs). So, my order detail table might look like this:
order_id | detail
---------|-------
1 | X
1 | Y
1 | Z
2 | X
2 | Z
2 | B
3 | A
3 | Z
3 | B
The two pairs that go together are (X & Y) and (A & B). What is an efficient way of retrieving only those order_ids that DO NOT contain either one of these pairs? e.g. For the above table, I need to receive only the order_id 2.
The only solution I can come up with is essentially to use two queries and perform a self join:
select distinct o.order_id
from orders o
where o.order_id not in (
select distinct order_id
from order_detail od1 where od1.detail=X
join order_detail od2 on od2.order_id = od1.order_id and od2.detail=Y
)
and o.order_id not in (
select distinct order_id
from order_detail od1 where od1.detail=A
join order_detail od2 on od2.order_id = od1.order_id and od2.detail=B
)
The problem is that performance is an issue, my order_detail table is HUGE, and I am quite inexperienced in query languages. Is there a faster way to do this with a lower cardinality? I also have zero control over the schema of the tables, so I can't change anything there.
First and foremost I'd like to emphasise that finding the most efficient query is a combination of a good query and a good index. Far too often I see questions here where people look for magic to happen in only one or the other.
E.g. Of a variety of solutions, yours is the slowest (after fixing syntax errors) when there are no indexes, but is quite a bit better with an index on (detail, order_id)
Please also note that you have the actual data and table structures. You'll need to experiment with various combinations of queries and indexes to find what works best; not least because you haven't indicated what platform you're using and results are likely to vary between platforms.
[/ranf-off]
Query
Without further ado, Gordon Linoff has provided some good suggestions. There's another option likely to offer similar performance. You said you can't control the schema; but you can use a sub-query to transform the data into a 'friendlier structure'.
Specifically, if you:
pivot the data so you have a row per order_id
and columns for each detail you want to check
and the intersection is a count of how many orders have that detail...
Then your query is simply: where (x=0 or y=0) and (a=0 or b=0). The following uses SQL Server's temporary tables to demonstrate with sample data. The queries below work regardless of duplicate id, val pairs.
/*Set up sample data*/
declare #t table (
id int,
val char(1)
)
insert #t(id, val)
values (1, 'x'), (1, 'y'), (1, 'z'),
(2, 'x'), (2, 'z'), (2, 'b'),
(3, 'a'), (3, 'z'), (3, 'b')
/*Option 1 manual pivoting*/
select t.id
from (
select o.id,
sum(case when o.val = 'a' then 1 else 0 end) as a,
sum(case when o.val = 'b' then 1 else 0 end) as b,
sum(case when o.val = 'x' then 1 else 0 end) as x,
sum(case when o.val = 'y' then 1 else 0 end) as y
from #t o
group by o.id
) t
where (x = 0 or y = 0) and (a = 0 or b = 0)
/*Option 2 using Sql Server PIVOT feature*/
select t.id
from (
select id ,[a],[b],[x],[y]
from (select id, val from #t) src
pivot (count(val) for val in ([a],[b],[x],[y])) pvt
) t
where (x = 0 or y = 0) and (a = 0 or b = 0)
It's interesting to note that the query plans for options 1 and 2 above are slightly different. This suggests the possibility of different performance characteristics over large data sets.
Indexes
Note that the above will likely process the whole table. So there is little to be gained from indexes. However, if the table has "long rows", an index on only the 2 columns you're working with means that less data needs to be read from disk.
The query structure you provided is likely to benefit from an indexes such as (detail, order_id). This is because the server can more efficiently check the NOT IN sub-query conditions. How beneficial will depend on the distribution of data in your table.
As a side note I tested various query options including a fixed version of yours and Gordon's. (Only a small data size though.)
Without the above index, your query was slowest in the batch.
With the above index, Gordon's second query was slowest.
Alternative Queries
Your query (fixed):
select distinct o.id
from #t o
where o.id not in (
select od1.id
from #t od1
inner join #t od2 on
od2.id = od1.id
and od2.val='Y'
where od1.val= 'X'
)
and o.id not in (
select od1.id
from #t od1
inner join #t od2 on
od2.id = od1.id
and od2.val='a'
where od1.val= 'b'
)
Mixture between Gordon's first and second query. Fixes the duplicate issue in the first and the performance in the second:
select id
from #t od
group by id
having ( sum(case when val in ('X') then 1 else 0 end) = 0
or sum(case when val in ('Y') then 1 else 0 end) = 0
)
and( sum(case when val in ('A') then 1 else 0 end) = 0
or sum(case when val in ('B') then 1 else 0 end) = 0
)
Using INTERSECT and EXCEPT:
select id
from #t
except
(
select id
from #t
where val = 'a'
intersect
select id
from #t
where val = 'b'
)
except
(
select id
from #t
where val = 'x'
intersect
select id
from #t
where val = 'y'
)
I would use aggregation and having:
select order_id
from order_detail od
group by order_id
having sum(case when detail in ('X', 'Y') then 1 else 0 end) < 2 and
sum(case when detail in ('A', 'B') then 1 else 0 end) < 2;
This assumes that orders do not have duplicate rows with the same detail. If that is possible:
select order_id
from order_detail od
group by order_id
having count(distinct case when detail in ('X', 'Y') then detail end) < 2 and
count(distinct case when detail in ('A', 'B') then detail end) < 2;

SQL Query Best techniques to get MAX data from a foreign key linked table

I have written two queries however feel they are inefficient.
I have two queries, one which prepares the data (the data was originally from a old fox pro db and the dates etc where nvarchars, so I convert them to dates etc) the second which collates all of the data ready to be exported to a csv and eventually the csv is sent to a web service.
So the first query...
I have a table of people and a table of placements (placements being a job that they have had) the placements table will have lots of different rows for a single person and I need only the latest (based on start and end date), is the below the most efficient way of doing this?
PersonCode = unique id for the person, Code = unique id for the placement
SELECT * FROM Person c
LEFT JOIN
(
SELECT MAX(StartDate) AS StartDate, MAX(EndDate) AS EndDate, MAX(Code) AS Code, PersonCode
FROM PersonPlacement
GROUP BY PersonCode
) cp ON c.PersonCode = cp.PersonCode
LEFT JOIN PersonPlacement cp2 ON cp.Code = cp2.Code
So my second query is below...
The second query reads from the first query and needs to do the following:
Get only unique candidates based on last contact date (the original data had dupes)
Get the latest placement
Get Resume data
Only get people that are not currently in a job based on start and end date of placement
If they are in a job that is ending soon then show them
See query below...
SELECT *
FROM Pre_PersonView c
INNER JOIN (
SELECT PersonCode, Code, row_number() over(partition by PersonCode order by StartDate desc) as rn
FROM Pre_PersonView
) pj ON c.PersonCode = pj.PersonCode AND pj.rn = 1
LEFT JOIN Pre_PersonView cp ON pj.Code = cp.Code
INNER JOIN (
SELECT PersonCode, row_number() over(partition by PersonCode order by LastContactDate desc) as rn
FROM Person
) uc ON c.PersonCode = uc.PersonCode AND uc.rn = 1
LEFT JOIN [PersonResumeText] ct ON c.PersonId = ct.PersonId
WHERE c.PersonCode NOT IN
(
SELECT pcv.PersonCode
FROM Pre_PersonView pcv
WHERE pcv.Department IN ('x','y','z')
AND pcv.StartDate <= GETDATE()
AND (CASE WHEN pcv.EndDate = '1899-12-30' THEN GETDATE() + 1 ELSE pcv.EndDate END) > GETDATE()
)
AND DATEDIFF(DAY, ISNULL((CASE WHEN cp.StartDate = '0216-07-22' THEN '2016-07-22' ELSE cp.StartDate END), GETDATE() -365), ISNULL((CASE WHEN cp.EndDate = '1899-12-30' THEN NULL ELSE cp.EndDate END), GETDATE() + 1))
>
(CASE WHEN cp.Department IN ('x','y','z') THEN 365 ELSE 2 END)
Again my question here is this the most efficient way to be doing this?

How to search for column name in range or column is null

I have an query that I am writing where i have a table containing a list of ages, this column can have NULLS. I have been trying to get a result containing users ages 18+ and if the users age is NULL i would like that result as well. Here is a simplified version of what im trying to do below, and the results.
SELECT * FROM TABLE WHERE (Table.Column >= 18 OR Table.Column IS NULL)
-Returns users of all ages and nulls
SELECT * `FROM TABLE WHERE ((Table.Column >= 18) OR (Table.Column IS NULL))
-Returns users of all ages and nulls
SELECT * FROM TABLE WHERE (Table.Column >= 18 NOT BETWEEN 1 AND 17)
-Returns only users 18+, does not return NULLS
SELECT * `FROM TABLE WHERE ((Table.Column >= 18)
-Returns only users 18+, does not return NULLS
Any insight to what may be happening would be a blessing. This is part of a 1300 line query and this is the best i am able to simplify it. I may need you to keep in mind there may be other things going on which im un-able to explain and maybe a hacky work around is in order.
To go into further details, in psuedo code the entire query is as below.
SELECT ColumnA, ColumnB,
CASE
WHEN (Condition 1),
WHEN (Condition 2)
ELSE 'N/A' END AS [Complete],
CASE
WHEN (Condition 1),
WHEN (Condition 2)
ELSE Column END AS [Column],
ColumnC, ColumnD
FROM
LEFT OUTER JOIN Table A on A.Column = B.Column
LEFT OUTER JOIN Table C on A.Column = B.Column
LEFT OUTER JOIN Table D on A.Column = C.Column
WHERE (
Condition1,
and Condition2,
and (Table.Column >= 18 OR Table.Column IS NULL)
)UNION
SELECT
MAX([column]) AS [column],
MAX([MyColumn] AS [My Column],
FROM (
SELECT
column],
MyColumn,
CASE
WHEN (Condition 1),
WHEN (Condition 2)
else 'N/A' end as [Complete]
CASE
WHEN (Condition 1),
WHEN (Condition 2)
ELSE Column END AS [Column],
Column3
WHERE (Condition1)
AND Condition2
)
and (Table.Column >= 18 OR Table.Column IS NULL)
GROUP BY [ColumnName]
UNION
SELECT * FROM TABLE
For my case I had to put the three tables which were being union together in 3 different sub queries, from there I put my where clause
SELECT A.* FROM (
Select * FROM TABLE1
) A
WHERE (A.ColumnOne >= 18 OR ColumnOne IS NULL)
UNION
SELECT B.* FROM (
Select * FROM TABLE2
) B
WHERE (B.ColumnOne >= 18 OR ColumnOne IS NULL)
UNION
SELECT C.* FROM (
Select * FROM TABLE2
) C
WHERE (C.ColumnOne >= 18 OR ColumnOne IS NULL)

SQL Server query : selecting a total figure, based on a sub-query

I am trying to select total figures from my database table, using aggregate functions.
The trouble is: one of the columns I need requires that I run a sub-query within the aggregate. Which SQL does not allow.
Here is the error I am getting :
Cannot perform an aggregate function on an expression containing an aggregate or a subquery.
Here is the initial query :
select
method,
sum(payment_id) as payment_id,
sum(status) as status,
sum(allowEmailContact) as allowEmailContact,
sum(allowPhoneContact) as allowPhoneContact,
sum(totalReservations) as totalReservations
from
(SELECT
RES.method, count(*) as payment_id,
'' as status, '' as complete_data,
'' as allowEmailContact, '' as allowPhoneContact,
'' as totalReservations
FROM
Customer CUS
INNER JOIN
Reservation RES ON CUS.id = RES.customerId
WHERE
(RES.created > '2015-05-31 23:59' and RES.created <= '2015-06-15
23:59')
AND RES.payment_id IS NOT NULL
AND scope_id = 1
GROUP BY
RES.method
UNION ALL
etc
etc
) AS results
GROUP BY
method
(I used : "etc, etc, etc" to replace a large part of the query; I assume there is no need to write the entire code, as it is very long. But, the gist is clear)
This query worked just fine.
However, I need an extra field -- a field for those customers whose data are "clean" --- meaning : trimmed, purged of garbage characters (like : */?"#%), etc.
I have a query that does that. But, the problem is: how to insert this query into my already existing query, so I can create that extra column?
This is the query I am using to "clean" customer data :
select *
from dbo.Customer
where
Len(LTRIM(RTRIM(streetAddress))) > 5 and
Len(LTRIM(RTRIM(streetAddress))) <> '' and
(Len(LTRIM(RTRIM(streetAddress))) is not null and
Len(LTRIM(RTRIM(postalCode))) = 5 and postalCode <> '00000' and
postalCode <> '' and Len(LTRIM(RTRIM(postalCode))) is not null and
Len(LTRIM(RTRIM(postalOffice))) > 2 and
phone <> '' and Len(LTRIM(RTRIM(email))) > 5 and
Len(LTRIM(RTRIM(email))) like '#' and
Len(LTRIM(RTRIM(firstName))) > 2 and Len(LTRIM(RTRIM(lastName))) > 2) and
Len(LTRIM(RTRIM(firstName))) <> '-' and Len(LTRIM(RTRIM(lastName))) <> '-' and
Len(LTRIM(RTRIM(firstName))) is not null and
Len(LTRIM(RTRIM(lastName))) is not null
etc, etc
This query works fine on its own.
But, how to INSERT it into the initial query, to create a separate field, where I can get the TOTAL of those customers who meet this "clean" criteria?
I tried it like this :
select
method,
sum(payment_id) as payment_id,
sum(status) as status,
SUM((select *
from dbo.Customer
where
Len(LTRIM(RTRIM(streetAddress))) > 5 and
Len(LTRIM(RTRIM(streetAddress))) <> '' and
(Len(LTRIM(RTRIM(streetAddress))) is not null and
Len(LTRIM(RTRIM(postalCode))) = 5 and
postalCode <> '00000' and postalCode <> '' and
Len(LTRIM(RTRIM(postalCode))) is not null and
Len(LTRIM(RTRIM(postalOffice))) > 2 and phone <> '' and
Len(LTRIM(RTRIM(email))) > 5 and
Len(LTRIM(RTRIM(email))) like '#' and
Len(LTRIM(RTRIM(firstName))) > 2 and
Len(LTRIM(RTRIM(lastName))) > 2) and
Len(LTRIM(RTRIM(firstName))) <> '-' and
Len(LTRIM(RTRIM(lastName))) <> '-' and
Len(LTRIM(RTRIM(firstName))) is not null and
Len(LTRIM(RTRIM(lastName))) is not null) ) as clean_data,
sum(allowEmailContact) as allowEmailContact, sum(allowPhoneContact) as allowPhoneContact,
sum(totalReservations) as totalReservations
from
(SELECT
RES.method, count(*) as payment_id, '' as status,
'' as complete_data, '' as allowEmailContact,
'' as allowPhoneContact, '' as totalReservations
FROM Customer CUS
INNER JOIN Reservation RES ON CUS.id = RES.customerId
WHERE (RES.created > '2015-05-31 23:59' and RES.created <= '2015-06-15
23:59')
AND RES.payment_id is not null and scope_id = 1
GROUP BY RES.method
UNION ALL
etc
etc
etc
and it gave me that "aggregate" error.
SELECT COUNT(*) instead of SUM(), also, the WHERE Clause to clean the data is awful. There has to be a better way. Maybe mark the rows as clean when they're updated or as a batch job?