SQL Server 2008
I have a query with several local variables that does some easy math in the result set. When I copy and paste the query to try to save it as a view, it fails telling me there's incorrect syntax. (in this case it's near the declare statement of the variables.) If needed I'll post the query, just wondering if there's a reason for this to work one way and not the other.
declare #totalpop float,
#totalMales float,
#totalFemales float,
#percentMales float,
#percentFemales float;
select #totalmales=sum(case when sex='m' then 1 else 0 end),
#totalfemales = sum(case when sex='f' then 1 else 0 end),
#totalpop=count(*)
from tblVisits
select #percentmales = round(100 * #totalmales/#totalpop,2),
#percentFemales = round(100*#totalfemales/#totalpop,2)
select #totalmales,#percentmales,#totalfemales, #percentfemales, #totalpop
You don't need any of the declared variables, you can do this in plain-old sql with a nested select:
SELECT totalmales, round(1e2*totalmales/totalpop, 2) percentmales,
totalfemales, round(1e2*totalfemales/totalpop, 2) percentfemales,
totalpop
FROM (SELECT sum(case when sex='m' then 1 else 0 end) totalmales,
sum(case when sex='f' then 1 else 0 end) totalfemales,
count(*) totalpop
FROM tblVisits) innerquery
Which should be usable on most any database that supports views and subselects (which is basically all of them.
You cannot use variables inside views. You can, however, transforms it in a SP (Stored procedure) or table-based function.
Edit: Or do what TokenMacGuy told you :P.
You are using SQL Server 2008, so another way to do this is with PIVOT:
create view V as
select
m as totalmales,
round(1e2*m/(m+f),2) as pctmales,
f as totalfemales,
round(1e2*f/(m+f),2) as pctfemales,
m+f as totalpop
from tblVisits as T
pivot (count(sex) for sex in ([m],[f])) as P;
If you do this, be sure to keep the 1e2 or use 100.0 instead of 100 inside the round() expression. Otherwise, the divisions m/(m+f) and f/(m+f) will be integer divisions and both yield zero.
Related
Can someone please explain to me why I am dumb? I am trying to use this result set to do a running deduction from the variable I declared below. Granted I am newer at CTE's but I figured this would have been INF easier.
DECLARE #monies AS money = 35600.00;
WITH RFRoll AS
(
SELECT
col
, value
, Amt = #monies
FROM #rfTmp
UNION ALL
SELECT
col
, value
, CASE
WHEN #monies>value then #monies-value
WHEN value<#monies then #monies-value
WHEN value>#monies then #monies-#monies
END Amt
FROM #rfTmp
WHERE
CASE
WHEN #monies>value then #monies-value
WHEN value<#monies then #monies-value
WHEN value>#monies then #monies-#monies
END >0
)
SELECT *
FROM RFRoll
This is the result get I get. It looks as if its just displaying the the current line calculation instead of the running deduction I'm trying to get at.
Ive tried various different ways and I keep running into a brick wall. I keep reverting to this current state which obviously isn't correct b/c I'm not actually using a running total variable.
So I guess my question is how do i do this using a variable? Am I actually able to reset the value recursively? I don't think so? Any insight would be greatly appreciated.
The desired out out would be this (not counting the extra fourth column)
Sorry for adding the 4th column it occurred to me that maybe I haven't truly understood the issue at hand and maybe I ought to rethink my approach. Basically I want cascading deduction up until a value is equal to 0. I have to do all the comparisons b/c a value can never be negative and I cant go over the value of the variable.
I am going to try and rework this in the meantime.
In SQL Server 2008, you can use apply or a correlated subquery. The basic structure is:
select r.*, r2.running_value
from #rftmp r outer apply
(select sum(r2.value) as running_value
from #rftmp r2
where r2.col <= r.col
) t2;
I'm not sure if this is what you want to accomplish.
Of course, in SQL Server 2012, the function is built in using a window function with order by.
I figured it out, if anyone cares
declare #monies as money = 35600.00;
WITH RFRoll as
(
SELECT
#rfTmp.*
, CASE
WHEN #monies>=value then value
WHEN value<#monies then value
WHEN value>#monies then #monies
END AS Amt
, CASE
WHEN #monies>=value then value
WHEN value<#monies then value
WHEN value>#monies then #monies
END AS ValueUsed
FROM #rfTmp WHERE [No] = 1
UNION ALL
SELECT
v.*
, RFRoll.Amt + CASE
WHEN v.value>#monies-RFRoll.Amt then #monies-RFRoll.Amt
WHEN v.value<#monies-RFRoll.Amt then v.value
WHEN #monies-RFRoll.Amt>v.value then v.value
END
, CASE
WHEN v.value>#monies-RFRoll.Amt then #monies-RFRoll.Amt
WHEN v.value<#monies-RFRoll.Amt then v.value
WHEN #monies-RFRoll.Amt>v.value then v.value
END
FROM #rfTmp v INNER JOIN RFRoll ON v.[No] = RFRoll.[No] + 1
WHERE RFRoll.Amt<#monies
)
SELECT * FROM RFRoll
Im trying to write a simple function that will allow me to get the sum based off the value of a column.
CREATE FUNCTION [GetSumOfColumnByCase](#column varchar(50), #case int)
RETURNS INT
AS
BEGIN
declare #return int
set #return = SUM(CASE WHEN #column = #case THEN 1 ELSE 0 END)
-- Return the result of the function
return #return
END
GO
I call this function like this:
SELECT HouseDescription,
[dbo].[GetSumOfColumnByCase]([HouseTypeId], 1) AS "houseType1",
[dbo].[GetSumOfColumnByCase]([HouseTypeId], 2) AS "houseType2"
Doing things this way forces me to GROUP BY both the houseDescription and the HouseTypeId columns but i just want to GROUP BY the housedescription.
If i do things this way:
SELECT HouseDescription,
SUM(CASE WHEN HouseTypeId = 1 THEN 1 ELSE 0 END) AS "houseType1",
SUM(CASE WHEN HouseTypeId = 2 THEN 1 ELSE 0 END) AS "houseType2"
Its fine, it doesnt force me to GROUP BY HouseTypeId.
Can anyone explain why this is?
When you are using a GROUP BY clause, every column needs to either be in the GROUP BY, or it needs to be aggregated.
In your second example, you are fulfilling these requirements - by placing the SUM around the function call. In your first example, since the function call itself isn't wrapped in an aggregation (SUM, MAX, MIN, etc.), you must place it in the GROUP BY clause in order to not trigger an error.
https://msdn.microsoft.com/en-us/library/ms177673.aspx
I agree with Gordon though, you may want to rethink your strategy for this.
You cannot do what you want with a function, because SQL Server does not support dynamic SQL (readily) in functions. And to handle any column, you would need dynamic SQL.
But, you don't need that anyway. If you want the sum on each row of the original data, you want window functions:
SELECT SUM(CASE WHEN HouseTypeId = 1 THEN 1 ELSE 0 END) OVER () AS houseType1,
SUM(CASE WHEN HouseTypeId = 2 THEN 1 ELSE 0 END) OVER () AS houseType2
. . .
Aggregation is not needed for this query.
Trying to do some calculations via SQL on my iSeries and have the following conundrum: I need to count the number of times a certain value appears in a column. My select statement is as follows:
Select
MOTRAN.ORDNO, MOTRAN.OPSEQ, MOROUT.WKCTR, MOTRAN.TDATE,
MOTRAN.LBTIM, MOROUT.SRLHU, MOROUT.RLHTD, MOROUT.ACODT,
MOROUT.SCODT, MOROUT.ASTDT, MOMAST.SSTDT, MOMAST.FITWH,
MOMAST.FITEM,
CONCAT(MOTRAN.ORDNO, MOTRAN.OPSEQ) As CON,
count (Concat(MOTRAN.ORDNO, MOTRAN.OPSEQ) )As CountIF,
MOROUT.SRLHU / (count (Concat(MOTRAN.ORDNO, MOTRAN.OPSEQ))) as calc
*(snip)*
With this information, I'm trying to count the number of times a value in CON appears. I will need this to do some math with so it's kinda important. My count statement doesn't work properly as it reports a certain value as occurring once when I see it appears 8 times.
Try putting a CASE statement inside a SUM().
SUM(CASE WHEN value = 'something' THEN 1 ELSE 0 END)
This will count the number of rows where value = 'something'.
Similary...
SUM(CASE WHEN t1.val = CONCAT(t2.val, t3.val) THEN 1 ELSE 0 END)
If you're on a supported version of the OS, ie 6.1 or higher...
You might be able to make use of "grouping set" functionality. Particularly the ROLLUP clause.
I can't say for sure without more understanding of your data.
Otherwise, you're going to need to so something like
wth Cnt as (select ORDNO, OPSEQ, count(*) as NbrOccur
from MOTRAN
group by ORDNO, OPSEQ
)
Select
MOTRAN.ORDNO, MOTRAN.OPSEQ, MOROUT.WKCTR, MOTRAN.TDATE,
MOTRAN.LBTIM, MOROUT.SRLHU, MOROUT.RLHTD, MOROUT.ACODT,
MOROUT.SCODT, MOROUT.ASTDT, MOMAST.SSTDT, MOMAST.FITWH,
MOMAST.FITEM,
CONCAT(MOTRAN.ORDNO, MOTRAN.OPSEQ) As CON,
Cnt.NbrOccur,
MOROUT.SRLHU / Cnt.NbrOccur as calc
from
motran join Cnt on mortran.ordno = cnt.ordno and mortran.opseq = cnt.opseq
*(snip)*
I have the following sql statement and I want to update a field on the rows returned from the select statement. Is this possible with my select? The things I have tried are not giving me the desired results:
SELECT
Flows_Flows.FlowID,
Flows_Flows.Active,
Flows_Flows.BeatID,
Flows_Flows.FlowTitle,
Flows_Flows.FlowFileName,
Flows_Flows.FlowFilePath,
Flows_Users.UserName,
Flows_Users.DisplayName,
Flows_Users.ImageName,
Flows_Flows.Created,
SUM(CASE WHEN [Like] = 1 THEN 1 ELSE 0 END) AS Likes,
SUM(CASE WHEN [Dislike] = 1 THEN 1 ELSE 0 END) AS Dislikes
FROM Flows_Flows
INNER JOIN Flows_Users ON Flows_Users.UserID = Flows_Flows.UserID
LEFT JOIN Flows_Flows_Likes_Dislikes ON
Flows_Flows.FlowID=Flows_Flows_Likes_Dislikes.FlowID
WHERE Flows_Flows.Active = '1' AND Flows_Flows.Created < DATEADD(day, -60, GETDATE())
Group By Flows_Flows.FlowID, Flows_Flows.Active, Flows_Flows.BeatID,
Flows_Flows.FlowTitle, Flows_Flows.FlowFileName, Flows_Flows.FlowFilePath,
Flows_Users.UserName, Flows_Users.DisplayName, Flows_Users.ImageName,
Flows_Flows.Created
Having SUM(CASE WHEN [Like] = 1 THEN 1 ELSE 0 END) = '0' AND SUM(CASE WHEN [Dislike] = 1
THEN 1 ELSE 0 END) >= '0'
This select statement returns exactly what I need but I want to change the Active field from 1 to 0.
yes - the general structure might be like this: (note you don't declare your primary key)
UPDATE mytable
set myCol = 1
where myPrimaryKey in (
select myPrimaryKey from mytable where interesting bits happen here )
Because you haven't made your question more clear in what result you want to achieve, I'll provide an answer with my own assumptions.
Assumption
You have a select statement that gives you stuffs, and it works as desired. What you want it to do is to make it return results and update those selected rows on the fly - basically like saying "find X, tell me about X and make it Y".
Anwser
If my assumption is correct, unfortunately I don't think there is any way you can do that. A select does not alter the table, it can only fetch information. Similarly, an update does not provide more detail than the number of rows updated.
But don't give up yet, depending on the result you want to achieve, you have alternatives.
Alternatives
If you just want to update the rows that you have selected, you can
simply write an UPDATE statement to do that, and #Randy has provided
a good example of how it will be written.
If you want to reduce calls to server, meaning you want to make just
one call to the server and get result, as well as to update the
rows, you can write store procedures to do that.
Store procedures are like functions you wrote in programming languages. It essentially defines a set of sql operations and gives them a name. Each time you call that store procedure, the set of operations gets executed with supplied inputs, if any.
So if you want to learn more about store procedures you can take a look at:
http://www.mysqltutorial.org/introduction-to-sql-stored-procedures.aspx
If I understand correctly you are looking for a syntax to be able to select the value of Active to be 0 if it is 1. The syntax for something like that is
SELECT
Active= CASE WHEN Active=1 THEN 0 ELSE Active END
FROM
<Tables>
WHERE
<JOIN Conditions>
I would like to know if there is a way to match people between two separate systems, using (mostly) SQL.
We have two separate Oracle databases where people are stored. There is no link between the two (i.e. cannot join on person_id); this is intentional. I would like to create a query that checks to see if a given group of people from system A exists in system B.
I am able to create tables if that makes it easier. I can also run queries and do some data manipulation in Excel when creating my final report. I am not very familiar with PL/SQL.
In system A, we have information about people (name, DOB, soc, sex, etc.). In system B we have the same types of information about people. There could be data entry errors (person enters an incorrect spelling), but I am not going to worry about this too much, other than maybe just comparing the first 4 letters. This question deals with that problem more specifically.
They way I thought about doing this is through correlated subqueries. So, roughly,
select a.lastname, a.firstname, a.soc, a.dob, a.gender
case
when exists (select 1 from b where b.lastname = a.lastname) then 'Y' else 'N'
end last_name,
case
when exists (select 1 from b where b.firstname = a.firstname) then 'Y' else 'N'
end first_name,
case [etc.]
from a
This gives me what I want, I think...I can export the results to Excel and then find records that have 3 or more matches. I believe that this shows that a given field from A was found in B. However, I ran this query with just three of these fields and it took over 3 hours to run (I'm looking in 2 years of data). I would like to be able to match on up to 5 criteria (lastname, firstname, gender, date of birth, soc). Additionally, while soc number is the best choice for matching, it is also the piece of data that tends to be missing the most often. What is the best way to do this? Thanks.
You definitely want to weigh the different matches. If an SSN matches, that's a pretty good indication. If a firstName matches, that's basically worthless.
You could try a scoring method based on weights for the matches, combined with the phonetic string matching algorithms you linked to. Here's an example I whipped up in T-SQL. It would have to be ported to Oracle for your issue.
--Score Threshold to be returned
DECLARE #Threshold DECIMAL(5,5) = 0.60
--Weights to apply to each column match (0.00 - 1.00)
DECLARE #Weight_FirstName DECIMAL(5,5) = 0.10
DECLARE #Weight_LastName DECIMAL(5,5) = 0.40
DECLARE #Weight_SSN DECIMAL(5,5) = 0.40
DECLARE #Weight_Gender DECIMAL(5,5) = 0.10
DECLARE #NewStuff TABLE (ID INT IDENTITY PRIMARY KEY, FirstName VARCHAR(MAX), LastName VARCHAR(MAX), SSN VARCHAR(11), Gender VARCHAR(1))
INSERT INTO #NewStuff
( FirstName, LastName, SSN, Gender )
VALUES
( 'Ben','Sanders','234-62-3442','M' )
DECLARE #OldStuff TABLE (ID INT IDENTITY PRIMARY KEY, FirstName VARCHAR(MAX), LastName VARCHAR(MAX), SSN VARCHAR(11), Gender VARCHAR(1))
INSERT INTO #OldStuff
( FirstName, LastName, SSN, Gender )
VALUES
( 'Ben','Stickler','234-62-3442','M' ), --3/4 Match
( 'Albert','Sanders','523-42-3441','M' ), --2/4 Match
( 'Benne','Sanders','234-53-2334','F' ), --2/4 Match
( 'Ben','Sanders','234623442','M' ), --SSN has no dashes
( 'Ben','Sanders','234-62-3442','M' ) --perfect match
SELECT
'NewID' = ns.ID,
'OldID' = os.ID,
'Weighted Score' =
(CASE WHEN ns.FirstName = os.FirstName THEN #Weight_FirstName ELSE 0 END)
+
(CASE WHEN ns.LastName = os.LastName THEN #Weight_LastName ELSE 0 END)
+
(CASE WHEN ns.SSN = os.SSN THEN #Weight_SSN ELSE 0 END)
+
(CASE WHEN ns.Gender = os.Gender THEN #Weight_Gender ELSE 0 END)
,
'RAW Score' = CAST(
((CASE WHEN ns.FirstName = os.FirstName THEN 1 ELSE 0 END)
+
(CASE WHEN ns.LastName = os.LastName THEN 1 ELSE 0 END)
+
(CASE WHEN ns.SSN = os.SSN THEN 1 ELSE 0 END)
+
(CASE WHEN ns.Gender = os.Gender THEN 1 ELSE 0 END) ) AS varchar(MAX))
+
' / 4',
os.FirstName ,
os.LastName ,
os.SSN ,
os.Gender
FROM #NewStuff ns
--make sure that at least one item matches exactly
INNER JOIN #OldStuff os ON
os.FirstName = ns.FirstName OR
os.LastName = ns.LastName OR
os.SSN = ns.SSN OR
os.Gender = ns.Gender
where
(CASE WHEN ns.FirstName = os.FirstName THEN #Weight_FirstName ELSE 0 END)
+
(CASE WHEN ns.LastName = os.LastName THEN #Weight_LastName ELSE 0 END)
+
(CASE WHEN ns.SSN = os.SSN THEN #Weight_SSN ELSE 0 END)
+
(CASE WHEN ns.Gender = os.Gender THEN #Weight_Gender ELSE 0 END)
>= #Threshold
ORDER BY ns.ID, 'Weighted Score' DESC
And then, here's the output.
NewID OldID Weighted Raw First Last SSN Gender
1 5 1.00000 4 / 4 Ben Sanders 234-62-3442 M
1 1 0.60000 3 / 4 Ben Stickler 234-62-3442 M
1 4 0.60000 3 / 4 Ben Sanders 234623442 M
Then, you would have to do some post processing to evaluate the validity of each possible match. If you ever get a 1.00 for weighted score, you can assume that it's the right match, unless you get two of them. If you get a last name and SSN (a combined weight of 0.8 in my example), you can be reasonably certain that it's correct.
Example of HLGEM's JOIN suggestion:
SELECT a.lastname,
a.firstname,
a.soc,
a.dob,
a.gender
FROM TABLE a
JOIN TABLE b ON SOUNDEX(b.lastname) = SOUNDEX(a.lastname)
AND SOUNDEX(b.firstname) = SOUNDEX(a.firstname)
AND b.soc = a.soc
AND b.dob = a.dob
AND b.gender = a.gender
Reference: SOUNDEX
I would probably use joins instead of correlated subqueries but you will have to join on all the fields, so not sure how much that might improve things. But since correlated subqueries often have to evaluate row-by-row and joins don't it could improve things a good bit if you have good indexing. But as with all performance tuning only trying the techinque will let you knw ofor sure.
I did a similar task looking for duplicates in our SQL Server system and I broke it out into steps. So first I found everyone where the names and city/state were an exact match. Then I looked for additional possible matches (phone number, ssn, inexact name match etc. AS I found a possible match between two profiles, I added it to a staging table with a code for what type of match found it. Then I assigned a confidence amount to each type of match and added up the confidence for each potential match. So if the SOC matches, you might want a high confidence, same if the name is eact and the gender is exact and the dob is exact. Less so if the last name is exact and the first name is not exact, etc. By adding a confidence, I was much better able to see which possible mathes were more likely to be the same person. SQl Server also has a soundex function which can help with names that are slightly different. I'll bet Oracle has something similar.
After I did this, I learned how to do fuzzy grouping in SSIS and was able to generate more matches with a higher confidence level. I don't know if Oracle's ETL tools havea a way to do fuzzy logic, but if they do it can really help with this type of task. If you happen to also have SQL Server, SSIS can be run connecting to Oracle, so you could use fuzzy grouping yourself. It can take a long time to run though.
I will warn you that name, dob and gender are not likely to ensure they are the same person especially for common names.
Are there indexes on all of the columns in table b in the WHERE clause? If not, that will force a full scan of the table for each row in table a.
You can use soundex but you can also use utl_match for fuzzy comparing of string, utl_match makes it possible to define a treshold: http://www.psoug.org/reference/utl_match.html