IIF Function returning incorrect calculated values - SQL Server - sql

I am writing a query to show returns of placing each way bets on horse races
There is an issue with the PlaceProfit result - This should show a return if the horses finishing position is between 1-4 and a loss if the position is => 5
It does show the correct return if the horses finishing position is below 9th, but 10th place and above is being counted as a win.
I include my code below along with the output.
ALTER VIEW EachWayBetting
AS
SELECT a.ID,
RaceDate,
runners,
track.NAME AS Track,
horse.NAME as HorseName,
IndustrySP,
Place AS 'FinishingPosition',
-- // calculates returns on the win & place parts of an each way bet with 1/5 place terms //
IIF(A.Place = '1', 1.0 * (A.IndustrySP-1), '-1') AS WinProfit,
IIF(A.Place <='4', 1.0 * (A.IndustrySP-1)/5, '-1') AS PlaceProfit
FROM dbo.NewRaceResult a
LEFT OUTER JOIN track ON track.ID = A.TrackID
LEFT OUTER JOIN horse ON horse.ID = A.HorseID
WHERE a.Runners > 22
This returns:

As I mention in the comments, the problem is your choice of data type for place, it's varchar. The ordering for a string data type is completely different to that of a numerical data type. Strings are sorted by character from left to right, in the order the characters are ordered in the collation you are using. Numerical data types, however, are ordered from the lowest to highest.
This means that, for a numerical data type, the value 2 has a lower value than 10, however, for a varchar the value '2' has a higher value than '10'. For the varchar that's because the ordering is completed on the first character first. '2' has a higher value than '1' and so '2' has a higher value than '10'.
The solution here is simple, fix your design; store numerical data in a numerical data type (int seems appropriate here). You're also breaking Normal Form rules, as you're storing other data in the column; mainly the reason a horse failed to be classified. Such data isn't a "Place" but information on why the horse didn't place, and so should be in a separate column.
You can therefore fix this by firstly adding a new column, then updating it's value to be the values that aren't numerical and making place only contain numerical data, and then finally altering your place column.
ALTER TABLE dbo.YourTable ADD UnClassifiedReason varchar(5) NULL; --Obviously use an appropriate length.
GO
UPDATE dbo.YourTable
SET Place = TRY_CONVERT(int,Place),
UnClassifiedReason = CASE WHEN TRY_CONVERT(int,Place) IS NULL THEN Place END;
GO
ALTER TABLE dbo.YourTable ALTER COLUMN Place int NULL;
GO
If Place does not allow NULL values, you will need to ALTER the column first to allow them.

In addition to fixing the data as Larnu suggests, you should also fix the query:
SELECT nrr.ID, nrr.RaceDate, nrr.runners,
t.NAME AS Track, t.NAME as HorseName, nrr.IndustrySP,
Place AS FinishingPosition,
-- // calculates returns on the win & place parts of an each way bet with 1/5 place terms //
(CASE WHEN nrr.Place = 1 THEN (nrr..IndustrySP - 1.0) ELSE -1 END) AS WinProfit,
(CASE WHEN nrr.Place <= 4 THEN (nrr.IndustrySP - 1.0) / 5 THEN -1 END) AS PlaceProfit
FROM dbo.NewRaceResult nrr LEFT JOIN
track t
ON t.ID = nrr.TrackID LEFT JOIN
horse h
ON h.ID = nrr.HorseID
WHERE nrr.Runners > 22;
The important changes are removing single quotes from numbers and column names. It seems you need to understand the differences among strings, numbers, and identifiers.
Other changes are:
Meaningful table aliases, rather than meaningless letters such as a.
Qualifying all column references, so it is clear where columns are coming from.
Switching from IFF() to CASE. IFF() is bespoke SQL Server; CASE is standard SQL for conditional expressions (both work fine).
Being sure that the types returned by all branches of the conditional expressions are consistent.
Note: This version will work even if you don't change the type of Place. The strings will be converted to numbers in the appropriate places. I don't advocate relying on such silent conversion, so I recommend fixing the data.
If place can have non-numeric values, then you need to convert them:
(CASE WHEN TRY_CONVERT(int, nrr.Place) = 1 THEN (nrr..IndustrySP - 1.0) ELSE -1 END) AS WinProfit,
(CASE WHEN TRY_CONVERT(int, nrr.Place) <= 4 THEN (nrr.IndustrySP - 1.0) / 5 THEN -1 END) AS PlaceProfit
But the important point is to fix the data.

Related

How to query column with letters on SQL?

I'm new to this.
I have a column: (chocolate_weight) On the table : (Chocolate) which has g at the end of every number, so 30x , 2x5g,10g etc.
I want to remove the letter at the end and then query it to show any that weigh greater than 35.
So far I have done
Select *
From Chocolate
Where chocolate_weight IN
(SELECT
REPLACE(chocolote_weight,'x','') From Chocolate) > 35
It is coming back with 0 , even though there are many that weigh more than 35.
Any help is appreciated
Thanks
If 'g' is always the suffix then your current query is along the right lines, but you don't need the IN you can do the replace in the where clause:
SELECT *
FROM Chocolate
WHERE CAST(REPLACE(chocolate_weight,'g','') AS DECIMAL(10, 2)) > 35;
N.B. This works in both the tagged DBMS SQL-Server and MySQL
This will fail (although only silently in MySQL) if you have anything that contains units other than grams though, so what I would strongly suggest is that you fix your design if it is not too late, store the weight as an numeric type and lose the 'g' completely if you only ever store in grams. If you use multiple different units then you may wish to standardise this so all are as grams, or alternatively store the two things in separate columns, one as a decimal/int for the numeric value and a separate column for the weight, e.g.
Weight
Unit
10
g
150
g
1000
lb
The issue you will have here though is that you will have start doing conversions in your queries to ensure you get all results. It is easier to do the conversion once when the data is saved and use a standard measure for all records.

Alter a existing SQL statement, to give an additional column of data, but to not affect performance, so best approach

In this query, I want to add a new column, which gives the SUM of a.VolumetricCharge, but only where PremiseProviderBillings.BillingCategory = 'Water'. But i don't want to add it in the obvious place since that would limit the rows returned, I only want it to get the new column value
SELECT b.customerbillid,
-- Here i need SUM(a.VolumetricCharge) but where a.BillingCategory is equal to 'Water'
Sum(a.volumetriccharge) AS Volumetric,
Sum(a.fixedcharge) AS Fixed,
Sum(a.vat) AS VAT,
Sum(a.discount) + Sum(deferral) AS Discount,
Sum(Isnull(a.estimatedconsumption, 0)) AS Consumption,
Count_big(*) AS Records
FROM dbo.premiseproviderbillings AS a WITH (nolock)
LEFT JOIN dbo.premiseproviderbills AS b WITH (nolock)
ON a.premiseproviderbillid = b.premiseproviderbillid
-- Cannot add a where here since that would limit the results and change the output
GROUP BY b.customerbillid;
Bit of a tricky one, as what you're asking for will definitely affect performance (your asking SQL Server to do more work after all!).
However, we can add a column to your results which performs a conditional sum so that it does not affect the result of the other columns.
The answer lies in using a CASE expression!
Sum(
CASE
WHEN PremiseProviderBillings.BillingCategory = 'Water' THEN
a.volumetriccharge
ELSE
0
END
) AS WaterVolumetric

SQL Query for Search Page

I am working on a small project for an online databases course and i was wondering if you could help me out with a problem I am having.
I have a web page that is searching a movie database and retrieving specific columns using a movie initial input field, a number input field, and a code field. These will all be converted to strings and used as user input for the query.
Below is what i tried before:
select A.CD, A.INIT, A.NBR, A.STN, A.ST, A.CRET_ID, A.CMNT, A.DT
from MOVIE_ONE A
where A.INIT = :init
AND A.CD = :cd
AND A.NBR = :num
The way the page must search is in three different cases:
(initial and number)
(code)
(initial and number and code)
The cases have to be independent so if certain field are empty, but fulfill a certain case, the search goes through. It also must be in one query. I am stuck on how to implement the cases.
The parameters in the query are taken from the Java parameters in the method found in an SQLJ file.
If you could possibly provide some aid on how i can go about this problem, I'd greatly appreciate it!
Consider wrapping the equality expressions in NVL (synonymous to COALESCE) so if parameter inputs are blank, corresponding column is checked against itself. Also, be sure to kick the a-b-c table aliasing habit.
SELECT m.CD, m.INIT, m.NBR, m.STN, m.ST, m.CRET_ID, m.CMNT, m.DT
FROM MOVIE_ONE m
WHERE m.INIT = NVL(:init, m.INIT)
AND m.CD = NVL(:cd, m.CD)
AND m.NBR = COALESCE(:num, m.NBR)
To demonstrate, consider below DB2 fiddles where each case can be checked by adjusting value CTE parameters all running on same exact data.
Case 1
WITH
i(init) AS (VALUES('db2')),
c(cd) AS (VALUES(NULL)),
n(num) AS (VALUES(53)),
cte AS
...
Case 2
WITH
i(init) AS (VALUES(NULL)),
c(cd) AS (VALUES(2018)),
n(num) AS (VALUES(NULL)),
cte AS
...
Case 3
WITH
i(init) AS (VALUES('db2')),
c(cd) AS (VALUES(2018)),
n(num) AS (VALUES(53)),
cte AS
...
However, do be aware the fiddle runs a different SQL due to nature of data (i.e., double and dates). But query does reflect same concept with NVL matching expressions on both sides.
SELECT *
FROM cte, i, c, n
WHERE cte.mytype = NVL(i.init, cte.mytype)
AND YEAR(CAST(cte.mydate AS date)) = NVL(c.cd, YEAR(CAST(cte.mydate AS date)))
AND ROUND(cte.mynum, 0) = NVL(n.num, ROUND(cte.mynum, 0));

DQS - matching empty domains

Is it a way to configure matching rules in the DQS Data Quality Project to ignore matching the empty domains?
I find it very strange if two empty domain values are considered as match for 100%. I always can, of course, write the newid() in all the empty domains inside the underlying sql data source (view), but this is overkill and maybe there are "right" way to do this...
Sadly this feature is not currently supported so only the workarounds you list above exist.
"Null values in the corresponding fields of two records will be considered a match"
sources:
http://technet.microsoft.com/en-us/library/hh213071.aspx
http://social.msdn.microsoft.com/Forums/sqlserver/en-US/7b52419c-0bb8-4e56-b920-e68ff551bd76/can-i-set-a-matching-rule-to-only-match-if-the-value-is-not-null?forum=sqldataqualityservices
A note when implementing any workaround - the performance will be negatively impacted. You will notice this more with larger data sets.
FWIW The workaround I implemented:
save matching results to a table, include a column isMatchingScoreAdjusted default 0
find all records that have a null match and calculate adjusted value
update results
example proc
DECLARE #GIVEN_NAME FLOAT = 22;
WITH adjustedscore
AS (
SELECT c.MatchingScore
+ case when p.GIVEN_NAME is null and c.GIVEN_NAME is null then -#GIVEN_NAME else 0 end
as [AdjustedMatchingScore]
,c.RecordId
FROM [dbo].[dqs_matches] p
INNER JOIN [dbo].[dqs_matches] c ON c.SiblingId = p.RecordId
WHERE c.IsPivot = 0
AND p.GIVEN_NAME IS NULL
AND c.GIVEN_NAME IS NULL
)
UPDATE m SET MatchingScore = a.AdjustedMatchingScore, isMatchingScoreAdjusted = 1
FROM adjustedscore a
INNER JOIN [dbo].[dqs_matches] m ON m.RecordId = a.RecordId
where m.isMatchingScoreAdjusted = 0
I found acceptable solution.
Empty fields inside composite domain are not considered as matched.
By the way if all the composite domain fields of two records are blank, then these domains are considered as matched on 100%. But I am quite satisfied this that.

Splitting text in SQL Server stored procedure

I'm working with a database, where one of the fields I extract is something like:
1-117 3-134 3-133
Each of these number sets represents a different set of data in another table. Taking 1-117 as an example, 1 = equipment ID, and 117 = equipment settings.
I have another table from which I need to extract data based on the previous field. It has two columns that split equipment ID and settings. Essentially, I need a way to go from the queried column 1-117 and run a query to extract data from another table where 1 and 117 are two separate corresponding columns.
So, is there anyway to split this number to run this query?
Also, how would I split those three numbers (1-117 3-134 3-133) into three different query sets?
The tricky part here is that this column can have any number of sets here (such as 1-117 3-133 or 1-117 3-134 3-133 2-131).
I'm creating these queries in a stored procedure as part of a larger document to display the extracted data.
Thanks for any help.
Since you didn't provide the DB vendor, here's two posts that answer this question for SQL Server and Oracle respectively...
T-SQL: Opposite to string concatenation - how to split string into multiple records
Splitting comma separated string in a PL/SQL stored proc
And if you're using some other DBMS, go search for "splitting text ". I can almost guarantee you're not the first one to ask, and there's answers for every DBMS flavor out there.
As you said the format is constant though, you could also do something simpler using a SUBSTRING function.
EDIT in response to OP comment...
Since you're using SQL Server, and you said that these values are always in a consistent format, you can do something as simple as using SUBSTRING to get each part of the value and assign them to T-SQL variables, where you can then use them to do whatever you want, like using them in the predicate of a query.
Assuming that what you said is true about the format always being #-### (exactly 1 digit, a dash, and 3 digits) this is fairly easy.
WITH EquipmentSettings AS (
SELECT
S.*,
Convert(int, Substring(S.AwfulMultivalue, V.Value * 6 - 5, 1) EquipmentID,
Convert(int, Substring(S.AwfulMultivalue, V.Value * 6 - 3, 3) Settings
FROM
SourceTable S
INNER JOIN master.dbo.spt_values V
ON V.Value BETWEEN 1 AND Len(S.AwfulMultivalue) / 6
WHERE
V.type = 'P'
)
SELECT
E.Whatever,
D.Whatever
FROM
EquipmentSettings E
INNER JOIN DestinationTable D
ON E.EquipmentID = D.EquipmentID
AND E.Settings = D.Settings
In SQL Server 2005+ this query will support 1365 values in the string.
If the length of the digits can vary, then it's a little harder. Let me know.
Incase if the sets does not increase by more than 4 then you can use Parsename to retrieve the result
Declare #Num varchar(20)
Set #Num='1-117 3-134 3-133'
select parsename(replace (#Num,' ','.'),3)
Result :- 1-117
Now again use parsename on the same resultset
Select parsename(replace(parsename(replace (#Num,' ','.'),3),'-','.'),1)
Result :- 117
If the there are more than 4 values then use split functions