I have a rugby database + player table. In the player table I have performance and I want to represent the performance as
0 = low
1 = medium
2 = high
I don't know what datatype the column should be. And what is the formula or function to do that?
Please help
You can define your column like this:
performance tinyint not null check (performance in (0, 1, 2))
tinyint takes only 1 byte for a value and values can range from 0 to 255.
If you store the values as 1 - Low, 2 - Medium, 3 - High and are using SQL server 2012+, then you can simply use CHOOSE function to convert the value to text when select like this:
select choose(performance,'Low','Medium','High')
. . .
If you really want to store as 0,1,2, use :
select choose(performance+1,'Low','Medium','High')
. . .
If you are using a lower version of SQL server, you can use CASE like this:
case performance
when 0 then 'Low'
when 1 then 'Medium'
when 2 then 'High'
end
1- column datatype should b int.
2- where you send the date check the performance first like:-
if(performance = low)
perVar = 0
send into database
There are a number of ways you can handle this. One way would be to represent the performance using an int column, which would take on values 0, 1, 2, .... To get the labels for those peformances, you could create a separate table which would map those numbers to descriptive strings, e.g.
id | text
0 | low
1 | medium
2 | high
You would then join to this table whenever you needed the full text description. Note that this is probably the only option which will scale as the number of performance types starts to get large.
If you don't want a separate table, you could also use a CASE expression to generate labels when querying, e.g.
CASE WHEN id = 0 THEN 'low'
WHEN id = 1 THEN 'medium'
WHEN id = 1 THEN 'high'
END
I would use a TINYINT datatype in the performance table to conserve space, then use a FOREIGN KEY CONSTRAINT from a second table which holds the descriptions. The constraint would force the entry of 0, 1, 2 in the performance table while providing a normalized solution that could grow to include additional perforamnce metrics.
Related
I have created a view in my SQL Server database which will give me number of columns.
One of the column heading is Priority and the values in this column are Low, Medium, High and Immediate.
When I execute this view, the result is returned perfectly like below. I want to change or assign values for these priorities. For example: instead of Low I should get 4, instead of Medium I should get 3, for High it should be 2 and for Immediate it should be 1.
What should I do to achieve this?
Ticket# Priority
123 Low
1254 Low
5478 Medium
4585 High
etc., etc.,
Use CASE:
Instead of Low I should get 4, instead of Medium I should get 3, for
High it should be 2 and for Immediate it should be 1
SELECT
[Ticket#],
[Priority] = CASE Priority
WHEN 'Low' THEN 4
WHEN 'Medium' THEN 3
WHEN 'High' THEN 2
WHEN 'Immediate' THEN 1
ELSE NULL
END
FROM table_name;
EDIT:
If you use dictionary table like in George Botros Solution you need to remember about:
1) Maintaining and storing dictionary table
2) Adding UNIUQE index to Priority.Name to avoid duplicates like:
Priority table
--------------------
Id | Name | Value
--------------------
1 | Low | 4
2 | Low | 4
...
3) Instead of INNER JOIN defensively you ought to use LEFT JOIN to get all results even if there is no corresponding value in dictionary table.
I have an alternative solution for your problem by creating a new Priority table (Id, Name, Value)
by joining to this table you will be able to select the value column
SELECT Ticket.*, Priority.Value
FROM Ticket INNER JOIN Priority
ON Priority.Name = Ticket.Priority
Note: although using the case keyword is the most straight forward solution for
this problem
this solution may be useful if you will need this priority value in many places at your system
Beloved SO Cronies,
I'm trying to custom sort bandwidth data using ORDER BY or any performance-focused solution likely involving a temp table. I've scoured SO and Google and have only turned up parts of functions that I can use, so I've arrived at posting here as a final stop.
Data (example)
VALUE
---------
10 Kbps
5 Kbps
1 Mbps
10 Mbps
100 Mbps
10 Gbps
1 Gbps
SQL fiddle with the below. Can you hear it playing in the background?
Bandwidth Sorting Start (SQL Fiddle)
select * from Bandwidth
order by (
case
when Value like '%kbps%' then 1
when Value like '%mbps%' then 2
when Value like '%gbps%' then 3
else 4
end)
My thinking is splitting the string Value into a parameter and running a case on the metric type (e.g. Kbps, Mbps) then applying a multiplier to the parameter based on that and presenting that in a temp table that I can sort and return on an int-based sort without showing the column in the results!
Thanks in advance. I tried to post on DBA StackExchange but existing work location presently blocks the login creation there.
Just use a delimiter to separate the numbers and convert them to integer
order by
(
case
when Value like '%Kbps%' then 1
when Value like '%Mbps%' then 2
when Value like '%Gbps%' then 3
else 4
end) ,
CONVERT(INT,SUBSTRING(Value, 0, CHARINDEX(' ', Value)))
FIDDLE
I have two dimensions DimFlag and DimPNL and a fact table FactAmount. I am looking to:
When pnl is stat(Is Stat=1) : sum (Actual x FlagId)
For pnl I multiply the amounts by field FlagId basically if it will be so 0 0 X = 0 ...
DimFlag
FlagId FlagLabel
-----------------
1 NotClosed
0 IsClosed
DimPNL
PNLId PNLName Is Stat
1 a 1
2 test 1
3 test2 0
FactAmount
id PNLId FlagId Actual
1 1 1 100
2 2 1 10
3 3 0 120
I tried the following MDX but it didn't work, any idea please ?
Scope (
[Dim PNL].[PNL].members,[Measures].members
);
this = iif([Dim PNL].[PNL].CurrentMember.Properties("Is Stat") =1
,
aggregate([Dim PNL].[PNL].currentmember,[Measures].currentmember)* iif([Dim Flag].[Flag Label].[Flag Label].currentmember = 0, 0, 1),
aggregate([Dim PNL].[PNL].currentmember,[Measures].currentmember)
);
While this type of calculation can be done in MDX, the MDX can get complex and performs bad. I would suggest to explicitly do the calculation e. g. in the DSV or a view on the fact table that you then use instead of the fact table directly in the DSV. The result of the calculation would then be another column on which you can base a standard measure.
To do it in the DSV, assuming you use a relational table as the base for the fact table, add a named calculation to it, define the column name however you like, and use the expression Actual * FlagID. For the other calculation, you may need a subselect, i. e. the expression would be Actual * case when pnlId in(1,2) then 1 else 0 end. You can use any SQL that works as a column expression in the select list as the expression in for a named calculation.
Implementing the same in a view on FactAmount, you could implement the second expression better, as then you could join table DimPNL in the view definition and thus use column IsStat in the calculation. Then you would replace table FactAmout by the view, which has the two additional measure columns.
In either case, just define two measures on the two new columns in the cube, and you are done.
As a rule, calculations that are done on record level in the fact table before any aggregation should be done at data loading time, i. e. as described above.
I am working on a tag recommendation system that takes metadata strings (e.g. text descriptions) of an object, and splits it into 1-, 2- and 3-grams.
The data for this system is kept in 3 tables:
The "object" table (e.g. what is being described),
The "token" table, filled with all 1-, 2- and 3-grams found (examples below), and
The "mapping" table, which maintains associations between (1) and (2), as well as a frequency count for these occurrences.
I am therefore able to construct a table via a LEFT JOIN, that looks somewhat like this:
SELECT mapping.object_id, mapping.token_id, mapping.freq, token.token_size, token.token
FROM mapping LEFT JOIN
token
ON (mapping.token_id = token.id)
WHERE mapping.object_id = 1;
object_id token_id freq token_size token
+-----------+----------+------+------------+--------------
1 1 1 2 'a big'
1 2 1 1 'a'
1 3 1 1 'big'
1 4 2 3 'a big slice'
1 5 1 1 'slice'
1 6 3 2 'big slice'
Now I'd like to be able to get the relative probability of each term within the context of a single object ID, so that I can sort them by probability, and see which terms are most probably (e.g. ORDER BY rel_prob DESC LIMIT 25)
For each row, I'm envisioning the addition of a column which gives the result of freq/sum of all freqs for that given token_size. In the case of 'a big', for instance, that would be 1/(1+3) = 0.25. For 'a', that's 1/3 = 0.333, etc.
I can't, for the life of me, figure out how to do this. Any help is greatly appreciated!
If I understood your problem, here's the query you need
select
m.object_id, m.token_id, m.freq,
t.token_size, t.token,
cast(m.freq as decimal(29, 10)) / sum(m.freq) over (partition by t.token_size, m.object_id)
from mapping as m
left outer join token on m.token_id = t.id
where m.object_id = 1;
sql fiddle example
hope that helps
I've got a table ItemValue full of data on a SQL 2005 Server running in 2000 compatibility mode that looks something like (it's a User-Defined values table):
ID ItemCode FieldID Value
-- ---------- ------- ------
1 abc123 1 D
2 abc123 2 287.23
4 xyz789 1 A
5 xyz789 2 3782.23
6 xyz789 3 23
7 mno456 1 W
9 mno456 3 45
... and so on.
FieldID comes from the ItemField table:
ID FieldNumber DataFormatID Description ...
-- ----------- ------------ -----------
1 1 1 Weight class
2 2 4 Cost
3 3 3 Another made up description
. . x xxx
. . x xxx
. . x xxx
x 91 (we have 91 user-defined fields)
Because I can't PIVOT in 2000 mode, we're stuck building an ugly query using CASEs and GROUP BY to get the data to look how it should for some legacy apps, which is:
ItemNumber Field1 Field2 Field3 .... Field51
---------- ------ ------- ------
abc123 D 287.23 NULL
xyz789 A 3782.23 23
mno456 W NULL 45
You can see we only need this table to show values up to the 51st UDF. Here's the query:
SELECT
iv.ItemNumber,
,MAX(CASE WHEN f.FieldNumber = 1 THEN iv.[Value] ELSE NULL END) [Field1]
,MAX(CASE WHEN f.FieldNumber = 2 THEN iv.[Value] ELSE NULL END) [Field2]
,MAX(CASE WHEN f.FieldNumber = 3 THEN iv.[Value] ELSE NULL END) [Field3]
...
,MAX(CASE WHEN f.FieldNumber = 51 THEN iv.[Value] ELSE NULL END) [Field51]
FROM ItemField f
LEFT JOIN ItemValue iv ON f.ID = iv.FieldID
WHERE f.FieldNumber <= 51
GROUP BY iv.ItemNumber
When the FieldNumber constraint is <= 51, the execute plan goes something like:
SELECT <== Computer Scalar <== Stream Aggregate <== Sort (Cost: 70%) <== Hash Match <== (Clustered Index Seek && Table Scan)
and it's fast! I can pull back 100,000+ records in about a second, which suits our needs.
However, if we had more UDFs and I change the constraint to anything above 66 (yes, I tested them one by one) or if I remove it completely, I lose the Sort in the Execution plan, and it gets replaced with a whole bunch of Parallelism blocks that gather, repartition, and distribute streams, and the entire thing is slow (30 seconds for even just 1 record).
FieldNumber has a clustered, unique index, and is part of composite primary key with the ID column (non-clustered index) in the ItemField table. The ItemValue table's ID and ItemNumber columns make a PK, and there is an extra non-clustered index on the ItemNumber column.
What is the reasoning behind this? Why does changing my simple integer constraint change the entire execution plan?
And if you're up to it... what would you do differently? There's a SQL upgrade planned for a couple months from now but I need to get this problem fixed before that.
SQL Server is smart enough to take CHECK constraints into account when optimizing the queries.
Your f.FieldNumber <= 51 is optimized out and the optimizer sees that the whole two tables should be joined (which is best done with a HASH JOIN).
If you don't have the constraint, the engine needs to check the condition and most probably uses index traversal to do this. This may be slower.
Could please post the whole plans for the queries? Just run SET SHOWPLAN_TEXT ON and then the queries.
Update:
What is the reasoning behind this? Why does changing my simple integer constraint change the entire execution plan?
If by a constraint you mean the WHERE condition, this is probably the other thing.
Set operations (that's what SQL does) have no single most efficient algorithm: efficiency of each algorithm depends heavily on the data distribution in the sets.
Say, for taking a subset (that's what the WHERE clause does) you can either find the range of record in the index and use the index record pointers to locate the data rows in the table, or just scan all records in the table and filter them using the WHERE condition.
Efficiency of the former operation is m × const, that of the latter is n, where m is the number of record satisfying the condition, n is the total number of records in the table and const > 1.
This means that for larger values of m the fullscan is more efficient.
SQL Server is aware of that and changes execution plans accordingly to the constants that affect the data distribution in the set operations.
TO do this, SQL Server maintains statistics: aggregated histograms of the data distribution in each indexed column and uses them to build the query plans.
So changing the integer in the WHERE condition in fact affects the size and the data distribution of the underlying sets and makes SQL Server to reconsider the algorithms best fit to work with the sets of that size and layout.
it gets replaced with a whole bunch of Parallelism blocks
Try this:
SELECT
iv.ItemNumber,
,MAX(CASE WHEN f.FieldNumber = 1 THEN iv.[Value] ELSE NULL END) [Field1]
,MAX(CASE WHEN f.FieldNumber = 2 THEN iv.[Value] ELSE NULL END) [Field2]
,MAX(CASE WHEN f.FieldNumber = 3 THEN iv.[Value] ELSE NULL END) [Field3]
...
,MAX(CASE WHEN f.FieldNumber = 51 THEN iv.[Value] ELSE NULL END) [Field51]
FROM ItemField f
LEFT JOIN ItemValue iv ON f.ID = iv.FieldID
WHERE f.FieldNumber <= 51
GROUP BY iv.ItemNumber
OPTION (Maxdop 1)
By using Option(Maxdop 1), this should prevent the parellelism in the execution plan.
At 66 you are hitting some internal cost estimate threshold that decides it is better to use one plan vs. the other. What that threshold is and why it happens is not really important. Note that your query differ with each FieldNumber value, as you are not only changing the WHERE: you also change the pseudo-'pivot' projected fields.
Now I don't know all the details of your table and your queries and insert/update/delete/pattern, but for the particular query you posted the proper clustered index structure for the ItemValue table is this:
CREATE CLUSTERED INDEX [cdxItemValue] ON ItemValue (FieldID, ItemNumber);
This structure eliminate the need to intermediate sort the results for this 'pivot' query.