Compound Operators with T-SQL - sql

USE Saleslogix
DECLARE #AssumedGrowth int
SET #AssumedGrowth = 28
SELECT
account,
employees as NumberIn2013,
#AssumedGrowth += employees as NumberIn2014
FROM sysdba.account
WHERE employees <> 'NULL'
and account like 'Shaw%'
It's telling me that += is invalid and only works with +. Can someone help me with getting this example to work as a compound operator? I don't know if it makes too much difference, but I am using 2005 Management Studio.
Also if it's not a huge pain, adding the same example with #AssumedGrowth being a percentage?

What you are trying to do is this:
SELECT account, employees as NumberIn2013,
(#AssumedGrowth = #AssumedGroup + employees) as NumberIn2014
FROM sysdba.account
WHERE employees <> 'NULL' and account like 'Shaw%';
But, I don't think it will work. I would instead suggest using the built-in capabilities, in particular, row_number():
SELECT account, employees as NumberIn2013,
employees * pow(1 + #AssumedGrowth/100.0, row_number() over (order by <field>) - 1)
FROM sysdba.account
WHERE employees <> 'NULL' and account like 'Shaw%';
Note that you need to specify the ordering for the results. Presumably, there is some sort of id or datetime column that has the appropriate order. Tables represent unordered sets, so there is no "first" row.

Related

SQL Server : case when statement in query using LEFT(xxx,15)

I have a query I'm working on in Microsoft SQL Server Management Studio and I'm not sure how to accomplish something.
Here's the current query:
SELECT DISTICNT
PRONOTES.CPK,
REPLACE(PRONOTES.SUBJECT, ',','') AS SUBJECT,
PRONOTES.CREATOR,
PRONOTES.DATE_CREATED
FROM
PRONOTES
WHERE
DATE_CREATED BETWEEN '2020-01-01' AND '2020-01-31'
My issue is that the software creates a SUBJECT that includes a prescription number when an order is discontinued. So I get results in the SUBJECT column that look like this:
Discontinued RX #2341241341
Discontinued RX #23455859900
All other possible SUBJECTS are locked because users have to select them from a dropdown, it's just this instance that causes that unique value. I'm trying to measure productivity of different users by how many notes they create and what types of notes they create.
I'd like the results to just show "Discontinued RX" instead of including the number, so that when this gets shipped off to excel and a pivot table is created there won't be a million lines because of the uniqueness of that prescription number.
I can't do it with a simple:
LEFT(REPLACE(PRONOTES.SUBJECT, ',', ''), 15)
because then I'll lose too much data from other subjects, so I was wondering how to do this with a case when or if there's some other better way. I thought maybe modification so that only subjects that start with the words "Discontinued Rx" get chopped off.
Right now it generates this:
But I'd like this:
You can use a CASE expression on SUBJECT so that when it starts with Discontinued RX that is all you show:
SELECT DISTINCT
PRONOTES.CPK,
CASE WHEN LEFT(PRONOTES.SUBJECT, 15) = 'Discontinued RX' THEN 'Discontinued RX'
ELSE REPLACE(PRONOTES.SUBJECT, ',','')
END AS SUBJECT,
PRONOTES.CREATOR,
PRONOTES.DATE_CREATED
FROM PRONOTES
WHERE DATE_CREATED BETWEEN '2020-01-01' AND '2020-01-31'
The following is a simplistic pattern that you can then adjust into a CASE statement in your SELECT statement as required.
DECLARE #test varchar(100) = 'Discontinued RX #2341241341 Discontinued RX #23455859900'
SELECT CASE WHEN PATINDEX('Discontinued RX #%', #test) > 0 THEN 'Replace' ELSE 'Keep' END

Oracle SQL Developer Create View Assignment

I've got an assignment with the following instructions:
Create a view named A11T1 (that's A-One-One-T-One, not A-L-L-T-L) that will display the concatenated name, JobTitle and Salary of the people who have a Cat value of N and whose salary is at least 30 percent higher than the average salary of all people who have a Cat value of N. The three column headings should be Name, JobTitle and Salary. The rows should be sorted in traditional phonebook order.
Note 1: As always, concatenated names must appear with one space between the first and last names.
Note 2: The concatenated names and job titles must be displayed in proper case (e.g., Mary Ellen Smith, Assistant Manager) for this task.
Note 3: Remember, the Person11 data is messy. Be sure to look for N and n when you are identifying the people with a Cat value of N.
What I have so far is:
CREATE VIEW A11T1 AS
SELECT INITCAP(FNAME||' '||LNAME) AS "Name", INITCAP(JobTitle), Salary
FROM PERSON11
WHERE UPPER(CAT) = 'N'
GROUP by INITCAP(FNAME||' '||LNAME), INITCAP(JobTitle), Salary
HAVING SALARY >= 1.3 * ROUND(AVG(SALARY),0)
Order by LNAME, FNAME
Error at Command Line:7 Column:10 Error report: SQL Error: ORA-00979: not a GROUP BY expression 00979. 00000 - "not a GROUP BY expression"
Is the current error I'm getting
No matter how much I edit my code it just won't create into a view and I've been stuck on this for hours! I appreciate any responses, even a point in the right direction.
Why do you need to "group by" concatenated name, job title and salary? Do you have more than one row per name?
Perhaps it's because you need to compute the average salary and that requires aggregation? You can't do everything in a single SELECT statement in SQL (at least not with simple tools - you seem to be in the early stages of learning and not looking to use window functions).
The "avg salary" needs to come from a subquery. Where you have >= 1.3 * round(...) you should have instead:
... >= 1.3 * (select avg(salary) from person11 where cat = 'N')
Note that the subquery must be enclosed in parentheses. In your code I see you use upper(cat) - is there a concern that cat may be upper or lower case? In that case it may be better to write
cat in ('n', 'N')
Avoid wrapping column values inside functions whenever possible (that often leads to worse performance). Also, I see no need to round the average salary in your requirements - and in any case, what's the point to rounding to zero decimal places if you then multiply by 1.3? Rounding may actually lead to incorrect output.
EDIT: Sorry, to clarify: I think you are well on your way already. Use the subquery for the average salary, remove the group by (which doesn't hurt anything but is really unneeded), and if you care to, change the upper(cat) as I suggested; I think your query will work with these changes.
Good luck!
I think the easiest way uses window functions:
CREATE VIEW A11T1 AS
SELECT INITCAP(FNAME || ' '|| LNAME) AS Name, INITCAP(JobTitle), Salary
FROM (SELECT p.*, AVG(SALARY) OVER () as avg_salary
FROM FROM PERSON11 p
WHERE UPPER(CAT) = 'N'
) p
WHERE SALARY >= 1.3 * avg_salary
ORDER BY LNAME, FNAME ;

How to get most popular name by year in SQL Server

I am practicing SQL in Microsoft SQL Server 2012 (not a homework question), and have a table Names. The table shows baby names by year, with columns Sex (gender of name), N (number of babies having that name), Yr (year), and Name (the name itself).
I need to write a query using only one SELECT statement that returns the most popular baby name by year, with gender, the year, and the number of babies named. So far I have;
SELECT *
From Names
ORDER By N DESC;
Which gives the highest values of N in DESC order, repeating years. I need to limit it to only the highest value in each year, and everything I have tried to do so has thrown errors. Any advice you can give me for this would be appreciated.
Off the top of my my head, something like the following would normally let you do it in (technically) one SELECT statment. That statement includes sub-SELECTs, but I'm not immediately seeing an alternative that wouldn't.
When there's joint top ranking names, both queries should bring back all joint top results so there may not be exactly one answer. If you then just need a random single representative row from those result, look at using select top 1, perhaps adding order by to get the first alphabetically.
Most popular by year regardless of gender:
-- ONE PER YEAR:
SELECT n.Year, n.Name, n.Gender, n.Qty FROM Name n
WHERE NOT EXISTS (
SELECT 1 FROM Name n2
WHERE n2.Year = n.Year
AND n2.Qty > n.Qty
)
Most popular by year for each gender:
-- ONE PER GENDER PER YEAR:
SELECT n.Year, n.Name, n.Gender, n.Qty FROM Name n
WHERE NOT EXISTS (
SELECT 1 FROM Name n2
WHERE n2.Year = n.Year
AND n2.Gender = n.Gender
AND n2.Qty > n.Qty
)
Performance is, despite the verbosity of the SQL, usually on a par with alternatives when using this pattern (often better).
There are other approaches, including using GROUP statements, but personally I find this one more readable and standard cross-DBMS.

Sorting SQL by first two characters of fields

I'm trying to sort some data by sales person initials, and the sales rep field is 3 chars long, and is Firstname, Lastname and Account type. So, Bob Smith would be BS* and I just need to sort by the first two characters.
How can I pull all data for a certain rep, where the first two characters of the field equals BS?
In some databases you can actually do
select * from SalesRep order by substring(SalesRepID, 1, 2)
Othere require you to
select *, Substring(SalesRepID, 1, 2) as foo from SalesRep order by foo
And in still others, you can't do it at all (but will have to sort your output in program code after you get it from the database).
Addition: If you actually want just the data for one sales rep, do as the others suggest. Otherwise, either you want to sort by the thing or maybe group by the thing.
What about this
SELECT * FROM SalesTable WHERE SalesRepField LIKE 'BS_'
I hope that you never end up with two sales reps who happen to have the same initials.
Also, sorting and filtering are two completely different things. You talk about sorting in the question title and first paragraph, but your question is about filtering. Since you can just ORDER BY on the field and it will use the first two characters anyway, I'll give you an answer for the filtering part.
You don't mention your RDBMS, but this will work in any product:
SELECT
my_columns
FROM
My_Table
WHERE
sales_rep LIKE 'BS%'
If you're using a variable/parameter then:
SELECT
my_columns
FROM
My_Table
WHERE
sales_rep LIKE #my_param + '%'
You can also use:
LEFT(sales_rep, 2) = 'BS'
I would stay away from:
SUBSTRING(sales_rep, 1, 2) = 'BS'
Depending on your SQL engine, it might not be smart enough to realize that it can use an index on the last one.
You haven't said what DBMS you are using. The following would work in Oracle, and something like them in most other DBMSs
1) where sales_rep like 'BS%'
2) where substr(sales_rep,1,2) = 'BS'
SELECT * FROM SalesRep
WHERE SUBSTRING(SalesRepID, 1, 2) = 'BS'
You didn't say what database you were using, this works in MS SQL Server.

Aggregate SQL Function to grab only the first from each group

I have 2 tables - an Account table and a Users table. Each account can have multiple users. I have a scenario where I want to execute a single query/join against these two tables, but I want all the Account data (Account.*) and only the first set of user data (specifically their name).
Instead of doing a "min" or "max" on my aggregated group, I wanted to do a "first". But, apparently, there is no "First" aggregate function in TSQL.
Any suggestions on how to go about getting this query? Obviously, it is easy to get the cartesian product of Account x Users:
SELECT User.Name, Account.* FROM Account, User
WHERE Account.ID = User.Account_ID
But how might I got about only getting the first user from the product based on the order of their User.ID ?
Rather than grouping, go about it like this...
select
*
from account a
join (
select
account_id,
row_number() over (order by account_id, id) -
rank() over (order by account_id) as row_num from user
) first on first.account_id = a.id and first.row_num = 0
I know my answer is a bit late, but that might help others. There is a way to achieve a First() and Last() in SQL Server, and here it is :
Stuff(Min(Convert(Varchar, DATE_FIELD, 126) + Convert(Varchar, DESIRED_FIELD)), 1, 23, '')
Use Min() for First() and Max() for Last(). The DATE_FIELD should be the date that determines if it is the first or last record. The DESIRED_FIELD is the field you want the first or the last value. What it does is :
Add the date in ISO format at the start of the string (23 characters long)
Append the DESIRED_FIELD to that string
Get the MIN/MAX value for that field (since it start with the date, you will get the first or last record)
Stuff that concatened string to remove the first 23 characters (the date part)
Here you go!
EDIT: I got problems with the first formula : when the DATE_FIELD has .000 as milliseconds, SQL Server returns the date as string with NO milliseconds at all, thus removing the first 4 characters from the DESIRED_FIELD. I simply changed the format to "20" (without milliseconds) and it works all great. The only downside is if you have two fields that were created at the same seconds, the sort can possibly be messy... in which cas you can revert to "126" for the format.
Stuff(Max(Convert(Varchar, DATE_FIELD, 20) + Convert(Varchar, DESIRED_FIELD)), 1, 19, '')
EDIT 2 : My original intent was to return the last (or first) NON NULL row. I got asked how to return the last or first row, wether it be null or not. Simply add a ISNULL to the DESIRED_FIELD. When you concatenate two strings with a + operator, when one of them is NULL, the result is NULL. So use the following :
Stuff(Max(Convert(Varchar, DATE_FIELD, 20) + IsNull(Convert(Varchar, DESIRED_FIELD), '')), 1, 19, '')
Select *
From Accounts a
Left Join (
Select u.*,
row_number() over (Partition By u.AccountKey Order By u.UserKey) as Ranking
From Users u
) as UsersRanked
on UsersRanked.AccountKey = a.AccountKey and UsersRanked.Ranking = 1
This can be simplified by using the Partition By clause. In the above, if an account has three users, then the subquery numbers them 1,2, and 3, and for a different AccountKey, it will reset the numnbering. This means for each unique AccountKey, there will always be a 1, and potentially 2,3,4, etc.
Thus you filter on Ranking=1 to grab the first from each group.
This will give you one row per account, and if there is at least one user for that account, then it will give you the user with the lowest key(because I use a left join, you will always get an account listing even if no user exists). Replace Order By u.UserKey with another field if you prefer that the first user be chosen alphabetically or some other criteria.
I've benchmarked all the methods, the simpelest and fastest method to achieve this is by using outer/cross apply
SELECT u.Name, Account.* FROM Account
OUTER APPLY (SELECT TOP 1 * FROM User WHERE Account.ID = Account_ID ) as u
CROSS APPLY works just like INNER JOIN and fetches the rows where both tables are related, while OUTER APPLY works like LEFT OUTER JOIN and fetches all rows from the left table (Account here)
You can use OUTER APPLY, see documentation.
SELECT User1.Name, Account.* FROM Account
OUTER APPLY
(SELECT TOP 1 Name
FROM [User]
WHERE Account.ID = [User].Account_ID
ORDER BY Name ASC) User1
SELECT (SELECT TOP 1 Name
FROM User
WHERE Account_ID = a.AccountID
ORDER BY UserID) [Name],
a.*
FROM Account a
The STUFF response from Dominic Goulet is slick. But, if your DATE_FIELD is SMALLDATETIME (instead of DATETIME), then the ISO 8601 length will be 19 instead of 23 (because SMALLDATETIME has no milliseconds) - so adjust the STUFF parameter accordingly or the return value from the STUFF function will be incorrect (missing the first four characters).
First and Last do not exist in Sql Server 2005 or 2008, but in Sql Server 2012 there is a First_Value, Last_Value function. I tried to implement the aggregate First and Last for Sql Server 2005 and came to the obstacle that sql server does guarantee the calculation of the aggregate in a defined order. (See attribute SqlUserDefinedAggregateAttribute.IsInvariantToOrder Property, which is not implemented.) This might be because the query analyser tries to execute the calculation of the aggregate on multiple threads and combine the results, which speeds up the execution, but does not guarantee an order in which elements are aggregated.
Define "First". What you think of as first is a coincidence that normally has to do with clustered index order but should not be relied on (you can contrive examples that break it).
You are right not to use MAX() or MIN(). While tempting, consider the scenario where you the first name and last name are in separate fields. You might get names from different records.
Since it sounds like all your really care is that you get exactly one arbitrary record for each group, what you can do is just MIN or MAX an ID field for that record, and then join the table into the query on that ID.
There are a number of ways of doing this, here a a quick and dirty one.
Select (SELECT TOP 1 U.Name FROM Users U WHERE U.Account_ID = A.ID) AS "Name,
A.*
FROM Account A
(Slightly Off-Topic, but) I often run aggregate queries to list exception summaries, and then I want to know WHY a customer is in the results, so use MIN and MAX to give 2 semi-random samples that I can look at in details e.g.
SELECT Customer.Id, COUNT(*) AS ProblemCount
, MIN(Invoice.Id) AS MinInv, MAX(Invoice.Id) AS MaxInv
FROM Customer
INNER JOIN Invoice on Invoice.CustomerId = Customer.Id
WHERE Invoice.SomethingHasGoneWrong=1
GROUP BY Customer.Id
Create and join with a subselect 'FirstUser' that returns the first user for each account
SELECT User.Name, Account.*
FROM Account, User,
(select min(user.id) id,account_id from User group by user.account_id) as firstUser
WHERE Account.ID = User.Account_ID
and User.id = firstUser.id and Account.ID = firstUser.account_id