Average amount of characters in varchar MS SQL server

Average amount of characters in varchar MS SQL server - sql

I am currently trying to select multiple columns including note_id, notes_date_time, comments, DATALENGTH(comments) as 'note_length', from a table labelled job_notes.
However, I am only aiming to display the above when the length of characters within a comment is greater than the average length of the other comments. (The data type for comments is VARCHAR something).
I am also aiming to order by the length of the comments in descending order.
This is my code:
SELECT note_id, notes_date_time, comments, DATALENGTH(comments) AS 'note_length'
FROM job_notes
WHERE DATALENGTH(comments) > AVG(DATALENGTH(comments))
ORDER BY DATALENGTH(comments) DESC;
Upon execution, I am met with the following error message
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.
Any help would be greatly appreciated. Thanks!

You can try the below -
SELECT note_id, notes_date_time, comments, DATALENGTH(comments) AS 'note_length'
FROM job_notes
WHERE DATALENGTH(comments) > (select AVG(DATALENGTH(comments)) from job_notes)
ORDER BY DATALENGTH(comments) DESC

As the error states, you can't use aggregate functions in the WHERE, and the HAVING won't help you here. Personally, I would suggest using a CTE and a Windowed Aggregate function:
WITH CTE AS
(SELECT note_id,
notes_date_time,
comments,
DATALENGTH(comments) AS Note_Length, --Don't use single quotes for aliases, they are meant for literal strings
AVG(DATALENGTH(comments)) OVER () AS Avg_Note_Length
FROM dbo.job_notes)
SELECT note_id,
notes_date_time,
comments,
Note_Length --Don't forget to divide by 2 if you want characters and this is an nvarchar
FROM CTE
WHERE Note_Length > Avg_Note_Length;

Related

convert access group by query to sql server query

I'm unable to convert MS Access query to SQL SERVER Query, with changing the group by columns because it will effect in the final result. The purpose of this query is to calculate the Creditor and debtor of accounts of projects.
I tried rewriting with 'CTE' but couldn't get any good result.. I hope someone could help me.. Thanks in advance...
this is the query I want to convert:
SELECT Sum(ZABC.M) AS M, Sum(ZABC.D) AS D, ZABC.ACC_NUMBER, ZABC.PROJECT_NUMBER, [M]-[D] AS RM, [D]-
[M] AS RD
FROM ZABC
GROUP BY ZABC.ACC_NUMBER, ZABC.PROJECT_NUMBER
ORDER BY ZABC.PROJECT_NUMBER;

The problem with the query are [M] and [D] in the select clause: these columns should either be repeated in the group by clause, or surrounded by an aggregate function. Your current group by clause gives you one row per (acc_number, project_number) tuple: you need to choose which computation you want for D and M, that may have several different values per group.
You did not explain the purpose of the original query. Maybe you meant:
SELECT
Sum(ZABC.M) AS M,
Sum(ZABC.D) AS D,
ZABC.ACC_NUMBER,
ZABC.PROJECT_NUMBER,
Sum(ZABC.M) - SUM(ZABC.D) AS RM,
SUM(ZABC.D) - SUM(ZABC.M) AS RD
FROM ZABC
GROUP BY ZABC.ACC_NUMBER, ZABC.PROJECT_NUMBER
ORDER BY ZABC.PROJECT_NUMBER;
There is a vast variety of aggregate functions available for you to pick from, such as MIN(), MAX(), AVG(), and so on.

Redshift LISTAGG frame clause

I am trying to aggregate strings, but limited to only the preceding rows, not the whole partition. Does anyone know how to do this in Redshift?
What I am trying to achieve is the appended_event_namespace column below.
This is what I've tried so far.
LISTAGG(event_namespace, '/')
WITHIN GROUP (ORDER BY tstamp_true)
OVER (PARTITION BY acct_id) AS appended_event_namespace
This results in the full ApplicationLaunch/CategoryBrowse/NotificationCenter/UserProfile aggregation on every single row instead of what is in the desired screenshot.
The difficulty is in getting it to only append up to the current row since there doesn't seem to be a frame-clause for Redshift's LISTAGG(). Thanks for any ideas that may help.

You can hack this together with another query. Start with your appended_event_namespace as the result of your original LISTAGG
SELECT event_namespace,
SUBSTRING(appended_event_namespace,
1,
POSITION(event_namespace,appended_event_namespace) + LEN(event_namespace) - 1
) as appended_event_namespace_cum
FROM your_table;
Basically, you take your aggregated, ordered string, and then take the first N characters where N is ([where it appears in the aggregated string ]+[its length]), which will cut out everything after that item. This gives you a cumulative namespace.

LISTAGG with frame clause is not supported in RS yet. If you have some columns that you can use for partitioning and ordering you can make a self join (not so performant but would accomplish what you want):
SELECT
t1.id
,t2.tstamp_true
,t1.event_namespace
,LISTAGG(t2.event_namespace,'/') WITHIN GROUP (ORDER BY t2.tstamp_true)
FROM your_table t1
JOIN your_table t2
ON t1.id=t2.id
AND t1.tstamp_true>=t2.tstamp_true
GROUP BY 1,2,3
Alternatively, if you want to avoid self join you can build a JSON with the following structure using LISTAGG:
[{tstamp_true_1,event_namespace_1},{tstamp_true_N,event_namespace_N},...]
and write a Python UDF that takes such JSON for the given group of rows and tstamp_true of the given row and returns the path (the function would need to filter the tstamp_true_N values earlier than the second parameter and concatenate filtered event_namespace_N values for the output)

How to GROUP BY on Oracle?

I need help with sql oracle, my group by doesnt work and i'm working on a shell so i don't have any help.
Can someone tell me how to group this next request by noArticle.
SELECT Article.noArticle, quantite
FROM Article LEFT JOIN LigneCommande ON Article.noArticle = LigneCommande.noArticle
GROUP BY Article.noArticle
/
Thank you

To tie things up, this is the correct SQL.
SELECT Article.noArticle, sum(quantite)
FROM Article LEFT JOIN LigneCommande ON Article.noArticle = LigneCommande.noArticle
GROUP BY Article.noArticle

You are grouping by a column and then you attempt to use the quantite field which is not group-level, it is record-level. Group by is aggregation and you have to use aggregate columns (the columns you are grouping by or aggregate functions on columns, like sum, avg, count, max or min). You need to aggregate your record-level fields to be able to use them in your projection (select clause). To name an example, your attempt was like trying to get the hair color of American women (of course, there are many American women and they might have different hair color, so it is unnatural and un-wise to attempt to get the value of hair color from the set of American women). Your fixed query is as follows:
SELECT Article.noArticle, sum(quantite)
FROM Article LEFT JOIN LigneCommande ON Article.noArticle = LigneCommande.noArticle
GROUP BY Article.noArticle

For my situation i need the summation of the quantite so in order to make it work i added SUM(quantite) and then i grouped by noArticle

...oracle group by syntax for beginners

What is the problem in this please?
select inst.id
, inst.type as "TypeOfInstall"
, count(inst.id) as "NoOfInstall"
from dm_bsl_ho.installment inst
group by inst.type

You're not allowed to use single function with group function. Like mixing count with single row function.
You should include the group by function:
select inst.type as "TypeOfInstall"
, count(inst.id) as "NoOfInstall"
from dm_bsl_ho.installment inst
GROUP BY inst.type;

When you do a GROUP BY in most RDBMSs, your selection is limited to the following two things:
Columns mentioned in the GROUP BY - in your case, that's inst.type
Aggregate functions - for example, count(inst.id)
However, the inst.id at the top is neither one of these. You need to remove it for the statement to work:
SELECT
type as "TypeOfInstall"
, COUNT(id) as "NoOfInstall"
FROM dm_bsl_ho.installment
GROUP BY type

Aggregate SQL Function to grab only the first from each group

I have 2 tables - an Account table and a Users table. Each account can have multiple users. I have a scenario where I want to execute a single query/join against these two tables, but I want all the Account data (Account.*) and only the first set of user data (specifically their name).
Instead of doing a "min" or "max" on my aggregated group, I wanted to do a "first". But, apparently, there is no "First" aggregate function in TSQL.
Any suggestions on how to go about getting this query? Obviously, it is easy to get the cartesian product of Account x Users:
SELECT User.Name, Account.* FROM Account, User
WHERE Account.ID = User.Account_ID
But how might I got about only getting the first user from the product based on the order of their User.ID ?

Rather than grouping, go about it like this...
select
*
from account a
join (
select
account_id,
row_number() over (order by account_id, id) -
rank() over (order by account_id) as row_num from user
) first on first.account_id = a.id and first.row_num = 0

I know my answer is a bit late, but that might help others. There is a way to achieve a First() and Last() in SQL Server, and here it is :
Stuff(Min(Convert(Varchar, DATE_FIELD, 126) + Convert(Varchar, DESIRED_FIELD)), 1, 23, '')
Use Min() for First() and Max() for Last(). The DATE_FIELD should be the date that determines if it is the first or last record. The DESIRED_FIELD is the field you want the first or the last value. What it does is :
Add the date in ISO format at the start of the string (23 characters long)
Append the DESIRED_FIELD to that string
Get the MIN/MAX value for that field (since it start with the date, you will get the first or last record)
Stuff that concatened string to remove the first 23 characters (the date part)
Here you go!
EDIT: I got problems with the first formula : when the DATE_FIELD has .000 as milliseconds, SQL Server returns the date as string with NO milliseconds at all, thus removing the first 4 characters from the DESIRED_FIELD. I simply changed the format to "20" (without milliseconds) and it works all great. The only downside is if you have two fields that were created at the same seconds, the sort can possibly be messy... in which cas you can revert to "126" for the format.
Stuff(Max(Convert(Varchar, DATE_FIELD, 20) + Convert(Varchar, DESIRED_FIELD)), 1, 19, '')
EDIT 2 : My original intent was to return the last (or first) NON NULL row. I got asked how to return the last or first row, wether it be null or not. Simply add a ISNULL to the DESIRED_FIELD. When you concatenate two strings with a + operator, when one of them is NULL, the result is NULL. So use the following :
Stuff(Max(Convert(Varchar, DATE_FIELD, 20) + IsNull(Convert(Varchar, DESIRED_FIELD), '')), 1, 19, '')

Select *
From Accounts a
Left Join (
Select u.*,
row_number() over (Partition By u.AccountKey Order By u.UserKey) as Ranking
From Users u
) as UsersRanked
on UsersRanked.AccountKey = a.AccountKey and UsersRanked.Ranking = 1
This can be simplified by using the Partition By clause. In the above, if an account has three users, then the subquery numbers them 1,2, and 3, and for a different AccountKey, it will reset the numnbering. This means for each unique AccountKey, there will always be a 1, and potentially 2,3,4, etc.
Thus you filter on Ranking=1 to grab the first from each group.
This will give you one row per account, and if there is at least one user for that account, then it will give you the user with the lowest key(because I use a left join, you will always get an account listing even if no user exists). Replace Order By u.UserKey with another field if you prefer that the first user be chosen alphabetically or some other criteria.

I've benchmarked all the methods, the simpelest and fastest method to achieve this is by using outer/cross apply
SELECT u.Name, Account.* FROM Account
OUTER APPLY (SELECT TOP 1 * FROM User WHERE Account.ID = Account_ID ) as u
CROSS APPLY works just like INNER JOIN and fetches the rows where both tables are related, while OUTER APPLY works like LEFT OUTER JOIN and fetches all rows from the left table (Account here)

You can use OUTER APPLY, see documentation.
SELECT User1.Name, Account.* FROM Account
OUTER APPLY
(SELECT TOP 1 Name
FROM [User]
WHERE Account.ID = [User].Account_ID
ORDER BY Name ASC) User1

SELECT (SELECT TOP 1 Name
FROM User
WHERE Account_ID = a.AccountID
ORDER BY UserID) [Name],
a.*
FROM Account a

The STUFF response from Dominic Goulet is slick. But, if your DATE_FIELD is SMALLDATETIME (instead of DATETIME), then the ISO 8601 length will be 19 instead of 23 (because SMALLDATETIME has no milliseconds) - so adjust the STUFF parameter accordingly or the return value from the STUFF function will be incorrect (missing the first four characters).

First and Last do not exist in Sql Server 2005 or 2008, but in Sql Server 2012 there is a First_Value, Last_Value function. I tried to implement the aggregate First and Last for Sql Server 2005 and came to the obstacle that sql server does guarantee the calculation of the aggregate in a defined order. (See attribute SqlUserDefinedAggregateAttribute.IsInvariantToOrder Property, which is not implemented.) This might be because the query analyser tries to execute the calculation of the aggregate on multiple threads and combine the results, which speeds up the execution, but does not guarantee an order in which elements are aggregated.

Define "First". What you think of as first is a coincidence that normally has to do with clustered index order but should not be relied on (you can contrive examples that break it).
You are right not to use MAX() or MIN(). While tempting, consider the scenario where you the first name and last name are in separate fields. You might get names from different records.
Since it sounds like all your really care is that you get exactly one arbitrary record for each group, what you can do is just MIN or MAX an ID field for that record, and then join the table into the query on that ID.

There are a number of ways of doing this, here a a quick and dirty one.
Select (SELECT TOP 1 U.Name FROM Users U WHERE U.Account_ID = A.ID) AS "Name,
A.*
FROM Account A

(Slightly Off-Topic, but) I often run aggregate queries to list exception summaries, and then I want to know WHY a customer is in the results, so use MIN and MAX to give 2 semi-random samples that I can look at in details e.g.
SELECT Customer.Id, COUNT(*) AS ProblemCount
, MIN(Invoice.Id) AS MinInv, MAX(Invoice.Id) AS MaxInv
FROM Customer
INNER JOIN Invoice on Invoice.CustomerId = Customer.Id
WHERE Invoice.SomethingHasGoneWrong=1
GROUP BY Customer.Id

Create and join with a subselect 'FirstUser' that returns the first user for each account
SELECT User.Name, Account.*
FROM Account, User,
(select min(user.id) id,account_id from User group by user.account_id) as firstUser
WHERE Account.ID = User.Account_ID
and User.id = firstUser.id and Account.ID = firstUser.account_id

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Average amount of characters in varchar MS SQL server - sql

You can try the below - SELECT note_id, notes_date_time, comments, DATALENGTH(comments) AS 'note_length' FROM job_notes WHERE DATALENGTH(comments) > (select AVG(DATALENGTH(comments)) from job_notes) ORDER BY DATALENGTH(comments) DESC

Related

convert access group by query to sql server query

Redshift LISTAGG frame clause

How to GROUP BY on Oracle?

...oracle group by syntax for beginners

Aggregate SQL Function to grab only the first from each group

Categories

Resources