SUM and GROUP BY a substring in Splice (NoSql) - sql

I am trying to to run a query like the one below. The goal is to get the total activity count for every user_key but because the user_key has a complex structure and I need only the part after the '|' symbol I had to use a substring function. However, when I'm trying to run the query, I get the
error:
SQL Error [42Y36]: Column reference 'USER_KEY' is invalid, or is part of an invalid expression. For a SELECT list with a GROUP BY, the columns and expressions being selected may only contain valid grouping expressions and valid aggregate expressions.
The substring function works OK outside this query. Any workarounds for this problem? Using Splice Machine (NoSql)
SELECT
substr(user_key, instr(user_key,'|') + 1) AS new_user_key,
SUM(
CAST(
activity_count AS INTEGER
)
) AS Total
FROM
schema_name.table_name
GROUP BY
substr(user_key, instr(user_key,'|') + 1)

Your GROUP BY column needs to match the SELECT
SELECT
substr(user_key, instr(user_key,'|') + 1) AS new_user_key,
SUM(
CAST(
activity_count AS INTEGER
)
) AS Total
FROM
schema_name.table_name
GROUP BY
substr(user_key, instr(user_key,'|') + 1) AS new_user_key

I found the answer myself. I used a table subquery:
SELECT new_table.new_user_key, sum(new_table.total)
from
(
SELECT
substr(user_key, instr(user_key,'|') + 1) AS new_user_key,
CAST(activity_count AS INTEGER) AS Total
FROM schema_name.table_name
)
as new_table
GROUP BY
new_table.new_user_key
Let's hope someone will find this post useful and will save some time to him or her.

Related

GROUPING multiple LIKE string

Data:
2015478 warning occurred at 20201403021545
2020179 error occurred at 20201303021545
2025480 timeout occurred at 20201203021545
2025481 timeout occurred at 20201103021545
2020482 error occurred at 20201473021545
2020157 timeout occurred at 20201403781545
2020154 warning occurred at 20201407851545
2027845 warning occurred at 20201403458745
In above data, there are 3 kinds of strings I am interested in warning, error and timeout
Can we have a single query where it will group by string and give the count of occurrences as below
Output:
timeout 3
warning 3
error 2
I know I can write separate queries to find count individually. But interested in a single query
Thanks
You can use filtered aggregation for that:
select count(*) filter (where the_column like '%timeout%') as timeout_count,
count(*) filter (where the_column like '%error%') as error_count,
count(*) filter (where the_column like '%warning%') as warning_count
from the_table;
This returns the counts in three columns rather then three rows as your indicated.
If you do need this in separate rows, you can use regexp_replace() to cleanup the string, then group by that:
select regexp_replace(the_column, '(.*)(warning|error|timeout)(.*)', '\2') as what,
count(*)
from the_table
group by what;
Please use below query, without hard coding the values using STRPOS
select val, count(1) from
(select substring(column_name ,position(' ' in (column_name))+1,
length(column_name) - position(reverse(' ') in reverse(column_name)) -
position(' ' in (column_name))) as val from matching) qry
group by val; -- Provide the proper column name
Demo:
If you want this on separate rows you can also use a lateral join:
select which, count(*)
from t cross join lateral
(values (case when col like '%error%' then 'error' end),
(case when col like '%warning%' then 'warning' end),
(case when col like '%timeout%' then 'timeout' end)
) v(which)
where which is not null
group by which;
On the other hand, if you simply want the second word -- but don't want to hardcode the values -- then you can use:
select split_part(col, ' ', 2) as which, count(*)
from t
group by which;
Here is a db<>fiddle.

Reference an ALIAS in a SUM funciton - SQL Server

Background:
I am trying to calculate the profit margin inside of the query and I am running into errors. When I try to use the select statement in the SUM function, I trigger an error:
Cannot perform an aggregate function on an expression containing an
aggregate or a subquery.
I understand that this is caused by having a SELECT query inside of the SUM function. From there, I tried to reference the alias of the COGS column. I recieve an error when I do that as well:
Invalid column name 'COGS'.
After messing around with the query some more, I figured it might be due to fact that I'm trying all of this inside of a SUM function and so I removed that and ran the query. It returned a few errors:
Column 'tbl_invoice.subTotal' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Column 'tbl_invoice.tradeinAmount' is invalid in the select list because it is not contained in either an aggregate function or the
GROUP BY clause.
Column 'tbl_invoice.subTotal' is invalid in the select list because it is not contained in either an aggregate function or the
GROUP BY clause.
Column 'tbl_invoice.tradeinAmount' is invalid in the select list because it is not contained in either an aggregate function or the
GROUP BY clause.
Column 'tbl_invoice.subTotal' is invalid in the select list because it is not contained in either an aggregate function or the
GROUP BY clause.
Column 'tbl_invoice.tradeinAmount' is invalid in the select list because it is not contained in either an aggregate function or the
GROUP BY clause.
Is there another way to use or reference the value I need in the SUM function?
Query:
--Main query
SELECT
custID,
COUNT(custID) AS InvoiceNum,
--This is the column that has an alias
(SELECT cogs FROM #tempMarketing where #tempMarketing.custID = tbl_invoice.custID) as COGS,
--This is where I am trying to calculate the profit margin
SUM(((((subTotal + (-1 * tradeinAmount) - (SELECT cogs FROM #tempMarketing where #tempMarketing.custID = tbl_invoice.custID)))
/ (NULLIF(subTotal + (-1 * tradeinAmount),0))) *100)) as Profitmargin,
FROM tbl_invoice
group by custID
order by InvoiceNum desc;
SELECT
a.custID,
COUNT(a.custID) AS InvoiceNum, case when a.custid=b.custid then b.cogs else 0 end
as COGS,
SUM(((((subTotal + (-1 * tradeinAmount) - case when a.custid=b.custid then b.cogs else 0 end))
/ (NULLIF(subTotal + (-1 * tradeinAmount),0))) *100)) as Profitmargin,
FROM tbl_invoice a
left join #tempMarketing b on a.custID =b.custid
group by a.custID,
case when a.custid=b.custid then b.cogs else null end
order by InvoiceNum desc;
You can try the following query, I have created a common table expression for cogs column:
WITH cte_base AS(
SELECT cogs FROM #tempMarketing where #tempMarketing.custID = tbl_invoice.custID
)
SELECT
custID,
COUNT(custID) AS InvoiceNum,
--This is the column that has an alias
cte_base.cogs as COGS,
--This is where I am trying to calculate the profit margin
SUM(((((subTotal + (-1 * tradeinAmount) - (cte_base.cogs)))
/ (NULLIF(subTotal + (-1 * tradeinAmount),0))) *100)) as Profitmargin,
FROM tbl_invoice
group by custID
order by InvoiceNum desc;

sql server using SUBSTRING with LIKE operator returns no results

I created this CTE that returns first and last names from 2 different tables. I would like to use the CTE to identify all of the records that have the same last names and the first name of one column starts with the same first letter of another column.
This is an example of the results of the CTE. I want the SELECT using the CTE to return only the highlighted results:
;WITH CTE AS
(
SELECT AD.FirstName AS AD_FirstName, AD.LastName AS AD_LastName, NotInAD.FirstName As NotInAD_FirstName, NotInAD.LastName As NotInAD_LastName
FROM PagingToolActiveDirectoryUsers AD JOIN
(
SELECT FirstName, LastName
FROM #PagingUsersParseName
EXCEPT
SELECT D.FirstName, D.LastName
FROM PagingToolActiveDirectoryUsers D
WHERE D.FirstName <> D.LastName AND D.LastName <> D.LoginName
AND D.LoginName LIKE '%[0-9]%[0-9]%'
) AS NotInAD ON NotInAD.LastName = AD.LastName
)
SELECT *
FROM CTE
WHERE (AD_LastName = NotInAD_LastName) AND (AD_FirstName LIKE ('''' + SUBSTRING(NotInAD_FirstName, 1, 1) + '%'''))
ORDER BY AD_LastName, AD_FirstName;
The result of this query returns no rows.
What am I doing wrong?
Thanks.
You're enclosing the string to be searched for with single-quotes, but it doesn't appear that the data in AD_FirstName has those single-quotes embedded in it. I suggest you replace the first line of the WHERE clause with
WHERE (AD_LastName = NotInAD_LastName) AND (AD_FirstName LIKE (SUBSTRING(NotInAD_FirstName, 1, 1) + '%'))
Best of luck.

IfNULL function in yii does not work

In my model I wrote:
$criteria->select = ' ( select avg( IfNULL( TR.stars_rating_type_id , 0) ) as stars_rating_type_id from '.$tablePrefix.'tour_review as TR where TR.tour_id = T.id and TR.status = \'A\' ) as reviews_avg_rating ';
And I get error :
Active record "Tour" is trying to select an invalid column "( select
avg( IfNULL( TR.stars_rating_type_id". Note, the column must exist in
the table or be an expression with alias.
The reason is that I add "IfNULL( ..., 0)" function in subquery to escape "null" in result set.
Without it I have to make additional verification and set 0 in case of null.
If I test raw sql with " SELECT ( select avg( IfNULL( TR.stars_rating_type_id, 0) ) as stars_rating_type_id..." it works ok,
so that is the problem from the yii side. How to fix it ?
Yii 1.1.14
Thanks!
Take a look on the manual:
http://www.yiiframework.com/doc/api/1.1/CDbCriteria
the select param receive the columns that will be search, not the hole query

SQL SUM question

Hi I have a question about SUM in sql,
I have a query that looks like this
SELECT
SUM ( table_one.field + table_two.field ) as total_field
SUM ( total_field + table_one.anotherfield )
FROM
table_one
JOIN
table_two ON table_one.id = table_two.id
WHERE
table_one = 1
But this doesn't work ( dont mind possible typing errors in JOIN statement, only the second SUM is the probly the query works perfecly without that SUM)
Is there another way to do it, as I need the total_field within my application. I can ofcource add those numbers within the application but I prefer to do it in sql.
You cannot use the column alias in an aggregate to reference the value, just SUM again;
SELECT
SUM ( table_one.field + table_two.field ) as total_field, --your missing a , also
SUM ( table_one.field + table_two.field + table_one.anotherfield )
FROM
table_one
JOIN
table_two ON table_one.id = table_two.id
WHERE
table_one = 1
SUM is an aggregate function. This means you can aggregate data from a field over several tuples and sum it up into a single tuple.
What you want to do is this:
SELECT
table_one.field + table_two.field,
table_one.field + table_two.field + table_one.anotherfield
or maybe this:
SELECT
SUM(table_one.field) + SUM(table_two.field),
SUM(table_one.field) + SUM(table_two.field) + SUM(table_one.anotherfield)
Try replacing "total_field" with "table_one.field + table_two.field" in second SUM().
The name "total_field" is an alias and as such cannot be used in an aggregate functions
The easiest and quickest way is to simply replace the code for total_field in the second calculation.
SELECT
SUM ( ISNULL(table_one.field,0) + ISNULL(table_two.field,0) ) as total_field
SUM ( ISNULL(table_one.field,0) + ISNULL(table_two.field,0) + IsNUll(table_one.anotherfield,0) )
from
table_one
As your code doesn't cater for a null value in the fields you may get warnings when sum the values. I would suggest using IsNull as above and if there is a null value just treat it as 0.
You could use a subquery like this:
SELECT
total_field,
total_field + sum_anotherfield
FROM (
SELECT
SUM(table_one.field + table_two.field) AS total_field,
SUM(table_one.anotherfield) AS sum_anotherfield
FROM
table_one
JOIN
table_two ON table_one.id = table_two.id
WHERE
table_one.somefield = 1
) x