SQL: Calculate Percentage in new column using another column

SQL: Calculate Percentage in new column using another column - sql

I found it hard to describe what I wanted to do in the title, but I will be more specific here.
I have a reasonably long query:
SELECT
/*Amount earned with validation to remove outlying figures*/
Case When SUM(t2.[ActualSalesValue])>=0.01 OR SUM(t2.[ActualSalesValue])<0 Then SUM(t2.[ActualSalesValue]) ELSE 0 END AS 'Amount',
/*Profit earned (is already calculated then input into db, this just pulls that figure*/
SUM(t2.[Profit]) AS 'Profit',
/*Product Type - pulls the product type so that we can sort by product*/
t1.[ucIIProductType] AS 'Product Type',
/*Profit Percentage - This is to calculate the percentage of profit based on the sales price which uses 2 different columns - Case ensures that there are no wild values appearing in the reports as previously experienced*/
Case When SUM(t2.[ActualSalesValue])>=0.01 OR SUM(t2.[ActualSalesValue])<0 THEN (SUM(t2.[Profit])/SUM(t2.[ActualSalesValue])) ELSE 0 END AS 'Profit Percentage',
/*Percentage of Turnover*/
*SUM(t2.[ActualSalesValue])/(Select SUM(t2.[ActualSalesValue]) OVER() FROM [_bvSTTransactionsFull]) AS 'PoT'
/*The join is connect the product type with the profit and the amount*/
FROM [dbo].[StkItem] AS t1
INNER JOIN [dbo].[_bvSTTransactionsFull] AS t2
/*There attirbutes are the links between the tables*/
ON t1.[StockLink]=t2.[AccountLink]
WHERE t2.[TxDate] BETWEEN '1/Aug/2014' AND '31/Aug/2014' AND ISNUMERIC(t2.[Account]) = 1
Group By t1.[ucIIProductType]
The 'Percentage of Turnover' part I am having trouble with - I am trying to calculate the percentage of the Amount based on the total amount - using the same column. So eg: I want to take the Amount value in row 1, then divide it by the total amount of the entire column and then have that value listed in a new column. But I keep getting errors or I Keep getting 1 (because it wants to divide the value by the same value. CAN anyone please advise me on proper syntax for solving this:
/*Percentage of Turnover*/
*SUM(t2.[ActualSalesValue])/(Select SUM(t2.[ActualSalesValue]) OVER() FROM [_bvSTTransactionsFull]) AS 'PoT'

I think you want one of the following:
SUM(t2.[ActualSalesValue])/(Select SUM(t.[ActualSalesValue]) FROM [_bvSTTransactionsFull] t) AS PoT
or:
SUM(t2.[ActualSalesValue])/(SUM(SUM(t2.[ActualSalesValue])) OVER() ) AS PoT
Note: you should use single quotes only for string and date constants, not for column and table names. If you need to escape names, use square braces.

Related

Question about divide result from the same column in SQL Server

I am trying to write statement in SQL Server. What I am trying to do is to get the result of count records in columns end with "R" divide the count of all the records. So it is basically the statement of a column with a statement " count (invoice) where Invoice like "%R" / count( Invoice)"
Here is my code without the divide calculation. I only come up with statement without the divide calculation.
SELECT
Invoice,
COUNT(ART_CURRENT__TRANSACTION.Invoice) AS Number_Revisions,
MAX(ART_CURRENT__TRANSACTION.[Customer]) AS "Customer",
MAX(ARM_MASTER__CUSTOMER.Name) AS "Name",
MAX(ART_CURRENT__TRANSACTION.[Job]) AS Job
FROM
ART_CURRENT__TRANSACTION
LEFT OUTER JOIN
ARM_MASTER__CUSTOMER ON ARM_MASTER__CUSTOMER.Customer = ART_CURRENT__TRANSACTION.Customer
WHERE
Invoice LIKE '%R'
GROUP BY
Invoice;
What I am trying to ask is how can I add a column that calculate the number of invoice end with "R"/ NUMBER OF INVOICE.
Thank you guys!

What I am trying to do is to get the result of count records in Columns end with "R" divide the count of all the records.
You seem to want this calculation:
SELECT AVG(CASE WHEN t.Invoice LIKE '%R' THEN 1.0 ELSE 0 END)
FROM ART_CURRENT__TRANSACTION t;
This assumes that invoice is in the transaction table. I don't think a join is necessary for what you want to do.

How to calculate a bank's deposit growth from one call report to the next, as a percentage?

I downloaded the entire FDIC bank call reports dataset, and uploaded it to BigQuery.
The table I currently have looks like this:
What I am trying to accomplish is adding a column showing the deposit growth rate since the last quarter for each bank:
Note:The first reporting date for each bank (e.g. 19921231) will not have a "Quarterly Deposit Growth". Hence the two empty cells for the two banks.
I would like to know if a bank is increasing or decreasing its deposits each quarter/call report (viewed as a percentage).
e.g. "On their last call report (19921231)First National Bank had deposits of 456789 (in 1000's). In their next call report (19930331)First National bank had deposits of 567890 (in 1000's). What is the percentage increase (or decrease) in deposits"?
This "_%_Change_in_Deposits" column would be displayed as a new column.
This is the code I have written so far:
select
SFRNLL.repdte, SFRNLL.cert, SFRNLL.name, SFRNLL.city, SFRNLL.county, SFRNLL.stalp, SFRNLL.specgrp AS `Loan_Specialization`, SFRNLL.lnreres as `_1_to_4_Residential_Loans`, AL.dep as `Deposits`, AL.lnlsnet as `loans_and_leases`,
IEEE_DIVIDE(SFRNLL.lnreres, AL.lnlsnet) as SFR2TotalLoanRatio
FROM usa_fdic_call_reports_1992.All_Reports_19921231_1_4_Family_Residential_Net_Loans_and_Leases as SFRNLL
JOIN usa_fdic_call_reports_1992.All_Reports_19921231_Assets_and_Liabilities as AL
ON SFRNLL.cert = AL.cert
where SFRNLL.specgrp = 4 and IEEE_DIVIDE(SFRNLL.lnreres, AL.lnlsnet) <= 0.10
UNION ALL
select
SFRNLL.repdte, SFRNLL.cert, SFRNLL.name, SFRNLL.city, SFRNLL.county, SFRNLL.stalp, SFRNLL.specgrp AS `Loan_Specialization`, SFRNLL.lnreres as `_1_to_4_Residential_Loans`, AL.dep as `Deposits`, AL.lnlsnet as `loans_and_leases`,
IEEE_DIVIDE(SFRNLL.lnreres, AL.lnlsnet) as SFR2TotalLoanRatio
FROM usa_fdic_call_reports_1993.All_Reports_19930331_1_4_Family_Residential_Net_Loans_and_Leases as SFRNLL
JOIN usa_fdic_call_reports_1993.All_Reports_19930331_Assets_and_Liabilities as AL
ON SFRNLL.cert = AL.cert
where SFRNLL.specgrp = 4 and IEEE_DIVIDE(SFRNLL.lnreres, AL.lnlsnet) <= 0.10
The table looks like this:
Additional notes:
I would also like to view the last column (SFR2TotalLoansRatio) as a percentage.
This code runs correctly, however, previously I was getting a "division by zero" error when attempting to run 50,000 rows (1992 to the present).

Addressing each of your question individually.
First) Retrieving SFR2TotalLoanRatio as percentage, I assume you want to see 9.988% instead of 0.0988 in your results. Currently, in BigQuery you can achieve this by casting the field into a STRING then, concatenating the % sign. Below there is an example with sample data:
WITH data as (
SELECT 0.0123 as percentage UNION ALL
SELECT 0.0999 as percentage UNION ALL
SELECT 0.3456 as percentage
)
SELECT CONCAT(CAST(percentage*100 as String),"%") as formatted_percentage FROM data
And the output,
Row formatted_percentage
1 1.23%
2 9.99%
3 34.56%
Second) Regarding your question about the division by zero error. I am assuming IEEE_DIVIDE(arg1,arg2) is a function to perform the division, in which arg1 is the divisor and arg2 is the dividend. Therefore, I would adivse your to explore your data in order to figured out which records have divisor equals to zero. After gathering these results, you can determine what to do with them. In case you decide to discard them you can simply add within your WHERE statement in each of your JOINs: AL.lnlsnet = 0. On the other hand, you can also modify the records where lnlsnet = 0 using a CASE WHEN or IF statements.
UPDATE:
In order to add this piece of code your query, you u have to wrap your code within a temporary table. Then, I will make two adjustments, first a temporary function in order to calculate the percentage and format it with the % sign. Second, retrieving the previous number of deposits to calculate the desired percentage. I am also assuming that cert is the individual id for each of the bank's clients. The modifications will be as follows:
#the following function MUST be the first thing within your query
CREATE TEMP FUNCTION percent(dep INT64, prev_dep INT64) AS (
Concat(Cast((dep-prev_dep)/prev_dep*100 AS STRING), "%")
);
#followed by the query you have created so far as a temporary table, notice the the comma I added after the last parentheses
WITH data AS(
#your query
),
#within this second part you need to select all the columns from data, and LAG function will be used to retrieve the previous number of deposits for each client
data_2 as (
SELECT repdte, cert, name, city, county, stalp, Loan_Specialization, _1_to_4_Residential_Loans,Deposits, loans_and_leases, SFR2TotalLoanRatio,
CASE WHEN cert = lag(cert) OVER (PARTITION BY id ORDER BY d) THEN lag(Deposits) OVER (PARTITION BY id ORDER BY id) ELSE NULL END AS prev_dep FROM data
)
SELECT repdte, cert, name, city, county, stalp, Loan_Specialization, _1_to_4_Residential_Loans,Deposits, loans_and_leases, SFR2TotalLoanRatio, percent(Deposits,prev_dep) as dept_growth_rate FROM data_2
Note that the built-in function LAG is used together with CASE WHEN in order to retrieve the previous amount of deposits per client.

SQL - count amount of occurences for items in different price diapasons

I have a question about SQL, and I honestly tried to search methods before asking. I will give an abstract (but precise) description below, and will greatly appreciate your example of solution (SQL query).
What I have:
Table A with category ids of the items and prices (in USD) for each item. category id has int type of value, price is string and looks like "USD 200000000" (real value is multiplied by 10^7). Tables also has a kind column with int type of value.
Table B with relation of category id and name.
What I need:
Get a table with price diapasons (like 0-100 | 100-200 | ...) as column names and count amount of items for each category id (as lines names) in all of the price diapasons. All results must be filtered by kind parameter (from table A) with value 3.
Questions, that I encountered (and which caused to ask for an example of SQL query):
Cut "USD from price string value, divide it by 10^7 and convert to float.
Gather diapasons of price values (0-100 | 100-200 | ...), with given step in the given interval (max price is considered as unknown at the start). Example: step 100 on 0-500 interval, and step 200 for values >500.
Put diapasons of price values into column names of the result table.
For each diapason, count amount of items in each category (category_id). Left limit of diapason shall not be considered (e.g. on 1000-1200 diapason, items with price 1000 shall not be considered).
Using B table, display name instead of category id.
Response is appreciated, ignorance will be understood.

If you only need category ids, then you do not need B. What you are looking for is conditional aggregation, something like:
select category_id,
sum(case when cast(substring(price, 4, 100) as int)/10000000 < 100 then 1 else 0 end) as price_000_100
sum(case when cast(substring(price, 4, 100) as int)/10000000 >= 100 and cast(substring(price, 4, 100) as int)/10000000 < 200
then 1 else 0
end) as price_100_200,
. . .
from a
group by category_id

There is no standard way to do what you describe.
That is because to do (3) you need a pivot aka crosstab, and this is not in ANSI SQL. Each DBMS has it's own implementation. Plus dynamic columns in a pivot table are an additional complication.
For example, Postgres calls it a "crosstab" and requires the tablefunc module to be installed. See this SO question and the documentation. Compare to SQL Server, which uses the PIVOT command.
You can get close using reasonably standard SQL.
Here is an example based on SQLite. A little bit of conversion would provide a solution for other systems, e.g. SUBSTR would be substring(string [from int] [for int]) in postgre.
Assuming a data table of format:
and a category name table of:
then the following code will produce:
WITH dataCTE AS
(SELECT product_id AS 'ID', CAST(SUBSTR(price, 5) AS INT)/1000000 AS 'USD',
CASE WHEN (CAST(SUBSTR(price, 5) AS INT)/1000000) <= 500 THEN
100 ELSE 200
END AS 'Interval'
FROM data
WHERE kind = 3),
groupCTE AS
(SELECT dataCTE.ID AS 'ID', dataCTE.USD AS 'USD', dataCTE.Interval AS 'Interval',
CASE WHEN dataCTE.Interval = 100 THEN
CAST(dataCTE.USD AS INT)/100
ELSE
(CAST(dataCTE.USD-500 AS INT)/200)+5
END AS 'GroupID'
FROM dataCTE),
cleanCTE AS
(SELECT *, CASE WHEN groupCTE.Interval = 100 THEN
CAST(groupCTE.GroupID *100 AS VARCHAR)
|| '-' ||
CAST((groupCTE.GroupID *100)+99 AS VARCHAR)
ELSE
CAST(((groupCTE.GroupID-5)*200)+500 AS VARCHAR)
|| '-' ||
CAST(((groupCTE.GroupID-5)*200)+500+199 AS VARCHAR)
END AS 'diapason'
FROM groupCTE
INNER JOIN cat_name AS cn ON groupCTE.ID = cn.cat_id)
SELECT *
FROM cleanCTE;
If you modify the last SELECT to:
SELECT name, diapason, COUNT(diapason)
FROM cleanCTE
GROUP BY name, diapason;
then you get a grouped output:
This is as close as you will get without specifying the exact system; even then you will have a problem with dynamically creating the column names.

SAP BO - how to get 1/0 distinct values per week in each row

the problem I am trying to solve is having a SAP Business Objects query calculate a variable for me because calculating it in a large excel file crashes the process.
I am having a bunch of columns with daily/weekly data. I would like to get a "1" for the first instance of Name/Person/Certain Identificator within a single week and "0" for all the rest.
So for example if item "Glass" was sold 5 times in week 4 in this variable/column first sale will get "1" and next 4 sales will get "0". This will allow me to have the number of distinct items being sold in a particular week.
I am aware there are Count and Count distinct functions in Business Objects, however I would prefer to have this 1/0 system for the entire raw table of data because I am using it as a source for a whole dashboard and there are lots of metrics where distinct will be part/slicer for.
The way I doing it previously is with excel formula: =IF(SUMPRODUCT(($A$2:$A5000=$A2)*($G$2:$G5000=$G2))>1,0,1)
This does the trick and gives a "1" for the first instance of value in column G appearing in certain value range in column A ( column A is the week ) and gives "0" when the same value reappears for the same week value in column A. It will give "1" again when the week value change.
Since it is comparing 2 cells in each row for the entire columns of data as the data gets bigger this tends to crash.
I was so far unable to emulate this in Business Objects and I think I exhausted my abilities and googling.
Could anyone share their two cents on this please?

Assuming you have an object in the query that uniquely identifies a row, you can do this in a couple of simple steps.
Let's assume your query contains the following objects:
Sale ID
Name
Person
Sale Date
Week #
Price
etc.
You want to show a 1 for the first occurrence of each Name/Week #.
Step 1: Create a variable with the following definition. Let's call it [FirstOne]
=Min([Sale ID]) In ([Name];[Week #])
Step 2: In the report block, add a column with the following formula:
=If [FirstOne] = [Sale ID] Then 1 Else 0
This should produce a 1 in the row that represents the first occurrence of Name within a Week #. If you then wanted to show a 1 one the first occurrence of Name/Person/Week #, you could just modify the [FirstOne] variable accordingly:
=Min([Sale ID]) In ([Name];[Person];[Week #])

I think you want logic around row_number():
select t.*,
(case when 1 = row_number() over (partition by name, person, week, identifier
order by ??
)
then 1 else 0
end) as new_indicator
from t;
Note the ??. SQL tables represent unordered sets. There is no "first" row in a table or group of rows, unless a column specifies that ordering. The ?? is for such a column (perhaps a date/time column, perhaps an id).
If you only want one row to be marked, you can put anything there, such as order by (select null) or order by week.

Missing Expression Error when I try to convert the currency type

I want to display the output of total price from my table as 'USD 20.00' or 'EURO 40' for US and Euro respectively.
I tired the following code and got the missing expression error.
How to convert it to the specified format?
The data type of the column 'price' is number.
My code goes here..
select convert(varchar(50),convert(money, coalesce(sum(nvl(CASE
WHEN (typ=27 and l.tyt ='USD') THEN 'USD' + PRICE
WHEN (typ=27 and l.tyt='EURO') THEN 'EURO'+ PRICE
END,0)),0)),1) as Total from transactions;
Thanks in advance!

There are several things wrong with the expression. 1) adding string value ('USD'/'EURO') to numeric value (PRICE), 2) NVL() on different data type, 3) SUM() on a string data. Anyone of these could be the cause of the error.
If the table stores differently denominated prices, you cannot sum them into on row - which one do you want to be in the result (EURO or USD). If you want to have both in two rows, the the following should work (note that in your original statement, the alias l is undefined so this is not likely to work):
SELECT 'USD', Sum(price) as Total
FROM transactions
WHERE typ=27 and tyt ='USD'
UNION ALL
SELECT 'EURO', Sum(price) as Total
FROM transactions
WHERE typ=27 and tyt ='EURO'
or simply
SELECT tyt, Sum(price) as Total
FROM transactions
WHERE typ=27
GROUP BY tyt

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas