Writing equations in SQL using multiple variables - sql

I'm trying to use data that is labeled by year (2012 - 2016) to calculate CAGR. The data was originally in one column indicating the total population with another column indicating the year. I've isolated the 2012 and 2016 data into two separate columns and am trying to use SQL to calculate the CAGR rate ((data from 2016)/(data from 2012)^(1/4))-1.
Is this the correct way to calculate CAGR/cummulative growth? I've tried simply using the two columns of data but because they are mismatched and have nulls, it doesn't work. Please let me know if you have any ideas.

Compound Annual Growth Rate (CAGR) doesn't really lend itself to what you're trying to do.
Usually this is used when you say, invest $1000 in a fund, and you calculate the annual growth based on the ending value.
Example - if you invest $1000 and in 5 years it's worth $5000:
( 5,000 / 1,000)1/5 - 1 = .37973 = 37.97%
If I was to write that in SQL Server it would be:
SELECT SUM(POWER((5000.0/1000.0),(1.0/5.0))-1.0)
You can replace the 5000 and 1000 to be the specific columns you want to compare, or a range of data you need to compare.
If you elaborate your question I will update this answer.

Related

How to calculate the avg time a tool stays on hold? oracle sql developer

Im trying to calculate the average time a tool stays on loan. The time a tool stays on loan is the number of days between loan_status_change_date and tool_out_date (table columns). the date type of these 2 columns is ex: 01-SEP-17
whats the best way to approach this?
We can do arithmetic with Oracle dates. It's not clear from the column names which one is the start of the loan and which the end; in the following example I've assumed loan_status_change is when the tool is returned.
select tool
, avg(loan_status_change - tool_out_date) as avg_loan_days
from your_table
group by tool
/
The AVG() function is an aggregate function, so it handles the /ns for us. The substraction is to calculate the length of a particular loan, which is the value you want to average. The result of that substraction already is a number of days, so no further transformation is necessary. If your columns have a time element then the result might not be an integer.

SQL Statement - want daily dates rolled up and displayed as Year

I have two years worth of data that I'm summing up for instance
Date | Ingredient_cost_Amount| Cost_Share_amount |
I'm looking at two years worth of data for 2012 and 2013,
I want to roll up all the totals so I have only two rows, one row for 2012 and one row for 2013. How do I write a SQL statement that will look at the dates but display only the 4 digit year vs 8 digit daily date. I suspect the sum piece of it will be taken care of by summing those columns withe calculations, so I'm really looking for help in how to tranpose a daily date to a 4 digit year.
Help is greatly appreciated.
select DATEPART(year,[Date]) [Year]
, sum(Ingredient_cost_Amount) Total
from #table
group by DATEPART(year,[Date])
Define a range/grouping table.
Something similar to the following should work in most RDBMSs:
SELECT Grouping.id, SUM(Ingredient.ingredient_cost_amount) AS Ingredient_Cost_Amount,
SUM(Ingredient.cost_share_amount) AS Cost_Share_Amount
FROM (VALUES (2013, DATE('2013-01-01'), DATE('2014-01-01')),
(2012, DATE('2012-01-01'), DATE('2013-01-01'))) Grouping(id, gStart, gEnd)
JOIN Ingredient
ON Ingredient.date >= Grouping.gStart
AND Ingredient.date < Grouping.gEnd
GROUP BY Grouping.id
(DATE() and related conversion functions are heavily DB dependent. Some RDBMSs don't support using VALUES this way, although there are other ways to create the virtual grouping table)
See this blog post for why I used an exclusive upper bound for the range.
Using a range table this way will potentially allow the db to use indices to help with the aggregation. How much this helps depends on a bunch of other factors, like the specific RDBMS used.

Finding Outliers In SQL

I am very new to SQL and have my data in an Access database (~50k rows) with the following structure
State Year Date Price
CA 2012 1/2/13 5.00
NY 2013 1/2/13 6.00
NY 2013 1/7/13 7.00
A (State, Year) pair, though held in different columns here, represent a vintage (like a wine). So we talk about how the price of "CA 2012" moves throughout the year.
Because some of our data is entered manually into this database, there is opportunity for error. We would like to write a query that flags any suspicious entries for further review.
I have read many different questions and threads on the subject but have not found anything that addresses my main concern of how to find local outliers - the price can move up and down so prices that may be okay for some date range may be an outlier earlier in the year
Update: I chunked my data into buckets of months so finding local outliers might be easier as a result of that. I'm still looking for good outlier detection methods I can implement in SQL.
Sometimes simple is best- No need for an intro to statistics yet. I would recommend starting with simple grouping. Within that function you can Average, get the minimum, the Maximum and other useful bits of data. Here are a couple of examples to get you started:
SELECT Table1.State, Table1.Yr, Count(Table1.Price) AS CountOfPrice, Min(Table1.Price) AS MinOfPrice, Max(Table1.Price) AS MaxOfPrice, Avg(Table1.Price) AS AvgOfPrice
FROM Table1
GROUP BY Table1.State, Table1.Yr;
Or (in case you want month data included)
SELECT Table1.State, Table1.Yr, Month([Dt]) AS Mnth, Count(Table1.Price) AS CountOfPrice, Min(Table1.Price) AS MinOfPrice, Max(Table1.Price) AS MaxOfPrice
FROM Table1
GROUP BY Table1.State, Table1.Yr, Month([Dt]);
Obviously you'll need to modify the table and field names (Just so you know though- 'Year' and 'Date' are both reserved words and best not used for field names.)

is it possible to find out how much of the db data is older than some N years in SQL Server?

I have two database in SQL Server. I wanted to find out the data older than (let say 3) years.
I know the database creation date, currently I have around 550 GB (both the database) of data spanned for 7 years, I wanted to know 'how much of the DB data (out of total 550 GB)is older than 3 years OR (5 years)'?
I was going through this link but couldn't get the expected data.
SQL SERVER – Query to find number Rows, Columns, ByteSize for each table in the current database – Find Biggest Table in Database
One of the solution coming in my mind right now is to find out the total number of rows accounted for 7 years (easily get this number), total number of rows accounted for 5 years (starting from the date creation) (don't know how to get this number).
then for row_count_7_years accounts for 550 GB of data , what will be the row_count_5_years? i will get the approx data.
Please Help
For such purposes you should keep some datetime field as marc mentioned. I suppose you don't have it.
In you suggested solution you can get the whole count of rows from your table (for 7 years i suppose), but you wouldn't be able to get the rows for 5 years, because there is no date.
You can get the whole number of records for 7 years and divide them on the number of years, and ONLY IN CASE you have your database avarage fulfill, you can make query for top (numberOFRows in one year)*5 and order them by row_number(). The result - the rows, you should delete. But I wouldn't recommend you to use this solution.
I would recommend you to alter your tables and add the datetime columns for each of them. Before that you should make the backup for the whole date and copy it somewhere. After 3 years you would be able to make your clean up.
as mentioned above u shud have a date column , however if you dont , depending on the realtionships in your tables u might be able to estimate the number of rows looking up realtionships with some other table that has the datetime column , else if you have a backup ( unlikely but still) you can restore that to identify the delta

Sql Queries for finding the sales trend

Suppose ,I have a table which has all the billing records. Now I want to see the sales trend for a user given time duration group by each 3 days ...what should be the sql query regarding this?
please help,Otherwise I am gone ...
I can only give a vague suggestion as per the question, however you may want to have a derived column with a standardised date (as per MS date format, just a number per day) that you could then use a modulus (3) on so that days are equal per 3 day period. You can then group and aggregate over this column to get the values for a 3 day period. Obviously to display the date nicely you would have to multiply back and convert your column as well.
Again I'm not sure of the specifics, but I think this general idea could be achieved to get a result (may well not be the best way so it would help to add more to the question...)