SQL YTD for previous years and this year - sql

Wondering if anyone can help with the code for this.
I want to query the data and get 2 entries, one for YTD previous year and one for this year YTD.
Only way I know how to do this is as 2 separate queries with where clauses.. I would prefer to not have to run the query twice.
One column called DatePeriod and populated with 2011 YTD and 2012YTD, would be even better if I could get it to do 2011YTD, 2012YTD, 2011Total, 2012Total... though guessing this is 4 queries.
Thanks
EDIT:
In response to help clear a few things up:
This is being coded in MS SQL.
The data looks like so: (very basic example)
Date | Call_Volume
1/1/2012 | 4
What I would like is to have the Call_Volume summed up, I have queries that group it by week, and others that do it by month. I could pull all the dailies in and do this in Excel but the table has millions of rows so always best to reduce the size of my output.
I currently group by Week/Month and Year and union all so its 1 output. But that means I have 3 queries accessing the same table, large pain, very slow not efficient and that is fine but now I also need a YTD so its either 1 more query or if I could find a way to add it to the yearly query that would ideal:
So
DatePeriod | Sum_Calls
2011 Total | 40
2011 YTD | 12
2012 Total | 45
2012 YTD | 15
Hope this makes any sense.

SQL is built to do operations on rows, not columns (you select columns, of course, but aggregate operations are all on rows).
The most standard approach to this is something like:
SELECT SUM(your_table.sales), YEAR(your_table.sale_date)
FROM your_table
GROUP BY YEAR(your_table.sale_date)
Now you'll get one row for each year on record, with no limit to how many years you can process. If you're already grouping by another field, that's fine; you'll then get one row for each year in each of those groups.
Your program can then iterate over the rows and organize/render them however you like.
If you absolutely, positively must have columns instead, you'll be stuck with something like this:
SELECT SUM(IF(YEAR(date) = 2011, sales, 0)) AS total_2011,
SUM(IF(YEAR(date) = 2012, total_2012, 0)) AS total_2012
FROM your_table
If you're building the query programmatically you can add as many of those column criteria as you need, but I wouldn't count on this running very efficiently.
(These examples are written with some MySQL-specific functions. Corresponding functions exist for other engines but the syntax would be a little different.)

Related

Stacking column data in excel SQL query

I'm trying to pull data from an access database into an excel table with an SQL query. The problem is that my access database has columns with similar data that I want to combine into one single column. This should give me duplicates of the data in other columns for each entry. I'm not great with SQL but I think I have the basics down.
Database structure that I have:
Date | Product | Hours 1 | Reason 1 | Hours 2 | Reason 2 |
2019 A 3 "xxx" 5 "yyy"
Excel table that I want:
Date | Product | Hours | Reason |
2019 A 3 "xxx"
2019 A 5 "yyy"
Also not sure if it's possible but it would be great to see the source column of each
Date | Product | Hours | Reason | Source |
2019 A 3 "xxx" "Hours 1"
2019 A 5 "yyy" "Hours 2"
I've tried UNION ALL and got duplicates of the data but not merged into one column. I'm about to try INSERT INTO but sort of lost on how to get each one into the same column
Try this
SELECT Date, Product, Hours, Reason, Source
FROM (
SELECT Date, Product, Hours1 Hours, Reason1 Reason, "Hours 1" Source
FROM Table
UNION
SELECT Date, Product, Hours2, Reason2, "Hours 2"
FROM Table
)
It looks like you have a bad data structure in the table. By that I mean its a "flat" table with multiple hours in one row for a record. This is generally a PITA when it comes doing tasks for reviewing data in many to one situations. Normally there would be a table where records get logged separately for each hour involved. I understand you probably didnt build it, but its worth pointing out for you own information.
Fundamentally, this issue would be easier to appproach once you understood how that less than desirable structure affects what youre task is. This is essentially, in my mind, a pivot problem. PIVOT in SQL is essentially switching rows and columns. There are may ways to pivot data with code - pick your favorite - most people actually use the function PIVOT, where I tend to teeter between CTE's (common table experessions) and PIVOT. IMO CTE's are easier to read once you understand them. Because Acess SQL doesnt support PIVOT or CTE's we just had to treat the body of what a cte wouldve been as a correlated subquery.
SELECT x.*
FROM
(
SELECT
Date,
Product,
Hours,
Reason,
[Hours 1] AS Source
FROM yourTableName
UNION
SELECT
Date,
Product,
Hours,
Reason,
[Hours 2] AS Source
FROM yourTableName
) x

Most efficient way to subset a query to a list of dates (across two columns) in PostgreSQL?

I am querying a table in a PG database which contains a period (character varying(255)) and a value (integer), which looks something like:
|period|value|
|Months|3 |
|Months|6 |
|Weeks |1 |
|Years |5 |
After a few joins, I'm looking to subset my result set to only include a subset of these period / value combinations, for example I may only want 3 Months, 6 Months and 5 Years (so not 1 Weeks).
I'd usually reach to a WHERE IN(..) but don't think I can do this across two columns. Instead I've tried to make a composite column by:
CURRENT_DATE + CAST(CONCAT(tbl.value, tbl.period) AS INTERVAL)
Producing a column of timestamps which I can then subset with an IN('2019-05-18', '2019-08-18', '2024-02-18').
This works but isn't particularly pretty or efficient. Is there a better way?
I'm free to change my query (so I can subset by dates as I currently am, or by 3 and Months) but importantly I do not know ahead of time whether 2 Years will be stored as 24 Months (nor do I have control of the table).
Thanks!
You can say
WHERE (period, value) IN (('months', 5), ...)
and use an index over both columns.
I hope I got the syntax right; there might be a ROW missing somewhere.

SQL Statement - want daily dates rolled up and displayed as Year

I have two years worth of data that I'm summing up for instance
Date | Ingredient_cost_Amount| Cost_Share_amount |
I'm looking at two years worth of data for 2012 and 2013,
I want to roll up all the totals so I have only two rows, one row for 2012 and one row for 2013. How do I write a SQL statement that will look at the dates but display only the 4 digit year vs 8 digit daily date. I suspect the sum piece of it will be taken care of by summing those columns withe calculations, so I'm really looking for help in how to tranpose a daily date to a 4 digit year.
Help is greatly appreciated.
select DATEPART(year,[Date]) [Year]
, sum(Ingredient_cost_Amount) Total
from #table
group by DATEPART(year,[Date])
Define a range/grouping table.
Something similar to the following should work in most RDBMSs:
SELECT Grouping.id, SUM(Ingredient.ingredient_cost_amount) AS Ingredient_Cost_Amount,
SUM(Ingredient.cost_share_amount) AS Cost_Share_Amount
FROM (VALUES (2013, DATE('2013-01-01'), DATE('2014-01-01')),
(2012, DATE('2012-01-01'), DATE('2013-01-01'))) Grouping(id, gStart, gEnd)
JOIN Ingredient
ON Ingredient.date >= Grouping.gStart
AND Ingredient.date < Grouping.gEnd
GROUP BY Grouping.id
(DATE() and related conversion functions are heavily DB dependent. Some RDBMSs don't support using VALUES this way, although there are other ways to create the virtual grouping table)
See this blog post for why I used an exclusive upper bound for the range.
Using a range table this way will potentially allow the db to use indices to help with the aggregation. How much this helps depends on a bunch of other factors, like the specific RDBMS used.

Rank in powerpivot

In Powerpivot, I have a problem in ranking in Table 1, based on Sales and Year. I want to have the result like that:
Year Store Sales **Rank**
2013 A 200 3
2013 B 250 2
2013 C 300 1
2014 A 350 2
2014 B 300 3
2014 C 400 1
Which rank function could I use to have this rank result?
Thanks in advance.
Tran,
Probably the smartest way to go is to use the 'X' functions. They can be a bit tricky and non intuitive, yet are extremely powerful.
First, create a simple measure to calculate the total sales:
TotalSales:=SUM(Stores[Sales])
Then, use this formula below to calculate the rank (per store per year):
Rank:=RANKX(ALL(Stores[Store]), [TotalSales])
That should do what you are looking for. Once those two measures are ready, create a new powerpivot table, dray Year and Store onto rows pane and add required values.
ALL function overwrites the applied rows filter and thus allows to calculate rank per year.
The result should look like this:
Hope this helps.

Is this SQL the most efficient way

We have a table that converts SAT scores into ACT scores using a year. if the data changes in the future we would add the new scores along with the year the scores change. We need to pass in a year and sat score and return the correct act score.
sample data with three rows would be
act sat year
28 1010 1998
29 1010 2012
30 1010 2015
If I pass in a SAT score of 1010 and a year of 2014 I should return an act score of 29 back.
I wrote the following SQL statement that works.
select act,
RANK() OVER(ORDER BY year DESC)
from keessattbl
where sat = 1010 and INT(year) <= 2014
FETCH FIRST ROW ONLY
Is this the most efficient way to handle this.
Thanks in advance Doug
Another option would be to use the following:
select k1.*
from keessattbl k1
where k1.sat = 1010
and k1.year = (select max(k2.year)
from keessattbl k2
where k2.sat = k1.sat
and k2.year <= 2014)
You will need to check which one is more efficient. If year (and possibly sat) is indexed, then both are probably quite fast.
But you will need to look at the execution plan (or simply time the statements) to find out.
I would say "Sure." Is it not performing well?
Also, most DBMS's have some way to get the first row of a result set, so you don't need to use DB2 unless you want to.
if you are not sure if it's the most efficient way to write then you can check by doing an EXPLAIN on the query. write the query another way, do an EXPLAIN on it and compare the costs. IBM provides the IBM Data Studio product for free. you can just right-click on your sql and select Visual Explain to get the results in the gui.