MERGE with CASE in the same table - sql

I am using SQL Server 2014.
My table is called TF and this is what I have so far.
+-----------+------------+--------+--------------+
| IdProduct | Month | Sales | Accumulation |
+-----------+------------+--------+--------------+
| DSN101 | 01/01/2014 | 100 | ((1)) |
| DSN101 | 01/02/2014 | 50 | 50 |
| DSN101 | 01/03/2014 | 250 | 300 |
+-----------+------------+--------+--------------+
IdProduct is a string
Month is a Date
Sales and accumulation are float
The accumulation column was initially null and what I did next didn't work so I put the default value to 1.
This is how I update the table and fill it :
GO
MERGE INTO dbo.TF as A
USING dbo.TF as P
ON (A.IdProduct = P.IdProduct and MONTH(P.Month)=MONTH(A.Month)-1 and YEAR(P.Month)=YEAR(A.Month))
WHEN MATCHED THEN
UPDATE SET A.Accumulation = CASE
WHEN P.Accumulation Is not null then P.Accumulation+A.Sales-1
WHEN MONTH(A.Month)=1 and not exists(select P.Sales) then A.Sales
END;
So at first none of this would work obviously because of the first null that leads to a second null and then to third..
Now the first case works fine, the second doesn't and I just don't get why.
I tried many combinations with no success. What I need is that for the first month in every year the accumulation column equals simply the sales column.
I understand my code makes every line look for the previous one but I don't know how to make it stop when it's January.
Please help me !

It seems that you want accumulative to contain the cumulative sum of the values in the column. You don't need to store this in the table, you can just get it from a query:
select tf.*, sum(sales) over (order by month) - sales
from tf;
The - sales is because the accumulation appears to not include the current month's sales. If you only want it for the current year (which the merge statement suggests), then add a partition by:
select tf.*, sum(sales) over (partition by year(month) order by month) - sales
from tf;
And, if you really want to include this in the table, these are updatable expressions:
with toupdate as (
select tf.*, sum(sales) over (order by month) - sales as newval
from tf
)
update toupdate
set accumulation = newval;

Related

Dividing sum results

I'm really sorry as this was probably answered before, but I couldn't find something that solved the problem.
In this case, I'm trying to get the result of dividing two sums in the same column.
| Id | month | budget | sales |
| -- | ----- | ------ | ----- |
| 1 | jan | 1000 | 800 |
| 2 | jan | 1000 | 850 |
| 1 | feb | 1200 | 800 |
| 2 | feb | 1100 | 850 |
What i want is to get the % of completition for each id and month (example: get 0,8 or 80% in a fifth column for id 1 in jan)
I have something like
sel
id,
month,
sum (daily_budget) as budget,
sum (daily_sales) as sales,
budget/sales over (partition by 1,2) as efectivenes
from sales
group by 1,2
I know im doing this wrong but I'm kinda new with sql and cant find the way :|
Thanks!
This should do it
CAST(ROUND(SUM(daily_sales) * 100.00 / SUM(daily_budget), 1) AS DECIMAL(5,2)) AS Effectiveness
I'm new at SQL too but maybe I can help. Try this?
sel
id,
month,
sum (daily_budget) as budget,
sum (daily_sales) as sales,
(sum(daily_budget)/sum(daily_sales)) over (partition by id) as efectivenes
from sales
group by id
If you want to ALTER your table so that it contains a fifth column where the result of budget/sales is automatically calculated, all you need to do this add the formula to this auto-generated column. The example I am about to show is based on MySQL.
Open MySQL
Find the table you wish to modify in the Navigator Pane, right-click on it and select "Alter Table"
Add a new row to your table. Make sure you select NN (Not Null) and G (Generated Column) check boxes
In the Default/Expression column, simply enter the expression budget / sales.
Once you run your next query, you should see your column generated and populated with the calculated results. If you simply want the SQL statement to do the same from the console, it will be something like this: ALTER table YOUR_TABLE_NAME add result FLOAT as (budget / sales);

SQL Server query - keeping first and last unique records of a group

We are trying to remove and rank data in tables that is provided in a daily feed to our system. the example data of course isn't the actual product, but clearly represents the concept.
Daily inserts:
data is imported daily into tables that continually updates the status of the products
the daily status updates tell us when products were listed, are they currently listed and then the last date it was listed
after a period of {X} time, we can normalize the data
Cleanup & ranking:
we are now trying to remove duplicate records for values in a group that fall in-between the first and last values
we also want to set identifiers for the records that represent the first and last occurrence of those unique values in that group
Sample data:
I've found that the photo is the easiest way to show the data, show what's needed and not needed - I hope this makes it easier and not obtuse.
In the sample data:
"ridgerapp" we want to keep the records for 03/12/17 & 06/12/17.
"ridgerapp" we want to delete the records that fall between the dates above.
"ridgerapp" we want to also set/update the records for 03/12/17 & 06/12/17 as the first and last occurrence - something like -
update table set 03/12/17 = 0 (first), 06/12/17 = 1 (last)
"sierra" is just another expanded data sample, and we want to keep the records for 12/06/16 and 12/11/16.
"sierra" delete the records that fall between 12/06/16 and 12/11/16.
"sierra" update the status/rank for the 12/06/16 and 12/11/16 records as the first and last occurrence.
update table set 12/06/16 = 0 (first), 12/11/16 = 1 (last).
Conclusion:
Using pseudo code, this is the overall objective:
select distinct records in table (using id,name,color,value as unique identifiers)
for the records in each group look at the history and find the top and bottom dates
delete records between top and bottom dates for each group
update the history with a status/rank (field name is rank) of 0 and 1 for values in each group
using the sample data, the results would end up
Updated table values:
23 ridgerapp blue 25 03/12/17 0
23 ridgerapp blue 25 06/12/17 1
57 sierra red 15 12/06/16 0
57 sierra red 15 12/11/16 1
I'd use a CTE with the row_number() window function to find the first and last rows for each group, and then update it.
You didn't specify what makes a group a group so I only based this off the ID. If you want the group be a set of columns, i.e ID and Color and Value then just add these columns to the partition by list. For the sample data the result would be the same, but different sample data would have different outcomes.
Notice I didn't include the exact rows for the sierra group because I wanted to show you how it'd handle duplicate history dates.
declare #table table (id int, [name] varchar(64), color varchar(16), [value] int, history date)
insert into #table
values
(23,'ridgerapp','blue',25,'20170312'),
(23,'ridgerapp','blue',25,'20170325'),
(23,'ridgerapp','blue',25,'20170410'),
(23,'ridgerapp','blue',25,'20170610'),
(23,'ridgerapp','blue',25,'20170612'),
(57,'sierra','red',15,'20161206'),
(57,'sierra','red',15,'20161208'),
(57,'sierra','red',15,'20161210'),
(57,'sierra','red',15,'20161210') --notice this is a duplicate row
;with cte as(
select
*
,fst = row_number() over (partition by id order by history asc)
,lst = row_number() over (partition by id order by history desc)
from #table
)
delete from cte
where fst !=1 and lst !=1
select
*
,flag = case when row_number() over (partition by id order by history asc) = 1 then 0 else 1 end
from #table
RETURNS
+----+-----------+-------+-------+------------+------+
| id | name | color | value | history | flag |
+----+-----------+-------+-------+------------+------+
| 23 | ridgerapp | blue | 25 | 2017-03-12 | 0 |
| 23 | ridgerapp | blue | 25 | 2017-06-12 | 1 |
| 57 | sierra | red | 15 | 2016-12-06 | 0 |
| 57 | sierra | red | 15 | 2016-12-10 | 1 |
+----+-----------+-------+-------+------------+------+

SQL Server, complex query

I have an Azure SQL Database table which is filled by importing XML-files.
The order of the files is random so I could get something like this:
ID | Name | DateFile | IsCorrection | Period | Other data
1 | Mr. A | March, 1 | false | 3 | Foo
20 | Mr. A | March, 1 | true | 2 | Foo
13 | Mr. A | Apr, 3 | true | 2 | Foo
4 | Mr. B | Feb, 1 | false | 2 | Foo
This table is joined with another table, which is also joined with a 3rd table.
I need to get the join of these 3 tables for the person with the newest data, based on Period, DateFile and Correction.
In my above example, Id=1 is the original data for Period 3, I need this record.
But in the same file was also a correction for Period 2 (Id=20) and in the file of April, the data was corrected again (Id=13).
So for Period 3, I need Id=1, for Period 2 I need Id=13 because it has the last corrected data and I need Id=4 because it is another person.
I would like to do this in a view, but using a stored procedure would not be a problem.
I have no idea how to solve this. Any pointers will be much appreciated.
EDIT:
My datamodel is of course much more complex than this sample. DateFile and Period are DateTime types in the table. Actually Period is two DateTime columns: StartPeriod and EndPeriod.
Well looking at your data I believe we can disregard the IsCorrection column and just pick the latest column for each user/period.
Lets start by ordering the rows placing the latest on top :
SELECT ROW_NUMBER() OVER (PARTITION BY Period, Name ORDER by DateFile DESC), *
And from this result you select all with row number 1:
;with numberedRows as (
SELECT ROW_NUMBER() OVER (PARTITION BY Period, Name ORDER by DateFile DESC) as rowIndex, *
)
select * from numberedRows where rowIndex=1
The PARTITION BY tells ROW_NUMBER() to reset the counter whenever it encounters change in the columns Period and Name. The ORDER BY tells the ROW_NUMBER() that we want th newest row to be number 1 and then older posts afterwards. We only need the latest row.
The WITH declares a "common table expression" which is a kind of subquery or temporary table.
Not knowing your exact data, I might recommend you something wrong, but you should be able to join your with last query with other tables to get your desired result.
Something like:
;with numberedRows as (
SELECT ROW_NUMBER() OVER (PARTITION BY Period, Name ORDER by DateFile DESC) as rowIndex, *
)
select * from numberedRows a
JOIN periods b on b.empId = a.Id
JOIN msg c on b.msgId = c.Id
where a.rowIndex=1

SQL to find max of sum of data in one table, with extra columns

Apologies if this has been asked elsewhere. I have been looking on Stackoverflow all day and haven't found an answer yet. I am struggling to write the query to find the highest month's sales for each state from this example data.
The data looks like this:
| order_id | month | cust_id | state | prod_id | order_total |
+-----------+--------+----------+--------+----------+--------------+
| 67212 | June | 10001 | ca | 909 | 13 |
| 69090 | June | 10011 | fl | 44 | 76 |
... etc ...
My query
SELECT `month`, `state`, SUM(order_total) AS sales
FROM orders GROUP BY `month`, `state`
ORDER BY sales;
| month | state | sales |
+------------+--------+--------+
| September | wy | 435 |
| January | wy | 631 |
... etc ...
returns a few hundred rows: the sum of sales for each month for each state. I want it to only return the month with the highest sum of sales, but for each state. It might be a different month for different states.
This query
SELECT `state`, MAX(order_sum) as topmonth
FROM (SELECT `state`, SUM(order_total) order_sum FROM orders GROUP BY `month`,`state`)
GROUP BY `state`;
| state | topmonth |
+--------+-----------+
| ca | 119586 |
| ga | 30140 |
returns the correct number of rows with the correct data. BUT I would also like the query to give me the month column. Whatever I try with GROUP BY, I cannot find a way to limit the results to one record per state. I have tried PartitionBy without success, and have also tried unsuccessfully to do a join.
TL;DR: one query gives me the correct columns but too many rows; the other query gives me the correct number of rows (and the correct data) but insufficient columns.
Any suggestions to make this work would be most gratefully received.
I am using Apache Drill, which is apparently ANSI-SQL compliant. Hopefully that doesn't make much difference - I am assuming that the solution would be similar across all SQL engines.
This one should do the trick
SELECT t1.`month`, t1.`state`, t1.`sales`
FROM (
/* this one selects month, state and sales*/
SELECT `month`, `state`, SUM(order_total) AS sales
FROM orders
GROUP BY `month`, `state`
) AS t1
JOIN (
/* this one selects the best value for each state */
SELECT `state`, MAX(sales) AS best_month
FROM (
SELECT `month`, `state`, SUM(order_total) AS sales
FROM orders
GROUP BY `month`, `state`
)
GROUP BY `state`
) AS t2
ON t1.`state` = t2.`state` AND
t1.`sales` = t2.`best_month`
It's basically the combination of the two queries you wrote.
Try this:
SELECT `month`, `state`, SUM(order_total) FROM orders WHERE `month` IN
( SELECT TOP 1 t.month FROM ( SELECT `month` AS month, SUM(order_total) order_sum FROM orders GROUP BY `month`
ORDER BY order_sum DESC) t)
GROUP BY `month`, state ;

SQL to find the date when the price last changed

Input:
Date Price
12/27 5
12/21 5
12/20 4
12/19 4
12/15 5
Required Output:
The earliest date when the price was set in comparison to the current price.
For e.g., price has been 5 since 12/21.
The answer cannot be 12/15 as we are interested in finding the earliest date where the price was the same as the current price without changing in value(on 12/20, the price has been changed to 4)
This should be about right. You didn't provide table structures or names, so...
DECLARE #CurrentPrice MONEY
SELECT TOP 1 #CurrentPrice=Price FROM Table ORDER BY Date DESC
SELECT MIN(Date) FROM Table WHERE Price=#CurrentPrice AND Date>(
SELECT MAX(Date) FROM Table WHERE Price<>#CurrentPrice
)
In one query:
SELECT MIN(Date)
FROM Table
WHERE Date >
( SELECT MAX(Date)
FROM Table
WHERE Price <>
( SELECT TOP 1 Price
FROM Table
ORDER BY Date DESC
)
)
This question kind of makes no sense so im not 100% sure what you are after.
create four columns, old_price, new_price, old_date, new_date.
! if old_price === new_price, simply print the old_date.
What database server are you using? If it was Oracle, I would use their windowing function. Anyway, here is a quick version that works in mysql:
Here is the sample data:
+------------+------------+---------------+
| date | product_id | price_on_date |
+------------+------------+---------------+
| 2011-01-01 | 1 | 5 |
| 2011-01-03 | 1 | 4 |
| 2011-01-05 | 1 | 6 |
+------------+------------+---------------+
Here is the query (it only works if you have 1 product - will have to add a "and product_id = ..." condition on the where clause if otherwise).
SELECT p.date as last_price_change_date
FROM test.prices p
left join test.prices p2 on p.product_id = p2.product_id and p.date < p2.date
where p.price_on_date - p2.price_on_date <> 0
order by p.date desc
limit 1
In this case, it will return "2011-01-03".
Not a perfect solution, but I believe it works. Have not tested on a larger dataset, though.
Make sure to create indexes on date and product_id, as it will otherwise bring your database server to its knees and beg for mercy.
Bernardo.