Using Update in a Common Table Expression - sql

below is a CTE that is part of a larger query. I am trying to attempt something simple: update or replace all records for '2016-11' or '2016-12' values with '2016-10.'
The query runs into an error at UPDATE. Is there an alternative here that would make this query work?
with q (month, cobrand, members) as
(select date_trunc('month',optimized_transaction_date), cobrand_id,
count(distinct unique_mem_id)
from yi_fourmpanel.card_panel
Where (cobrand_id = '10001372' or cobrand_id = '10005640' or cobrand_id = '10005244')
group by 1,2)
UPDATE q
SET members = dc
FROM (SELECT cobrand, members dc
FROM q
WHERE month = '2016-10') x
WHERE q.cobrand = x.cobrand
AND month IN ('2016-11', '2016-12')

If you don't want to change table data, don't use UPDATE.
To achieve what you want to do, write the main query similar to this:
WITH q(month, cobrand, members) AS (...)
SELECT
month,
cobrand,
CASE WHEN month IN ('2016-11', '2016-12')
THEN (SELECT members
FROM q q1
WHERE q1.month = '2016-10')
ELSE members
END
FROM q;

Related

Avoiding aggregation when selecting values from tables

I have the following code which selects value from table2 when 'some string' occurs more than once in 1990
SELECT a.value, COUNT(*) AS test
FROM table1 c
JOIN table2 a
ON c.value2 = a.value_2
JOIN table3 o
ON c.value3 = o.value_3
AND o.value4 = 1990
WHERE c.string = 'Some string'
GROUP BY a.value
HAVING COUNT(*) > 1
This works fine but I am attempting to write a query that produces a similar result without using aggregation. I just need to select values with more then 1 c.string and select those rather than counting and selecting the count as well. I thought about searching for pairs of 'some string' occurring in 1990 for a value but am unsure of how to execute this. Pointing me in the right direction would be appreciated! Struggling to find any documentation referencing this. Thank you!
Use window function ROW_NUMBER() to assign a sequence number within the rows of each table2.value. And use window function FIRST_VALUE() to get the largest row number for each table2.value. Use DISTINCT to remove the duplicates:
select distinct value, first_value(rn) over ( order by rn desc) as count
from
(
SELECT a.value , row_number() over (partition by a.value order by null) rn
FROM table1 c
JOIN table2 a
ON c.value2 = a.value_2
JOIN table3 o
ON c.value3 = o.value_3
AND o.value4 = 1990
WHERE c.string = 'Some string' ) t
where rn > 1;
To check for duplicates, you can use 'WHERE EXISTS', as a starting point. You could start by reading this:
https://www.w3schools.com/sql/sql_exists.asp
This will give you quite a long, cumbersome piece of code compared to using aggregation. But I expect that's the point of the task - to show how useful aggregation is.

Combine 2 complex queries into 1

I am trying to figure out if there's a way to combine these 2 queries into a single one. I've run into the limits of what I know and can't figure out if this is possible or not.
This is the 1st query that gets last year sales for each day per location (for one month):
if object_id('tempdb..#LY_Data') is not null drop table #LY_Data
select
[LocationId] = ri.LocationId,
[LY_Date] = convert(date, ri.ReceiptDate),
[LY_Trans] = count(distinct ri.SalesReceiptId),
[LY_SoldQty] = convert(money, sum(ri.Qty)),
[LY_RetailAmount] = convert(money, sum(ri.ExtendedPrice)),
[LY_NetSalesAmount] = convert(money, sum(ri.ExtendedAmount))
into #LY_Data
from rpt.SalesReceiptItem ri
join #Location l
on ri.LocationId = l.Id
where ri.Ignored = 0
and ri.LineType = 1 /*Item*/
and ri.ReceiptDate between #_LYDateFrom and #_LYDateTo
group by
ri.LocationId,
ri.ReceiptDate
Then the 2nd query computes a ratio based on the total sales for that month for each day (to be used later):
if object_id('tempdb..#LY_Data2') is not null drop table #LY_Data2
select
[LocationId] = ly.LocationId,
[LY_Date] = ly.LY_Date,
[LY_Trans] = ly.LY_Trans,
[LY_RetailAmount] = ly.LY_RetailAmount,
[LY_NetSalesAmount] = ly.LY_NetSalesAmount,
[Ratio] = ly.LY_NetSalesAmount / t.MonthlySales
into #LY_Data2
from (
select
[LocationId] = ly.LocationId,
[MonthlySales] = sum(ly.LY_NetSalesAmount)
from #LY_Data ly
group by
ly.LocationId
) t
join #LY_Data ly
on t.LocationId = ly.LocationId
I've tried using the first query as a subquery in the 2nd query group-by from clause, but that won't let me select those columns in the outer most select statement (multi part identifier couldn't be bound).
As well as putting the first query into the join clause at the end of the 2nd query with the same issue.
There's probably something I'm missing, but I'm still pretty new to SQL so any help or just a pointer in the right direction would be greatly appreciated! :)
You can try using a Common Table Expression (CTE) and window function:
if object_id('tempdb..#LY_Data') is not null drop table #LY_Data
;with
cte AS
(
select
[LocationId] = ri.LocationId,
[LY_Date] = convert(date, ri.ReceiptDate),
[LY_Trans] = count(distinct ri.SalesReceiptId),
[LY_SoldQty] = convert(money, sum(ri.Qty)),
[LY_RetailAmount] = convert(money, sum(ri.ExtendedPrice)),
[LY_NetSalesAmount] = convert(money, sum(ri.ExtendedAmount))
from rpt.SalesReceiptItem ri
join #Location l
on ri.LocationId = l.Id
where ri.Ignored = 0
and ri.LineType = 1 /*Item*/
and ri.ReceiptDate between #_LYDateFrom and #_LYDateTo
group by
ri.LocationId,
ri.ReceiptDate
)
select
[LocationId] = cte.LocationId,
[LY_Date] = cte.LY_Date,
...
[Ratio] = cte.LY_NetSalesAmount / sum(cte.LY_NetSalesAmount) over (partition by cte.LocationId)
into #LY_Data
from cte
sum(cte.LY_NetSalesAmount) over (partition by cte.LocationId) gives you the sum for each locationId. The code assume that this sum is always non-zero. Otherwise, a divide-by-0 error will occur.
Seems like all you need to do is calculate ratio in the first query.
You can do this with a correlated subquery.
SELECT
...
convert(money, sum(ri.ExtendedAmount)/(SELECT sum(ri2.ExtendedAmount)
FROM rpt.SalesReceiptItem ri2
WHERE ri2.LocationId=ri.LocationId
)
) AS ratio --extended amount/total extended amount for this location

Use of MAX function in SQL query to filter data

The code below joins two tables and I need to extract only the latest date per account, though it holds multiple accounts and history records. I wanted to use the MAX function, but not sure how to incorporate it for this case. I am using My SQL server.
Appreciate any help !
select
PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label
from
Property.dbo.PROP
inner join
Property.dbo.PROP_DATA on Property.dbo.PROP.FileID = Actuarial.dbo.PROP_DATA.FileID
where
(PROP_DATA.Label in ('Occupancy' , 'OccupancyTIV'))
and (PROP.EffDate >= '42278' and PROP.EffDate <= '42643')
and (PROP.Status = 'Bound')
and (Prop.FileTime = Max(Prop.FileTime))
order by
PROP.EffDate DESC
Assuming your DBMS supports windowing functions and the with clause, a max windowing function would work:
with all_data as (
select
PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label,
max (PROP.EffDate) over (partition by PROP.PolNo) as max_date
from Actuarial.dbo.PROP
inner join Actuarial.dbo.PROP_DATA
on Actuarial.dbo.PROP.FileID = Actuarial.dbo.PROP_DATA.FileID
where (PROP_DATA.Label in ('Occupancy' , 'OccupancyTIV'))
and (PROP.EffDate >= '42278' and PROP.EffDate <= '42643')
and (PROP.Status = 'Bound')
and (Prop.FileTime = Max(Prop.FileTime))
)
select
FileName, InsName, Status, FileTime, SubmissionNo,
PolNo, EffDate, ExpDate, Region, UnderWriter, Data, Label
from all_data
where EffDate = max_date
ORDER BY EffDate DESC
This also presupposes than any given account would not have two records on the same EffDate. If that's the case, and there is no other objective means to determine the latest account, you could also use row_numer to pick a somewhat arbitrary record in the case of a tie.
Using straight SQL, you can use a self-join in a subquery in your where clause to eliminate values smaller than the max, or smaller than the top n largest, and so on. Just set the number in <= 1 to the number of top values you want per group.
Something like the following might do the trick, for example:
select
p.FileName
, p.InsName
, p.Status
, p.FileTime
, p.SubmissionNo
, p.PolNo
, p.EffDate
, p.ExpDate
, p.Region
, p.Underwriter
, pd.Data
, pd.Label
from Actuarial.dbo.PROP p
inner join Actuarial.dbo.PROP_DATA pd
on p.FileID = pd.FileID
where (
select count(*)
from Actuarial.dbo.PROP p2
where p2.FileID = p.FileID
and p2.EffDate <= p.EffDate
) <= 1
and (
pd.Label in ('Occupancy' , 'OccupancyTIV')
and p.Status = 'Bound'
)
ORDER BY p.EffDate DESC
Have a look at this stackoverflow question for a full working example.
Not tested
with temp1 as
(
select foo
from bar
whre xy = MAX(xy)
)
select PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label
from Actuarial.dbo.PROP
inner join temp1 t
on Actuarial.dbo.PROP.FileID = t.dbo.PROP_DATA.FileID
ORDER BY PROP.EffDate DESC

Adding a new computed variable back to main dataset in SQL

I am trying to compute a variable (say last_week) and add it back to my main dataset (say new_j). I managed to join it to new_j. However, if I want to use that variable (last_week) now for further calculations, it does not recognise it. Here's my code:
SELECT [Weekkey] AS weekkey
,[article / colour] as prod_id
,[Current MP Department No/Desc] as prod_dept
,[Total Stock] as total_stock
INTO #new_j
FROM [J_20160831] --(that’s the db in server and I created a temp db #new_j)
SELECT prod_id, max(weekkey) as last_week
into #lastweeksales
FROM #new_j
group by prod_id
select *
from #new_j
left join #lastweeksales
on #lastweeksales.prod_id = #new_j.prod_id
So, I joined both successfully and if I run this code, I see column last_week. Now what I want to do is this:
select *
,case
when last_week = max(weekkey) then total_stock
else 0
end as last_stock_position
from #new_j
But it says last_week is not found in new_j. I also tried #lastweeksales.last_week instead of just last_week in the last bit of code, but it didn't either. What's the best way out here? Moreover, is there a better way to do it instead?. The output I am looking to have at the end is a table with these variables: WeekKey, prod_dept, prod_id, total_stock, last_week, last_stock_position
Thanks for the help!!! Much appreciate it.
This normal behaviour of joins..
by selecting this
select * from #new_j left join #lastweeksales
on #lastweeksales.prod_id = #new_j.prod_id'
all the columns of newj and lastweekales will be displayed in same order (first new_j columns and then lastweeksales columns ).So 'last_week' is the last column of lastweeksales.
Secondly,
select *,
case when last_week = max(weekkey) then total_stock
else 0
end as last_stock_position
from #new_j
in above query,your are selecting 'last_week' column which belongs to the table #lastweeksales.
Be careful while selecting the columns.
I guess your expecting,
select a.WeekKey, a.prod_dept, a.prod_id, a.total_stock, b.last_week,
case
when b.last_week = max(a.weekkey) then total_stock
else 0
end as last_stock_position
from #new_j as a
left join #lastweeksales as b
on b.prod_id = a.prod_id
group by a.weekkey,a.prod_dept,a.prod_id,a.total_stock,b.last_week

Group By & Having vs. SubQuery (Where Count is Greater Than 1)

I'm struggling here trying to write a script that finds where an order was returned multiple times by the same associate (count greater than 1). I'm guessing my syntax with the subquery is incorrect. When I run the script, I get a message back that the "SELECT failed.. [3669] More than one value was returned by the subquery."
I'm not tied to the subquery, and have tried using just the group by and having statements, but I get an error regarding a non-aggregate value. What's the best way to proceed here and how do I fix this?
Thank you in advance - code below:
SEL s.saletran
, s.saletran_dt SALE_DATE
, r.saletran_id RET_TRAN
, r.saletran_dt RET_DATE
, ra.user_id RET_ASSOC
FROM salestrans s
JOIN salestrans_refund r
ON r.orig_saletran_id = s.saletran_id
AND r.orig_saletran_dt = s.saletran_dt
AND r.orig_loc_id = s.loc_id
AND r.saletran_dt between s.saletran_dt and s.saletran_dt + 30
JOIN saletran rt
ON rt.saletran_id = r.saletran_id
AND rt.saletran_dt = r.saletran_dt
AND rt.loc_id = r.loc_id
JOIN assoc ra --Return Associate
ON ra.assoc_prty_id = rt.sls_assoc_prty_id
WHERE
(SELECT count(*)
FROM saletran_refund
GROUP BY ORIG_SLTRN_ID
) > 1
AND s.saletran_dt between '2015-01-01' and current_date - 1
Based on what you've got so far, I think you want to use this instead:
where r.ORIG_SLTRN_ID in
(select
ORIG_SLTRN_ID
from
saletran_refund
group by ORIG_SLTRN_ID
having count (*) > 1)
That will give you the ORIG_SLTRN_IDs that have more than one row.
you don't give enough for a full answer but this is a start
group by s.saletran
, s.saletran_dt SALE_DATE
, r.saletran_id RET_TRAN
, r.saletran_dt RET_DATE
, ra.user_id RET_ASSOC
having count(distinct(ORIG_SLTRN_ID)) > 0
this does return more the an one row
run it
SELECT count(*)
FROM saletran_refund
GROUP BY ORIG_SLTRN_ID