how to prevent duplicate columns in inner join with multiple select clauses - sql

select * from
(select date, gen_city_id, min(temp) as min_temp, max(temp) as max_temp from current_weather group by date, gen_city_id order by date) cw
inner join
(select gen_city_id, forecast_date, array_agg(temp) from forecast where forecast_date < current_date group by gen_city_id, forecast_date) f
on cw.gen_city_id = f.gen_city_id and cw.date = f.forecast_date;
The above query works, however the gen_city_id and date/forecast_date columns are selected from both the tables. In my result set how do I prevent duplicate columns from both the tables ?
If I try removing the columns from the select cause of any one of the tables, then the query errors out.

Change the query in this way. You can specify which fields you want to obtain in the resultset:
select cw.*,f.temp from
(select date, gen_city_id, min(temp) as min_temp, max(temp) as max_temp from current_weather group by date, gen_city_id order by date) cw
inner join
(select gen_city_id, forecast_date, array_agg(temp) temp from forecast where forecast_date < current_date group by gen_city_id, forecast_date) f
on cw.gen_city_id = f.gen_city_id and cw.date = f.forecast_date;

You can use the using clause:
select *
from (select date, gen_city_id, min(temp) as min_temp, max(temp) as max_temp
from current_weather
group by date, gen_city_id order by date
) cw join
(select gen_city_id, forecast_date as date, array_agg(temp)
from forecast
where forecast_date < current_date
group by gen_city_id, forecast_date
) f
using (gen_city_id, date) ;
This removes the duplicate columns included in the using clause.
In general, though, I recommend listing out all the columns separately.

Related

Compose a SQL query that produces monthly revenue by channel and the previous month's revenue

Hey everyone I have two tables with output like this:
Month_Table
Transaction_Table
I need to calculate the monthly revenue by channel and the previous month's revenue: I did this query but it is not completed
Select date_created, channel, sum(revenue) as monthly_revenue
from transaction_table
GROUP BY date_created,channel
The result should be displaying monthly revenue and the month's revenue of previous month.
How can I do that?
try this code .
with resultTable as(
select RT.channel,RT.sumRevenue,LT.[month-start_date],LT.month_end_date,LT.year_month
from (select t.channel,sum(revenue) as sumRevenue,M.month_index from Month_Table M,Transaction_Table T
where t.date_created BETWEEN m.[month-start_date] AND m.month_end_date
group by m.month_index,t.channel) RT Join Month_Table LT on RT.month_index = LT.month_index
)
select * from resultTable
output:
OR use this query
with resultTable as(
select RT.channel,RT.sumRevenue,LT.[month-start_date],LT.month_end_date,LT.year_month
from (select t.channel,sum(revenue) as sumRevenue,M.month_index from Month_Table M,Transaction_Table T
where t.date_created BETWEEN m.[month-start_date] AND m.month_end_date
group by m.month_index,t.channel) RT Join Month_Table LT on RT.month_index = LT.month_index
)
select *,LAG(sumRevenue,1) OVER (PARTITION BY channel ORDER BY channel) previous_month_sales from resultTable
output:
You could try uing a a join between you tables
Select a.month_index, a.year_month, b.channel, sum(b.revenue) as monthly_revenue
from Month_Table a
from transaction_table b ON b.date_created between a.month_start_date and a.month_and_date
amd month(b.date_created) = betwwen month(curdate()) -1 and month(curdate())
GROUP BY a.month_index, a.year_month, b.channel
order by a.year_month desc
Try this:
Select t1.date_created, t1.channel, sum(t1.revenue) as monthly_revenue ,sum(t2.revenue) prev_month_revenue
from transaction_table t1 left join transaction_table t2 on t1.channel = t2.channel and to_char(t1.date_created,'MM') = to_char(add_months(t2.date_created,-1),'MM')
GROUP BY t1.date_created,t1.channel;

How to sum from multiple columns and segregate into separate column if result is positive and negative

I am using postgresql and need to write a query to sum values from separate columns of two different tables and then segregate into separate columns if positive or negative.
For Example,
Below is the source table
Below is the resultant table which need to be created also used while populating it
I have written below query to aggregate sum and able to populate TOT_CREDIT and TOT_DEBIT column. Is there any optimized query to achieve that ?
select t.account_id,
t.transaction_date,
SUM(t.transaction_amt) filter (where t.transaction_amt >= 0) as tot_debit,
SUM(t.transaction_amt) filter (where t.transaction_amt < 0) as tot_credit,
case
when
(
SUM(t.transaction_amt) +
SUM(COALESCE(b.credit_balance,0)) +
SUM(COALESCE(b.debit_balance,0))
) < 0
then
(
SUM(t.transaction_amt) +
SUM(COALESCE(b.credit_balance,0)) +
SUM(COALESCE(b.debit_balance,0))
)
end as credit_balance,
case
when
(
SUM(t.transaction_amt) +
SUM(COALESCE(b.credit_balance,0)) +
SUM(COALESCE(b.debit_balance,0))
) > 0
then
(
SUM(t.transaction_amt) +
SUM(COALESCE(b.credit_balance,0)) +
SUM(COALESCE(b.debit_balance,0))
)
end as debit_balance,
from
transaction t
LEFT OUTER JOIN balance b ON (t.account_id = b.account_id
and t.transaction_date = b.transaction_date
and b.transaction_date=t.transaction_date- INTERVAL '1 DAYS')
group by
t.account_id,
t.transaction_date
Please provide some pointer.
EDIT 1: This query is not working in expected manner.
One way is to break your logic into smal queries and join them in the end!
select tw.account_id, tw.t_date,tw.t_c,th.T_D,fo.C_B,fi.d_B from
(select account_id, Transaction_date as t_date, sum(Transaction_AMT) as t_C from TransactionTABLE
where Transaction_AMT<0 group by account_id, Transaction_date ) as tw inner join
(select account_id, Transaction_date as t_date, sum(Transaction_AMT) as t_d from TransactionTABLE
where Transaction_AMT>0 group by account_id, Transaction_date ) as th on tw.account_id=th.account_id and tw.t_date=th.t_date inner join
(select account_id, Transaction_date as t_date, sum(Transaction_AMT) as C_B from TransactionTABLE
where sum(Transaction_AMT)<0 group by account_id, Transaction_date ) as fo on th.account_id=fo.account_id and th.t_date=fo.t_date inner join
(select account_id, Transaction_date as t_date, sum(Transaction_AMT) as d_B from TransactionTABLE
where sum(Transaction_AMT)>0 group by account_id, Transaction_date ) as fi on fi.account_id=fo.account_id and fi.t_date=fo.t_date;
Or else
You could try something as follows which calculates the running count of d_B over the Transaction_date and account_id
select account_id,
transaction_date,
SUM(transaction_amt) filter (where transaction_amt >= 0) as tot_debit,
SUM(transaction_amt) filter (where transaction_amt < 0) as tot_credit,
sum(transaction_amt) over (partition by account_id where sum(transaction_amt)<0) as credit_balance,
sum(transaction_amt) over (partition by account_id where sum(transaction_amt)>=0) as debit_balance
from TransactionTABLE group by account_id, Transaction_date order by 1,2;

Summing Two Columns from Two Tables with Two Dates

I work for a CPG company and need to create a report that compares the previous month's delivered units to the next month's forecast. (Simply, our forecasting tool screws up occasionally and this will help identify when the forecast is off.)
My issue is my SQL query is summing forecast sales correctly, but the sum of total delivered is not respecting the dates I have in my WHERE clause -- it's summing total delivered for as far back as the query can reach.
Here is my query:
SELECT
DelUnits.Customer, DelUnits.ObsText01,
FinalFcst.SKU, FinalFcst.Customer,
SUM(DelUnits.Value) AS TotalDelivered,
SUM(FinalFcst.FinalFcst) AS ForecastSales
FROM
DelUnits
LEFT JOIN
FinalFcst ON DelUnits.Customer = FinalFcst.Customer
WHERE
(FinalFcst.DT >= '2018-01-01' and FinalFcst.DT <= '2018-01-31')
AND (DelUnits.Date >= '2017-12-01' and DelUnits.Date <= '2017-12-31')
AND DelUnits.ObsText01 = '10_LB'
AND FinalFcst.SKU = '10_LB'
GROUP BY
DelUnits.Customer, DelUnits.ObsText01, FinalFcst.SKU, FinalFcst.Customer
Again, the query seems to work correctly for the final forecast (summing the forecast between 1/1/18 - 1/31/18) but sums the entire delivery history for a customer. I don't understand why it won't sum the delivery history for just 12/1/17 - 12/31/17.
Thank you for your help!
Presumably, there is only one row for FinalFcst. So, either include it in the GROUP BY clause or use MAX() instead of SUM():
max(FinalFcst.FinalFcst) as ForecastSales
One way to achieve this is to calculate TotalDelivered and ForecastSales in 2 different queries and then join them together.
Try this:
SELECT DelUnits.customer,
DelUnits.obstext01,
FinalFcst.sku,
FinalFcst.customer,
totaldelivered,
forecastsales
FROM (SELECT customer,
obstext01,
Sum(value) AS TotalDelivered
FROM delunits
WHERE date >= '2017-12-01'
AND date <= '2017-12-31'
AND obstext01 = '10_LB'
GROUP BY customer,
obstext01) DelUnits
LEFT JOIN (SELECT customer,
sku,
Sum(finalfcst) AS ForecastSales
FROM finalfcst
WHERE dt >= '2018-01-01'
AND dt <= '2018-01-31'
AND sku = '10_LB'
GROUP BY customer,
sku) FinalFcst ON DelUnits.customer = FinalFcst.customer
You have a many to many relationship between the tables. Ultimately you need to SUM() one table before joining to the other to create a one to many relationship, or you end up duplicating records.
My favorite approach is a derived table:
SELECT C.Customer,
C.ObsText01,
FC.SKU,
C.TotalDelivered,
SUM(FC.FinalFcst) ForecastSales
FROM (SELECT SUM(Value) TotalDelivered, Customer, ObsText01
FROM DelUnits
WHERE Date >= '2017-12-01' AND Date <= '2017-12-31'
AND ObsText01 = '10_LB'
GROUP BY Customer) C
LEFT JOIN FinalFcst FC ON C.Customer = FC.Customer
AND FC.DT >= '2018-01-01'
AND FC.DT <= '2018-01-31'
AND FC.SKU = '10_LB'
GROUP BY C.Customer, C.ObsText01, FC.SKU, C.TotalDelivered
A couple things: Added your forecast table filters to the join predicate, since having those in the WHERE will create an INNER JOIN out of your LEFT JOIN. Also removed FC.Customer from the select and the group since it is redundant with C.Customer.
Maybe you could try to create a temp table to calculate the delivery history. I am not sure of the SQL Server verbiage, but something like this:
WITH DEL_HIST AS
(SELECT DelUnits.Customer,
DelUnits.ObsText01,
sum(DelUnits.Value) as TotalDelivered,
FROM DelUnits
Where(DelUnits.Date >= '2017-12-01' and DelUnits.Date <= '2017-12-31')
and DelUnits.ObsText01 = '10_LB'
Group By DelUnits.Customer, DelUnits.ObsText01)
SELECT
DEL_HIST.Customer,
DEL_HIST.ObsText01,
FinalFcst.SKU,
FinalFcst.Customer,
DEL_HIST.TotalDelivered,
sum(FinalFcst.FinalFcst) as ForecastSales
FROM DEL_HIST
left join FinalFcst ON DelUnits.Customer = FinalFcst.Customer
Where (FinalFcst.DT >= '2018-01-01' and FinalFcst.DT <= '2018-01-31')
and FinalFcst.SKU = '10_LB'
Group By DelUnits.Customer, DelUnits.ObsText01, FinalFcst.SKU, FinalFcst.Customer

SQL - values from two rows into new two rows

I have a query that gives a sum of quantity of items on working days. on weekend and holidays that quantity value and item value is empty.
I would like that on empty days is last known quantity and item.
My query is like this:
`select a.dt,b.zaliha as quantity,b.artikal as item
from
(select to_date('01-01-2017', 'DD-MM-YYYY') + rownum -1 dt
from dual
connect by level <= to_date(sysdate) - to_date('01-01-2017', 'DD-MM-YYYY') + 1
order by 1)a
LEFT OUTER JOIN
(select kolicina,sum(kolicina)over(partition by artikal order by datum_do) as zaliha,datum_do,artikal
from
(select sum(vv.kolicinaulaz-vv.kolicinaizlaz)kolicina,vz.datum as datum_do,vv.artikal
from vlpzaglavlja vz, vlpvarijante vv
where vz.id=vv.vlpzaglavlje
and vz.orgjed='01006'
and vv.skladiste='01006'
and vv.artikal in (3069,6402)
group by vz.datum,vv.artikal
order by vv.artikal,vz.datum asc)
order by artikal,datum_do asc)b
on a.dt=b.datum_do
where a.dt between to_date('12102017','ddmmyyyy') and to_date('16102017','ddmmyyyy')
order by a.dt`
and my output is like this:
and I want this:
In short, if quantity is null use lag(... ignore nulls) and coalesce or nvl:
select dt, item,
nvl(quantity, lag(quantity ignore nulls) over (partition by item order by dt))
from t
order by dt, item
Here is the full query, I cannot test it, but it is something like:
with t as (
select a.dt, b.zaliha as quantity, b.artikal as item
from (
select date '2017-10-10' + rownum - 1 dt
from dual
connect by date '2017-10-10' + rownum - 1 <= date '2017-10-16' ) a
left join (
select kolicina, datum_do, artikal,
sum(kolicina) over(partition by artikal order by datum_do) as zaliha
from (
select sum(vv.kolicinaulaz-vv.kolicinaizlaz) kolicina,
vz.datum as datum_do, vv.artikal
from vlpzaglavlja vz
join vlpvarijante vv on vz.id = vv.vlpzaglavlje
where vz.orgjed = '01006' and vv.skladiste='01006'
and vv.artikal in (3069,6402)
group by vz.datum, vv.artikal)) b
on a.dt = b.datum_do)
select *
from (
select dt, item,
nvl(quantity, lag(quantity ignore nulls)
over (partition by item order by dt)) qty
from t)
where dt >= date '2017-10-12'
order by dt, item
There are several issues in your query, major and minor:
in date generator (subquery a) you are selecting dates from long period, january to september, then joining with main tables and summing data and then selecting only small part. Why not filter dates at first?,
to_date(sysdate). sysdate is already date,
use ansi joins,
do not use order by in subqueries, it has no impact, only last ordering is important,
use date literals when defining dates, it is more readable.

writing a sql query in MySQL with subquery on the same table

I have a table svn1:
id | date | startdate
23 2002-12-04 2000-11-11
23 2004-08-19 2005-09-10
23 2002-09-09 2004-08-23
select id,startdate from svn1 where startdate>=(select max(date) from svn1 where id=svn1.id);
Now the problem is how do I let know the subquery to match id with the id in the outer query. Obviously id=svn1.id wont work. Thanks!
If you have the time to read more:
This really is a simplified version of asking what I really am trying to do here. my actual query is something like this
select
id, count(distinct archdetails.compname)
from
svn1,svn3,archdetails
where
svn1.name='ant'
and svn3.name='ant'
and archdetails.name='ant'
and type='Bug'
and svn1.revno=svn3.revno
and svn3.compname=archdetails.compname
and
(
(startdate>=sdate and startdate<=edate)
or
(
sdate<=(select max(date) from svn1 where type='Bug' and id=svn1.id)
and
edate>=(select max(date) from svn1 where type='Bug' and id=svn1.id)
)
or
(
sdate>=startdate
and
edate<=(select max(date) from svn1 where type='Bug' and id=svn1.id)
)
)
group by id LIMIT 0,40;
As you notice select max(date) from svn1 where type='Bug' and id=svn1.id has to be calculated many times.
Can I just calculate this once and store it using AS and then use that variable later. Main problem is to correct id=svn1.id so as to correctly equate it to the id in the outer table.
I'm not sure you can eliminate the repetition of the subquery, but the subquery can reference the main query if you use a table alias, as in the following:
select id,
count(distinct archdetails.compname)
from svn1 s1,
svn3 s3,
archdetails a
where s1.name='ant' and
s3.name='ant' and
a.name='ant' and
type='Bug' and
s1.revno=s3.revno and
s3.compname = a.compname and
( (startdate >= sdate and startdate<=edate) or
(sdate <= (select max(date)
from svn1
where type='Bug' and
id=s1.id and
edate>=(select max(date)
from svn1
where type='Bug' and
id=s1.id)) or
(sdate >= startdate and edate<=(select max(date)
from svn1
where type='Bug' and
id=s1.id)) )
group by id LIMIT 0,40;
Share and enjoy.
You should be able to left join to a sub-select so you only run the query once. Then you can do a join condition to pull out the maximum for the ID on each record as shown below:
SELECT id,
COUNT(DISTINCT archdetails.compname)
FROM svn1,
svn3,
archdetails
LEFT JOIN (
SELECT id, MAX(date) AS MaximumDate
FROM svn1
WHERE TYPE = 'Bug'
GROUP BY id
) AS MaxDate ON MaxDate.id = svn1.id
WHERE svn1.name = 'ant'
AND svn3.name = 'ant'
AND archdetails.name = 'ant'
AND TYPE = 'Bug'
AND svn1.revno = svn3.revno
AND svn3.compname = archdetails.compname
AND (
(startdate >= sdate AND startdate <= edate)
OR (
sdate <= MaxDate.MaximumDate
AND edate >= MaxDate.MaximumDate
)
OR (
sdate >= startdate
AND edate <= MaxDate.MaximumDate
)
)
GROUP BY
id LIMIT 0,
40;
Try using alias, something like this should work:
select s.id,s.startdate from svn1.s where s.startdate>=(select max(date) from svn1.s2 where s.id=s2.id);