Showing particular row data as column - sql

My table is as follows:
Date
Code
Price
MA5
MA20
2022-01-01
APPLE
1000
1080
1090
2022-01-02
APPLE
1100
1084
1100
2022-01-03
APPLE
1200
1090
1100
2022-01-01
MICROSOFT
7
9
10
2022-01-02
MICROSOFT
7.5
8
9.5
2022-01-03
MICROSOFT
8
8.5
9
...
...
...
...
...
2022-01-01
NASDAQ
14400
15600
16700
2022-01-02
NASDAQ
14500
15200
16100
2022-01-03
NASDAQ
14600
15000
16000
I'm currently saving NASDAQ values and stock data on the same table using MariaDB.
However, I want to show NASDAQ's MA values as new column fields into rest of the field, as NASDAQ_MA5, NASDAQ_MA20.
My question is, how do I select nasdaq's MA5 and MA20 values and put it as the values according to the matching dates? My desired output is as follows:
Date
Code
Price
MA5
MA20
NASDAQ_MA5
NASDAQ_MA20
2022-01-01
APPLE
1000
1080
1090
15600
16700
2022-01-02
APPLE
1100
1084
1100
15200
16100
2022-01-03
APPLE
1200
1090
1100
15000
16000
2022-01-01
MICROSOFT
7
9
10
15600
16700
2022-01-02
MICROSOFT
7.5
8
9.5
15200
16100
2022-01-03
MICROSOFT
8
8.5
9
15000
16000
I've been trying the following:
SELECT *,
(PARTITION BY DATE, case when (code='NASDAQ') then MA5 else NULL end) as 'NASDAQ_MA5',
(PARTITION BY DATE, case when (code='NASDAQ') then MA20 else NULL end) as 'NASDAQ_MA20'
FROM TABLE
Your help will be very appreciated.

You need to join the two distinct sets of data. It probably makes sense to define a CTE to keep the definitions clear, then join with your main table - either an inner join if there's always a corresponding date or left join if there might not be.
This assumes there's only a single nasdaq code for each date, if that's not the case you can aggregate in the CTE as required.
with nasdaq as (
select date, MA5 NASDAQ_MA5, MA20 NASDAQ_MA20
from t
where code = 'NASDAQ'
)
select t.*, n.NASDAQ_MA5, n.NASDAQ_MA20
from t
left join nasdaq n on n.date=t.date
where t.code != 'NASDAQ';

Im sure theres a more efficient way of doing this, but heres a way to solve your issue using 2 subqueries:
SELECT t1.*, (SELECT t2.MA5
FROM tableA t2
WHERE t1.date = t2.date
AND code = 'NASDAQ') as NASDAQ_MA5,
(SELECT t2.MA20
FROM tableA t2
WHERE t1.date = t2.date
AND code = 'NASDAQ') as NASDAQ_MA20
FROM tableA t1
WHERE code != 'NASDAQ'
Try it out here.

You can do a self join on date column and separating the data by CODE = 'NASDAQ'. Here is a way of doing this:
select a.*, b.ma5 as NASDAQ_MA5, b.ma20 as NASDAQ_MA20 from table1 a
left outer join (select date, ma5,ma20 from table1 where code = 'NASDAQ') b
on a.date = b.date where a.code <> 'NASDAQ'

Related

Is there a way in SQL to select rows until a column reaches a specific value?

I have a dataset of customer profiles where I am trying to capture how much revenue they generated until they cancelled their subscription. The issue I am having is that after the customer cancels their subscription, the customer profile still exists in the database and registers as being charged 0. I am trying to create a visualization that shows each customers lifespan in a table up until the month that they cancel.
Here is the data I have:
customer name
customer id
cancelled
charge date
charged amount
gary
012
no
1/1/2022
199
gary
012
no
2/1/2022
199
gary
012
no
3/1/2022
199
gary
012
yes
4/1/2022
199
gary
012
no
5/1/2022
199
gary
012
no
6/1/2022
199
I my desired output would select the first 4 lines above, and get rid of the last two.
I can pull up the data, but not sure where to go from there. So far I have:
select
t.customer_name,
t.customer_id,
t.cancel_flag,
t.revenue_date,
a.revenue,
a.customer_id
from metrics t
inner join drp.mrr a
on t.customer_id= a.customer_id
Any ideas are much appreciated!!
You can use a window function to distinguish the rows before the cancellation and after it. For example:
select *
from (
select
t.customer_name,
t.customer_id,
t.cancel_flag,
t.revenue_date,
a.revenue,
a.customer_id,
max(t.cancel_flag) over(
partition by t.customer_id
order by t.revenue_date
rows between unbounded preceding and 1 preceding
) as mc
from metrics t
inner join drp.daasity_mrr a on t.customer_id= a.customer_id
) x
where mc = 'no' or mc is null
We can add a running total to see how much each customer paid so far.
select *
,sum(charged_amount) over(partition by customer_id order by charge_date) as running_total
from
(
select customer_name
,customer_id
,cancelled
,charge_date
,case when count(case when cancelled = 'yes' then 1 end) over(partition by customer_id order by charge_date) = 0 then charged_amount end as charged_amount
from t
) t
customer_name
customer_id
cancelled
charge_date
charged_amount
running_total
gary
12
no
2022-01-01
199
199
gary
12
no
2022-02-01
199
398
gary
12
no
2022-03-01
199
597
gary
12
yes
2022-04-01
null
597
gary
12
no
2022-05-01
null
597
gary
12
no
2022-06-01
null
597
Fiddle

T-SQL get values for specific group

I have a table EmployeeContract similar like this:
ContractId
EmployeeId
ValidFrom
ValidTo
Salary
12
5
2018-02-01
2019-06-31
x
25
8
2015-01-01
2099-12-31
x
50
5
2019-07-01
2021-05-31
x
52
6
2011-08-01
2021-12-31
x
72
8
2010-08-01
2014-12-31
x
52
6
2011-08-01
2021-12-31
x
Table includes history contracts in company for each employee. I need to get date when employees started work and last date of contract. Sometime records has duplicates.
For example, based on data from above:
EmployeeId
ValidFrom
ValidTo
5
2018-02-01
2021-05-31
8
2010-08-01
2099-12-31
6
2011-08-01
2021-12-31
Base on this article: https://www.techcoil.com/blog/sql-statement-for-selecting-the-latest-record-in-each-group/
I prepared query like this:
select minv.*, maxv.maxvalidto from
(select distinct con.[EmployeeId], mvt.maxvalidto
from [EmployeeContract] con
join (select [EmployeeId], max(validto) as maxvalidto
FROM [EmployeeContract]
group by [EmployeeId]) mvt
on con.[EmployeeId] = mvt.[EmployeeId] and mvt.maxvalidto = con.validto) maxv
join
(select distinct con.[EmployeeId], mvf.minvalidfrom
from [EmployeeContract] con
join (select [EmployeeId], min(validfrom) as minvalidfrom
FROM [EmployeeContract]
group by [EmployeeId]) mvf
on con.[EmployeeId] = mvf.[EmployeeId] and mvf.minvalidfrom = con.validfrom) minv
on minv.[EmployeeId] = maxv.[EmployeeId]
order by 1
But I'm not satisfied, i think it's not easy to read, and probably optimize is poor. How can I do it better?
I think you want group by:
select employeeid, min(validfrom), max(validto)
from employeecontract
group by employeeid

Rolling Sum (4 Months)

I have been struggeling with building a query in access that calculates a "rolling 4 months" of sales data. I have been experimenting with DSUM, but I only seem to be able to get the subtotal or running total for a specific group (not a moving total). I have tried to illustrate what I am trying to do below.
Date Product Value Rolling_4_Month_Sum
January A 100 100
February A 200 300
March A 300 600
April A 300 900
May A 200 1000
June A 400 1200
July A 500 1400
August A 700 1800
Is it possible to make a running total for 4 rows/months only?
SELECT
a.Date,
a.Product,
a.Value,
SUM(b.value)
FROM
Table a
INNER JOIN Table b ON a.Product=b.Product
AND b.Date <= a.Date
AND b.Date >= DateAdd("q",1, a.Date)
GROUP BY
a.Date, a.Product
This should work in my opinion.
Table a is your "single month" row date.
Table b is self join to retrieve the last 4 predecessing months. It is done by adding b.Date >= DateAdd("q",1, a.Date) as self-join criteria.
Here is a nice example of how these kinds of things work.
Data:
OrderDetailID OrderID ProductID Price
1 1234 1 $5.00
2 1234 2 ($2.00)
3 1234 3 $4.00
4 1235 1 $5.00
5 1235 3 $4.00
6 1235 5 $12.00
7 1235 2 ($2.00)
SQL:
SELECT OD.OrderDetailID, OD.OrderID, OD.ProductID, OD.Price, (SELECT Sum(Price) FROM tblOrderDetails
WHERE OrderDetailID <= OD.OrderDetailID) AS RunningSum
FROM tblOrderDetails AS OD;

cross reference nearest date data

I have three table ElecUser, ElecUsage, ElecEmissionFactor
ElecUser:
UserID UserName
1 Main Building
2 Staff Quarter
ElecUsage:
UserID Time Amount
1 1/7/2010 23230
1 8/10/2011 34340
1 8/1/2011 34300
1 2/3/2012 43430
1 4/2/2013 43560
1 3/2/2014 44540
2 3/6/2014 44000
ElecEmissionFactor:
Time CO2Emission
1/1/2010 0.5
1/1/2011 0.55
1/1/2012 0.56
1/1/2013 0.57
And intended outcome:
UserName Time CO2
1 2010 11615
1 2011 37752 (34340*0.55 + 34300*0.55)
1 2012 24320.8
1 2013 24829.2
1 2014 25387.8
2 2014 25080
The logic is ElecUsage.Amount * ElecEmissionFactor.
If same user and same year, add them up for the record of that year.
My query is:
SELECT ElecUser.UserName, Year([ElecUsage].[Time]), SUM((ElecEmissionFactor.CO2Emission*ElecUsage.Amount)) As CO2
FROM ElecEmissionFactor, ElecUser INNER JOIN ElecUsage ON ElecUser.UserID = ElecUsage.UserID
WHERE (((Year([ElecUsage].[Time]))>=Year([ElecEmissionFactor].[Time])))
GROUP BY ElecUser.UserName, Year([ElecUsage].[Time])
HAVING Year([ElecUsage].[Time]) = Max(Year(ElecEmissionFactor.Time));
However, this only shows the year with emission factor.
The challenge is to reference the year without emission factor to the latest year with emission factor.
Sub-query may be one of the solutions but i fail to do so.
I got stuck for a while. Hope to see your reply.
Thanks
Try something like this..
-- not tested
select T1.id, year(T1.time) as Time, sum(T1.amount*T2.co2emission) as CO2
from ElecUsage T1
left outer join ElecEmissionFactor T2 on (year(T1.time) = year(T2.time))
Group by year(T1.time), T1.id
use sub query to get the corresponding factor in this way
select T1.id,
year(T1.time) as Time,
sum(T1.amount*
(
select top 1 CO2Emission from ElecEmissionFactor T2
where year(T2.time) <= year(T1.time) order by T2.time desc
)
) as CO2
from ElecUsage T1
Group by year(T1.time), T1.id

SQL query to find the avg salary based on the nearest client dob's

I have a requirement with a below table.
Conditions:
I have to take the average of salaries clients, if the client has serial 3 days date of birth gap.
If there are no nearest 3 day dob's gap between the gap between the clients, then no need to take that client into consideration.
Example:
in the below table
client 17 has previous clientid's WITH serial dob's with 1day gap -> in this case I'll TAKE salary AVG FOR 17 BY taking 15,16 & 17 salaries.
client 18 has previous clientid's WITH serial dob's -> in this case I'll TAKE salary AVG FOR 18 BY taking 16,17 & 18 salaries.
Table:
JobType ClientID ClinetDOB's Slaries
.net 1 2012-03-14 300
.net 2 2012-04-11 400
.net 3 2012-04-12 200
.net 4 2012-07-29 400
.net 5 2012-08-17 1200
.net 6 2012-08-18 1400
.net 7 2012-08-19 1400
java 8 2012-04-10 400
java 9 2012-07-29 400
java 10 2012-07-30 600
java 11 2012-08-14 1200
java 12 2012-08-15 1800
java 13 2012-08-16 1100
java 14 2012-09-17 1200
java 15 2012-08-18 2400
java 16 2012-08-19 2400
java 17 2012-08-20 2400
java 18 2012-08-21 1500
Result Should looks LIKE this:-
JobType ClientID ClinetDOB's AVG(Slaries)
.net 7 2012-08-19 1333
Java 13 2012-08-16 1366 --This avg of 5,6,7 clientsId's(because they have serial 3days dob's)
Java 17 2012-08-20 2400 --This avg of 15,16,17 clientsId's(because they have serial 3days dob's)
Java 18 2012-08-21 2100 --This avg of 16,17,18 clientsId's(because they have serial 3days dob's)
Below query giving the some messup results.
select t1.ClientID,
t1.ClinetDOBs,
(t1.Slaries + sum (t2.Slaries)) / (count (*) + 1) Avg_Slaries
from table1 t1
inner join table1 t2
on (t1.ClinetDOBs = dateadd(day, 3, t2.ClinetDOBs) and t1.jobtype = t2.jobtype)
group by t1.ClientID,
t1.ClinetDOBs,
t1.Slaries
Please help.
Thank You In advance!
You might try this - difference is that from t2 are taken rows from previous three days, which include current row being tested so no double-summing is needed. Also ˙having` removes rows that reference themselves only.
select t1.ClientID,
t1.ClinetDOBs,
avg(t2.Slaries) Avg_Slaries
from table1 t1
inner join table1 t2
on t1.ClinetDOBs >= t2.ClinetDOBs
and t1.ClinetDOBs <= dateadd(day, 3, t2.ClinetDOBs)
and t1.jobtype = t2.jobtype
group by t1.ClientID,
t1.ClinetDOBs
having count(*) > 1
You can see it on your last data here.
The following query joins in each of the three preceding records. The joins both bring in the data and act as a filter to ensure that there are three:
select tmain.ClientID, tmain.ClinetDOBs,
sum(tmain.slaries + t1.slaries + t2.slaries)/3.0 as avg_slaries
from table1 tmain join
table1 t1
on t1.ClinetDOBs = dateadd(day, -1, tmain.ClinetDOBs) and
t1.jobtype = tmain.jobtype join
table t2
on t2.ClinetDOBs = dateadd(day, -2, tmain.ClinetDOBs) and
t2.jobtype = tmain.jobtype
group by tmain.ClientID, tmain.ClinetDOBs, tmain.Slaries
You question seems odd. Why do the dates have to be sequential and why do they all have to be there? What happens if there are multiple people on the same date and job title?
Try
select t1.ClientID,
t1.ClinetDOBs,
avg(t2.Slaries)
from table1 t1
inner join table1 t2
on t2.ClinetDOBs >= t1.ClinetDOBs)
t2.ClinetDOBs <= dateadd(day, 3, t1.ClinetDOBs)
and t1.jobtype = t2.jobtype
group by t1.ClientID,
t1.ClinetDOBs