How to subtract two query result in hive

How to subtract two query result in hive - hive

I have tried this code in SQL it is working fine but in hive it is not working
select((select sum(price) from apart where construction_year=2020) - (select sum(price) from apart where construction_year=1990)) as difference_between_1990_and_2020;

you need to convert them to subquery
select p20-p19 as difference_between_1990_and_2020
from
(select((select sum(price) p20 from apart where construction_year=2020) rs20
join (select sum(price) p19 from apart where construction_year=1990) rs19 on 1=1

Related

Get sum of previous 6 values including the group

I need to sum up the values for the last 7 days,so it should be the current plus the previous 6. This should happen for each row i.e. in each row the column value would be current + previous 6.
The case :-
(Note:- I will calculate the hours,by suming up the seconds).
I tried using the below query :-
select SUM([drivingTime]) OVER(PARTITION BY driverid ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
from [f.DriverHseCan]
The problem I face is I have to do grouping on driver,asset for a date
In the above case,the driving time should be sumed up and then,its previous 6 rows should be taken,
I cant do this using rank() because I need these rows as well as I have to show it in the report.
I tried doing this in SSRS and SQL both.
In short it is adding total driving time for current+ 6 previous days

Try the following query
SELECT
s.date
, s.driverid
, s.assetid
, s.drivingtime
, SUM(s2.drivingtime) AS total_drivingtime
FROM f.DriverHseCan s
JOIN (
SELECT date,driverid, SUM(drivingtime) drivingtime
FROM f.DriverHseCan
GROUP BY date,driverid
) AS s2
ON s.driverid = s2.driverid AND s2.date BETWEEN DATEADD(d,-6,s.date) AND s.date
GROUP BY
s.date
, s.driverid
, s.assetid
, s.drivingtime
If you have week start/end dates, there could be better performing alternatives to solve your problem, e.g. use the week number in SSRS expressions rather than do the self join on SQL server

I think aggregation does what you want:
select sum(sum([drivingTime])) over (partition by driverid
order by date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
)
from [f.DriverHseCan]
group by driverid, date

I guess you need to use CROSS APPLY.
Something like following? :
SELECT driverID,
date,
CA.Last6DayDrivingTime
FROM YourTable YT
CROSS APPLY
(
SELECT SUM(drivingTime) AS Last6DayDrivingTime
FROM YourTable CA ON CA.driverID=YT.driverID
WHERE CA.date BETWEEN DATEADD(DAY,-6,YT.date) AND YT.date)
) CA
Edit:
As you commented that cross apply slow down the performance, other option is to pre calculate the week values in temp table or using CTE and then use them in your main query.

SQL Server get customer with 7 consecutive transactions

I am trying to write a query that would get the customers with 7 consecutive transactions given a list of CustomerKeys.
I am currently doing a self join on Customer fact table that has 700 Million records in SQL Server 2008.
This is is what I came up with but its taking a long time to run. I have an clustered index as (CustomerKey, TranDateKey)
SELECT
ct1.CustomerKey,ct1.TranDateKey
FROM
CustomerTransactionFact ct1
INNER JOIN
#CRTCustomerList dl ON ct1.CustomerKey = dl.CustomerKey --temp table with customer list
INNER JOIN
dbo.CustomerTransactionFact ct2 ON ct1.CustomerKey = ct2.CustomerKey -- Same Customer
AND ct2.TranDateKey >= ct1.TranDateKey
AND ct2.TranDateKey <= CONVERT(VARCHAR(8), (dateadd(d, 6, ct1.TranDateTime), 112) -- Consecutive Transactions in the last 7 days
WHERE
ct1.LogID >= 82800000
AND ct2.LogID >= 82800000
AND ct1.TranDateKey between dl.BeginTranDateKey and dl.EndTranDateKey
AND ct2.TranDateKey between dl.BeginTranDateKey and dl.EndTranDateKey
GROUP BY
ct1.CustomerKey,ct1.TranDateKey
HAVING
COUNT(*) = 7
Please help make it more efficient. Is there a better way to write this query in 2008?

You can do this using window functions, which should be much faster. Assuming that TranDateKey is a number and you can subtract a sequential number from it, then the difference constant for consecutive days.
You can put this in a query like this:
SELECT CustomerKey, MIN(TranDateKey), MAX(TranDateKey)
FROM (SELECT ct.CustomerKey, ct.TranDateKey,
(ct.TranDateKey -
DENSE_RANK() OVER (PARTITION BY ct.CustomerKey, ct.TranDateKey)
) as grp
FROM CustomerTransactionFact ct INNER JOIN
#CRTCustomerList dl
ON ct.CustomerKey = dl.CustomerKey
) t
GROUP BY CustomerKey, grp
HAVING COUNT(*) = 7;
If your date key is something else, there is probably a way to modify the query to handle that, but you might have to join to the dimension table.

This would be a perfect task for a COUNT(*) OVER (RANGE ...), but SQL Server 2008 supports only a limited syntax for Windowed Aggregate Functions.
SELECT CustomerKey, MIN(TranDateKey), COUNT(*)
FROM
(
SELECT CustomerKey, TranDateKey,
dateadd(d,-ROW_NUMBER()
OVER (PARTITION BY CustomerKey
ORDER BY TranDateKey),TranDateTime) AS dummyDate
FROM CustomerTransactionFact
) AS dt
GROUP BY CustomerKey, dummyDate
HAVING COUNT(*) >= 7
The dateadd calculates the difference between the current TranDateTime and a Row_Number over all date per customer. The resulting dummyDatehas no actual meaning, but is the same meaningless date for consecutive dates.

Adding Case in two sub queries returning null on subtract

I have two tables, both have qty column, I want to subtract issued_donated_items from donated_items. It works fine until there is not a record in issued_donated_items then my query returns null
SELECT
(
(SELECT Sum(quantity) AS tQty FROM donated_items WHERE item_id=4)
-
(SELECT Sum(quantity_issued) AS issueQty FROM issued_donated_items WHERE item_id=4)
)AS total

I would suggest moving the subqueries to the from clause and using coalesce():
SELECT (COALESCE(di.QTY, 0) - COALESCE(idi.issueQTY, 0)
) AS total
FROM (SELECT Sum(quantity) AS tQty FROM donated_items WHERE item_id = 4) di CROSS JOIN
(SELECT Sum(quantity_issued) AS issueQty FROM issued_donated_items WHERE item_id = 4) idi;
This makes it easy to re-use the values if you, for instance, want to see the two numbers as well as their difference.

Use isnull() like this:
SELECT
(
(SELECT Sum(isnull(quantity,0)) AS tQty FROM donated_items WHERE item_id=4)
-
(SELECT Sum(isnull(quantity_issued,0)) AS issueQty FROM issued_donated_items WHERE item_id=4)
)AS total
For ANSI standard SQL use coalesce() instead of isnull().

Show %s in Access 2010 Crosstab query instead of just counts

I have the following two queries that build/feed into the third query. My goal is to have a crosstab query of [MCOs] down the left and possible responses/values for [DrpDown] across the top with the values shown as percentages of the total for each [MCO] (so % of row total).
What I have works, but I want to know if I can do it all in one query.
SELECT tblMCOs.MCOs, tblMCOs.DrpDwn, Count(tblMCOs.ID) AS CountOfID
FROM tblMCOs
GROUP BY tblMCOs.MCOs, tblMCOs.DrpDwn;
SELECT tblMCOs.MCOs, Count(tblMCOs.DrpDwn) AS CountOfDrpDwn
FROM tblMCOs
GROUP BY tblMCOs.MCOs;
TRANSFORM Sum(Round([qryMCODrpDwnCt]![CountOfID]/[qryMCOCtDrpDwn]!
[CountOfDrpDwn],4)*100) AS PCT
SELECT qryMCODrpDwnCt.MCOs
FROM qryMCODrpDwnCt INNER JOIN qryMCOCtDrpDwn ON qryMCODrpDwnCt.MCOs =
qryMCOCtDrpDwn.MCOs
GROUP BY qryMCODrpDwnCt.MCOs
PIVOT qryMCODrpDwnCt.DrpDwn;
Thanks in advance for your help.

What I have works, but I want to know if I can do it all in one query.
Crosstab queries can be a bit fussy, but simply inserting the SQL code as subqueries should work:
TRANSFORM Sum(Round([sqMCODrpDwnCt]![CountOfID]/[sqMCOCtDrpDwn]![CountOfDrpDwn],4)*100) AS PCT
SELECT sqMCODrpDwnCt.MCOs
FROM
(
SELECT tblMCOs.MCOs, tblMCOs.DrpDwn, Count(tblMCOs.ID) AS CountOfID
FROM tblMCOs
GROUP BY tblMCOs.MCOs, tblMCOs.DrpDwn
) AS sqMCODrpDwnCt
INNER JOIN
(
SELECT tblMCOs.MCOs, Count(tblMCOs.DrpDwn) AS CountOfDrpDwn
FROM tblMCOs
GROUP BY tblMCOs.MCOs
) AS sqMCOCtDrpDwn
ON sqMCODrpDwnCt.MCOs = sqMCOCtDrpDwn.MCOs
GROUP BY sqMCODrpDwnCt.MCOs
PIVOT sqMCODrpDwnCt.DrpDwn

Oracle SQL Sum dollars into quarters

I would like the output to be:
VENDOR_ID FY13Q1 FY13Q2 FY13Q3 FY13Q4 ...
ABC123 5000 NULL NULL 10000
DEF321 10000 8000 15000 2000
From the table:
VENDOR_ID VARCHAR
GROSS_AMT NUMERIC
INVOICE_DT DATE
This query works BUT I need to find a more efficient way (if possible):
SELECT T1.VENDOR_ID, FY13Q1, FY13Q3, FY13Q4, FY14Q1, FY14Q2, FY14Q3, FY14Q4
FROM
(
SELECT VENDOR_ID, SUM(GROSS_AMT) AS FY13Q1
FROM PS_VOUCHER
WHERE INVOICE_DT BETWEEN '01-JUL-12' AND '30-Sep-12'
GROUP BY VENDOR_ID
) T1
FULL JOIN
(
SELECT VENDOR_ID, SUM(GROSS_AMT) AS FY13Q2
FROM PS_VOUCHER
WHERE INVOICE_DT BETWEEN '1-Oct-12' AND '31-Dec-12'
GROUP BY VENDOR_ID
) T2
ON T1.VENDOR_ID LIKE T2.VENDOR_ID
...
FY13Q3 through FY14Q4 looks the same as above except the dates are changed to match the quarter. Any ideas on how to simplify this using a CASE statement or GROUP BY?

The original query is inefficient, because the query makes oracle read the table multiple times. Almost all of this kind of problem can be solved by reading the table once.
You can use pivot to simply the query, if you are using oracle 11g or above.
select * from (
select vendor_id, to_char(invoice_dt, 'yyyy-q') yyq, sum(gross_amt) amt
from ps_voucher
group by vendor_id, to_char(invoice_dt, 'yyyy-q')
)
pivot (
sum(amt)
for yyq in ('2013-1', '2013-2', '2013-3', '2013-4', '2014-1', '2014-2', '2014-3', '2014-4')
)
order by vendor_id;
If you are using 10g or below, you should use decode function or case clause. Perhaps you want to read this: http://oracletuts.net/sql/three-ways-to-transpose-rows-into-columns-in-oracle-sql/

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to subtract two query result in hive - hive

I have tried this code in SQL it is working fine but in hive it is not working select((select sum(price) from apart where construction_year=2020) - (select sum(price) from apart where construction_year=1990)) as difference_between_1990_and_2020;

you need to convert them to subquery select p20-p19 as difference_between_1990_and_2020 from (select((select sum(price) p20 from apart where construction_year=2020) rs20 join (select sum(price) p19 from apart where construction_year=1990) rs19 on 1=1

Related

Get sum of previous 6 values including the group

SQL Server get customer with 7 consecutive transactions

Adding Case in two sub queries returning null on subtract

Show %s in Access 2010 Crosstab query instead of just counts

Oracle SQL Sum dollars into quarters

Categories

Resources