Showing the closest date and not the rest - sql

Thanks for looking.
I am trying to get the smallest time between visits per customer. With the below code I am getting the amount of time between each visit, but it is comparing every date. I just need the smallest margin and then to get rid of/hide all the others.
select a.account_id, a.transaction_id, b.transaction_date , a.transaction_date, round(a.transaction_date - b.transaction_date, 0) as Time_between
from time_between_trans a, time_between_trans b
where a.transaction_date > b.transaction_date
and a.account_ID like '717724'
and a.account_id = b.account_id
for each date in transaction_date find the closest date to it in the
trasactions
They must be of the same account_id and the second date must be later
than the first

You need to add a group by to your query:
select a.account_id, a.transaction_id, min(time_between) as min_time_between
from (select a.account_id, a.transaction_id, b.transaction_date, a.transaction_date,
round(a.transaction_date - b.transaction_date, 0) as Time_between
from time_between_trans a join
time_between_trans b
on a.transaction_date > b.transaction_date and
a.account_ID ='717724' and
a.account_id = b.account_id
) a
group by a.account_id, a.transaction_id
I also fixed your join syntax to use proper join syntax, and changed the "like" to an "=", since you have no wildcards.
There may be other methods to express this, but this is standard SQL. You should specify the database you are using in your question.
If you are using Oracle, then just do the following:
select a.*,
(case when nextdt - transaction_date) < transaction_date - nextdt
then nextdt - transaction_date)
else transaction_date - nextdt
end) as mintime
from (select a.*,
lead(transaction_date, 1) over (partition by account_id order by transaction_date) as nexttd,
lag(transaction_date, 1) over (partition by account_id order by transaction_date) as prevtd
from time_between_trans a
) a
Actually, this isn't quite correct, because you have to take NULLs into account for nexttd and prevtd. But, the idea is simple. The lead and lag functions let you take the prev or text transaction, and then you can do whatever you want to find the minimum -- or get whatever other information you would like from the records.

Related

How to sum from multiple columns and segregate into separate column if result is positive and negative

I am using postgresql and need to write a query to sum values from separate columns of two different tables and then segregate into separate columns if positive or negative.
For Example,
Below is the source table
Below is the resultant table which need to be created also used while populating it
I have written below query to aggregate sum and able to populate TOT_CREDIT and TOT_DEBIT column. Is there any optimized query to achieve that ?
select t.account_id,
t.transaction_date,
SUM(t.transaction_amt) filter (where t.transaction_amt >= 0) as tot_debit,
SUM(t.transaction_amt) filter (where t.transaction_amt < 0) as tot_credit,
case
when
(
SUM(t.transaction_amt) +
SUM(COALESCE(b.credit_balance,0)) +
SUM(COALESCE(b.debit_balance,0))
) < 0
then
(
SUM(t.transaction_amt) +
SUM(COALESCE(b.credit_balance,0)) +
SUM(COALESCE(b.debit_balance,0))
)
end as credit_balance,
case
when
(
SUM(t.transaction_amt) +
SUM(COALESCE(b.credit_balance,0)) +
SUM(COALESCE(b.debit_balance,0))
) > 0
then
(
SUM(t.transaction_amt) +
SUM(COALESCE(b.credit_balance,0)) +
SUM(COALESCE(b.debit_balance,0))
)
end as debit_balance,
from
transaction t
LEFT OUTER JOIN balance b ON (t.account_id = b.account_id
and t.transaction_date = b.transaction_date
and b.transaction_date=t.transaction_date- INTERVAL '1 DAYS')
group by
t.account_id,
t.transaction_date
Please provide some pointer.
EDIT 1: This query is not working in expected manner.
One way is to break your logic into smal queries and join them in the end!
select tw.account_id, tw.t_date,tw.t_c,th.T_D,fo.C_B,fi.d_B from
(select account_id, Transaction_date as t_date, sum(Transaction_AMT) as t_C from TransactionTABLE
where Transaction_AMT<0 group by account_id, Transaction_date ) as tw inner join
(select account_id, Transaction_date as t_date, sum(Transaction_AMT) as t_d from TransactionTABLE
where Transaction_AMT>0 group by account_id, Transaction_date ) as th on tw.account_id=th.account_id and tw.t_date=th.t_date inner join
(select account_id, Transaction_date as t_date, sum(Transaction_AMT) as C_B from TransactionTABLE
where sum(Transaction_AMT)<0 group by account_id, Transaction_date ) as fo on th.account_id=fo.account_id and th.t_date=fo.t_date inner join
(select account_id, Transaction_date as t_date, sum(Transaction_AMT) as d_B from TransactionTABLE
where sum(Transaction_AMT)>0 group by account_id, Transaction_date ) as fi on fi.account_id=fo.account_id and fi.t_date=fo.t_date;
Or else
You could try something as follows which calculates the running count of d_B over the Transaction_date and account_id
select account_id,
transaction_date,
SUM(transaction_amt) filter (where transaction_amt >= 0) as tot_debit,
SUM(transaction_amt) filter (where transaction_amt < 0) as tot_credit,
sum(transaction_amt) over (partition by account_id where sum(transaction_amt)<0) as credit_balance,
sum(transaction_amt) over (partition by account_id where sum(transaction_amt)>=0) as debit_balance
from TransactionTABLE group by account_id, Transaction_date order by 1,2;

Trying to create a SQL query

I am trying to create a query that retrieves only the ten companies with the highest number of pickups over the six-month period, this means pickup occasions, and not the number of items picked up.
I have done this
SELECT *
FROM customer
JOIN (SELECT manifest.pickup_customer_ref reference,
DENSE_RANK() OVER (PARTITION BY manifest.pickup_customer_ref ORDER BY COUNT(manifest.trip_id) DESC) rnk
FROM manifest
INNER JOIN trip ON manifest.trip_id = trip.trip_id
WHERE trip.departure_date > TRUNC(SYSDATE) - interval '6' month
GROUP BY manifest.pickup_customer_ref) cm ON customer.reference = cm.reference
WHERE cm.rnk < 11;
this uses dense_rank to determine the order or customers with the highest number of trips first
Hmm well i don't have Oracle so I can't test it 100%, but I believe your looking for something like the following:
Keep in mind that when you use group by, you have to narrow down to the same fields you group by in the select. Hope this helps at least give you an idea of what to look at.
select TOP 10
c.company_name,
m.pickup_customer_ref,
count(*) as 'count'
from customer c
inner join mainfest m on m.pickup_customer_ref = c.reference
inner join trip t on t.trip_id = m.trip_id
where t.departure_date < DATEADD(month, -6, GETDATE())
group by c.company_name, m.pickup_customer_ref
order by 'count', c.company_name, m.pickup_customer_ref desc

sql to select first n unique lines on sorted result

I have query resulting me 1 column of strings, result example:
NAME:
-----
SOF
OTP
OTP
OTP
SOF
VIL
OTP
SOF
GGG
I want to be able to get SOF, OTP, VIL - the first 3 unique top,
I tried using DISTINCT and GROUP BY, but it is not working, the sorting is damaged..
The query building this result is :
SELECT DISTINCT d.adst
FROM (SELECT a.date adate,
b.date bdate,
a.price + b.price total,
( b.date - a.date ) days,
a.dst adst
FROM flights a
JOIN flights b
ON a.dst = b.dst
ORDER BY total) d
I have "flights" table with details, and I need to get the 3 (=n) cheapest destinations.
Thanks
This can easily be done using window functions:
select *
from (
SELECT a.date as adate,
b.date as bdate,
a.price + b.price as total,
dense_rank() over (order by a.price + b.price) as rnk,
b.date - a.date as days,
a.dst as adst
FROM flights a
JOIN flights b ON a.dst = b.dst
) t
where rnk <= 3
order by rnk;
More details on window functions can be found in the manual:
http://www.postgresql.org/docs/current/static/tutorial-window.html
Find a way to do it.
I am selecting the DST and the PRICE, grouping by DST with MIN function on Price and limiting 3.
do I have better way to do it?
SELECT d.adst , min(d.total) mttl
FROM (SELECT a.date adate,
b.date bdate,
a.price + b.price total,
( b.date - a.date ) days,
a.dst adst
FROM flights a
JOIN flights b
ON a.dst = b.dst
ORDER BY total) d
group by adst order by mttl;
select
name
from
testname
where
name in (
select distinct(name) from testname)
group by name order by min(ctid) limit 3
SQLFIDDLE DEMO
You can tweak your query to return the correct result, by adding where days > 0 and limit 3 in the outer query like this:
select *
from
(
select
a.date adate,
b.date bdate,
(a.price + b.price) total,
(b.date - a.date) days ,
a.dst adst
from flights a
join flights b on a.dst = b.dst
order by total
) d
where days > 0
limit 3;
SQL Fiddle Demo
This assuming that the second entry is the return flight with date greater than the first entry. So that you got positive days difference.
Note that, your query without days > 0 will give you a cross join between the table and it self, for each flight you will get 4 rows, two with it self with days = 0 and other row with negative days so I used days > 0 to get the correct row.
I recommend that you add a new column, an Id Flight_Id as a primary key, and another foreign key something like From_Flight_Id. So the primary flight would have a null From_Flight_Id, and the returning flight will have a From_Flight_Id equal to the flight_id of the primary filght, this way you can join them properly instead.
SELECT DISTINCT(`EnteredOn`) FROM `rm_pr_patients` Group By `EnteredOn`
SELECT DISTINCT ON (column_name) FROM table_name order by name LIMIT 3;

Selecting only if at least one row matches condition

I have a select statement and want to return all values only if at least one of them has a date with 60 days of difference from today.
The problem is that i have an outer apply which returns the column i want to compare to, and they come from different tables (one belongs to cash items, and the other to card items).
Considering I have the following:
OUTER APPLY (
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1 --Cash
UNION
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2 --Card
) AS items
I want to return all the rows only when DATEDIFF(DAY, MIN(items.item_date), GETDATE()) >= 60, but I want them all even if only one matches this condition.
What would be the best approach to do this?
EDIT
To make it clearer, I'll explain the use case:
I need to show the items of every loan, only if the client is late for more than 60 days of the due date on any of it
I am also not sure, what do you expect, but how about that:
WITH items
AS (SELECT Count(*) AS quantity,
Min(date) AS item_date
FROM dbo.Get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1
UNION
SELECT Count(*) AS quantity,
Min(date) AS item_date
FROM dbo.Get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2)
SELECT a.*
FROM items AS a,
(SELECT TOP 1 *
FROM items AS b
WHERE Datediff(day, b.item_date, Getdate()) >= 60) AS c
It's a sort of CROSS JOIN, where table C will have one or zero rows depending on that if the condition is met - it will than join to every row in other table.
Have you tried something like this?
SELECT a.quantity, a.item_date
FROM
(SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1
UNION
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2) a
WHERE DATEDIFF(day, a.item_date, GETDATE()) >= 60
Typically I do this using a CTE to select the key for the records I want to select and then join on that. Below is an attempt at an example:
with LateClients as
(
SELECT LoadId FROM Payment Where /*payment date later than 60 days*/
)
SELECT p.LoanId,
p.UserId
FROM Payment as p
INNER JOIN LateClients as LC
ON p.LoanId = lc.LoanId
OrderBy p.LoanId, p.UserId
I know it's a bit different from the code you posted, but this is a simplified example that should explain the concept. Good luck!

Sql Server - Joining subqueries using calculated fields

I am trying to calculate the percentage change in price between days. As the days are not consectutive, I build into the query a calculated field that tells me what relative day it is (day 1, day 2, etc). In order to compare today with yesterday, I offset the calculated day number by 1 in a subquery. what I want to do is to join the inner and outer query on the calculated relative day. The code I came up with is:
SELECT TOP 11
P.Date,
(AVG(P.SettlementPri) - PriceY) / PriceY as PriceChange,
P.Symbol,
(RANK() OVER (ORDER BY P.Date desc)) as dayrank_Today
FROM OTE P
JOIN (SELECT TOP 11
C.Date,
AVG(SettlementPri) as PriceY,
(RANK() OVER (ORDER BY C.Date desc))+1 as dayrank_Yest
FROM OTE C
WHERE C.ComCode = 'C-'
GROUP BY c.Date) C ON dayrank_Today = C.dayrank_Yest
WHERE P.ComCode = 'C-'
GROUP BY P.Symbol, P.Date
If I try and execute the query, I get an erro message indicating dayrank_Today is an invalid column. I have tried renaming it, qualifying it, yell obsenities at it and I get squat. Still an error.
You can't do a select of a calculated column, and then use it in a join. You can use CTEs, which I'm not so familiar with, or you can jsut do table selects like so:
SELECT
P.Date,
(AVG(AvgPrice) - C.PriceY) / C.PriceY as PriceChange,
P.Symbol,
P.dayrank_Today FROM
(SELECT TOP 11
ComCode,
Date,
AVG(SettlementPri) as AvgPrice,
Symbol,
(RANK() OVER (ORDER BY Date desc)) as dayrank_Today
FROM OTE WHERE ComCode = 'C-') P
JOIN (SELECT TOP 11
C.Date,
AVG(SettlementPri) as PriceY,
(RANK() OVER (ORDER BY C.Date desc))+1 as dayrank_Yest
FROM OTE C
WHERE C.ComCode = 'C-'
GROUP BY c.Date) C ON dayrank_Today = C.dayrank_Yest
GROUP BY P.Symbol, P.Date
If possible consider using a CTE as it makes it very easy. Something like this:
With Raw as
(
SELECT TOP 11 C.Date,
Avg(SettlementPri) As PriceY,
Rank() OVER (ORDER BY C.Date desc) as dayrank
FROM OTE C WHERE C.Comcode = 'C-'
Group by C.Date
)
select today.pricey as todayprice ,
yesterday.pricey as yesterdayprice,
(today.pricey - yesterday.pricey)/today.pricey * 100 as percentchange
from Raw today
left outer join Raw yesterday on today.dayrank = yesterday.dayrank + 1
Obviously this doesn;t include the symbol but that can be included pretty easily.
If using 'With' syntax doesn;t suit you can also use calculated fields with Outer Apply http://technet.microsoft.com/en-us/library/ms175156.aspx
Although the CTE will mean that you only need to write your price calculation once which is a lot cleaner
Cheers
I had the same problem and found this thread and found a solution so I thought I'd post it here.
Instead of using the column name as parameter for ON, copy the statement that gave you the colmun name in the first place:
replace:
ON dayrank_Today = C.dayrank_Yest
with:
ON (RANK() OVER (ORDER BY Date desc)) = C.dayrank_Yest
Granted, you're displeasing the Programming Gods by violating DRY, but you could be pragmatic and mention the duplication in the comments, which should appease their wrath to a mild grumbling.