Distinct query for SQL Server needed - sql

I am pulling data with this query in SQL Server
SELECT DISTINCT
DOC.TPID,
DOC.TYPE,
DOC.DOCNO,
O211.PONO,
H210.INVDATE,
H210.INVNO,
H210.EQPMTINIT,
H210.EQPMTNO,
D214.DESTIMATED,
D214.DACTUAL,
DOC.CDATETIME
FROM [databasename].[dbo].[DOC]
JOIN [databasename].[dbo].[IN_211_HDR] H211 ON DOC.[TRANNO] = H211.TRANNO
JOIN [databasename].[dbo].[IN_211_ORD] O211 ON H211.TRANNO = O211.TRANNO
JOIN [databasename].[dbo].[IN_210_HDR] H210 ON DOCNO = H210.BOLNO
JOIN [databasename].[dbo].[IN_214_HDR] H214 ON H211.BOLNO = H214.SHPID
JOIN [databasename].[dbo].[IN_214_DTL] D214 ON H214.TRANNO = D214.TRANNO
WHERE
[TPID] = 'DSV' AND doc.[STATUSERP] = ''
ORDER BY
CDATETIME DESC
This will return the following result set.
O211.PONO D214.DESTIMATED
DSV 211 STAD8204126 106824 2014-05-27 00:00:00.000 US01271338 CCLU 4481776 2014-04-20 00:00:00.000 NULL 2014-04-10 15:00:10.000
DSV 211 STAD8204126 106824 2014-05-27 00:00:00.000 US01271338 CCLU 4481776 2014-05-02 00:00:00.000 NULL 2014-04-10 15:00:10.000
DSV 211 STAD8204126 106824 2014-05-27 00:00:00.000 US01271338 CCLU 4481776 2014-05-03 00:00:00.000 NULL 2014-04-10 15:00:10.000
DSV 211 STAD8204126 106824 2014-05-27 00:00:00.000 US01271338 CCLU 4481776 2014-05-18 00:00:00.000 NULL 2014-04-10 15:00:10.000
DSV 211 STAD8203444 106843 2014-05-21 00:00:00.000 US01267372 TGHU 4732265 2014-04-17 00:00:00.000 NULL 2014-04-10 08:03:14.000
DSV 211 STAD8203444 106843 2014-05-21 00:00:00.000 US01267372 TGHU 4732265 2014-05-05 00:00:00.000 NULL 2014-04-10 08:03:14.000
DSV 211 STAD8203444 106847 2014-05-21 00:00:00.000 US01267372 TGHU 4732265 2014-04-17 00:00:00.000 NULL 2014-04-10 08:03:14.000
DSV 211 STAD8203444 106847 2014-05-21 00:00:00.000 US01267372 TGHU 4732265 2014-05-05 00:00:00.000 NULL 2014-04-10 08:03:14.000
DSV 211 STAD8203444 108380 2014-05-21 00:00:00.000 US01267372 TGHU 4732265 2014-04-17 00:00:00.000 NULL 2014-04-10 08:03:14.000
DSV 211 STAD8203444 108380 2014-05-21 00:00:00.000 US01267372 TGHU 4732265 2014-05-05 00:00:00.000 NULL 2014-04-10 08:03:14.000
I need to have it so that it only returns rows with a unique O211.PONO. The only difference between those rows is the date but I need to only return one row for each unique O211.PONO number. It should take the one with the latest date in the D214.DESTIMATED field.

The easiest way is with row_number():
with t as (
<your query here without the order by>
)
select t.*
from (select t.*,
row_number() over (partition by PONO order by DESTIMATED desc) as seqnum
from t
) t
where seqnum = 1;

Related

How do I join a sparse table and fill rows between in SQL Server

How can I apply weights from a one table to another [Port] where the weight table has sparse dates?
[Port] table
utcDT UsdPnl
-----------------------------------------------
2012-03-09 00:00:00.000 -0.00581815226439161
2012-03-11 00:00:00.000 -0.000535272460588547
2012-03-12 00:00:00.000 -0.00353079778650661
2012-03-13 00:00:00.000 0.00232882689252497
2012-03-14 00:00:00.000 -0.0102592811199384
2012-03-15 00:00:00.000 0.00254451559598693
2012-03-16 00:00:00.000 0.0146718613139845
2012-03-18 00:00:00.000 0.000425144543842752
2012-03-19 00:00:00.000 -0.00388548271428044
2012-03-20 00:00:00.000 -0.00662423680184768
2012-03-21 00:00:00.000 0.00405506208635343
2012-03-22 00:00:00.000 -0.000814822806982203
2012-03-23 00:00:00.000 -0.00289523953346103
2012-03-25 00:00:00.000 0.00204150859774465
2012-03-26 00:00:00.000 -0.00641635182718787
2012-03-27 00:00:00.000 -0.00107168420738448
2012-03-28 00:00:00.000 0.00131000520696153
2012-03-29 00:00:00.000 0.0008223678402638
2012-03-30 00:00:00.000 -0.00255345945390133
2012-04-01 00:00:00.000 -0.00337792814650089
[Weights] table
utcDT Weight
--------------------------------
2012-03-09 00:00:00.000 1
2012-03-20 00:00:00.000 3
2012-03-29 00:00:00.000 7
So, I want to use the weights as if I had a full table like this below. i.e. change to new weight on first day it appears in [Weights] table:
utcDT UsedWeight
----------------------------------
2012-03-09 00:00:00.000 1
2012-03-11 00:00:00.000 1
2012-03-12 00:00:00.000 1
2012-03-13 00:00:00.000 1
2012-03-14 00:00:00.000 1
2012-03-15 00:00:00.000 1
2012-03-16 00:00:00.000 1
2012-03-18 00:00:00.000 1
2012-03-19 00:00:00.000 1
2012-03-20 00:00:00.000 3
2012-03-21 00:00:00.000 3
2012-03-22 00:00:00.000 3
2012-03-23 00:00:00.000 3
2012-03-25 00:00:00.000 3
2012-03-26 00:00:00.000 3
2012-03-27 00:00:00.000 3
2012-03-28 00:00:00.000 3
2012-03-29 00:00:00.000 7
2012-03-30 00:00:00.000 7
2012-04-01 00:00:00.000 7
You can use apply:
select p.*, w.*
from port p outer apply
(select top (1) w.*
from weights w
where w.utcDT <= p.utcDT
order by w.utcDT desc
) w;
outer apply is usually pretty efficient, if you have the right indexes. In this case, the right inex is on weights(utcDT desc).
You can use lead() in a subquery to associate the next date a weight changes to each weights record, and then join with port using an inequality condition on the dates:
select p.utcDt, w.weight
from port p
inner join (
select utcDt, weight, lead(utcDt) over(order by utcDt) lead_utcDt from weights
) w
on p.utcDt >= w.utcDt
and (w.lead_utcDt is null or p.utcDt < w.lead_utcDt)

get end dates from list of start dates in sql

I have a sql query that brings back a list of references (products) that were at a specific status and an effective date. Unfortunately when one product moves to a different status the system doesn't put an end date in, so I am wanting to generate the end date, based on the effective date and sequence number. Is this possible?
Product Status EffectiveDate Enddate SeqNo
10 *UC 2017-10-02 00:00:00.000 NULL 8590
584 UC 2017-02-28 00:00:00.000 NULL 8380
584 APA 2017-07-07 00:00:00.000 NULL 8620
584 APA3 2017-08-10 00:00:00.000 NULL 8630
902 *UC 2017-10-13 00:00:00.000 NULL 8590
902 APA 2017-10-13 00:00:00.000 NULL 8620
1017 *UC 2017-09-01 00:00:00.000 NULL 8590
1017 APA 2017-10-10 00:00:00.000 NULL 8620
SO I would want to return the following...
Product Status EffectiveDate EndDate SeqNo
10 *UC 2017-10-02 00:00:00.000 NULL 8590
584 UC 2017-02-28 00:00:00.000 2017-07-07 00:00:00.000 8380
584 APA 2017-07-07 00:00:00.000 2017-08-10 00:00:00.000 8620
584 APA3 2017-08-10 00:00:00.000 NULL 8630
902 *UC 2017-10-13 00:00:00.000 2017-10-13 00:00:00.000 8590
902 APA 2017-10-13 00:00:00.000 NULL 8620
1017 *UC 2017-09-01 00:00:00.000 2017-10-10 00:00:00.000 8590
1017 APA 2017-10-10 00:00:00.000 NULL 8620
Many thanks.
You can use lead() :
select t.*, lead(EffectiveDate) over (partition by product order by SeqNo) as EndDate
from table t;
However, lead() starts from version 2012 +, so you can use apply instead :
select t.*, t1.EffectiveDate as EndDate
from table t outer apply
(select top (1) t1.*
from table t1
where t1.product = t.product and t1.SeqNo > t.SeqNo
order by t1.SeqNo
) t1;

SQL Server : compare rows, exclude from results when some values are the same

I have the following SQL Server query problem.
If there is a row where Issue_DATE = as Maturity_Date in another row, and if both rows have the same ID and Amount USD, then none of these rows should be displayed.
Here is a simplified version of my table:
ID ISSUE_DATE MATURITY_DATE AMOUNT_USD
1 2010-01-01 00:00:00.000 2015-12-01 00:00:00.000 5000
1 2010-01-01 00:00:00.000 2001-09-19 00:00:00.000 700
2 2014-04-09 00:00:00.000 2019-04-09 00:00:00.000 400
1 2015-12-01 00:00:00.000 2016-12-31 00:00:00.000 5000
5 2015-02-24 00:00:00.000 2015-02-24 00:00:00.000 8000
4 2012-11-29 00:00:00.000 2015-11-29 00:00:00.000 10000
3 2015-01-21 00:00:00.000 2018-01-21 00:00:00.000 17500
2 2015-02-02 00:00:00.000 2015-12-05 00:00:00.000 12000
1 2015-01-12 00:00:00.000 2018-01-12 00:00:00.000 18000
2 2015-12-05 00:00:00.000 2016-01-10 00:00:00.000 12000
Result should be:
ID ISSUE_DATE MATURITY_DATE AMOUNT_USD
1 2010-01-01 00:00:00.000 2001-09-19 00:00:00.000 700
2 2014-04-09 00:00:00.000 2019-04-09 00:00:00.000 400
5 2015-02-24 00:00:00.000 2015-02-24 00:00:00.000 8000
4 2012-11-29 00:00:00.000 2015-11-29 00:00:00.000 10000
3 2015-01-21 00:00:00.000 2018-01-21 00:00:00.000 17500
1 2015-01-12 00:00:00.000 2018-01-12 00:00:00.000 18000
I tried with self join, but I do not get right result.
Thanks in advance!
Can you try something like this? 'not exists' is the way of doing it.
select * from table t1 where not exists (select 'x' from table t2 where t1.issue_date = t2.maturity_date and t1.amount_usd=t2.amount_usd and t1.id = t2.id)
I'd think about making subquery of all the dupes and then eliminating them from the first table like so:
select t1.ID
, t1.ISSUE_DATE
, t1.MATURITY_DATE
, t1.AMOUNT_USD
FROM
t1
LEFT JOIN
(select a.ID
, a.ISSUE_DATE
, a.MATURITY_DATE
, a.AMOUNT_USD
FROM
t1 a
INNER JOIN
ti b
) dupes
on
t1.ID = dupes.ID
WHERE dupes.ID IS NULL;

SQL Server query join several tables

I have a query that I don't think should be that hard to make, however, I've spent a lot of time on it now and still can't get it the way I want, so I hope someone here can help me.
Basically, I need to create a report that will give a value for each month, for each area. However, not all areas deliver data each month; in that case the view should return NULL for that month and area. So, the view need to look something like this:
Month Area Value
2012-08-01 Area1 2
2012-08-01 Area2 3
2012-09-01 Area1 3
2012-09-01 Area2 NULL
My data table looks something like this
Date Area Value
2012-08-01 Area1 2
2012-08-01 Area2 3
2012-09-01 Area1 3 -- Notice that Area2 is not present for September here
I have a table with all the available areas
Furthermore, I have created a table-valued function that returns all dates from a given date until now.
For example this statement
SELECT * FROM Periods_Months('2012-01-01')
would return 8 records like:
DateValue Year Month YearMonth
2012-01-01 00:00:00.000 2012 1 20121
2012-02-01 00:00:00.000 2012 2 20122
2012-03-01 00:00:00.000 2012 3 20123
2012-04-01 00:00:00.000 2012 4 20124
2012-05-01 00:00:00.000 2012 5 20125
2012-06-01 00:00:00.000 2012 6 20126
2012-07-01 00:00:00.000 2012 7 20127
2012-08-01 00:00:00.000 2012 8 20128
Based on the suggestions, my query now looks like this:
WITH months AS (
SELECT DateValue, YearMonth FROM Periods_Months('2011-01-01')
)
select m.DateValue
,CAST(DATEADD(s,-1,DATEADD(mm, DATEDIFF(m,0,m.DateValue)+1,0)) AS Date) AS DateReported -- Get last day in month
,ResponseTime AS Value
,g.ExternalId
from GISDB.dbo.GisObjects g
CROSS JOIN months m
LEFT OUTER JOIN
( -- SELECT data from data table, grouped by area and month
SELECT dbo.YearMonth(CloseDate) AS YearMonth
,MAX(CloseDate) AS LastDate
,GisObjectId
,SUM(DATEDIFF(HH,RegDate,CloseDate)) AS ResponseTime -- calculate response time between start and end data (the value we need)
FROM DataTable
WHERE CloseDate IS NOT NULL
AND GisObjectId IS NOT NULL
GROUP BY GisObjectId, dbo.YearMonth(CloseDate) -- group by area and month
) c
ON g.ObjectId = c.GisObjectId AND c.YearMonth = m.YearMonth
WHERE g.CompanyId = 3 AND g.ObjectTypeId = 1 -- reduce the GIS objects that we compare to
ORDER BY m.DateValue, g.ObjectId
But the result is this (Value is always NULL):
DateValue DateReported Value ExternalId
2011-01-01 00:00:00.000 31-01-2011 NULL 9994
2011-01-01 00:00:00.000 31-01-2011 NULL 9993
2011-01-01 00:00:00.000 31-01-2011 NULL 9992
2011-01-01 00:00:00.000 31-01-2011 NULL 9991
2011-01-01 00:00:00.000 31-01-2011 NULL 2339
2011-01-01 00:00:00.000 31-01-2011 NULL 2338
2011-01-01 00:00:00.000 31-01-2011 NULL 2337
2011-01-01 00:00:00.000 31-01-2011 NULL 2336
2011-01-01 00:00:00.000 31-01-2011 NULL 2335
2011-01-01 00:00:00.000 31-01-2011 NULL 2334
2011-01-01 00:00:00.000 31-01-2011 NULL 2327
2011-01-01 00:00:00.000 31-01-2011 NULL 2326
2011-01-01 00:00:00.000 31-01-2011 NULL 2325
2011-01-01 00:00:00.000 31-01-2011 NULL 2324
2011-01-01 00:00:00.000 31-01-2011 NULL 2323
2011-01-01 00:00:00.000 31-01-2011 NULL 2322
etc.
I suppose you have a table with all your areas, which I call area_table.
WITH month_table AS (
SELECT dateValue FROM Periods_Months('2012-01-01')
)
select * from area_table
CROSS JOIN month_table
LEFT OUTER JOIN myValueTable
ON area_table.name = myValueTable.area
AND myValueTable.date = left(convert(varchar(30),month_table.dateValue,120),10)
ORDER BY myValueTable.Month, myValueTable.area
Suppose Areas is your table for all available areas, t - is your data table:
SELECT pm.dateValue,Ar.Area, t.value
FROM Periods_Months('2012-01-01') pm, Areas ar
left join t on (pm.dateValue=t.Date) and (ar.Area=t.Area)
order by pm.DateValue,ar.Area

joining monthly values with daily values in sql

I have daily values in one table and monthly values in another table. I need to use the values of the monthly table and calculate them on a daily basis.
basically, monthly factor * daily factor -- for each day
thanks!
I have a table like this:
2010-12-31 00:00:00.000 28.3
2010-09-30 00:00:00.000 64.1
2010-06-30 00:00:00.000 66.15
2010-03-31 00:00:00.000 12.54
and a table like this :
2010-12-31 00:00:00.000 98.1
2010-12-30 00:00:00.000 97.61
2010-12-29 00:00:00.000 99.03
2010-12-28 00:00:00.000 97.7
2010-12-27 00:00:00.000 96.87
2010-12-23 00:00:00.000 97.44
2010-12-22 00:00:00.000 97.76
2010-12-21 00:00:00.000 96.63
2010-12-20 00:00:00.000 95.47
2010-12-17 00:00:00.000 95.2
2010-12-16 00:00:00.000 94.84
2010-12-15 00:00:00.000 94.8
2010-12-14 00:00:00.000 94.1
2010-12-13 00:00:00.000 93.88
2010-12-10 00:00:00.000 93.04
2010-12-09 00:00:00.000 91.07
2010-12-08 00:00:00.000 90.89
2010-12-07 00:00:00.000 92.72
2010-12-06 00:00:00.000 93.05
2010-12-03 00:00:00.000 91.74
2010-12-02 00:00:00.000 90.74
2010-12-01 00:00:00.000 90.25
I need to take the value for the quarter and multiply it buy all the days in the quarter by the daily value
You could try:
SELECT dt.day, dt.factor*mt.factor AS daily_factor
FROM daily_table dt INNER JOIN month_table mt
ON YEAR(dt.day) = YEAR(mt.day)
AND FLOOR((MONTH(dt.day)-1)/3) = FLOOR((MONTH(mt.day)-1)/3)
ORDER BY dt.day
or (as suggested by #Andriy)
SELECT dt.day, dt.factor*mt.factor AS daily_factor
FROM daily_table dt INNER JOIN month_table mt
ON YEAR(dt.day) = YEAR(mt.day)
AND DATEPART(QUARTER, dt.day) = DATEPART(QUARTER, mt.day)
ORDER BY dt.day