sql in but with multiple column - sql

I have final sql query like this
UPDATE Booking
set BookingType='Booked'
where BookingType='Defaulter' and BookingId in(SELECT BookingID
FROM ScheduledDues WHERE projectID=#ProjectId and DueFrom <= GETDATE()
GROUP BY BookingID HAVING MAX(DueTill) =0)
Now what i want is that select column to contains ScheduleDues with order by ScheduleDues desc. but cannot do it because it contains in. how can i do it?
SELECT BookingID,Max(DueTill),ScheduledDueID
FROM ScheduledDues WHERE projectID=30 and DueFrom <= GETDATE()
GROUP BY BookingID ,ScheduledDueID order by ScheduledDueID desc

If I understand your question correctly, is this what you try to achieve?
SELECT
ScheduledDues.BookingID
, ScheduledDues.DueTill
-- or you could type , InnerTable.MaxDueTill
, ScheduledDues.ScheduledDueID
FROM (
SELECT BookingID, Max(DueTill) AS MaxDueTill
FROM ScheduledDues
WHERE projectID=30 and DueFrom <= GETDATE()
GROUP BY BookingID ) InnerTable
JOIN ScheduledDues
ON InnerTable.BookingID = ScheduledDues.BookingID
AND InnerTable.MaxDueTill = ScheduledDues.DueTill
ORDER BY ScheduledDueID DESC
Thus for each booking (that can have multiple Schedule-Ids) you want the schedule-id associated with the maximum due-date. Meaning that for each booking, you get one schedule-id. And you wish to order that schedule-id. (And with the schedule-id being in the original group-by, you got all schedule-ids and not just the one you were looking for.)

Related

Delete the records repeated by date, and keep the oldest

I have this query, and it returns the following result, I need to delete the records repeated by date, and keep the oldest, how could I do this?
select
a.EMP_ID, a.EMP_DATE,
from
EMPLOYES a
inner join
TABLE2 b on a.table2ID = b.table2ID and b.ID_TYPE = 'E'
where
a.ID = 'VJAHAJHSJHDAJHSJDH'
and year(a.DATE) = 2021
and month(a.DATE) = 1
and a.ID <> 31
order by
a.DATE;
Additionally, I would like to fill in the missing days of the month ... and put them empty if I don't have that data, can this be done?
I would appreciate if you could guide me to solve this problem
Thank you!
The other answers miss some of the requirement..
Initial step - do this once only. Make a calendar table. This will come in handy for all sorts of things over the time:
DECLARE #Year INT = '2000';
DECLARE #YearCnt INT = 50 ;
DECLARE #StartDate DATE = DATEFROMPARTS(#Year, '01','01')
DECLARE #EndDate DATE = DATEADD(DAY, -1, DATEADD(YEAR, #YearCnt, #StartDate));
;WITH Cal(n) AS
(
SELECT 0 UNION ALL SELECT n + 1 FROM Cal
WHERE n < DATEDIFF(DAY, #StartDate, #EndDate)
),
FnlDt(d, n) AS
(
SELECT DATEADD(DAY, n, #StartDate), n FROM Cal
),
FinalCte AS
(
SELECT
[D] = CONVERT(DATE,d),
[Dy] = DATEPART(DAY, d),
[Mo] = DATENAME(MONTH, d),
[Yr] = DATEPART(YEAR, d),
[DN] = DATENAME(WEEKDAY, d),
[N] = n
FROM FnlDt
)
SELECT * INTO Cal FROM finalCte
ORDER BY [Date]
OPTION (MAXRECURSION 0);
credit: mostly this site
Now we can write some simple query to stick your data (with one small addition) onto it:
--your query, minus the date bits in the WHERE, and with a ROW_NUMBER
WITH yourQuery AS(
SELECT a.emp_id, a.emp_date,
ROW_NUMBER() OVER(PARTITION BY CAST(a.emp_date AS DATE) ORDER BY a.emp_date) rn
FROM EMPLOYES a
INNER JOIN TABLE2 b on a.table2ID = b.table2ID
WHERE a.emp_id = 'VJAHAJHSJHDAJHSJDH' AND a.id <> 31 AND b.id_type = 'E'
)
--your query, left joined onto the cal table so that you get a row for every day even if there is no emp data for that day
SELECT c.d, yq.*
FROM
Cal c
LEFT JOIN yourQuery yq
ON
c.d = CAST(yq.emp_date AS DATE) AND --cut the time off
yq.rn = 1 --keep only the earliest time per day
WHERE
c.d BETWEEN '2021-01-01' AND EOMONTH('2021-01-01')
We add a rownumbering to your table, it restarts every time the date changes and counts up in order of time. We make this into a CTE (or a subquery, CTE is cleaner) then we simply left join it to the calendar table. This means that for any date you don't have data, you still have the calendar date. For any days you do have data, the rownumber rn being a condition of the join means that only the first datetime from each day is present in the results
Note: something is wonky about your question . You said you SELECT a.emp_id and your results show 'VJAHAJHSJHDAJHSJDH' is the emp id, but your where clause says a.id twice, once as a string and once as a number - this can't be right, so I've guessed at fixing it but I suspect you have translated your query into something for SO, perhaps to hide real column names.. Also your SELECT has a dangling comma that is a syntax error.
If you have translated/obscured your real query, make absolutely sure you understand any answer here when translating it back. It's very frustrating when someone is coming back and saying "hi your query doesn't work" then it turns out that they damaged it trying to translate it back to their own db, because they hid the real column names in the question..
FInally, do not use functions on table data in a where clause; it generally kills indexing. Always try and find a way of leaving table data alone. Want all of january? Do like I did, and say table.datecolumn BETWEEN firstofjan AND endofjan etc - SQLserver at least stands a chance of using an index for this, rather than calling a function on every date in the table, every time the query is run
You can use ROW_NUMBER
WITH CTE AS
(
SELECT a.EMP_ID, a.EMP_DATE,
RN = ROW_NUMBER() OVER (PARTITION BY a.EMP_ID, CAST(a.DATE as Date) ORDER BY a.DATE ASC)
from EMPLOYES a INNER JOIN TABLE2 b
on a.table2ID = b.table2ID
and b.ID_TYPE = 'E'
where a.ID = 'VJAHAJHSJHDAJHSJDH'
and year(a.DATE) = 2021
and MONTH(a.DATE) = 1
and a.ID <> 31
)
SELECT * FROM CTE
WHERE RN = 1
Try with an aggregate function MAX or MIN
create table #tmp(dt datetime, val numeric(4,2))
insert into #tmp values ('2021-01-01 10:30:35', 1)
insert into #tmp values ('2021-01-02 10:30:35', 2)
insert into #tmp values ('2021-01-02 11:30:35', 3)
insert into #tmp values ('2021-01-03 10:35:35', 4)
select * from #tmp
select tmp.*
from #tmp tmp
inner join
(select max(dt) as dt, cast(dt as date) as dt_aux from #tmp group by cast(dt as date)) compressed_rows on
tmp.dt = compressed_rows.dt
drop table #tmp
results:

How to use DISTINCT in one column when SELECTING from many columns

I'm trying to select columns from two different views but I only want to use the DISTINCT statement on one specific column. I thought using the GROUP BY statement would work but it's throwing an error.
SELECT DISTINCT
[Act].[ClientId]
, [Ref].[Agency]
, [Act].[FundCode]
, [Act].[VService]
, [Act].[Service]
, [Act].[Attended]
, [Act].[StartDate]
FROM [dbo].[FS_v_CrossReference_ALL] AS [Ref]
INNER JOIN [dbo].[FS_v_Activities] AS [Act] ON [Ref].[VendorId] = [Act].[VendorId]
WHERE [Act].[StartDate] BETWEEN '1/1/2015' AND '12/31/2015'
GROUP BY [Act].[ClientId]
I want to use the DISTINCT statement on [Act].[ClientId]. Is there any way to do this?
Presumably, you want row_number():
SELECT ar.*
FROM (SELECT Act.*, Reg.Agency,
ROW_NUMBER() OVER (PARTITION BY Act.ClientId ORDER BY ACT.StartDate DESC) as seqnum
FROM [dbo].[FS_v_CrossReference_ALL] [Ref] JOIN
[dbo].[FS_v_Activities] Act
ON [Ref].[VendorId] = [Act].[VendorId]
WHERE [Act].[StartDate] >= '2015-01-01' AND
[Act].[StartDate] < '2016-01-01'
) ar
WHERE seqnum = 1;
Particularly note the changes to the date comparisons:
The dates are in standard format (YYYY-MM-DD or YYYYMMDD).
BETWEEN is replaced by two inequalities. This makes the code robust if the date is really a date/time with a time component.

Counting New Unique Values in Growing Time Window

I have a large table of users (as a guid), some associated values, and a time stamp of when each row was inserted. A user might be associated with many rows in this table.
guid | <other columns> | insertdate
I want to count for each month: how many unique new users were inserted. It's easy to do manually:
select count(distinct guid)
from table
where insertdate >= '20060201' and insertdate < '20060301'
and guid not in (select guid from table where
insertdate >= '20060101' and insertdate < '20060201')
How could this be done for each successive month in sql?
I thought to use a rank function to associate clearly each guid with a month:
select guid,
,dense_rank() over ( order by datepart(YYYY, insertdate),
datepart(m, t.TransactionDateTime)) as MonthRank
from table
and then iterate upon each rank value:
declare #no_times int
declare #counter int = 1
set #no_times = select count(distinct concat(datepart(year, t.TransactionDateTime),
datepart(month, t.TransactionDateTime))) from table
while #no_times > 0 do
(
select count(*), #counter
where guid not in (select guid from table where rank = #counter)
and rank = #int + 1
#counter += 1
#no_times -= 1
union all
)
end
I know this strategy is probably the wrong way to go about things.
Ideally, I would like a result set to look like this:
MonthRank | NoNewUsers
I would be extremely interested and grateful if a sql wizard could point me in the right direction.
SELECT
DATEPART(year,t.insertdate) AS YearNum
,DATEPART(mm,t.insertdate) as MonthNum
,COUNT(DISTINCT guid) AS NoNewUsers
,DENSE_RANK() OVER (ORDER BY COUNT(DISTINCT t.guid) DESC) AS MonthRank
FROM
table t
LEFT JOIN table t2
ON t.guid = t2.guid
AND t.insertdate > t2.insertdate
WHERE
t2.guid IS NULL
GROUP BY
DATEPART(year,t.insertdate)
,DATEPART(mm,t.insertdate)
Use a left join to see if the table ever existed as a prior insert date and if they didn't then count them using aggregation like you normally would. If you want to add a rank to see which month(s) have the highest number of new users then you can use your DENSE_RANK() function but because you are already grouping by want you want you do not need a partition clause.
If you want the first time that a guid entered, then your query doesn't exactly work. You can get the first time with two aggregations:
select year(first_insertdate), month(first_insertdate), count(*)
from (select t.guid, min(insertdate) as first_insertdate
from t
group by t.guid
) t
group by year(first_insertdate), month(first_insertdate)
order by year(first_insertdate), month(first_insertdate);
If you are looking for counting guids each time they skip a month, then you can use lag():
select year(insertdate), month(insertdate), count(*)
from (select t.*,
lag(insertdate) over (partition by guid order by insertdate) as prev_insertdate
from t
) t
where prev_insertdate is null or
datediff(month, prev_insertdate, insertdate) >= 2
group by year(insertdate), month(insertdate)
order by year(insertdate), month(insertdate);
I solved it with the terrible while loop, then a friend helped me to solve it more efficiently in another way.
The loop version:
--ranked by month
select t.TransactionID
,t.BuyerUserID
,concat(datepart(year, t.InsertDate), datepart(month,
t.InsertDate)) MonthRankName
,dense_rank() over ( order by datepart(YYYY, t.InsertDate),
datepart(m, t.InsertDate)) as MonthRank
into #ranked
from table t;
--iteratate
declare #counter int = 1
declare #no_times int
select #no_times = count(distinct concat(datepart(year, t.InsertDate),
datepart(month, t.InsertDate))) from table t;
select count(distinct r.guid) as NewUnique, r.Monthrank into #results
from #ranked r
where r.MonthRank = 1 group by r.MonthRank;
while #no_times > 1
begin
insert into #results
select count(distinct rt.guid) as NewUnique, #counter + 1 as MonthRank
from #ranked r
where rt.guid not in
(
select rt2.guid from #ranked rt2
where rt2.MonthRank = #counter
)
and rt.MonthRank = #counter + 1
set #counter = #counter+1
set #no_times = #no_times-1
end
select * from #results r
This turned out to run pretty slowly (as you might expect)
What turned out to be faster by a factor of 10 was this method:
select t.guid,
cast (concat(datepart(year, min(t.InsertDate)),
case when datepart(month, min(t.InsertDate)) < 10 then
'0'+cast( datepart(month, min(t.InsertDate)) as varchar(10))
else cast (datepart(month, min(t.InsertDate)) as varchar(10)) end
) as int) as MonthRankName
into #NewUnique
from table t
group by t.guid;
select count(1) as NewUniques, t.MonthRankName from #NewUnique t
group by t.MonthRankName
order by t.MonthRankName
Simply identifying the very first month each guid appears, then counting the number of these occurring each month. With a bit of a hack to get YearMonth formatted nicely (this seems to be more efficient than format([date], 'yyyyMM') but need to experiment more on that.

SQL Query to show order of work orders

First off sorry for the poor subject line.
EDIT: The Query here duplicates OrderNumbers I am needing the query to NOT duplicate OrderNumbers
EDIT: Shortened the question and provided a much cleaner question
I have a table that has a record of all of the work orders that have been performed. there are two types of orders. Installs and Trouble Calls. My query is to find all of the trouble calls that have taken place within 30 days of an install and match that trouble call (TC) to the proper Install (IN). So the Trouble Call date has to happen after the install but no more than 30 days after. Additionally if there are two installs and two trouble calls for the same account all within 30 days and they happen in order the results have to reflect that. The problem I am having is I am getting an Install order matching to two different Trouble Calls (TC) and a Trouble Call(TC) that is matching to two different Installs(IN)
In the example on SQL Fiddle pay close attention to the install order number 1234567810 and the Trouble Call order number 1234567890 and you will see the issue I am having.
http://sqlfiddle.com/#!3/811df/8
select b.accountnumber,
MAX(b.scheduleddate) as OriginalDate,
b.workordernumber as OriginalOrder,
b.jobtype as OriginalType,
MIN(a.scheduleddate) as NewDate,
a.workordernumber as NewOrder,
a.jobtype as NewType
from (
select workordernumber,accountnumber,jobtype,scheduleddate
from workorders
where jobtype = 'TC'
) a join
(
select workordernumber,accountnumber,jobtype,scheduleddate
from workorders
where jobtype = 'IN'
) b
on a.accountnumber = b.accountnumber
group by b.accountnumber,
b.scheduleddate,
b.workordernumber,
b.jobtype,
a.accountnumber,
a.scheduleddate,
a.workordernumber,
a.jobtype
having MIN(a.scheduleddate) > MAX(b.scheduleddate) and
DATEDIFF(day,MAX(b.scheduleddate),MIN(a.scheduleddate)) < 31
Example of what I am looking for the results to look like.
Thank you for any assistance you can provide in setting me on the correct path.
You were actually very close. I realized that what you really want is the MIN() TC date that is greater than each install date for that account number so long as they are 30 days or less apart.
So really you need to group by the install dates from your result set excluding WorkOrderNumbers still. Something like:
SELECT a.AccountNumber, MIN(a.scheduleddate) TCDate, b.scheduleddate INDate
FROM
(
SELECT WorkOrderNumber, ScheduledDate, JobType, AccountNumber
FROM workorders
WHERE JobType = 'TC'
) a
INNER JOIN
(
SELECT WorkOrderNumber, ScheduledDate, JobType, AccountNumber
FROM workorders
WHERE JobType = 'IN'
) b
ON a.AccountNumber = b.AccountNumber
WHERE b.ScheduledDate < a.ScheduledDate
AND DATEDIFF(DAY, b.ScheduledDate, a.ScheduledDate) <= 30
GROUP BY a.AccountNumber, b.AccountNumber, b.ScheduledDate
This takes care of the dates and AccountNumbers, but you still need the WorkOrderNumbers, so I joined the workorders table back twice, once for each type.
NOTE: I assume that each workorder has a unique date for each account number. So, if you have workorder 1 ('TC') for account 1 done on '1/1/2015' and you also have workorder 2 ('TC') for account 1 done on '1/1/2015' then I can't guarantee that you will have the correct WorkOrderNumber in your result set.
My final query looked like this:
SELECT
aggdata.AccountNumber, inst.workordernumber OriginalWorkOrderNumber, inst.JobType OriginalJobType, inst.ScheduledDate OriginalScheduledDate,
tc.WorkOrderNumber NewWorkOrderNumber, tc.JobType NewJobType, tc.ScheduledDate NewScheduledDate
FROM (
SELECT a.AccountNumber, MIN(a.scheduleddate) TCDate, b.scheduleddate INDate
FROM
(
SELECT WorkOrderNumber, ScheduledDate, JobType, AccountNumber
FROM workorders
WHERE JobType = 'TC'
) a
INNER JOIN
(
SELECT WorkOrderNumber, ScheduledDate, JobType, AccountNumber
FROM workorders
WHERE JobType = 'IN'
) b
ON a.AccountNumber = b.AccountNumber
WHERE b.ScheduledDate < a.ScheduledDate
AND DATEDIFF(DAY, b.ScheduledDate, a.ScheduledDate) <= 30
GROUP BY a.AccountNumber, b.AccountNumber, b.ScheduledDate
) aggdata
LEFT OUTER JOIN workorders tc
ON aggdata.TCDate = tc.ScheduledDate
AND aggdata.AccountNumber = tc.AccountNumber
AND tc.JobType = 'TC'
LEFT OUTER JOIN workorders inst
ON aggdata.INDate = inst.ScheduledDate
AND aggdata.AccountNumber = inst.AccountNumber
AND inst.JobType = 'IN'
select in1.accountnumber,
in1.scheduleddate as OriginalDate,
in1.workordernumber as OriginalOrder,
'IN' as OriginalType,
tc.scheduleddate as NewDate,
tc.workordernumber as NewOrder,
'TC' as NewType
from
workorders in1
out apply (Select min(in2.scheduleddate) as scheduleddate from workorders in2 Where in2.jobtype = 'IN' and in1.accountnumber=in2.accountnumber and in2.scheduleddate>in1.scheduleddate) ins
join workorders tc on tc.jobtype = 'TC' and tc.accountnumber=in1.accountnumber and tc.scheduleddate>in1.scheduleddate and (ins.scheduleddate is null or tc.scheduleddate<ins.scheduleddate) and DATEDIFF(day,in1.scheduleddate,tc.scheduleddate) < 31
Where in1.jobtype = 'IN'

No column name was specified for column 2 of 't' T_SQL error

PLease i want to return only debtorid from this T-SQL . It gives the error "
No column name was specified for column 2 of 't'." My query is below.
with t as
(SELECT debtorid,MAX(dbo.FollowUp.followupdate) FROM dbo.FollowUp WITH (NOLOCK) WHERE
( dbo.FollowUp.FollowUpDate >= '01-01-2011 00:00:00:000' and dbo.FollowUp.FollowUpDate <= '01-13-2014 23:59:59.000'
and Status = 'PTP') GROUP BY [Status], DebtorId)
SELECT t.debtorid FROM t;
WIth my little knowledge , i thought the above should work fine for me. however, it didn't.Any help would be appreciated.
You need to add a name to your second column, e.g.:
with t as
(SELECT debtorid, MAX(dbo.FollowUp.followupdate) maxFollowup
FROM dbo.FollowUp WITH (NOLOCK)
WHERE dbo.FollowUp.FollowUpDate >= '01-01-2011 00:00:00:000'
and dbo.FollowUp.FollowUpDate <= '01-13-2014 23:59:59.000'
and Status = 'PTP'
GROUP BY [Status], DebtorId)
SELECT t.debtorid FROM t;
You are using Common Table Expressions or CTE, in CTE you must have column name as per msdn. You can also refer for CTE best practices:
http://technet.microsoft.com/en-us/library/ms190766(v=sql.105).aspx
http://blog.sqlauthority.com/2011/05/10/sql-server-common-table-expression-cte-and-few-observation/
;with t as
(
SELECT FWP.debtorid
,MAX(FWP.followupdate) followupdate
FROM dbo.FollowUp FWP WITH (NOLOCK)
WHERE
(
FWP.FollowUpDate >= '01-01-2011 00:00:00:000'
and FWP.FollowUpDate <= '01-13-2014 23:59:59.000'
and FWP.Status = 'PTP'
) GROUP BY FWP.[Status], FWP.DebtorId
)
SELECT t.debtorid FROM t;