SQL Server - Group by - Additional column - sql

I have a problem that I can hardly put in words and thus was not able to search for a solution before creating this post. Please forgive me if this has been asked before. Let me illustrate input and desired output:
Order Description Operation OperationDescription SubTarget
12 Order12 Op1 Order12, Op1 ABA
12 Order12 Op2 Order12, Op2 ABB
18 Order18 Op1 Order18, Op1 XYA
18 Order18 Op2 Order18, Op2 XYB
19 Order19 Op1 Order19, Op1 KLA
20 Order20 Op1 Order20, Op1 Truck123
20 Order20 Op2 Order20, Op2 Truck456
20 Order20 Op3 Order20, Op3 Truck789
20 Order20 Op4 Order20, Op4 Truck123
When I query the table above and group by Order and Description, I'd like to get all char from SubTarget (from left to write) as long as they match (and discard the rest):
Order Description SubTarget
12 Order12 AB
18 Order18 XY
19 Order19 KLA
20 Order20 Truck
I once found some neat code on the net to concatenate different values from a column not in the group by clause, using STUFF and FOR XML PATH. Not sure if that approach could be helpful here as well.
Thank you all in advance!
Regards,
Toby
Additional notes, based on comments and answer from #junketsu:
There is a column Target in the background, which is not accessible. Its content is always a partition of SubTarget - the other way around: SubTarget further adds some details to Target by adding more characters to the end of the string. That said, both values are not limited to two or three characters respectively. If so, I could easily use substring function.
The thrid example (Order# 19) might be confusing. I included this sample so show that it would be fine to the the whole string as a result if there were only one single operation in the order.
Another example may be: Order 5 with Operation Op1, Op2, Op3 and SubTarget Truck123, Truck456, Truck789 and Truck 123. This should produce "Truck" as outcome. The repetition of Truck123 is no error.
Hope this makes it clearer.
In the end I want to approach the actual content of column Target as it cannot be included in the query.
Thanks again,
Toby

I cant able to understand your Additional notes and The thrid example (Order# 19). I just worked for your expected answer,
create table #group (Orders int,Description varchar (20),Operation varchar (20)
,OperationDescription varchar (20),SubTarget varchar (20)
)
insert into #group values
(20,'Order20','Op1','Order20, Op1','Truck123')
,(20,'Order20','Op2','Order20, Op2','Truck456')
,(20,'Order20','Op3','Order20, Op3','Truck789')
,(20,'Order20','Op4','Order20, Op4','Truck123')
,(12,'Order12','Op1','Order12 Op1','ABA')
,(12,'Order12','Op2','Order12 Op2','ABB')
,(18,'Order18','Op1','Order18 Op1','XYA')
,(18,'Order18','Op2','Order18 Op2','XYB')
,(19,'Order19','Op1','Order19 Op1','KLA')
select distinct
gor.Orders, gor.Description, iif (g.c = 1, gor.SubTarget
, left (gor.subtarget, 2)) subtraget
from (
select distinct
orders, Description
, count (*) c
from #group group by orders, Description
) g join #group gor on g.Orders = gor.Orders
And I got :
Orders Description subtraget
12 Order12 AB
18 Order18 XY
19 Order19 KLA
20 Order20 Tr
Revert me, if query needs updates.
Update 1 Find the updated query.
select distinct
orders, Description, Operation, OperationDescription
, iif (count (*) over (partition by orders, Description ) = 1, subtarget,
left (subtarget, 2)
) subtarget
from #group
Update 2
1). cte: First of all I take a sub-string of all subtarget.
eg: Truck123->Truck12->Truck1->......->Tr.
2). countlen: I count the pattern in cte and get max length. Because, basic string comes many times.
eg: Truck comes more time than, Trunck123, Trunck456, Trunck789, Trunck123.
And Truck length is greater than Tr, Tru, Truc.
3). maxcount: I get the maximum count, which are returned by countlen
4). At last I join above cte's without subtarget. Then which is got from cte.
;with cte as (
select Orders, Description, SubTarget, len (SubTarget) len from #group
union all
select Orders, Description, left (subtarget, len (SubTarget) - 1)
, LEN (SubTarget) - 1 from cte where len > 2
), countlen as (
select
Orders, Description, SubTarget
, count (len) over (partition by Orders, Description, SubTarget order by len) count
, max (len) over (partition by Orders, Description, SubTarget order by len) maxlen
from cte
), maxcount as (
select Orders, Description, max (count) maxcount from countlen group by Orders, Description
) select distinct
o.Orders, o.Description, c.SubTarget
from (
select
cc.Orders, cc.Description, max (cc.maxlen) maxofmax
from countlen cc
join maxcount m
on cc.Orders = m.Orders and cc.Description = m.Description
where m.maxcount = cc.count
group by cc.Orders, cc.Description
) o
join cte c
on o.Orders = c.Orders and o.Description = c.Description and len (c.SubTarget) = o.maxofmax

Here you go good sir
create table #temp_1
( [order] int null
,Description varchar(15) null
,Operation varchar(30) null
,OperationDescription varchar(30) null
,SubTarget varchar(30) null
)
insert into #temp_1 values
(12 ,'Order12','Op1', 'Order12, Op1' ,'ABA')
,(12 ,'Order12',' Op2', 'Order12, Op2' ,'ABB')
,(18 ,'Order18','Op1', 'Order18, Op1' ,'XYA')
,(18 ,'Order18','Op2', 'Order18, Op2' ,'XYB')
,(19 ,'Order19','Op1', 'Order19, Op1' ,'KLA')
select *
from (
select *
,Rank_1 = Row_number() over(partition by SubTarget_1 order by [Order] asc)
from (
select [order],[Description]
--,SubTarget = substring(SubTarget,0,3)
,SubTarget_1 = case when SubTarget like 'a%b%' then 'AB'
when SubTarget like 'x%y%'then 'XY' else SubTarget end
from #temp_1
) a
) b
where Rank_1 = 1
order by [Order] asc

Related

writing a query using advanced group by

I have a single table database consists of the following fields:
ID, Seniority (years), outcome and some other less important fields.
Table row example:
ID:36 Seniority(years):1.79 outcome:9627
I need to write a query (sql server) in relatively simple code that returns the average outcome, grouped by the Seniority field, with leaps of five years (0-5 years, 6-10 etc...) with the condition that the average will be shown only if the group has more than 3 rows.
Result row example:
range:0-5 average:xxxx
Thank you very much
Use CASE statement to create different age groups. Try this
select case when Seniority between 0 and 5 then '0-5'
when Seniority between 6 and 10 then '6-10'
..
End,
Avg(outcome)
From yourtable
Group by case when Seniority between 0 and 5 then '0-5'
when Seniority between 6 and 10 then '6-10'
..
End
Having count(1)>=3
Since you have decimal places, If you want to count 5.4 to 0-5 group and 5.6 to 6-10 then use Round(Seniority,0) instead of Seniority in CASE statement
P.s.
0-5 contains 6 values while 6-10 contains 5.
select 'range:'
+ cast (isnull(nullif(floor((abs(seniority-1))/5)*5+1,1),0) as varchar)
+ '-'
+ cast ((floor((abs(seniority-1))/5)+1)*5 as varchar) as seniority_group
,avg(outcome)
from t
group by floor((abs(seniority-1))/5)
having count(*) >= 3
;
This would be something like:
select floor(seniority / 5), avg(outcome)
from t
group by floor(seniority / 5)
having count(*) >= 3;
Note: This breaks the seniority into equal sized groups which is 0-4, 5-9, and so on. This seems more reasonable than having unequal groups.
You can follow Gordon's answer(but you should to edit it a little), but I would do this with additional table with all possible intervals. You then can add appropriate index to boost it.
create table intervals
(
id int identity(1, 1),
start int,
end int
)
insert into intervals values
(0, 5),
(6, 10)
...
select i.id, avg(t.outcome) as outcome
from intervals i
join tablename t on t.seniority between i.start and i.end
group by i.id
having count(*) >=3
If creating new tables is not an option you can always use a CTE:
;with intervals as(
select * from
(values
(0, 5),
(6, 10)
--...
) t(start, [end])
)
select i.id, avg(t.outcome) as outcome
from intervals i
join tablename t on t.seniority between i.start and i.[end]
group by i.id
having count(*) >=3

Treating null values of two records as not equal in SQL query

I am working on a SQL query which performs some calculations and returns difference of two columns that belongs to two different rows of single table when certain values in the other columns are not equal
For Example I have the following data in a table
id Market Grade Term Bid Offer CP
1 Heavy ABC Jun14 -19.5 -17 BA
2 Heavy ABC Jul14 -20 -17.5 BB
3 Sour XYZ Jun14 -30 -17 NULL
4 Sour XYZ Jul14 -32 -27 NULL
5 Sweet XY Jun14 -30 -17 PV
6 Sweet XY Jul14 -32 -27 PV
Now, I want the following results
(AS Market and Grade are same and CP are not same for Id=1,2 So, it should calculate
Bid of Id=1 - Offer of Id=2
Offer of Id=1- Bid of Id=2
(AS Market and Grade are same for Id=3,4 and also their CP are both NULL logically but I still want to calculate as I did in the previous case
Bid of Id=3 - Offer of Id=4
Offer of Id=3- Bid of Id=4
And, Finally I dont want to calculate anything for record with Ids 5 and 6 as their CPs are same
Something Like the following should be the result
Market Term Bid Offer
Heavy/ABC Jun14/Jul14 (-19.5-(-17.5))=-2 (-17-(-20))=3
Sour/XYZ Jun14/Jul14 (-30-(-27))=-3 (-17-(-32))=15
I was able to figure out most of this except the case when CPs are two records are NULL as it is treating them as equal which is obvious
;with numbered as
(
select id, market, grade, term, bid, offer, row_number() OVER (Partition BY Market, Grade ORDER BY Bid desc) i
from things
)
--select * from numbered
select r1.market + '/' + r1.grade as Market, r1.term + '/' + r2.term as Term, r1.Bid - r2.Offer [Bid], r1.Offer - r2.Bid [Offer]
from numbered r1
join numbered r2 on r1.market = r2.market and r1.grade = r2.grade and r1.i < r2.i and r1.CP!=r2.CP
How can I treat both NULLs as not equal.
Can you not just change:
and r1.CP!=r2.CP
to:
and ISNULL(r1.CP, 'X') != ISNULL(r2.CP, 'Y')
Edit. If you want to be really safe and live a little dangerously you could even do this:
and ISNULL(r1.CP, CONVERT(VARCHAR(36), NEWID())) != ISNULL(r2.CP, CONVERT(VARCHAR(36), NEWID()))
I'm not quite throwing this out as an answer because it is an awful solution, but you could replace NULLS with a stand-in value with the ISNULL function.
;with numbered as
(
select id, market, grade, term, bid, offer, row_number()
OVER (Partition BY Market, Grade ORDER BY Bid desc) i
from things
)
--select * from numbered
select r1.market + '/' + r1.grade as Market, r1.term + '/' + r2.term as Term,
r1.Bid - r2.Offer [Bid], r1.Offer - r2.Bid [Offer]
from numbered r1
join numbered r2 on r1.market = r2.market and r1.grade = r2.grade and r1.i < r2.i
and ISNULL(r1.CP, 1) != ISNULL(r2.CP,2)
You couldn't treat both NULLs as equal, even if you wanted to. NULL is the absence of value; you can't compare nothing with nothing, that makes no sense.

how to have column output zero when it equals null?

I have a query where i get the average of some ratings
which works correct.
WITH row_avg_table(avg_rating,employee, approveddate) AS
(SELECT
(SELECT AVG(rating)
FROM (
VALUES (CAST(c.rating1 AS float)), (CAST(c.rating2 AS float)), (CAST(c.rating3 AS float)),
(CAST(c.rating4 AS float)), (CAST(c.rating5 AS float)) ) AS v (rating)
WHERE v.rating > 0) avg_rating,
employee,approveddate
FROM CSEReduxResponses c)
SELECT employee,
avg(avg_rating) as average_rating
FROM row_avg_table
where month(approveddate)=2014
AND year(approveddate)=6
GROUP BY employee;
The problem im having is when a rating 1-5 would all be 0.
Right now it gives me 'null' i would like it to show 0 for this special occasion.
for example i have the data below
create table CSEReduxResponses (rating1 int, rating2 int, rating3 int, rating4 int, rating5 int,
approveddate datetime,employee int)
insert into CSEReduxResponses (rating1 , rating2 ,rating3 , rating4 , rating5 ,
approveddate, employee )
values
(5,4,5,1,4,'2014-06-18',1),
(5,4,5,1,0,'2014-06-18',1),
(0,0,0,0,0,'2014-06-19',3);
So for employee=3 average_rating =0
What if you use ISNULL function like below
SELECT ISNULL(AVG(rating),some_default_value)
(OR) CASE condition like below
SELECT AVG(case when rating = 0 then 1 else rating end)
The problem you have is the WHERE v.rating > 0 clause. If the ratings are 0, then this will return no values - and the average of nothing cannot be calculated, and thus returns null. Simply remove this clause, or if you want to filter negative values, change it to WHERE v.rating >= 0.
If you use 0 to store values that you don't want to impact the rating (like it appears that you might be doing from the second of the examples you provide) this won't work - you'll have to instead go through the output and replace null values with 0. You might copy the results from your first query into a temp table/table variable and do replace such as
select employee, rating = isnull(t.rating, 0)
from #temptable t
Or something of the like.
Use
isnull(Rating1, 0)
or
coalesce(Rating1, 0)
If Rating1 is null then it will be replaced with the default value 0. Similartly you can check for other fields. The COALESCE and ISNULL T-SQL functions are used to return the first nonnull expression among the input arguments but there are few differences between those which you can find here
More details on coalesce
There is no advantage in using a CTE for this. You are unpivoting the non-normalized table using cross apply and values. To arrive at the equivalent of the unpivot command you use WHERE ... IS NOT NULL but presumably you are ensuring all 5 values are present to avoid bias in the calculations, so I would suggest this:
SELECT
employee
, AVG(isnull(rating,0)) AS avg_rating
FROM CSEReduxResponses c
CROSS APPLY (
VALUES (CAST(c.rating1 AS float))
, (CAST(c.rating2 AS float))
, (CAST(c.rating3 AS float))
, (CAST(c.rating4 AS float))
, (CAST(c.rating5 AS float))
) AS ca1 (rating)
WHERE approveddate >= '20140601'
AND approveddate < '20140701'
GROUP BY
employee
;
results from your sample data:
| EMPLOYEE | AVG_RATING |
|----------|------------|
| 1 | 3.4 |
| 3 | 0 |
see: http://sqlfiddle.com/#!3/f91d9/2
Also: I would encourage you to NOT to use functions on the data to facilitate a where condition. Avoiding functions on data allows use of indexes. Here all you need is a simple date range (I have used YYYYMMDD as it is the safest format for sql server).

Sort string as number in sql server

I have a column that contains data like this. dashes indicate multi copies of the same invoice and these have to be sorted in ascending order
790711
790109-1
790109-11
790109-2
i have to sort it in increasing order by this number but since this is a varchar field it sorts in alphabetical order like this
790109-1
790109-11
790109-2
790711
in order to fix this i tried replacing the -(dash) with empty and then casting it as a number and then sorting on that
select cast(replace(invoiceid,'-','') as decimal) as invoiceSort...............order by invoiceSort asc
while this is better and sorts like this
invoiceSort
790711 (790711) <-----this is wrong now as it should come later than 790109
790109-1 (7901091)
790109-2 (7901092)
790109-11 (79010911)
Someone suggested to me to split invoice id on the - (dash ) and order by on the 2 split parts
like=====> order by split1 asc,split2 asc (790109,1)
which would work i think but how would i split the column.
The various split functions on the internet are those that return a table while in this case i would be requiring a scalar function.
Are there any other approaches that can be used? The data is shown in grid view and grid view doesn't support sorting on 2 columns by default ( i can implement it though :) ) so if any simpler approaches are there i would be very nice.
EDIT : thanks for all the answers. While every answer is correct i have chosen the answer which allowed me to incorporate these columns in the GridView Sorting with minimum re factoring of the sql queries.
Judicious use of REVERSE, CHARINDEX, and SUBSTRING, can get us what we want. I have used hopefully-explanatory columns names in my code below to illustrate what's going on.
Set up sample data:
DECLARE #Invoice TABLE (
InvoiceNumber nvarchar(10)
);
INSERT #Invoice VALUES
('790711')
,('790709-1')
,('790709-11')
,('790709-21')
,('790709-212')
,('790709-2')
SELECT * FROM #Invoice
Sample data:
InvoiceNumber
-------------
790711
790709-1
790709-11
790709-21
790709-212
790709-2
And here's the code. I have a nagging feeling the final expressions could be simplified.
SELECT
InvoiceNumber
,REVERSE(InvoiceNumber)
AS Reversed
,CHARINDEX('-',REVERSE(InvoiceNumber))
AS HyphenIndexWithinReversed
,SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))
AS ReversedWithoutAffix
,SUBSTRING(InvoiceNumber,1+LEN(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))),LEN(InvoiceNumber))
AS AffixIncludingHyphen
,SUBSTRING(InvoiceNumber,2+LEN(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))),LEN(InvoiceNumber))
AS AffixExcludingHyphen
,CAST(
SUBSTRING(InvoiceNumber,2+LEN(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))),LEN(InvoiceNumber))
AS int)
AS AffixAsInt
,REVERSE(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber)))
AS WithoutAffix
FROM #Invoice
ORDER BY
-- WithoutAffix
REVERSE(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber)))
-- AffixAsInt
,CAST(
SUBSTRING(InvoiceNumber,2+LEN(SUBSTRING(REVERSE(InvoiceNumber),1+CHARINDEX('-',REVERSE(InvoiceNumber)),LEN(InvoiceNumber))),LEN(InvoiceNumber))
AS int)
Output:
InvoiceNumber Reversed HyphenIndexWithinReversed ReversedWithoutAffix AffixIncludingHyphen AffixExcludingHyphen AffixAsInt WithoutAffix
------------- ---------- ------------------------- -------------------- -------------------- -------------------- ----------- ------------
790709-1 1-907097 2 907097 -1 1 1 790709
790709-2 2-907097 2 907097 -2 2 2 790709
790709-11 11-907097 3 907097 -11 11 11 790709
790709-21 12-907097 3 907097 -21 21 21 790709
790709-212 212-907097 4 907097 -212 212 212 790709
790711 117097 0 117097 0 790711
Note that all you actually need is the ORDER BY clause, the rest is just to show my working, which goes like this:
Reverse the string, find the hyphen, get the substring after the hyphen, reverse that part: This is the number without any affix
The length of (the number without any affix) tells us how many characters to drop from the start in order to get the affix including the hyphen. Drop an additional character to get just the numeric part, and convert this to int. Fortunately we get a break from SQL Server in that this conversion gives zero for an empty string.
Finally, having got these two pieces, we simple ORDER BY (the number without any affix) and then by (the numeric value of the affix). This is the final order we seek.
The code would be more concise if SQL Server allowed us to say SUBSTRING(value, start) to get the string starting at that point, but it doesn't, so we have to say SUBSTRING(value, start, LEN(value)) a lot.
Try this one -
Query:
DECLARE #Invoice TABLE (InvoiceNumber VARCHAR(10))
INSERT #Invoice
VALUES
('790711')
, ('790709-1')
, ('790709-21')
, ('790709-11')
, ('790709-211')
, ('790709-2')
;WITH cte AS
(
SELECT
InvoiceNumber
, lenght = LEN(InvoiceNumber)
, delimeter = CHARINDEX('-', InvoiceNumber)
FROM #Invoice
)
SELECT InvoiceNumber
FROM cte
CROSS JOIN (
SELECT repl = MAX(lenght - delimeter)
FROM cte
WHERE delimeter != 0
) mx
ORDER BY
SUBSTRING(InvoiceNumber, 1, ISNULL(NULLIF(delimeter - 1, -1), lenght))
, RIGHT(REPLICATE('0', repl) + SUBSTRING(InvoiceNumber, delimeter + 1, lenght), repl)
Output:
InvoiceNumber
-------------
790709-1
790709-2
790709-11
790709-21
790709-211
790711
Try this
SELECT invoiceid FROM Invoice
ORDER BY
CASE WHEN PatIndex('%[-]%',invoiceid) > 0
THEN LEFT(invoiceid,PatIndex('%[-]%',invoiceid)-1)
ELSE invoiceid END * 1
,CASE WHEN PatIndex('%[-]%',REVERSE(invoiceid)) > 0
THEN RIGHT(invoiceid,PatIndex('%[-]%',REVERSE(invoiceid))-1)
ELSE NULL END * 1
SQLFiddle Demo
Above query uses two case statements
Sorts first part of Invoiceid 790109-1 (eg: 790709)
Sorts second part of Invoiceid after splitting with '-' 790109-1 (eg: 1)
For detailed understanding check the below SQLfiddle
SQLFiddle Detailed Demo
OR use 'CHARINDEX'
SELECT invoiceid FROM Invoice
ORDER BY
CASE WHEN CHARINDEX('-', invoiceid) > 0
THEN LEFT(invoiceid, CHARINDEX('-', invoiceid)-1)
ELSE invoiceid END * 1
,CASE WHEN CHARINDEX('-', REVERSE(invoiceid)) > 0
THEN RIGHT(invoiceid, CHARINDEX('-', REVERSE(invoiceid))-1)
ELSE NULL END * 1
Order by each part separately is the simplest and reliable way to go, why look for other approaches? Take a look at this simple query.
select *
from Invoice
order by Convert(int, SUBSTRING(invoiceid, 0, CHARINDEX('-',invoiceid+'-'))) asc,
Convert(int, SUBSTRING(invoiceid, CHARINDEX('-',invoiceid)+1, LEN(invoiceid)-CHARINDEX('-',invoiceid))) asc
Plenty of good answers here, but I think this one might be the most compact order by clause that is effective:
SELECT *
FROM Invoice
ORDER BY LEFT(InvoiceId,CHARINDEX('-',InvoiceId+'-'))
,CAST(RIGHT(InvoiceId,CHARINDEX('-',REVERSE(InvoiceId)+'-'))AS INT)DESC
Demo: - SQL Fiddle
Note, I added the '790709' version to my test, since some of the methods listed here aren't treating the no-suffix version as lesser than the with-suffix versions.
If your invoiceID varies in length, before the '-' that is, then you'd need:
SELECT *
FROM Invoice
ORDER BY CAST(LEFT(list,CHARINDEX('-',list+'-')-1)AS INT)
,CAST(RIGHT(InvoiceId,CHARINDEX('-',REVERSE(InvoiceId)+'-'))AS INT)DESC
Demo with varying lengths before the dash: SQL Fiddle
My version:
declare #Len int
select #Len = (select max (len (invoiceid) - charindex ( '-', invoiceid))-1 from MyTable)
select
invoiceid ,
cast (SUBSTRING (invoiceid ,1,charindex ( '-', invoiceid )-1) as int) * POWER (10,#Len) +
cast (right(invoiceid, len (invoiceid) - charindex ( '-', invoiceid) ) as int )
from MyTable
You can implement this as a new column to your table:
ALTER TABLE MyTable ADD COLUMN invoice_numeric_id int null
GO
declare #Len int
select #Len = (select max (len (invoiceid) - charindex ( '-', invoiceid))-1 from MyTable)
UPDATE TABLE MyTable
SET invoice_numeric_id = cast (SUBSTRING (invoiceid ,1,charindex ( '-', invoiceid )-1) as int) * POWER (10,#Len) +
cast (right(invoiceid, len (invoiceid) - charindex ( '-', invoiceid) ) as int )
One way is to split InvoiceId into its parts, and then sort on the parts. Here I use a derived table, but it could be done with a CTE or a temporary table as well.
select InvoiceId, InvoiceId1, InvoiceId2
from
(
select
InvoiceId,
substring(InvoiceId, 0, charindex('-', InvoiceId, 0)) as InvoiceId1,
substring(InvoiceId, charindex('-', InvoiceId, 0)+1, len(InvoiceId)) as InvoiceId2
FROM Invoice
) tmp
order by
cast((case when len(InvoiceId1) > 0 then InvoiceId1 else InvoiceId2 end) as int),
cast((case when len(InvoiceId1) > 0 then InvoiceId2 else '0' end) as int)
In the above, InvoiceId1 and InvoiceId2 are the component parts of InvoiceId. The outer select includes the parts, but only for demonstration purposes - you do not need to do this in your select.
The derived table (the inner select) grabs the InvoiceId as well as the component parts. The way it works is this:
When there is a dash in InvoiceId, InvoiceId1 will contain the first part of the number and InvoiceId2 will contain the second.
When there is not a dash, InvoiceId1 will be empty and InvoiceId2 will contain the entire number.
The second case above (no dash) is not optimal because ideally InvoiceId1 would contain the number and InvoiceId2 would be empty. To make the inner select work optimally would decrease the readability of the select. I chose the non-optimal, more readable, approach since it is good enough to allow for sorting.
This is why the ORDER BY clause tests for the length - it needs to handle the two cases above.
Demo at SQL Fiddle
Break the sort into two sections:
SQL Fiddle
MS SQL Server 2008 Schema Setup:
CREATE TABLE TestData
(
data varchar(20)
)
INSERT TestData
SELECT '790711' as data
UNION
SELECT '790109-1'
UNION
SELECT '790109-11'
UNION
SELECT '790109-2'
Query 1:
SELECT *
FROM TestData
ORDER BY
FLOOR(CAST(REPLACE(data, '-', '.') AS FLOAT)),
CASE WHEN CHARINDEX('-', data) > 0
THEN CAST(RIGHT(data, len(data) - CHARINDEX('-', data)) AS INT)
ELSE 0
END
Results:
| DATA |
-------------
| 790109-1 |
| 790109-2 |
| 790109-11 |
| 790711 |
Try:
select invoiceid ... order by Convert(decimal(18, 2), REPLACE(invoiceid, '-', '.'))

SQL if breaking number pattern, mark record?

I have the following query:
SELECT AccountNumber, RptPeriod
FROM dbo.Report
ORDER BY AccountNumber, RptPeriod.
I get the following results:
123 200801
123 200802
123 200803
234 200801
344 200801
344 200803
I need to mark the record where the rptperiod doesnt flow concurrently for the account. For example 344 200803 would have an X next to it since it goes from 200801 to 200803.
This is for about 19321 rows and I want it on a company basis so between different companies I dont care what the numbers are, I just want the same company to show where there is breaks in the number pattern.
Any Ideas??
Thanks!
OK, this is kind of ugly (double join + anti-join) but it gets the work done, AND is pure portable SQL:
SELECT *
FROM dbo.Report R1
, dbo.Report R2
WHERE R1.AccountNumber = R2.AccountNumber
AND R2.RptPeriod - R1.RptPeriod > 1
-- subsequent NOT EXISTS ensures that R1,R2 rows found are "next to each other",
-- e.g. no row exists between them in the ordering above
AND NOT EXISTS
(SELECT 1 FROM dbo.Report R3
WHERE R1.AccountNumber = R3.AccountNumber
AND R2.AccountNumber = R3.AccountNumber
AND R1.RptPeriod < R3.RptPeriod
AND R3.RptPeriod < R2.RptPeriod
)
Something like this should do it:
-- cte lists all items by AccountNumber and RptPeriod, assigning an ascending integer
-- to each RptPeriod and restarting at 1 for each new AccountNumber
;WITH cte (AccountNumber, RptPeriod, Ranking)
as (select
AccountNumber
,RptPeriod
,row_number() over (partition by AccountNumber order by AccountNumber, RptPeriod) Ranking
from dbo.Report)
-- and then we join each row with each preceding row based on that "Ranking" number
select
This.AccountNumber
,This.RptPeriod
,case
when Prior.RptPeriod is null then '' -- Catches the first row in a set
when Prior.RptPeriod = This.RptPeriod - 1 then '' -- Preceding row's RptPeriod is one less that This row's RptPeriod
else 'x' -- -- Preceding row's RptPeriod is not less that This row's RptPeriod
end UhOh
from cte This
left outer join cte Prior
on Prior.AccountNumber = This.AccountNumber
and Prior.Ranking = This.Ranking - 1
(Edited to add comments)
WITH T
AS (SELECT *,
/*Each island of contiguous data will have
a unique AccountNumber,Grp combination*/
RptPeriod - ROW_NUMBER() OVER (PARTITION BY AccountNumber
ORDER BY RptPeriod ) Grp,
/*RowNumber will be used to identify first record
per company, this should not be given an 'X'. */
ROW_NUMBER() OVER (PARTITION BY AccountNumber
ORDER BY RptPeriod ) AS RN
FROM Report)
SELECT AccountNumber,
RptPeriod,
/*Check whether first in group but not first over all*/
CASE
WHEN ROW_NUMBER() OVER (PARTITION BY AccountNumber, Grp
ORDER BY RptPeriod) = 1
AND RN > 1 THEN 'X'
END AS Flag
FROM T
SELECT *
FROM report r
LEFT JOIN report r2
ON r.accountnumber = r.accountnumber
AND {r2.rptperiod is one day after r.rptPeriod}
JOIN report r3
ON r3.accountNumber = r.accountNumber
AND r3.rptperiod > r1.rptPeriod
WHERE r2.rptPeriod IS NULL
AND r3 IS NOT NULL
I'm not sure of sql servers date logic syntax, but hopefully you get the idea. r will be all the records where the next rptPeriod is NULL (r2) and there exists at least one greater rptPeriod (r3). The query isn't super straight forward I guess, but if you have an index on the two columns, it'll probably be the most efficent way to get your data.
Basically, you number rows within every account, then, using the row numbers, compare the RptPeriod values for the neighbouring rows.
It is assumed here that RptPeriod is the year and month encoded, for which case the year transition check has been added.
;WITH Report_sorted AS (
SELECT
AccountNumber,
RptPeriod,
rownum = ROW_NUMBER() OVER (PARTITION BY AccountNumber ORDER BY RptPeriod)
FROM dbo.Report
)
SELECT
AccountNumber,
RptPeriod,
CASE ISNULL(CASE WHEN r1.RptPeriod / 100 < r2.RptPeriod / 100 THEN 12 ELSE 0 END
+ r1.RptPeriod - r2.RptPeriod, 1) AS Chk
WHEN 1 THEN ''
ELSE 'X'
END
FROM Report_sorted r1
LEFT JOIN Report_sorted r2
ON r1.AccountNumber = r2.AccountNumber AND r1.rownum = r2.rownum + 1
It could be complicated further with an additional check for gaps spanning a year and more, if you need that.