Calculate total and subtotal without loop? - abap

I am starting learning the new abap. But i have problems. I want to make result output as below without using "LOOP" and "AT" statements.
I have internal table like:
Category Amount
AAA 10
AAA 20
BBB 30
CCC 40
CCC 50
CCC 60
I need to display output as:
Category Amount
AAA 10
AAA 20
SUBTOTAL 30
BBB 30
SUBTOTAL 30
CCC 40
CCC 50
CCC 60
SUBTOTAL 150
TOTAL 210
Can anyone help with this?

If your question is about how to build an internal table (in memory) with constructor expressions (ABAP >= 7.40), rather than rendering it on the screen or in a spool file (totals and subtotals are features well integrated in ALV and easy to use), then here's one way to do it (ASSERT is here to show that the final value is as expected) :
TYPES : BEGIN OF ty_line,
category TYPE string,
amount TYPE decfloat16,
END OF ty_line,
ty_lines TYPE STANDARD TABLE OF ty_line WITH DEFAULT KEY.
DATA(gt_main) = VALUE ty_lines( ( category = 'AAA' amount = 10 )
( category = 'AAA' amount = 20 )
( category = 'BBB' amount = 30 )
( category = 'CCC' amount = 40 )
( category = 'CCC' amount = 50 )
( category = 'CCC' amount = 60 ) ).
DATA(lt_display) = VALUE ty_lines(
( LINES OF VALUE #(
FOR GROUPS <g> OF <line> IN gt_main
GROUP BY ( category = <line>-category )
( LINES OF VALUE #( FOR <line2> IN GROUP <g> ( <line2> ) ) )
( category = 'SUBTOTAL'
amount = REDUCE #( INIT subtotal TYPE ty_line-amount
FOR <line2> IN GROUP <g>
NEXT subtotal = subtotal + <line2>-amount ) ) ) )
( category = 'TOTAL'
amount = REDUCE #( INIT total TYPE ty_line-amount
FOR <line> IN gt_main
NEXT total = total + <line>-amount ) ) ).
ASSERT lt_display = VALUE ty_lines( ( category = 'AAA' amount = 10 )
( category = 'AAA' amount = 20 )
( category = 'SUBTOTAL' amount = 30 )
( category = 'BBB' amount = 30 )
( category = 'SUBTOTAL' amount = 30 )
( category = 'CCC' amount = 40 )
( category = 'CCC' amount = 50 )
( category = 'CCC' amount = 60 )
( category = 'SUBTOTAL' amount = 150 )
( category = 'TOTAL' amount = 210 ) ).

I make this code as below.
TYPES: LTY_DISPLAY TYPE STANDARD TABLE OF TY_DISPLAY WITH EMPTY KEY.
LT_DISPLAY = REDUCE LTY_DISPLAY
( INIT LIST = VALUE LTY_DISPLAY( )
SUBTOTAL = VALUE LTY_DISPLAY( )
TOTAL = VALUE LTY_DISPLAY( )
LV_TEXT TYPE STRING
FOR GROUPS <GROUP_KEY> OF <WA> IN GT_MAIN GROUP BY ( CATEGORY = <WA>-CATEGORY ) ASCENDING
NEXT lV_TEXT = <GROUP_KEY>
LIST = VALUE LTY_COSP( BASE SUBTOTAL FOR <WA1> IN GROUP <GROUP_KEY> ( <WA1> ) )
SUBTOTAL = VALUE LTY_COSP( BASE LIST ( CATEGORY = 'SUBTOTAL' && LV_TEXT
AMOUNT = REDUCE #( INIT SUM TYPE P
FOR M IN GROUP <GROUP_KEY>
NEXT SUM = SUM + M-AMOUNT ) ) )
TOTAL = VALUE LTY_COSP( BASE SUBTOTAL ( CATEGORY = 'TOTAL'
AMOUNT = REDUCE #( INIT SUM TYPE P
FOR M IN GT_MAIN
NEXT SUM = SUM + M-AMOUNT ) ) ) ).

If you don't want to loop through the internal table using loop at, then you can always use a function module that will go through the internal table for you and print the totals and subtotals.
REUSE_ALV_GRID_DISPLAY is one such function module.
You can check the following link for a tutorial: http://www.saphub.com/abap-tutorial/abap-alv-total-subtotal/
The above is implemented in a procedural programming paradigm and is newer than original way of making reports in SAP (where you would use print statements).
However, an even newer way than that would to use the object-oriented implementation of ALV reports. Look into CL_GUI_ALV_GRID.
Here is an introductory article on CL_GUI_ALV_GRID:
https://wiki.scn.sap.com/wiki/display/ABAP/OBJECT+ORIENTED+ALV+Guide

Related

Apache Pig Group by and Filter if multiple values exist?

I am trying to group multiple rows with the same IDs, and then check for each tuple in the group if it contains both values, for example:
(10461 , 55 )
(10435 , 17 )
(10435 , 11 )
(10435 , 72 )
(10437 , 11 )
(10830 , 72 )
After I group it via: groupedData = group dataPoints by data_id;
I get :
(10461 ,{(10461 , 55)})
(10435 ,{(10435 , 17),(10435 , 11),(10435 , 72)})
I want to filter and get the value of 10435 if it contains 17 and 11.
You can use a nested FOREACH to filter the bags, and then check for empty bags. Note I'm not sure what you've called fields with the numbers (55, 17, 11 etc.) so this is value in the code below - replace as needed!
filteredBags = FOREACH groupedData {
seventeen = FILTER dataPoints BY value == 17;
eleven = FILTER dataPoints BY value == 11;
GENERATE
group AS data_id,
seventeen,
eleven;
}
nonNullBags = FILTER filteredBags BY NOT IsEmpty(seventeen) AND NOT IsEmpty(eleven);
finalIds = FOREACH nonNullBags GENERATE data_id;

SQL Server : split row into multiple rows based on a column value

I have a question regarding splitting rows based on column value
My example data set is :
id ExpenseType Price
------------------------
1 Car 100
2 Hotel 50
I want to split rows those have some Expense Types such as Car into two rows . Others should remain as one row.
First row Price *70
Second Row Price *30
Returned dataset should be
id ExpenseType Price
-----------------------
1 Car 70
1 Car 30
2 Hotel 50
Thanks for your answers in advance
If you want to split more expense types than car you could use:
WITH r AS (
SELECT 'Car' AS ExpenseType, 0.7 AS Ratio
UNION SELECT 'Car' AS ExpenseType, 0.3 AS Ratio
-- add more ExpenseTypes/Ratios here
)
SELECT
t.id,
t.ExpenseType,
t.Price * ISNULL(r.Ratio, 1.0) AS Price
FROM
your_table t
LEFT OUTER JOIN r ON t.ExpenseType = r.ExpenseType
A simple way uses union all:
select id, expensetype, price
from t
where expensetype <> 'Car'
union all
select id, expensetype, price * 0.7
from t
where expensetype = 'Car'
union all
select id, expensetype, price * 0.3
from t
where expensetype = 'Car';
This is not the most efficient method. For that, a cross apply with filtering logic is better:
select t.id, v.*
from t cross apply
(values (NULL, price), ('Car', price * 0.3), ('Car', price * 0.7)
) v(expensetype, price)
where v.expensetype = t.expense_type or
v.expensetype <> 'Car' and t.expense_type is null;
A less simple way is to use an OUTER APPLY
CREATE TABLE YourSampleData
(
Id INT IDENTITY(1,1) PRIMARY KEY,
ExpenseType VARCHAR(30) NOT NULL,
Price INT NOT NULL DEFAULT 0
);
INSERT INTO YourSampleData
(ExpenseType, Price) VALUES
('Car', 100)
,('Hotel', 50)
,('Gold', 1)
;
SELECT Id, ExpenseType
, COALESCE(a.Price, t.Price) AS Price
FROM YourSampleData t
OUTER APPLY
(
SELECT Price * Perc AS Price
FROM (VALUES
('Car',0.3E0), ('Car',0.7E0)
,('Gold',1.618E0)
) AS v(ExpType, Perc)
WHERE t.ExpenseType = v.ExpType
) a
GO
Id | ExpenseType | Price
-: | :---------- | ----:
1 | Car | 30
1 | Car | 70
2 | Hotel | 50
3 | Gold | 1.618
db<>fiddle here
I ran into a similar need, here is my solution.
Problem statement:
My organization is switching from an in-house build system to a third-party system. Numerical values in the original system surpassed the value size that the destination system could handle. The third-party system will not allow us to increase the field size, as a result we need to split the data up into values that do not surpass the field size limit.
Details:
Destination system can only support values under 1 billion (can include a negative sign)
Example:
DROP TABLE IF EXISTS #MyDemoData /* Generate some fake data for the demo */
SELECT item_no = 1, item_description = 'zero asset', amount = 0 INTO #MyDemoData
UNION SELECT item_no = 2, item_description = 'small asset', amount = 5100000
UNION SELECT item_no = 3, item_description = 'mid asset', amount = 510000000
UNION SELECT item_no = 4, item_description = 'large asset', amount = 5100000000
UNION SELECT item_no = 5, item_description = 'large debt', amount = -2999999999.99
SELECT * FROM #MyDemoData
DECLARE #limit_size INT = 1000000000
DROP TABLE IF EXISTS #groupings;
WITH
max_groups AS
(
SELECT max_groups=100
)
,groups AS
(
SELECT 1 AS [group]
UNION ALL
SELECT [group]+1
FROM groups
JOIN max_groups ON 1=1
WHERE [group]+1<=max_groups
)
,group_rows AS
(
SELECT 0 AS [row]
UNION ALL
SELECT [row]+1
FROM group_rows
JOIN max_groups ON 1=1
WHERE [row]+1<=max_groups
)
,groupings AS
(
SELECT [group],[row]
FROM group_rows
CROSS JOIN groups
WHERE [row] <= [group]
)
SELECT * INTO #groupings FROM groupings;
WITH /* Split out items that are under the limit and over the limit */
t1 AS /* Identify rows that are over the limit and by how many multiples over it is */
(
SELECT
item_no
, item_description
, amount
, over_limit = FLOOR(ABS(amount/#limit_size))
FROM #MyDemoData
)
SELECT /* select the items that are under the limit and do not need manipulated */
item_no
, item_description
, amount = CAST(amount AS DECIMAL(16,2))
FROM t1
WHERE ABS([amount]) < #limit_size
UNION ALL /* select the items that are over the limit, join on the groupings cte and calculate the split amounts */
SELECT
item_no
, item_description
, [Amount] = CAST(
CASE
WHEN row != 0 THEN (#limit_size-1) * ([amount]/ABS([amount]))
ELSE (ABS([amount]) - (t1.over_limit * #limit_size) + t1.over_limit) * ([amount]/ABS([amount]))
END AS DECIMAL(16,2))
FROM t1
JOIN #groupings bg ON t1.over_limit = bg.[group]
WHERE ABS([amount]) >= #limit_size
ORDER BY item_no

Select column value that matches a combination of other columns values on the same table

I have a table called Ads and another Table called AdDetails to store the details of each Ad in a Property / Value style, Here is a simplified example with dummy code:
[AdDetailID], [AdID], [PropertyName], [PropertyValue]
2 28 Color Red
3 28 Speed 100
4 27 Color Red
5 28 Fuel Petrol
6 27 Speed 70
How to select Ads that matches many combinations of PropertyName and PropertyValue, for example :
where PropertyName='Color' and PropertyValue='Red'
And
where PropertyName='Speed' and CAST(PropertyValue AS INT) > 60
You are probably going to do stuff like this a lot so I would start out by making a view that collapses all of the properties to a single row.
create view vDetail
as
select AdID,
max(case PropertyName
when 'Color' then PropertyValue end) as Color,
cast(max(case PropertyName
when 'Speed' then PropertyValue end) as Int) as Speed,
max(case PropertyName
when 'Fuel' then PropertyValue end) as Fuel
from AdDetails
group by AdID
This approach also solves the problem with casting Speed to an int.
Then if I select * from vDetails
This makes it easy to deal with when joined to the parent table. You said you needed a variable number of "matches" - note the where clause below. #MatchesNeeded would be the count of the number of variables that were not null.
select *
from Ads a
inner join vDetails v
on a.AdID = v.AdID
where case when v.Color = #Color then 1 else 0 end +
case when v.Spead > #Speed then 1 else 0 end +
case when v.Fuel = #Fuel then 1 else 0 end = #MatchesNeeded
I think you have two main problems to solve here.
1) You need to be able to CAST varchar values to integers where some values won't be integers.
If you were using SQL 2012, you could use TRY_CAST() ( sql server - check to see if cast is possible ). Since you are using SQL 2008, you will need a combination of CASE and ISNUMERIC().
2) You need an efficient way to check for the existence of multiple properties.
I often see a combination of joins and where clauses for this, but I think this can quickly get messy as the number of properties that you check gets over... say one. Instead, using an EXISTS clause tends to be neater and I think it provides better clues to the SQL Optimizer instead.
SELECT AdID
FROM Ads
WHERE 1 = 1
AND EXISTS (
SELECT 1
FROM AdDetails
WHERE AdID = Ads.AdID
AND ( PropertyName='Color' and PropertyValue='Red' )
)
AND EXISTS (
SELECT 1
FROM AdDetails
WHERE AdID = Ads.AdID
AND PropertyName='Speed'
AND
(
CASE
WHEN ISNUMERIC(PropertyValue) = 1
THEN CAST(PropertyValue AS INT)
ELSE 0
END
)
> 60
)
You can add as many EXISTS clauses as you need without the query getting particularly difficult to read.
Something like this might work for 2 conditions, you would have to adapt depending on the number of conditions
select a.*
from ads as a
join addetails as d1 on d1.adid = a.id
join addetails as d2 on d2.adid = a.id
where (d1.PropertyName='Color' and d1.PropertyValue='Red')
and (d2.PropertyName='Speed' and d2.CAST(PropertyValue AS INT) > 60)
DECLARE #AdDetails TABLE
(
AdDetailID INT,
AdID INT,
PropertyName VARCHAR(20),
PropertyValue VARCHAR(20)
)
INSERT INTO #AdDetails
( AdDetailID, AdID, PropertyName, PropertyValue )
VALUES
(2, 28, 'Color', 'Red'),
(3, 28, 'Speed', '100'),
(4, 27, 'Color', 'Red'),
(5, 28, 'Fuel', 'Petrol'),
(6, 27, 'Speed', '70');
--Col1
DECLARE #ColorValue VARCHAR(20) = 'Red'
--Col2
DECLARE #SpeedValue INT = 90
DECLARE #SpeedType VARCHAR(2) = '>'
--Col3
DECLARE #FuelValue VARCHAR(20) = null
SELECT DISTINCT a.AdID FROM #AdDetails a
INNER JOIN
(
SELECT *
FROM #AdDetails
WHERE #ColorValue IS NULL
OR #ColorValue = PropertyValue
) Color
ON Color.AdID = a.AdID
INNER JOIN
(
SELECT *
FROM #AdDetails
WHERE #SpeedType IS NULL
UNION
SELECT *
FROM #AdDetails
WHERE PropertyName = 'Speed'
AND ((#SpeedType = '>' AND CONVERT(INT, PropertyValue) > #SpeedValue)
OR (#SpeedType = '<' AND CONVERT(INT, PropertyValue) < #SpeedValue)
OR (#SpeedType = '=' AND CONVERT(INT, PropertyValue) = #SpeedValue))
) AS Speed
ON Speed.AdID = a.AdID
INNER JOIN
(
SELECT *
FROM #AdDetails
WHERE #FuelValue IS NULL
OR (#FuelValue = PropertyValue)
) AS Fuel
ON Fuel.AdID = a.AdID
I add one inner join clause per property type (with some overrides), your sql query would pass all of the possible property type info in one go nulling out whatever they don't want. very ugly code though as it grows.

How to distribute assumed data across a population?

My database contains Properties. Some of them have been surveyed and some haven't. Based on the survey we can calculate the costs that will be incurred for a surveyed property.
Then , when a property has not been surveyed we want to assume the costs for that property will be the same as a similar property that has been surveyed.
So we go look for a matching properties in order to choose a "clone".
If the property is in a block then we look for surveyed properties in the same block, if we don't find any then we look in the same postcode area then we look in the same street etc.
If there is more than one matching property in the block we don't want to use the same property to clone all the unsurveyed properties so we rotate surveyed properties as clones.
For example, say we have 5 properties in a block and P1 and P2 have been surveyed. P3 should use P1 as a clone, P4 should use P2 as a clone and P5 should use P1 as a clone.
So the total cost for the block will be 3 * P1.GetCost() + 2 * P2.GetCost()
I have written code that identifies a clone on this basis for a single property. But I need to produce a report that will summarise the costs potentially over several thousand properties. So I think that I will need to create a view in the database to optimise this.
My problem is that I can't figure out how to work out how many times each surveyed property will be cloned across the entire population. Can anyone suggest a technique I can apply?
EDIT
Test sql based on the answer from anon. This gets me the count of matching properties for each unsurveyed property, but I want the count of unsurveyed properties that I need to add to each surveyed property to get the cost multiplier:
IF EXISTS (SELECT
*
FROM sys.objects
WHERE object_id = OBJECT_ID(N'dbo.PropertyTest') AND type IN (N'U'))
DROP TABLE dbo.propertytest
GO
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[D]') AND type in (N'FN', N'IF', N'TF', N'FS', N'FT'))
DROP FUNCTION [dbo].[D]
GO
CREATE TABLE dbo.PropertyTest
(
ID int NOT NULL,
BlockID int NULL,
PostCode nvarchar(50) NULL,
StreetName nvarchar(50) NULL,
IsSurveyed bit NOT NULL,
Cost decimal(18, 0) NULL
)
GO
ALTER TABLE dbo.PropertyTest ADD CONSTRAINT
PK_PropertyTest PRIMARY KEY CLUSTERED
(
ID
)
GO
CREATE function D(#surveyedid int, #unsurveyeyed int)
returns table as
return
(
select case when
(SELECT u.blockid FROM propertytest u WHERE id = #unsurveyeyed) = (SELECT s.blockid FROM propertytest s where id = #surveyedid)
then 1
when
(SELECT u.postcode FROM propertytest u WHERE id = #unsurveyeyed) = (SELECT s.postcode FROM propertytest s where id = #surveyedid)
then 2
else
null
end as Distance
)
GO
INSERT INTO propertytest (id
, blockid
, postcode
, issurveyed
, cost
, StreetName)
SELECT
1, 1,'G20 6DJ', 1,20, 'Doune Gardens'
UNION
SELECT 2, 1, 'G20 6DJ', 1,30 , 'Doune Gardens'
UNION
SELECT 3, 1, 'G20 6DJ', 0, NULL , 'Doune Gardens'
UNION
SELECT 4, 1, 'G20 6DJ', 0, NULL , 'Doune Gardens'
UNION
SELECT 5, 1, 'G20 6DJ', 0, NULL , 'Doune Gardens'
UNION
SELECT 6, null, 'G20 6DJ', 0, NULL, 'Doune Gardens'
UNION
SELECT 7, null, 'G20 6BS', 0, NULL, 'Wilton Street'
UNION
SELECT 8, 1, 'G20 6BT', 0, NULL, 'Wilton Street'
SELECT
* INTO #s
FROM propertytest
WHERE issurveyed = 1
SELECT
* INTO #u
FROM propertytest
WHERE issurveyed = 0
--This is close to anon's suggestion
--with the current function it returns the count of surveyed properties that match an unsurveyed property
SELECT
#u.id,
COUNT(*)
FROM #s
CROSS JOIN #u
CROSS APPLY D(#S.ID,#U.ID) AS D
GROUP BY #u.id, D.Distance
HAVING D.Distance = MIN(D.Distance)
--I think this is closer to what I want
--with the current function it returns the total number
--of unsurveyed properties that match a surveyed property
--so P1 and P2 both match 3 in the same block
--Now I need P1 to act as proxy for for 2 of them and P2 to act as proxy for 1 of them
SELECT
#s.id, D.Distance, COUNT(*)
FROM #s
CROSS JOIN #u
CROSS APPLY D(#S.ID,#U.ID) AS D
GROUP BY #s.id, D.Distance
HAVING D.Distance = MIN(D.Distance)
DROP TABLE #s
DROP TABLE #u
This is a simplified version of my Linq-to-entities code that does matching. The GetMatch method is where I rotate the matching properties using the modulus. So in the above example we have 2 matchingProperties and 3 unallocated. If the unsurveyed property is at index 3 in unallocated, then it's clone is at index 1 in matchingProperties. But I can't see this working across an entire population so I am seeking inspiration for a different approach.
public class Property
{
public int ID {get; set;}
public int? BlockID {get; set;}
public Block { get; set;}
public PostCode { get; set; }
public boolean IsSurveyed {get; set;}
public decimal? GetCost()
{
//code to sum costs
}
}
private static Property GetMatch(Property property,
Func<Property, bool> matchFunction,
IQueryable<Property> surveyed, IQueryable<Property> unsurveyed)
{
var matchingProperties = surveyed.Where(matchFunction).OrderBy(p => p.ID);
int count = matchingProperties.Count();
Property match;
if (count == 1)
{
match = matchingProperties.First();
}
else if (count > 1)
{
//there is more than one property to match
//unallocated is the number of unsurveyed properties
//that match the criteria and they are ordered by id
//to ensure consistent allocation
var unallocated = unsurveyed.Where(matchFunction)
.OrderBy(p => p.ID)
.ToList();
//we want to match the first unallocated with the first matched,
//second with second but we must rotate through the matches,
//so use modulus
int index = unallocated.IndexOf(property) % count;
if (index < 0)
throw new InvalidOperationException
(#"The unsurveyed properties must include
the property we want to clone");
match = matchingProperties.ElementAt(index);
//the property to index is a
}
else
match = null;
return match;
}
private Property GetClone(Property property, out string cloneStatus)
{
IQueryable<Property> surveyed;
IQueryable<Property> unsurveyed;
surveyed = _Uow.PropertyRepository.All.Where(p => p.IsSurveyed );
unsurveyed = _Uow.PropertyRepository.All.Where(p => !p.IsSurveyed);
if (property.Block != null)
{
Property match = GetMatch(property,
c => c.BlockID == property.Block.ID,
surveyed as IQueryable<Property>, unsurveyed as IQueryable<Property>);
if (match != null)
cloneStatus = "Cloned from same block: "
+ match.GetFullAddress(" ", false);
return match;
}
if (!String.IsNullOrEmpty(property.PostCode))
{
Property match = GetMatch(property,
c => c.PostCode == property.PostCode, surveyed, unsurveyed);
if (match != null)
cloneStatus = "Cloned from same postcode: "
+ match.GetFullAddress(" ", false);
return match;
}
}
My approach is to use row numbers to match un-surveyed to surveyed properties, so for instance I would match the 1st un-surveyed row with the 1st surveyed row. I use a mod of the number of survey rows so that for instance the 4th un-surveyed row would match the 1st surveyed row if there were only 3 surveyed rows.
My query has the advantage of being able to modify it slightly to return the number of times a surveyed property has been matched.
EDITED FOR Streets too:
Here's the main query:
;with SurveyedByBlock
as
(
select Id, BlockID, Cost,
ROW_NUMBER() OVER (PARTITION BY BlockId ORDER BY ID) AS RN,
(SELECT COUNT(*)
FROM PropertyTest P2
WHERE P1.BlockID = P2.BlockID AND P2.IsSurveyed = 1
) AS MaxNumberOfRows
from PropertyTest P1
where issurveyed = 1 AND BlockID IS NOT NULL
),
SurveyedByPostCode
as
(
select Id, PostCode, Cost,
ROW_NUMBER() OVER (PARTITION BY PostCode ORDER BY ID) AS RN,
(SELECT COUNT(*)
FROM PropertyTest P2
WHERE P1.PostCode = P2.PostCode AND P2.IsSurveyed = 1
) AS MaxNumberOfRows
from PropertyTest P1
where issurveyed = 1 AND PostCode IS NOT NULL
),
SurveyedByStreet
AS
(
select Id, StreetName, Cost,
ROW_NUMBER() OVER (PARTITION BY StreetName ORDER BY ID) AS RN,
(SELECT COUNT(*)
FROM PropertyTest P2
WHERE P1.StreetName = P2.StreetName AND P2.IsSurveyed = 1
) AS MaxNumberOfRows
from PropertyTest P1
where issurveyed = 1 AND StreetName IS NOT NULL
),
UnSurveyed
AS
(
SELECT ID, BlockID, PostCode, Cost,
ROW_NUMBER() OVER (PARTITION BY BlockId ORDER BY ID) AS BlockRN,
ROW_NUMBER() OVER (PARTITION BY PostCode ORDER BY ID) AS PostCodeRN,
ROW_NUMBER() OVER (PARTITION BY StreetName ORDER BY ID) AS StreetNameRN
FROM PropertyTest
WHERE IsSurveyed = 0
)
SELECT UnSurveyed.Id, UnSurveyed.BlockID, UnSurveyed.PostCode, UnSurveyed.StreetName,
COALESCE(SurveyedByBlock.Cost, SurveyedByPostCode.Cost, SurveyedByStreet.Cost) AS Cost,
COALESCE(SurveyedByBlock.ID, SurveyedByPostCode.ID, SurveyedByStreet.Id) AS SurveyedId
FROM UnSurveyed
LEFT JOIN SurveyedByBlock
ON SurveyedByBlock.BlockID = UnSurveyed.BlockID
AND
((UnSurveyed.BlockRN % SurveyedByBlock.MaxNumberOfRows = SurveyedByBlock.RN )
OR -- unsurveyed row number matches left over row number
-- e.g. if we have 3 surveyed properties that match and this is the 4th row
-- in the unsurveyed properties it will match with the 1st surveyed row
-- 4 mod 3 = 1
(UnSurveyed.BlockRN % SurveyedByBlock.MaxNumberOfRows = 0
AND SurveyedByBlock.RN = SurveyedByBlock.MaxNumberOfRows)
)
LEFT JOIN SurveyedByPostCode
ON SurveyedByPostCode.PostCode = UnSurveyed.PostCode
AND ((UnSurveyed.PostCodeRN % SurveyedByPostCode.MaxNumberOfRows = SurveyedByPostCode.RN )
OR
(UnSurveyed.PostCodeRN % SurveyedByPostCode.MaxNumberOfRows = 0
AND SurveyedByPostCode.RN = SurveyedByPostCode.MaxNumberOfRows)
)
LEFT JOIN SurveyedByStreet
ON SurveyedByStreet.StreetName = UnSurveyed.StreetName
AND ((UnSurveyed.StreetNameRN % SurveyedByStreet.MaxNumberOfRows = SurveyedByStreet.RN )
OR
(UnSurveyed.StreetNameRN % SurveyedByStreet.MaxNumberOfRows = 0
AND SurveyedByStreet.RN = SurveyedByStreet.MaxNumberOfRows)
)
If you wanted to get the number of times that each surveyed property is matched then change the last select statement to:
...
SELECT COALESCE(SurveyedByBlock.ID, SurveyedByPostCode.ID) AS SurveyedId, COUNT(*)
...
GROUP BY COALESCE(SurveyedByBlock.ID, SurveyedByPostCode.ID)
Two sets: S (surveyed properties) and U (unsurveyed)
Formula D calculates distance from each member of U to S. This tells you how suitable S would be to act as a proxy for U. Shorter distance is better.
For each U, how many members of S are at the minimum distance?
SELECT U,COUNT(S)
FROM S
CROSS JOIN U
CROSS APPLY D(S,U) AS D
GROUP BY U
HAVING D = MIN(D)
--Example distance function
CREATE FUNCTION dbo.D(#s int, #u int)
RETURNS TABLE AS
RETURN
SELECT CASE
WHEN COUNT(DISTINCT block_id ) = 1 THEN 1
WHEN COUNT(DISTINCT postcode ) = 1 THEN 2
WHEN COUNT(DISTINCT street_id) = 1 THEN 3
END AS d
FROM propertytest
WHERE id IN (#s, #u)
GO

Batched output in SQL Server

I have a tables like the following
CREATE TABLE Company
(
Id INT
)
CREATE TABLE CompanyNumbers
(
CompanyId INT,
NumberText VARCHAR (255)
)
What I want as an output is this in pseudo code
Give me all the numbers for company A as a single comma separated string, if the string contains more than 150 numbers output another row with next 150 until complete.
What is the best way to achieve this? basically output batches of 150 numbers like this:
CompanyId | Batch
1 | 3344,444,5555,6444, 444, 44, 44555, 5555... > 150 of them
2 | 33343,33, 2233,3 (second row if more than 150)
I want this to be done within a stored procedure.
WITH cb AS
(
SELECT CompanyId, NumberText, ROW_NUMBER() OVER (PARTITION BY CompanyID ORDER BY NumberText) AS rn
FROM CompanyNumbers
)
SELECT CompanyID, batch,
(
SELECT CASE WHEN rn % 150 = 1 THEN '' ELSE ', ' END + NumberText AS [text()]
FROM cb
WHERE cb.CompanyID = cbd.CompanyID
AND rn BETWEEN cbd.batch * 150 + 1 AND cbd.batch * 150 + 150
FOR XML PATH('')
)
FROM (
SELECT DISTINCT CompanyID, FLOOR((rn - 1) / 150) AS batch
FROM cb
) AS cbd