SQL server count and Sum Query? - sql

I have a Question table with the fields (QuestionID,QuestionMarks), and with the data fields look like -
QuestionID QuestionMarks
1 1
2 4
5 1
9 1
12 2
which means at the moment Question table has 5 Question of Total 9 marks, Now my problem is that i want to know that a combination of 4 Questions with 8 marks is possible and fetch out that combination ( in general a combination of "x" questions of "y" marks is possible) ?
I was thinking to use CTE, but was afraid that it will take lot of time to execute the query if i have tens of thousand question.
please suggest some idea how to get the data. I am using SQL Server version 2008

This is a start. It's going to have poor performance:
declare #Qs table (QuestionID int not null, QuestionMarks int not null)
insert into #Qs (QuestionID,QuestionMarks) values
(1,1), (2,4), (5,1), (9,1), (12,2)
declare #TargetMarks int = 8
declare #TargetCount int = 4
;with Build as (
select QuestionID as MinID,QuestionID as MaxID,QuestionMarks as Total,1 as Cnt
,'/' + CONVERT(varchar(max),QuestionID) + '/' as QPath
from #Qs
union all
select MinID,q.QuestionID,Total+q.QuestionMarks,Cnt+1,QPath + CONVERT(varchar(max),q.QuestionID) + '/'
from
Build b
inner join
#Qs q
on
b.MaxID < q.QuestionID and
b.Total + q.QuestionMarks <= #TargetMarks and
b.Cnt < #TargetCount
)
select * from Build where Cnt = #TargetCount and Total = #TargetMarks
Result set:
MinID MaxID Total Cnt QPath
--------------------------------------------------------------------------------
2 12 8 4 /2/5/9/12/
1 12 8 4 /1/2/9/12/
1 12 8 4 /1/2/5/12/
The tricky part is that the QPath value isn't exactly the greatest way of storing ID values.

I think you're right in that tens of thousands of questions can slow down execution, so I'd start by limiting the potential rows being queried. You already know for certain that even with millions of rows, you never need more than four with identical QuestionMarks and you can reduce this further, e.g. (sorry for not knowing whether SQL Server accepts this syntax)
WITH LimitPotentialRows AS
(SELECT m1.QuestionID, m1.QuestionMarks,
(SELECT SUM(m2.QuestionMarks)
FROM MyTable m2
WHERE m1.QuestionMarks = m2.QuestionMarks
AND m1.PrimaryKeyID <= m2.PrimaryKeyID) CurrentMarks,
(SELECT COUNT(*)
FROM MyTable m3
WHERE m1.QuestionMarks = m3.QuestionMarks
AND m1.PrimaryKeyID <= m3.PrimaryKeyID) TotalQuestions
FROM MyTable m1
WHERE m1.QuestionMarks <= :DesiredTotalQuestionMarks - :TotalNoOfQuestions + 1
HAVING CurrentMarks <= :DesiredTotalQuestionMarks
AND TotalQuestions <= :TotalNoOfQuestions)
Desiring 4 questions with a total of 8 marks, the result of this CTE will leave you with only
QuestionMarks NumberOfQuestions
1 4
2 4
3 2
4 1
5 1
Having limited the number of rows from tens of thousands to maximum 12, you're unlikely to have performance problems in your further calculations.

Related

How do I aggregate numbers from a string column in SQL

I am dealing with a poorly designed database column which has values like this
ID cid Score
1 1 3 out of 3
2 1 1 out of 5
3 2 3 out of 6
4 3 7 out of 10
I want the aggregate sum and percentage of Score column grouped on cid like this
cid sum percentage
1 4 out of 8 50
2 3 out of 6 50
3 7 out of 10 70
How do I do this?
You can try this way :
select
t.cid
, cast(sum(s.a) as varchar(5)) +
' out of ' +
cast(sum(s.b) as varchar(5)) as sum
, ((cast(sum(s.a) as decimal))/sum(s.b))*100 as percentage
from MyTable t
inner join
(select
id
, cast(substring(score,0,2) as Int) a
, cast(substring(score,charindex('out of', score)+7,len(score)) as int) b
from MyTable
) s on s.id = t.id
group by t.cid
[SQLFiddle Demo]
Redesign the table, but on-the-fly as a CTE. Here's a solution that's not as short as you could make it, but that takes advantage of the handy SQL Server function PARSENAME. You may need to tweak the percentage calculation if you want to truncate rather than round, or if you want it to be a decimal value, not an int.
In this or most any solution, you have to count on the column values for Score to be in the very specific format you show. If you have the slightest doubt, you should run some other checks so you don't miss or misinterpret anything.
with
P(ID, cid, Score2Parse) as (
select
ID,
cid,
replace(Score,space(1),'.')
from scores
),
S(ID,cid,pts,tot) as (
select
ID,
cid,
cast(parsename(Score2Parse,4) as int),
cast(parsename(Score2Parse,1) as int)
from P
)
select
cid, cast(round(100e0*sum(pts)/sum(tot),0) as int) as percentage
from S
group by cid;

sql server : select rows who's sum matches a value [duplicate]

This question already has answers here:
How to get rows having sum equal to given value
(4 answers)
Closed 9 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
here is table T :-
id num
-------
1 50
2 20
3 90
4 40
5 10
6 60
7 30
8 100
9 70
10 80
and the following is a fictional sql
select *
from T
where sum(num) = '150'
the expected result is :-
(A)
id num
-------
1 50
8 100
(B)
id num
-------
2 20
7 30
8 100
(C)
id num
-------
4 40
5 10
8 100
the 'A' case is most preferred !
i know this case is related to combinations.
in real world - client gets items from a shop, and because of an agreement between him and the shop, he pay every Friday. the payment amount is not the exact total of items
for example: he gets 5 books of 50 € ( = 250 € ), and on Friday he bring 150 €, so the first 3 books are perfect match - 3 * 50 = 150. i need to find the id's of those 3 books !
any help would be appreciated!
You can use recursive query in MSSQL to solve this.
SQLFiddle demo
The first recursive query build a tree of items with cumulative sum <= 150. Second recursive query takes leafs with cumulative sum = 150 and output all such paths to its roots. Also in the final results ordered by ItemsCount so you will get preferred groups (with minimal items count) first.
WITH CTE as
( SELECT id,num,
id as Grp,
0 as parent,
num as CSum,
1 as cnt,
CAST(id as Varchar(MAX)) as path
from T where num<=150
UNION all
SELECT t.id,t.num,
CTE.Grp as Grp,
CTE.id as parent,
T.num+CTE.CSum as CSum,
CTE.cnt+1 as cnt,
CTE.path+','+CAST(t.id as Varchar(MAX)) as path
from T
JOIN CTE on T.num+CTE.CSum<=150
and CTE.id<T.id
),
BACK_CTE as
(select CTE.id,CTE.num,CTE.grp,
CTE.path ,CTE.cnt as cnt,
CTE.parent,CSum
from CTE where CTE.CSum=150
union all
select CTE.id,CTE.num,CTE.grp,
BACK_CTE.path,BACK_CTE.cnt,
CTE.parent,CTE.CSum
from CTE
JOIN BACK_CTE on CTE.id=BACK_CTE.parent
and CTE.Grp=BACK_CTE.Grp
and BACK_CTE.CSum-BACK_CTE.num=CTE.CSum
)
select id,NUM,path, cnt as ItemsCount from BACK_CTE order by cnt,path,Id
If you restrict your problem to "which two numbers add up to a value", the solution is as follows:
SELECT t1.id, t1.num, t2.id,t2.num
FROM T t1
INNER JOIN T t2
ON t1.id < t2.id
WHERE t1.num + t2.num = 150
If you also want the result for three and more numbers you can achieve that by using the above query as a base for recursive SQL. Don't forget to specify a maximum recursion depth!
To find the id's of the books that the client is paying, you would need to have a table with your clients, and another one to store the orders of the client, and what products he bought.
Otherwise it would be impossible to know what product the payment refers to.

SQL random number that doesn't repeat within a group

Suppose I have a table:
HH SLOT RN
--------------
1 1 null
1 2 null
1 3 null
--------------
2 1 null
2 2 null
2 3 null
I want to set RN to be a random number between 1 and 10. It's ok for the number to repeat across the entire table, but it's bad to repeat the number within any given HH. E.g.,:
HH SLOT RN_GOOD RN_BAD
--------------------------
1 1 9 3
1 2 4 8
1 3 7 3 <--!!!
--------------------------
2 1 2 1
2 2 4 6
2 3 9 4
This is on Netezza if it makes any difference. This one's being a real headscratcher for me. Thanks in advance!
To get a random number between 1 and the number of rows in the hh, you can use:
select hh, slot, row_number() over (partition by hh order by random()) as rn
from t;
The larger range of values is a bit more challenging. The following calculates a table (called randoms) with numbers and a random position in the same range. It then uses slot to index into the position and pull the random number from the randoms table:
with nums as (
select 1 as n union all select 2 union all select 3 union all select 4 union all select 5 union all
select 6 union all select 7 union all select 8 union all select 9
),
randoms as (
select n, row_number() over (order by random()) as pos
from nums
)
select t.hh, t.slot, hnum.n
from (select hh, randoms.n, randoms.pos
from (select distinct hh
from t
) t cross join
randoms
) hnum join
t
on t.hh = hnum.hh and
t.slot = hnum.pos;
Here is a SQLFiddle that demonstrates this in Postgres, which I assume is close enough to Netezza to have matching syntax.
I am not an expert on SQL, but probably do something like this:
Initialize a counter CNT=1
Create a table such that you sample 1 row randomly from each group and a count of null RN, say C_NULL_RN.
With probability C_NULL_RN/(10-CNT+1) for each row, assign CNT as RN
Increment CNT and go to step 2
Well, I couldn't get a slick solution, so I did a hack:
Created a new integer field called rand_inst.
Assign a random number to each empty slot.
Update rand_inst to be the instance number of that random number within this household. E.g., if I get two 3's, then the second 3 will have rand_inst set to 2.
Update the table to assign a different random number anywhere that rand_inst>1.
Repeat assignment and update until we converge on a solution.
Here's what it looks like. Too lazy to anonymise it, so the names are a little different from my original post:
/* Iterative hack to fill 6 slots with a random number between 1 and 13.
A random number *must not* repeat within a household_id.
*/
update c3_lalfinal a
set a.rand_inst = b.rnum
from (
select household_id
,slot_nbr
,row_number() over (partition by household_id,rnd order by null) as rnum
from c3_lalfinal
) b
where a.household_id = b.household_id
and a.slot_nbr = b.slot_nbr
;
update c3_lalfinal
set rnd = CAST(0.5 + random() * (13-1+1) as INT)
where rand_inst>1
;
/* Repeat until this query returns 0: */
select count(*) from (
select household_id from c3_lalfinal group by 1 having count(distinct(rnd)) <> 6
) x
;

Repeating Record Sequence using SQL

This could easily be done using code, but I wondered if it could be done at the database level using SQL Server (2008).
I have a table similar to below:
CROP_ID YEAR_ PRODUCTION
1 1 0
1 2 300
1 3 500
2 1 100
2 2 700
I want to be able to run a query to repeat this for n number of years, per crop type e.g.
CROP_ID YEAR_ PRODUCTION
1 1 0
1 2 300
1 3 500
1 4 0
1 5 300
1 6 500
etc.
I'm not sure of the best approach, I presume I would need a SP and pass in a year variable, and use a loop construct? However the exact syntax escapes me. Any help appreciated.
Update
Sorry for not providing all the information in my original post. The table will allow for multiple crop types, and for Produciton values to be updated so Case statements with fixed variables are not really suitable. Apologies for not being clearer.
Update
With the TVF answer I used the following modified SQL to select by CropType for 20 years.
select top 20 b.CROP_ID,
YEAR_ = n.num * (select count() from MyBaseTable where CROP_ID = 3) + b.YEAR,
b.PRODUCTION from MyBaseTable b, dbo.fnMakeNRows(20) n
where CROP_ID = 3
You can do this in standard SQL without creating a stored procedure or using temp tables. The example below will do this for 12 years. You can extend it out to any number of years:
insert into CropYield
(CropID, Year_, Production)
Select 1, a.a + (10 * b.a),
case (a.a + (10 * b.a)) % 3
when 0 then 500
when 1 then 0
when 2 then 300
end
from (Select 0 as a union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) as a
cross join (Select 0 as a union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) as b
where a.a + (10 * b.a) between 1 and 12
You could use a table-valued-function instead of a stored proc, which gives a little more flexibility for what you do with the result (as it can be selected from directly, inserted into another table, joined to other tables, etc).
You could also make this more generic by having a TVF generate N rows (with a number from 0 to N-1 on each row) and then use some simple expressions to generate the columns you need from this. I have found such a TVF to be useful in a variety of situations.
If you need to generate more complicated data than you can with 0..N-1 and simple expressions, then you should create a TVF dedicated to your specific needs.
The following example shows how a generic TVF could be used to generate the data you ask for:
create function fnMakeNRows (#num as integer)
returns #result table (num integer not null) as
begin
if #num is null or #num = 0
begin
return
end
declare #n as integer
set #n = 0
while #n < #num
begin
insert into #result values (#n)
set #n = #n + 1
end
return
end
go
select
CROP_ID = 1,
YEAR_ = num,
PRODUCTION = case num % 3 when 0 then 0 when 1 then 300 else 500 end
from dbo.fnMakeNRows(100000)
You can also use this to duplicate rows in an existing table (which I think is more like what you want). For example, assuming base_table contains the three rows at the beginning of your question, you can turn the 3 rows into 60 rows using the following:
select
b.CROP_ID,
YEAR_ = n.num * (select count(*) from base_table) + b.YEAR_,
b.PRODUCTION
from base_table b, dbo.fnMakeNRows(20) n
This (hopefully) shows the utility of a generic fnMakeNRows function.
A common trick to produce this kind of data without the need of a stored procedure is with the use of a table of constants. Because such a table can be of generic use, it can be created with say all the integers between 1 and 100 or even 1 and 1,000 depending on usage.
For exmaple
CREATE TABLE tblConstNums
( I INT )
INSERT INTO tblConstNums VALUES (1)
INSERT INTO tblConstNums VALUES (2)
INSERT INTO tblConstNums VALUES (3)
INSERT INTO tblConstNums VALUES (4)
INSERT INTO tblConstNums VALUES (5)
-- ...
INSERT INTO tblConstNums VALUES (1000)
The the solution can be written declaratively (without requiring Stored Procedure or more generally procedural statements:
SELECT CROP_ID, YEAR_ * I, PRODUCTION
FROM myCropTable T
JOIN tblConstNums C on 1=1
WHERE I in (1, 2, 3)
order by CROP_ID, YEAR_ * I, PRODUCTION
Note that the table of constants may include several columns for commonly used cases. For example, and even though many of these can be expressed as mathematical expressions of numbers in a basic 0 to n sequence, one can have a column with only even number, another one with odd numbers, another one with multiples of 5 etc. Also if it small enough, no indexes are needed on a table of constants but these may become useful on a bigger on.
use this one:
WITH tn (n) as
(
SELECT 0
UNION ALL
SELECT n+1
FROM tn
WHERE tn.n < 10
)
SELECT DISTINCT t.CROP_ID, t.YEAR_ + (3*tn.n), t.PRODUCTION
FROM table t, tn
/*
WHERE tn.n < 10 ==> you will get 1 -> (10*3) + 3 = 33
*/

SQL return multiple rows from one record

This is the opposite of reducing repeating records.
SQL query to create physical inventory checklists
If widget-xyz has a qty of 1 item return 1 row, but if it has 5, return 5 rows etc.
For all widgets in a particular warehouse.
Previously this was handled with a macro working through a range in excel, checking the qty column. Is there a way to make a single query instead?
The tables are FoxPro dbf files generated by an application and I am outputting this into html
Instead of generating an xml string and using xml parsing functions to generate a counter as Nestor has suggested, you might consider joining on a recursive CTE as a counter, as LukLed has hinted to:
WITH Counter AS
(
SELECT 0 i
UNION ALL
SELECT i + 1
FROM Counter
WHERE i < 100
),
Data AS
(
SELECT 'A' sku, 1 qty
UNION
SELECT 'B', 2
UNION
SELECT 'C', 3
)
SELECT *
FROM Data
INNER JOIN Counter ON i < qty
According to query analyzer, this query is much faster than the xml pseudo-table. This approach also gives you a recordset with a natural key (sku, i).
There is a default recursion limit of 100 in MSSQL that will restrict your counter. If you have quantities > 100, you can either increase this limit, use nested counters, or create a physical table for counting.
For SQL 2005/2008, take a look at
CROSS APPLY
What I would do is CROSS APPLY each row with a sub table with as many rows as qty has. A secondary question is how to create that sub table (I'd suggest to create an xml string and then parse it with the xml operators)
I hope this gives you a starting pointer....
Starting with
declare #table table (sku int, qty int);
insert into #table values (1, 5), (2,4), (3,2);
select * from #table;
sku qty
----------- -----------
1 5
2 4
3 2
You can generate:
with MainT as (
select *, convert(xml,'<table>'+REPLICATE('<r></r>',qty)+'</table>') as pseudo_table
from #table
)
select p.sku, p.qty
from MainT p
CROSS APPLY
(
select p.sku from p.pseudo_table.nodes('/table/r') T(row)
) crossT
sku qty
----------- -----------
1 5
1 5
1 5
1 5
1 5
2 4
2 4
2 4
2 4
3 2
3 2
Is that what you want?
Seriously dude... next time put more effort writing your question. It's impossible to know exactly what you are looking for.
You can use table with number from 1 to max(quantity) and join your table by quantity <= number. You can do it in many ways, but it depends on sql engine.
You can do this using dynamic sql.