coverage percentage using a complex sql query...? - sql

Ok, i've been trying to solve this for about 2 hours now... Please advise:
Tables:
PROFILE [id (int), name (varchar), ...]
SKILL [id (int), id_profile (int), id_app (int), lvl (int), ...]
APP [id (int), ...]
The lvl can basically go from 0 to 3.
I'm trying to get this particular stat:
"What is the percentage of apps that is covered by at least two people having a skill of 2 or higher?"
Thanks a lot

SELECT AVG(covered)
FROM (
SELECT CASE WHEN COUNT(*) >= 2 THEN 1 ELSE 0 END AS covered
FROM app a
LEFT JOIN skill s ON (s.id_app = a.id AND s.lvl >= 2)
GROUP BY a.id
)
More efficient way for MySQL:
SELECT AVG
(
IFNULL
(
(
SELECT 1
FROM skill s
WHERE s.id_app = a.id
AND s.lvl >= 2
LIMIT 1, 1
), 0
)
)
FROM app a
This will stop counting as soon as it finds the second skilled person for each app.
Efficient if you have a few app's but lots of person's.

Untested
select convert(float,count(*)) / (select count(*) from app) as percentage
from (
select count(*) as number
from skill
where lvl >= 2
group by id_app ) t
where t.number >= 2

The logic is: percentage = 100 * ( number of apps of interest ) / ( total number of apps )
select 'percentage' =
-- 100 times
( cast( 100 as float ) *
-- number of apps of interest
( select count(id_app)
from ( select id_app, count(*) as skilled_count
from skill
where lvl >= 2
group by id_app
having count(*) >= 2 ) app_counts )
-- divided by total number of apps
/ ( select count(*) from app )
The convert to float is needed so sql doesn't just do integer arithmetic.

SELECT SUM( CASE lvl WHEN 3 THEN 1 WHEN 2 THEN 1 ELSE 0 END ) / SUM(1) FROM SKILL
If your database has an if/then function instead of CASE, use that. For example, in MySQL:
SELECT SUM( IF( lvl >= 2, 1, 0 ) ) / SUM(1) FROM SKILL

I'm not sure if this is any better or worse than tvanfosson's answer, but here it is anyway:
SELECT convert(float, count(*)) / (Select COUNT(id) FROM APP) AS percentage
FROM APP INNER JOIN SKILL ON APP.id = SKILL.id
WHERE (
SELECT COUNT(id)
FROM SKILL AS Skill2 WHERE Skill2.id_app = APP.id and lvl >= 2
) >= 2

Related

Cross verifying sums in 12 different views

I want to create a control that verifies that the sum of a number between 12 different views, within three companies in each view is the same.
From time to time, something goes wrong and there's a small difference in the sums in two or more of the views in one or two of the companies.
Right now, I'm working on a SQL that gets very very long, since I'm calculating the sum of table A, and joining the sum from table B and selecting those companies where there's a difference. Then I'm making a UNION ALL from a new SQL where I'm comparing the sum in table A with the sum in table C. Since I have 12 different views that all has to be compared to each other, the SQL just get so long and the risk of an error increases.
Is there a smart and nicer way to go about this, where I don't have to manually cross-compare all 12 view with other in all possible combinations?
Certainly. When I'm talking about difference I'm not talking about small differences in the third decimal. This could be thousands or even a couple of millions because someone made a transaction error in the main-program that feeds into the tables.
In the code below I'm comparing the sums in TableA to the sums in TableB and TableC. If I manually have to add all the possible comparisons between the 12 different view, the code will end up being VERY long and I'd like to avoid that if possible.
> SELECT a.Companyid ,
> ( a.[View] + ' vs. ' + b.[View] ) AS Views ,
> ( a.[Mv.Facts.TableA] - b.[Mv.Facts.TableB] ) AS Difference FROM ( SELECT Companyid ,
> 'Facts.TableA' AS [View] ,
> SUM(MV) AS [Mv.Facts.TableA]
> FROM TABLE a
> WHERE Companyid IN ( 3, 5, 36 )
> GROUP BY Companyid ,
> ) a
> LEFT OUTER JOIN ( SELECT Companyid ,
> 'Facts.TableB' AS [View] ,
> SUM(Mv) AS [Mv.Facts.TableB]
> FROM TABLE B
> WHERE Companyid IN ( 3, 5, 36 )
> GROUP BY Companyid) b ON a.Companyid = b.Companyid WHERE ROUND(( a.[Mv.Facts.TableA] - b.[Mv.Facts.TableB]
> ), 0) <> 0
>
>
> UNION ALL SELECT Companyid ,
> ( a.[View] + ' vs. ' + c.[View] ) AS Views ,
> ( a.[Mv.Facts.TableA] - c.[Mv.Facts.TableC] ) AS Difference FROM ( SELECT Companyid ,
> 'Facts.TableA' AS [View] ,
> SUM(Mv) AS [Mv.Facts.TableA]
> FROM TABLE a
> WHERE Companyid IN ( 3, 5, 36 )
> GROUP BY Companyid ) a LEFT OUTER JOIN (SELECT Companyid ,
> 'Facts.TableC' AS [View] ,
> SUM(Mv) AS [Mv.Facts.TableC]
> FROM TABLE C
> WHERE Companyid IN ( 3, 5, 36 )
> GROUP BY Companyid) c ON a.Companyid = c.Companyid WHERE ROUND((a.[Mv.Facts.TableA] - c.[Mv.Facts.TableC]),0) <> 0

SQL query top 2 columns of joined table?

I am having no luck attempting to get the top (x number) of rows from a joined table. I want the top 2 resources (ordered by name) which in this case should be Katie and Simon and regardless of what I've tried, I can't seem to get it right. You can see below what I've commented out - and what looks like it should work (but doesn't). I cannot use a union. Any ideas?
select distinct
RTRESOURCE.RNAME as Resource,
RTTASK.TASK as taskname, SUM(distinct SOTRAN.QTY2BILL) AS quantitytobill from SOTRAN AS SOTRAN INNER JOIN RTTASK AS RTTASK ON sotran.taskid = rttask.taskid
left outer JOIN RTRESOURCE AS RTRESOURCE ON rtresource.keyno=sotran.resid
WHERE sotran.phantom<>'y' and sotran.pgroup = 'L' and sotran.timesheet = 'y' and sotran.taskid >0 AND RTRESOURCE.KEYNO in ('193','159','200') AND ( SOTRAN.ADDDATE>='8/15/2015 12:00:00 AM' AND SOTRAN.ADDDATE<'9/3/2015 11:59:59 PM' )
//and RTRESOURCE.RNAME in ( select distinct top 2 RTRESOURCE.RNAME from RTRESOURCE order by RTRESOURCE.RNAME)
//and ( select count(*) from RTRESOURCE RTRESOURCE2 where RTRESOURCE2.RNAME = RTRESOURCE.RNAME ) <= 2
GROUP BY RTRESOURCE.rname,RTTASK.task,RTTASK.taskid,RTTASK.mdsstring ORDER BY Resource,taskname
You should provide a schema.
But lets assume your query work. You create a CTE.
WITH youQuery as (
SELECT *
FROM < you big join query>
), maxBill as (
SELECT Resource, Max(quantitytobill) as Bill
FROM yourQuery
)
SELECT top 2 *
FROM maxBill
ORDER BY Bill
IF you want top 2 alphabetical
WITH youQuery as (
SELECT *
FROM < you big join query>
), Names as (
SELECT distinct Resource
FROM yourQuery
Order by Resource
)
SELECT top 2 *
FROM Names

Trying to Get SELECT TOP to work with Parameter in ACCESS

This is building on some code I got the other day (thanks to peterm). I am now trying to select the TOP X number of results after calculations on the query. The X can range from 1 to 8 depending on the number of results per player.
This is the code I have but I get a syntax error when I try to run it.
SELECT
PlayerID
, RoundID
, PlayedTo
, (SELECT Count(PlayerID) FROM PlayedToCalcs) AS C
, iif(
C <= 6
, 1
, iif(
C <= 8
, 2
, (
iif(
C <= 10
, 3
, (
iif(
C <= 12
, 4
, (
iif(
C <= 14
, 5
, (
iif(
C <= 16
, 6
, (
iif(
C <= 18
, 7
, (iif(C <= 20, 8, 999))
)
)
)
)
)
)
)
)
)
)
)
) AS X
FROM PlayedToCalcs AS s
WHERE PlayedTo IN (
SELECT TOP (X) PlayedTo
FROM PlayedToCalcs
WHERE PlayerID = s.PlayerID
ORDER BY PlayedTo DESC, RoundID DESC
)
ORDER BY PlayerID, PlayedTo DESC, RoundID DESC;
Here is a link http://sqlfiddle.com/#!3/a726c/4 with a small sample of the data I'm trying to use it on.
The Access db engine does not allow you to use a parameter for SELECT TOP. You must include a literal value in the SQL statement.
For example this query works correctly.
SELECT TOP 2 *
FROM tblFoo
ORDER BY id DESC;
But attempting to substitute a parameter, how_many, triggers error 3141, "The SELECT statement includes a reserved word or an argument name that is misspelled or missing, or the punctuation is incorrect."
SELECT TOP how_many *
FROM tblFoo
ORDER BY id DESC;
The reason being in SQL Server (the simulator you used in SQL Fiddle), you cannot use IIF. Try using CASE.
And there is a limitation of using 7 nested IIF in Access.

How to get the deepest levels of a hierarchical sql query

I'm using SQLServer 2008.
Say I have a recursive hierarchy table, SalesRegion, whit SalesRegionId and ParentSalesRegionId. What I need is, given a specific SalesRegion (anywhere in the hierarchy), retrieve ALL the records at the BOTTOM level.
I.E.:
SalesRegion, ParentSalesRegionId
1, null
1-1, 1
1-2, 1
1-1-1, 1-1
1-1-2, 1-1
1-2-1, 1-2
1-2-2, 1-2
1-1-1-1, 1-1-1
1-1-1-2, 1-1-1
1-1-2-1, 1-1-2
1-2-1-1, 1-2-1
(in my table I have sequencial numbers, this dashed numbers are only to be clear)
So, if the user enters 1-1, I need to retrieve al records with SalesRegion 1-1-1-1 or 1-1-1-2 or 1-1-2-1 (and NOT 1-2-2). Similarly, if the user enters 1-1-2-1, I need to retrieve just 1-1-2-1
I have a CTE query that retrieves everything below 1-1, but that includes rows that I don't want:
WITH SaleLocale_CTE AS (
SELECT SL.SaleLocaleId, SL.SaleLocaleName, SL.AccountingLocationID, SL.LocaleTypeId, SL.ParentSaleLocaleId, 1 AS Level /*Added as a workaround*/
FROM SaleLocale SL
WHERE SL.Deleted = 0
AND (#SaleLocaleId IS NULL OR SaleLocaleId = #SaleLocaleId)
UNION ALL
SELECT SL.SaleLocaleId, SL.SaleLocaleName, SL.AccountingLocationID, SL.LocaleTypeId, SL.ParentSaleLocaleId, Level + 1 AS Level
FROM SaleLocale SL
INNER JOIN SaleLocale_CTE SLCTE ON SLCTE.SaleLocaleId = SL.ParentSaleLocaleId
WHERE SL.Deleted = 0
)
SELECT *
FROM SaleLocale_CTE
Thanks in advance!
Alejandro.
I found a quick way to do this, but I'd rather the answer to be in a single query. So if you can think of one, please share! If I like it better, I'll vote for it as the best answer.
I added a "Level" column in my previous query (I'll edit the question so this answer is clear), and used it to get the last level and then delete the ones I don't need.
INSERT INTO #SaleLocales
SELECT *
FROM SaleLocale_GetChilds(#SaleLocaleId)
SELECT #LowestLevel = MAX(Level)
FROM #SaleLocales
DELETE #SaleLocales
WHERE Level <> #LowestLevel
Building off your post:
; WITH CTE AS
(
SELECT *
FROM SaleLocale_GetChilds(#SaleLocaleId)
)
SELECT
FROM CTE a
JOIN
(
SELECT MAX(level) AS level
FROM CTE
) b
ON a.level = b.level
There were a few edits in there. Kept hitting post...
Are you looking for something like this:
declare #SalesRegion as table ( SalesRegion int, ParentSalesRegionId int )
insert into #SalesRegion ( SalesRegion, ParentSalesRegionId ) values
( 1, NULL ), ( 2, 1 ), ( 3, 1 ),
( 4, 3 ), ( 5, 3 ),
( 6, 5 )
; with CTE as (
-- Get the root(s).
select SalesRegion, CAST( SalesRegion as varchar(1024) ) as Path
from #SalesRegion
where ParentSalesRegionId is NULL
union all
-- Add the children one level at a time.
select SR.SalesRegion, CAST( CTE.Path + '-' + cast( SR.SalesRegion as varchar(10) ) as varchar(1024) )
from CTE inner join
#SalesRegion as SR on SR.ParentSalesRegionId = CTE.SalesRegion
)
select *
from CTE
where Path like '1-3%'
I haven't tried this on a serious dataset, so I'm not sure how it'll perform, but I believe it solves your problem:
WITH SaleLocale_CTE AS (
SELECT SL.SaleLocaleId, SL.SaleLocaleName, SL.AccountingLocationID, SL.LocaleTypeId, SL.ParentSaleLocaleId, CASE WHEN EXISTS (SELECT 1 FROM SaleLocal SL2 WHERE SL2.ParentSaleLocaleId = SL.SaleLocaleID) THEN 1 ELSE 0 END as HasChildren
FROM SaleLocale SL
WHERE SL.Deleted = 0
AND (#SaleLocaleId IS NULL OR SaleLocaleId = #SaleLocaleId)
UNION ALL
SELECT SL.SaleLocaleId, SL.SaleLocaleName, SL.AccountingLocationID, SL.LocaleTypeId, SL.ParentSaleLocaleId, CASE WHEN EXISTS (SELECT 1 FROM SaleLocal SL2 WHERE SL2.ParentSaleLocaleId = SL.SaleLocaleID) THEN 1 ELSE 0 END as HasChildren
FROM SaleLocale SL
INNER JOIN SaleLocale_CTE SLCTE ON SLCTE.SaleLocaleId = SL.ParentSaleLocaleId
WHERE SL.Deleted = 0
)
SELECT *
FROM SaleLocale_CTE
WHERE HasChildren = 0

T-sql problem with running sum

I am trying to write T-sql script which will find "open" records for one table
Structure of data is following
Id (int PK) Ts (datetime) Art_id (int) Amount (float)
1 '2009-01-01' 1 1
2 '2009-01-05' 1 -1
3 '2009-01-10' 1 1
4 '2009-01-11' 1 -1
5 '2009-01-13' 1 1
6 '2009-01-14' 1 1
7 '2009-01-15' 2 1
8 '2009-01-17' 2 -1
9 '2009-01-18' 2 1
According to my needs I am trying to show only records after last sum for every one articles where 0 sorting by date of last running sum of zero value. So I am trying to abstract (show) records 5 and 6 for Art_id=1 and record 9 for art_id=2. I am using MSSQL2005 and my table has around 30K records with 6000 distinct values of ART_ID.
In this solution I simply want to find all the rows where there isn't a subsequent row for that Art_id where the running sum was 0. I am assuming we can use the ID as a better tiebreaker than TS, since two rows can come in with the same timestamp but they will get sequential identity values.
;WITH base AS
(
SELECT
ID, Art_id, TS, Amount,
RunningSum = Amount + COALESCE
(
(
SELECT SUM(Amount)
FROM dbo.foo
WHERE Art_id = f.Art_id
AND ID < f.ID
)
, 0
)
FROM dbo.[table name] AS f
)
SELECT ID, Art_id, TS, Amount
FROM base AS b1
WHERE NOT EXISTS
(
SELECT 1
FROM base AS b2
WHERE Art_id = b1.Art_id
AND ID >= b1.ID
AND RunningSum = 0
)
ORDER BY ID;
Complete working query:
SELECT
*
FROM TABLE_NAME E
JOIN
(SELECT
C.ART_ID,
MAX(TS) MAX_TS
FROM
(SELECT
ART_ID,
TS,
COALESCE((SELECT SUM(AMOUNT) FROM TABLE_NAME B WHERE (B.Art_id = A.Art_id) AND (B.Ts < A.Ts)),0) ROW_SUM
FROM TABLE_NAME A) C
WHERE C.ROW_SUM = 0
GROUP BY C.ART_ID) D
ON
(D.ART_ID = E.ART_ID) AND
(E.TS >= D.MAX_TS)
First we calculate running sums for every row:
SELECT
ART_ID,
TS,
COALESCE((SELECT SUM(AMOUNT) FROM TABLE_NAME B WHERE (B.Art_id = A.Art_id) AND (B.Ts < A.Ts)),0) ROW_SUM
FROM TABLE_NAME A
Then we look for last article with 0:
SELECT
C.ART_ID,
MAX(TS) MAX_TS
FROM
(SELECT
ART_ID,
TS,
COALESCE((SELECT SUM(AMOUNT) FROM TABLE_NAME B WHERE (B.Art_id = A.Art_id) AND (B.Ts < A.Ts)),0) ROW_SUM
FROM TABLE_NAME A) C
WHERE C.ROW_SUM = 0
GROUP BY C.ART_ID
You can find all rows where the running sum is zero with:
select cur.id, cur.art_id
from #articles cur
left join #articles prev
on prev.art_id = cur.art_id
and prev.id <= cur.id
group by cur.id, cur.art_id
having sum(prev.amount) = 0
Then you can query all rows that come after the rows with a zero running sum:
select a.*
from #articles a
left join (
select cur.id, cur.art_id, running = sum(prev.amount)
from #articles cur
left join #articles prev
on prev.art_id = cur.art_id
and prev.ts <= cur.ts
group by cur.id, cur.art_id
having sum(prev.amount) = 0
) later_zero_running on
a.art_id = later_zero_running.art_id
and a.id <= later_zero_running.id
where later_zero_running.id is null
The LEFT JOIN in combination with the WHERE says: there can not be a row after this row, where the running sum is zero.