SQL - Combining incomplete - sql

I'm using Oracle 10g. I have a table with a number of fields of varying types. The fields contain observations that have been made by made about a particular thing on a particular date by a particular site.
So:
ItemID, Date, Observation1, Observation2, Observation3...
There are about 40 Observations in each record. The table structure cannot be changed at this point in time.
Unfortunately not all the Observations have been populated (either accidentally or because the site is incapable of making that recording). I need to combine all the records about a particular item into a single record in a query, making it as complete as possible.
A simple way to do this would be something like
SELECT
ItemID,
MAX(Date),
MAX(Observation1),
MAX(Observation2)
etc.
FROM
Table
GROUP BY
ItemID
But ideally I would like it to pick the most recent observation available, not the max/min value. I could do this by writing sub queries in the form
SELECT
ItemID,
ObservationX,
ROW_NUMBER() OVER (PARTITION BY ItemID ORDER BY Date DESC) ROWNUMBER
FROM
Table
WHERE
ObservationX IS NOT NULL
And joining all the ROWNUMBER 1s together for an ItemID but because of the number of fields this would require 40 subqueries.
My question is whether there's a more concise way of doing this that I'm missing.

Create the table and the sample date
SQL> create table observation(
2 item_id number,
3 dt date,
4 val1 number,
5 val2 number );
Table created.
SQL> insert into observation values( 1, date '2011-12-01', 1, null );
1 row created.
SQL> insert into observation values( 1, date '2011-12-02', null, 2 );
1 row created.
SQL> insert into observation values( 1, date '2011-12-03', 3, null );
1 row created.
SQL> insert into observation values( 2, date '2011-12-01', 4, null );
1 row created.
SQL> insert into observation values( 2, date '2011-12-02', 5, 6 );
1 row created.
And then use the KEEP clause on the MAX aggregate function with an ORDER BY that puts the rows with NULL observations at the end. whatever date you use in the ORDER BY needs to be earlier than the earliest real observation in the table.
SQL> ed
Wrote file afiedt.buf
1 select item_id,
2 max(val1) keep( dense_rank last
3 order by (case when val1 is not null
4 then dt
5 else date '1900-01-01'
6 end) ) val1,
7 max(val2) keep( dense_rank last
8 order by (case when val2 is not null
9 then dt
10 else date '1900-01-01'
11 end) ) val2
12 from observation
13* group by item_id
SQL> /
ITEM_ID VAL1 VAL2
---------- ---------- ----------
1 3 2
2 5 6
I suspect that there is a more elegant solution to ignore the NULL values than adding the CASE statement to the ORDER BY but the CASE gets the job done.

i dont know about commands in oracle but in sql you could use some how that
first use pivot table is contains consecutives numbers 0,1,2...
i'm not sure but in oracle the function "isnull" is "NVL"
select items.ItemId,
case p.i = 0 then observation1 else '' end as observation1,
case p.i = 0 then observation1 else '' end as observation2,
case p.i = 0 then observation1 else '' end as observation3,
...
case p.i = 39 then observation4 else '' as observation40
from (
select items.ItemId
from table as items
where items.item = _paramerter_for_retrive_only_one_item /* select one item o more item where you filter items here*/
group by items.ItemId) itemgroup
left join
(
select
items.ItemId,
p.i,
isnull( max ( case p.i = 0 then observation1 else '' end ), '' ) as observation1,
isnull( max ( case p.i = 1 then observation2 else '' end ), '' ) as observation2,
isnull( max ( case p.i = 2 then observation3 else '' end), '' ) as observation3,
...
isnull( max ( case p.i = 39 then observation4), '' ) as observation40,
from
(select i from pivot where id < 40 /*you number of columns of observations, that attach one index*/
)
as p
cross join table as items
lef join table as itemcombinations
on item.itemid = itemcombinations.itemid
where items.item = _paramerter_for_retrive_only_one_item /* select one item o more item where you filter items here*/
and (p.i = 0 and not itemcombinations.observation1 is null) /* column 1 */
and (p.i = 1 and not itemcombinations.observation2 is null) /* column 2 */
and (p.i = 2 and not itemcombinations.observation3 is null) /* column 3 */
....
and (p.i = 39 and not itemcombinations.observation3 is null) /* column 39 */
group by p.i, items.ItemId
) as itemsimplified
on itemsimplified.ItemId = itemgroup.itemId
group by itemgroup.itemId
About pivot table
create an pivot table, Take a look at that
pivot table schema
name: pivot columns: {i : datatype int}
How populate
create foo table
schema foo
name: foo column: value datatype varchar
insert into foo
values('0'),
values('1'),
values('2'),
values('3'),
values('4'),
values('5'),
values('6'),
values('7'),
values('8'),
values('9');
/* insert 100 values */
insert into pivot
select concat(a.value, a.value) /* mysql */
a.value + a.value /* sql server */
a.value | a.value /* Oracle im not sure about that sintax */
from foo a, foo b
/* insert 1000 values */
insert into pivot
select concat(a.value, b.value, c.value) /* mysql */
a.value + b.value + c.value /* sql server */
a.value | b.value | c.value /* Oracle im not sure about that sintax */
from foo a, foo b, foo c
the idea about pivot table can consult in "Transact-SQL Cookbook By Jonathan Gennick, Ales Spetic"
I have to admit that the above solution (by Justin Cave) is simpler and easier to understand but this is another good option
at the end like you said you solved

Related

Query to find ranges of consecutive rows

I have file that contains a dump of a SQL table with 2 columns: int ID (auto increment identity field) and bit Flag. The flag = 0 means a record is good and the flag = 1 means a record is bad (contains an error). The goal is to find all blocks of consecutive bad records (with flag value of 1) with 1,000 or more rows. The solution shouldn't use cursors or while loops and it should use the set-based queries only (selects, joins etc).
We would like to see the actual queries used and the results in the following format:
StartID – EndID NumberOfErrorsInTheBlock
StartID – EndID NumberOfErrorsInTheBlock
……………………….
StartID – EndID NumberOfErrorsInTheBlock
For example if our data were only 30 records and we were looking for blocks with 5 or more records then the results would look as follows (see the screenshot below, the errors blocks that met the criteria are highlighted) :
[ID Range].....[Number of errors in the block]
11-15..... 5
19-25..... 7
sql file containing sample rows, dropbox
T-SQL Solution for SQL Server 2012 and Above
IF OBJECT_ID('tempdb..#tbl_ranges') IS NOT NULL
DROP TABLE #tbl_ranges;
CREATE TABLE #tbl_ranges
(
row_num INT PRIMARY KEY,
ID INT,
Flag BIT,
Label TINYINT
);
WITH cte_yourTable
AS
(
SELECT Id,
Flag,
CASE
--label min
WHEN Flag != LAG(flag,1) OVER (ORDER BY ID) THEN 1
--inner
WHEN Flag = LAG(flag,1) OVER (ORDER BY ID) AND Flag = LEAD(flag,1) OVER (ORDER BY ID) THEN 2
--end
WHEN Flag = LAG(flag,1) OVER (ORDER BY ID) AND Flag != LEAD(flag,1) OVER (ORDER BY ID) THEN 3
END label
FROM yourTable
)
INSERT INTO #tbl_ranges
SELECT ROW_NUMBER() OVER (ORDER BY ID) row_num,
ID,
Flag,
label
FROM cte_yourTable
WHERE label != 2;
SELECT A.ID ID_start,
B.ID ID_end,
B.ID - A.ID range_cnt
FROM #tbl_ranges A
INNER JOIN #tbl_ranges B
ON A.row_num = B.row_num - 1
AND A.Flag = B.Flag;
IF OBJECT_ID('tempdb..#tbl_ranges') IS NOT NULL
DROP TABLE #tbl_ranges;
Abbreviated Results:
ID_start ID_end range_cnt
----------- ----------- -----------
2 3 1
5 8 3
9 10 1
11 35 24
36 356 320
357 358 1
359 406 47
...
With out using Temp Table, This is the best solution, Here is the Answer and It is perfect example for CTE with in CTE ( Nested CTE )
With Evaluation (ID,Flag,Evaluate)
as
(select ID,Flag,Evaluate = ID-row_number() over (order by Flag,ID)
from [dbo].[SqltestRecordsNew]
where Flag = 1
),
Evaluation_Final (StartingRecordID,EndRecordID,Flag,cnt)
as
(
select min(ID) as StartingRecordID,max(ID) as EndRecordID,
Flag, cnt = count(*)
from Evaluation
group by Evaluate, Flag
)
select Concat(StartingRecordID,' - ', EndRecordID) as 'StartingRecordID - EndRecordId',
cnt as GroupItemCnt from Evaluation_Final
where cnt > 999
order by Concat(StartingRecordID,' - ', EndRecordID)
-- Test results Case 1
Select ID,Flag,
Case when Flag=1 then 'Success'
else 'Defect Data'
End as TestResults
from SqltestRecordsNew
where ID between 1494363 and 1495559
-- Test results Case 2
Select ID,Flag,
Case when Flag=1 then 'Success'
else 'Defect Data'
End as TestResults from SqltestRecordsNew
where ID between 1498409 and 1503899
-- Test results Case 3
Select ID,Flag,
Case when Flag=1 then 'Success'
else 'Defect Data'
End as TestResults from SqltestRecordsNew
where ID between 1548257 and 1550489

Select rows until condition met

I would like to write an Oracle query which returns a specific set of information. Using the table below, if given an id, it will return the id and value of B. Also, if B=T, it will return the next row as well. If that next row has a B=T, it will return that, and so on until a F is encountered.
So, given 3 it would just return one row: (3,F). Given 4 it would return 3 rows: ((4,T),(5,T),(6,F))
id B
1 F
2 F
3 F
4 T
5 T
6 F
7 T
8 F
Thank you in advance!
Use a sub-query to find out at what point you should stop, then return all row from your starting point to the calculated stop point.
SELECT
*
FROM
yourTable
WHERE
id >= 4
AND id <= (SELECT MIN(id) FROM yourTable WHERE b = 'F' AND id >= 4)
Note, this assumes that the last record is always an 'F'. You can deal with the last record being a 'T' using a COALESCE.
SELECT
*
FROM
yourTable
WHERE
id >= 4
AND id <= COALESCE(
(SELECT MIN(id) FROM yourTable WHERE b = 'F' AND id >= 4),
(SELECT MAX(id) FROM yourTable )
)

How do I determine if a group of data exists in a table, given the data that should appear in the group's rows?

I am writing data to a table and allocating a "group-id" for each batch of data that is written. To illustrate, consider the following table.
GroupId Value
------- -----
1 a
1 b
1 c
2 a
2 b
3 a
3 b
3 c
3 d
In this example, there are three groups of data, each with similar but varying values.
How do I query this table to find a group that contains a given set of values? For instance, if I query for (a,b,c) the result should be group 1. Similarly, a query for (b,a) should result in group 2, and a query for (a, b, c, e) should result in the empty set.
I can write a stored procedure that performs the following steps:
select distinct GroupId from Groups -- and store locally
for each distinct GroupId: perform a set-difference (except) between the input and table values (for the group), and vice versa
return the GroupId if both set-difference operations produced empty sets
This seems a bit excessive, and I hoping to leverage some other commands in SQL to simplify. Is there a simpler way to perform a set-comparison in this context, or to select the group ID that contains the exact input values for the query?
This is a set-within-sets query. I like to solve it using group by and having:
select groupid
from GroupValues gv
group by groupid
having sum(case when value = 'a' then 1 else 0 end) > 0 and
sum(case when value = 'b' then 1 else 0 end) > 0 and
sum(case when value = 'c' then 1 else 0 end) > 0 and
sum(case when value not in ('a', 'b', 'c') then 1 else - end) = 0;
The first three conditions in the having clause check that each elements exists. The last condition checks that there are no other values. This method is quite flexible, for various exclusions and inclusion conditions on the values you are looking for.
EDIT:
If you want to pass in a list, you can use:
with thelist as (
select 'a' as value union all
select 'b' union all
select 'c'
)
select groupid
from GroupValues gv left outer join
thelist
on gv.value = thelist.value
group by groupid
having count(distinct gv.value) = (select count(*) from thelist) and
count(distinct (case when gv.value = thelist.value then gv.value end)) = count(distinct gv.value);
Here the having clause counts the number of matching values and makes sure that this is the same size as the list.
EDIT:
query compile failed because missing the table alias. updated with right table alias.
This is kind of ugly, but it works. On larger datasets I'm not sure what performance would look like, but the nested instances of #GroupValues key off GroupID in the main table so I think as long as you have a good index on GroupID it probably wouldn't be too horrible.
If Object_ID('tempdb..#GroupValues') Is Not Null Drop Table #GroupValues
Create Table #GroupValues (GroupID Int, Val Varchar(10));
Insert #GroupValues (GroupID, Val)
Values (1,'a'),(1,'b'),(1,'c'),(2,'a'),(2,'b'),(3,'a'),(3,'b'),(3,'c'),(3,'d');
If Object_ID('tempdb..#FindValues') Is Not Null Drop Table #FindValues
Create Table #FindValues (Val Varchar(10));
Insert #FindValues (Val)
Values ('a'),('b'),('c');
Select Distinct gv.GroupID
From (Select Distinct GroupID
From #GroupValues) gv
Where Not Exists (Select 1
From #FindValues fv2
Where Not Exists (Select 1
From #GroupValues gv2
Where gv.GroupID = gv2.GroupID
And fv2.Val = gv2.Val))
And Not Exists (Select 1
From #GroupValues gv3
Where gv3.GroupID = gv.GroupID
And Not Exists (Select 1
From #FindValues fv3
Where gv3.Val = fv3.Val))

How to get and populate Zero '0' if no record found in SSRS Report 2005

I am using SSRS report to get the result based on the category and the category value changed to Yes,No,Not Met by replacing with 1 = Yes, 2 = No, 3 = Not Met.
The query is,
SELECT (
CASE
WHEN A = '99' or B = '99' THEN 3
WHEN C + D >= 10 THEN 1
ELSE 2
END) as Category,
and the result is,
N % N % N %
Not Met 10 11% 5 7% 45 20%
Yes 4 5% 30 4% 8 6%
No 10 11% 5 7% 45 20%
and if for example no result found for "Not Met", i want result like this with Zeros instead.
N % N % N %
Not Met 0 0% 0 0% 0 0%
Yes 4 5% 30 4% 8 6%
No 10 11% 5 7% 45 20%
I have tried Left Join in the query but it will bring extra record. i am kind of stuck and don't know how to get Zero '0' result if no record found for Yes, No or Not Met.
My query is,
SELECT (
CASE
WHEN A = '99' or B = '99' THEN 3
WHEN C + D >= 10 THEN 1
ELSE 2
END
) as Category, [Table A].*, [Table B].*
FROM Table A inner join Table B on Table A.id = Table B.id
WHERE
(
Condition and Field is not null
)
Please help as this is my final project and i am stuck.
Thanks.
MY SAMPLE DATA...
MY SAMPLE DATA
TABLE A
=======
ID -------REFNO-------BGDATE-------SBRST-------xx xx ...
-- --------- -- --
1209-------23-------09/09/1900-------13-------XX XX
3453-------12-------14/02/1978-------10-------XX XX
3476-------56-------02/03/1980-------10 -------XX XX
TABLE B
=======
ID-------- CITY -------xx xx ... xx
-- -- -- --
1209-------Glasgow-------xx-------X
3453-------Edinburgh-----xx-------X
3476-------Manchester----xx-------X
SELECT
(
CASE
WHEN BGDATE = '09/09/1900' THEN 3
--I tried this to get Value 3, if no condition met but no success
WHEN NOT EXISTS(BGDATE = '09/09/1900') THEN 3
WHEN SBRST IN ('11','12') THEN 1
ELSE 2
END
) as Category, [Table - A].*, [Table - B].*
FROM [Table - A] inner join [Table - A] on [Table - A].id = [Table - A].ID
----> For Ian <-----This is sample data of my WHERE clause
WHERE
(
(
SBRST IN ('4','5','6','7','11','12') AND SBRST<>'99
AND NOT
(
EXTNT IN ('2','3','4','99') OR
HOR IN ('2','99') OR
(DATEDIFF(day,CHDATE1,CHENDATE1)>='42')
)
)
AND
(
SBRST IS NOT NULL AND
EXTNT IS NOT NULL AND
HOR IS NOT NULL
)
)
order by Category desc
RESULT
======
category----------ID----------REFNO----------BGDATE----------SBRST----------xx
3-----------------1209----------23----------09/09/1900----------13----------XX
2-----------------3453----------12----------14/02/1978----------10----------XX
2-----------------3476----------56----------02/03/1980----------10----------Xx
Points to be considered:
1) The above is my sample data.
2) Table A and B has inner joins on ID will bring the results.
3) Populating the Category based on the above SELECT condition, but the problem is, *if there is no condition matching, no category gets populated but i want the missing category as well. In this case it is "1".
I want some thing like this.
EXPECTED RESULT
======
category----------ID----------REFNO----------BGDATE----------SBRST----------xx
3-----------------1209----------23----------09/09/1900----------13----------XX
2-----------------3453----------12----------14/02/1978----------10----------XX
2-----------------3476----------56----------02/03/1980----------10----------XX
1-----------------0-------------0-----------NULL----------------0-----------NULL
Another solution could be, if i create another table with all the 3 categories in it and then use RIGHT OUTER JOIN to get the results but i do not know HOW??
I had this issue once. I did a quick fix by creating a temp table and inserting the results into the temp table and then checked to see if any of the three categories were missing from the temp table then inserted a row into the temp table with the desired default values in there. Below is an example code (just to give you an idea)
SELECT (
CASE
WHEN A = '99' or B = '99' THEN 3
WHEN C + D >= 10 THEN 1
ELSE 2
END
) as Category, [Table A].*
INTO #Temp
FROM [Table A]
IF NOT EXISTS (SELECT 1 FROM #Temp WHERE Category=1)
BEGIN
INSERT INTO #Temp (Category,column1,column2,column3,etc...) VALUES ( 1,0,0,0,etc...
)
END
IF NOT EXISTS (SELECT 1 FROM #Temp WHERE Category=2)
BEGIN
INSERT INTO #Temp (Category,column1,column2,column3,etc...) VALUES ( 2,0,0,0,etc...
)
END
IF NOT EXISTS (SELECT 1 FROM #Temp WHERE Category=3)
BEGIN
INSERT INTO #Temp (Category,column1,column2,column3,etc...) VALUES ( 3,0,0,0,etc...
)
END
SELECT * FROM #Temp
DROP TABLE #temp
You can use a query something like this:
select cat.Category
, a.ID
, b.CITY
, a.BGDATE
, a.REFNO
, a.SBRST
from
(
select Category = 1
union all
select Category = 2
union all
select Category = 3
) cat
left join
(
[Table - A] a
inner join [Table - B] b on a.ID = b.ID
cross apply
(
SELECT Category = CASE WHEN a.BGDATE = '09/09/1900' THEN 3
WHEN a.SBRST IN ('11','12') THEN 1
ELSE 2
END
) c
) on c.Category = cat.Category
order by Category desc
I've created a SQL Fiddle which shows this gives the required results.
The key point to note is that I am using a subquery to create the required categories (the cat subquery), then using a left join to join this to the actual results - this makes sure all required categories are always included.

Can I get the minimum of 2 columns which is greater than a given value using only one scan of a table

This is my example data (there are no indexes and I do not want to create any):
CREATE TABLE tblTest ( a INT , b INT );
INSERT INTO tblTest ( a, b ) VALUES
( 1 , 2 ),
( 5 , 1 ),
( 1 , 4 ),
( 3 , 2 )
I want the minimum value in of both column a and column b which is greater then a given value. E.g. if the given value is 3 then I want 4 to be returned.
This is my current solution:
SELECT MIN (subMin) FROM
(
SELECT MIN (a) as subMin FROM tblTest
WHERE a > 3 -- Returns 5
UNION
SELECT MIN (b) as subMin FROM tblTest
WHERE b > 3 -- Returns 4
)
This searches the table twice - once to get min(a) once to get min(b).
I believe it should be faster to do this with just one pass. Is this possible?
You want to use conditional aggregatino for this:
select min(case when a > 3 then a end) as minA,
min(case when b > 3 then b end) as minB
from tblTest;
To get the minimum of both values, you can use a SQLite extension, which handles multiple values for min():
select min(min(case when a > 3 then a end),
min(case when b > 3 then b end)
)
from tblTest
The only issue is that the min will return NULL if either argument is NULL. You can fix this by doing:
select coalesce(min(min(case when a > 3 then a end),
min(case when b > 3 then b end)
),
min(case when a > 3 then a end),
min(case when b > 3 then b end)
)
from tblTest
This version will return the minimum value, subject to your conditions. If one of the conditions has no rows, it will still return the minimum of the other value.
From the top of my head, you could modify the table and add a min value column to store the minimum value of the two columns. then query that column.
Or you can do this:
select min(val)
from
(
select min(col1, col2) as val
from table1
)
where
val > 3
The outer SELECT, queries the memory, not the table itself.
Check SQL Fiddle