Counting total rows and rows under condition - sql

I have table1 looking like:
docid val1 val2 val3 value
----------------------------------
001 1 1 null 10
001 null null null 5
001 1 null 1 20
001 1 null null 7
001 null null null 15
002 null null 1 30
002 null null null 2
I need as output:
Per docid
the total number of rows that exist for that docid
and the sum of value for those rows
the number of rows that hit on the condition: val1 = 1 or val2 = 1 or val3 = 1
and the sum of value for those rows
As follows:
docid total_rows total_rows_value rows_with_val val_rows_value
001 5 57 3 37
002 2 1 32 2
What I have until now:
select [docid],
count(1) as [rows_with_val],
sum([value]) as [val_rows_value]
from table1
where val1 = 1 or val2 = 1 or val3 = 1
group by [docid]
;
This will only give the hits though. How can I account for both? I understand by deleting the where-clause, but where do I put it then? I have been reading about case statement (in my select) but don't know how to apply it here.

You can use conditional aggregation:
select docid, count(*) total_rows, sum(value) as sum_value,
sum(case when 1 in (val1, val2, val3) then 1 else 0 end) as cnt_val1,
sum(case when 1 in (val1, val2, val3) then value else 0 end) as sum_val1
from mytable
group by docid

Related

SQL -- Multiple rows, similar value in a row, need to not show specific values

Here is the issue:
Table name = a
1 2 3
123 1 A
123 1 A
123 2 A
332 1 A
332 1 A
321 2 B
321 2 A
321 1 A
So far what I have is this:
select distinct 1,2,3 from a where a.2='1' and a.3='B';
What it returns is each result (except for 321).
I only want to select values column 1 as long as that value is not in a row where there is a 2 in column 2 or a B in column 3. Is this possible?
"not in a row where there is a 2 in column 2 or a B in column 3" can be expressed as
select distinct 1,2,3 from a where a.2!='2' or a.3!='B';
or
select distinct 1,2,3 from a where a.2 <> '2' or a.3 <> 'B';
I would use group by and having:
select col1
from t
group by col1
having sum(case when col2 = 2 then 1 else 0 end) = 0 and
sum(case when col3 = 'B' then 1 else 0 end) = 0;

SQL Concatenate Rows by Composite Group

I need to concatenate rows values into a column based on which group the row belongs to using two grouping values.
TBL1
cat1 cat2 cat3 value
---- ---- ---- -----
1 1 lvl1 100
1 2 lvl2 abc
1 3 lvl2 cba
2 1 lvl1 200
2 2 lvl2 abb
3 1 lvl1 100
3 2 lvl2 bbc
3 3 lvl2 acc
3 4 lvl1 400
3 5 lvl2 acc
4 1 lvl1 300
4 2 lvl2 aab
...
TBL2
cat1 cat2 value
---- ---- ---------
1 100 abc, cba
2 200 abb
3 100 bbc, aac
3 400 aac
4 300 aab
...
This is using static DB2 SQL. The actual table has over a thousand records.
At least some versions of DB2 support listagg(). So the tricky part is identifying the groups. You can do this by counting the number of rows with where the value is a number, cumulatively. The resulting query is something like this:
select cat1,
max(case when value >= '0' and value <= '999' then value end) as cat2,
listagg(case when not value >= '0' and value <= '999' then value end, ', ') within group (order by cat2) as value
from (select t.*,
sum(case when value >= '0' and value <= '999' then 1 else 0 end) over (order by cat1, cat2) as grp
from t
) t
group by cat1, grp;
Checking for a number in DB2 can be tricky. The above uses simple between logic that is sufficient for your sample data.

SQL Server convert a table column values to names and assign 0 to the new columns if no value for a column

I would like to transform a table from rows to columns in SQL Server.
Given table1:
id value1 value2
1 name1 9
1 name1 26
1 name1 15
2 name2 20
2 name2 18
2 name2 61
I need a table like:
id name1 name2
1 9 0
1 26 0
1 15 0
2 0 20
2 0 18
2 0 61
Can pivot help here? An efficient way is preferred to do the convert because the table is large.
I have tried:
select
id, name1, name2
from
(select
id, value1, value2
from table1) d
pivot
(
max(value2)
for value1 in (name1, name2)
) piv;
But, it cannot provide all rows for same ID to combine with 0s.
Thanks
The 'secret' is to add a column to give uniqueness to each row within your 'nameX' groups. I've used ROW_NUMBER. Although PIVOT requires an aggregate, with our 'faked uniqueness' MAX, MIN etc will suffice. The final piece is to replace any NULLs with 0 in the outer select.
(BTW, we're on 2014 so I can't test this on 2008 - apologies)
SELECT * INTO #Demo FROM (VALUES
(1,'name1',9),
(1,'name1',26),
(1,'name1',15),
(2,'name2',20),
(2,'name2',18),
(2,'name2',61)) A (Id,Value1,Value2)
SELECT
Id
,ISNULL(Name1, 0) Name1
,ISNULL(Name2, 0) Name2
FROM
( SELECT
ROW_NUMBER() OVER ( PARTITION BY Id ORDER BY Id ) rn
,Id
,Value1
,Value2
FROM
#Demo ) A
PIVOT ( MAX(Value2) FOR Value1 IN ( [Name1], [Name2] ) ) AS P;
Id Name1 Name2
----------- ----------- -----------
1 9 0
1 26 0
1 15 0
2 0 20
2 0 18
2 0 61
you can do case based aggregation with group by
SQL Fiddle
select id,
max(case when value1 ='name1' then value2 else 0 end) as name1,
max(case when value1 ='name2' then value2 else 0 end) as name2
from Table1
group by id, value1, value2

T-Sql: Select Rows where at least two fields matches condition

I've got a table, let's call it values with a primary key and five integer fields, like this:
id val1 val2 val3 val4 val5
1 4 3 4 5 3
2 2 3 2 2 2
3 5 4 1 3 3
4 1 4 3 4 4
Now I need to select all rows where at least any two of the five value fields got the value 4. So the result set should contain the first row (id=1) and the last row (id=4).
I started with a simple OR condition but there are too many combinations. Then I tried a sub-select with HAVING and COUNT but no success.
Any Ideas how to solve this?
You can use VALUES to construct an inline table containing your fields. Then query this table to get rows having at least two fields equal to 4:
SELECT *
FROM mytable
CROSS APPLY (
SELECT COUNT(*) AS cnt
FROM (VALUES (val1), (val2), (val3), (val4), (val5)) AS t(v)
WHERE t.v = 4) AS x
WHERE x.cnt >= 2
Demo here
Although cross apply is fast, it might be marginally faster to simply use case:
select t.*
from t
where ((case when val1 = 4 then 1 else 0 end) +
(case when val2 = 4 then 1 else 0 end) +
(case when val3 = 4 then 1 else 0 end) +
(case when val4 = 4 then 1 else 0 end) +
(case when val5 = 4 then 1 else 0 end)
) >= 2;
I will also note that case is ANSI standard SQL and available in basically every database.
This is trivial to solve if your data is normalized - so lets use UNPIVOT to normalize the data and then solve it:
declare #t table (id int not null, val1 int not null, val2 int not null,
val3 int not null, val4 int not null, val5 int not null)
insert into #t(id,val1,val2,val3,val4,val5) values
(1,4,3,4,5,3),
(2,2,3,2,2,2),
(3,5,4,1,3,3),
(4,1,4,3,4,4)
select
id
from
#t t
unpivot
(valness for colness in (val1,val2,val3,val4,val5)) r
group by id
having SUM(CASE WHEN valness=4 THEN 1 ELSE 0 END) >= 2
Results:
id
-------
1
4
Of course, you can probably come up with better names than valness and colness that describes what these pieces of data (the numbers being stored and the numbers embedded in the column names) actually are.

How to check Oracle column values are all the same for a specific ID?

I am trying to figure out the best way to determine, for a specific ID within an Oracle 11g table that has 5 columns and say 100 rows against this ID, if all the column values are the same for these five columns.
For example:
Table Name: TABLE_DATA
Columns:
TD_ID ID COL1 COL2 COL3 COL4 COL5
-----------------------------------------------------------------------
1 1 1 0 3 2 0
2 1 1 0 3 2 0
3 1 1 0 3 2 0
4 1 1 0 3 2 0
5 1 1 0 3 2 0
6 1 1 0 3 2 0
So based on the above example which is just showing 6 rows for now against the ID:1, I want to check that for all COL1, COL2, COL3, COL4 and COL5 values where ID = 1, tell me if all the values are the same from the very first row right down to the last – if so, then return ‘Y’ else return ‘N’.
Given the above example, the result would be ‘Y’ but for instance, if TD_ID = 5 and COL3 = 4 then the result would be ‘N’, as all the column values are not the same, i.e.:
TD_ID ID COL1 COL2 COL3 COL4 COL5
-----------------------------------------------------------------------
1 1 1 0 3 2 0
2 1 1 0 3 2 0
3 1 1 0 3 2 0
4 1 1 0 3 2 0
5 1 1 0 4 2 0
6 1 1 0 3 2 0
I’m just not sure what the fastest approach to determine this is, as the table I am looking at may have more than 2000 rows within the table for a specific ID.
You may also try this :
Select ID
, case when count(distinct COL1 || COL2 || COL3 || COL4 || COL5) > 1
then 'N'
else 'Y' end RESULT
From TABLE_DATA
Group by id;
In this way you group by id and counts how many distinct combination are there.
If only 1 , so all the rows have the same set of values, otherwise it don't.
See if the following is fast enough for you:
SELECT ID, CASE WHEN COUNT(*) > 1 THEN 'No' ELSE 'Yes' END As "Result"
FROM (SELECT DISTINCT ID, COL1, COL2, COL3, COL4, COL5
FROM Table_Data) dist
GROUP BY ID
Here's a little query, you might wanna try out (eventually, you just could try figuring out a better MINUS statement for you):
SELECT
CASE
WHEN ( -- select count of records from a subquery
SELECT
COUNT(1)
FROM
( -- select all rows where id = 1
SELECT
td.col1
,td.col2
,td.col3
,td.col4
,td.col5
FROM
table_data td
WHERE
td.id = 1
MINUS -- substract the first row of the table with id = 1
SELECT
td.col1
,td.col2
,td.col3
,td.col4
,td.col5
FROM
table_data td
WHERE
td.id = 1
AND ROWNUM = 1
)
) = 0 -- check if subquery's count equals 0
AND EXISTS ( -- and exists at least 1 row in the table with id = 1
SELECT
1
FROM
table_data td
WHERE
td.id = 1
AND ROWNUM = 1
) THEN 'Y'
ELSE 'N'
END AS equal
FROM
dual