getting rid of redundant rows in sql db2 - sql

I have the following data format in sql db2:
ID Test_no Result
-- ------- ------
01 1 A
01 2 B
01 3 B
02 1 A
03 1 B
03 2 C
04 1 A
where person can take a maximum of 3 tests, although some only take a minimum of 1 test (the criteria is irrelevant). I have been asked to produce the table in, and I hate to use this phrase "wide format" i.e.
ID Test1 Test2 Test3
-- ----- ----- -----
01 A B B
02 A NULL NULL
03 B C NULL
04 A NULL NULL
where each person has one record and records the result if they took a certain test (Although I don't like working in this format!) I can do something like
select distinct ID,
case when Test_no = 1 then Result end as Test1,
case when Test_no = 2 then Result end as Test2,
case when Test_no = 3 then Result end as Test3
from my_table
however of course this generates a new line each time a non-null test score exists and I end up with:
ID Test1 Test2 Test3
-- ----- ----- -----
01 A NULL NULL
01 NULL B
01 NULL NULL C
.
.
.
How do I remove the rows that are generated as a result of a non-null test result appearing? i.e. like the previous table.
Thanks very much.

Try this way:
SELECT ID,
MAX(case when Test_no = 1 then Result end) as Test1,
MAX(case when Test_no = 2 then Result end) as Test2,
MAX(case when Test_no = 3 then Result end) as Test3
FROM my_table
GROUP BY ID

Related

SQL: How to get result set containing single record for each user with existing value

The question can be misleading, I can explain the question here:
userId col1 col2
001 null null
001 1 null
002 1 1
002 null 1
002 null null
003 null 1
003 1 null
Final result of query
001 1 null
002 1 1
003 1 1
I have multiple records in a table for the user. Some columns contains the values and some don't. I want the final result as shown above. If there exists a value for any of the columns in any row for a user, I want that value in the final result set.
I hope the example above makes it clear.
Use group by aggregation with max:
select userId, max(col1) as c1, max(col2) as c2
from userTbl
group by userId

select statement to return value from a different row

I have a select query that returns these results:
Transaction Type Value
----------- ---- -----
1 A Null
1 A Null
1 B 1234
2 A Null
2 A Null
2 B 4321
How would i form it so the Null values from type A get replace by the value in type B? To get this result:
Transaction Type Value
----------- ---- -----
1 A 1234
1 A 1234
1 B 1234
2 A 4321
2 A 4321
2 B 4321
You can use window functions:
select transaction, type, value,
max(case when type = 'b' then value end) over (partition by transaction) as new_value
from t;
If you just want the non-NULL value for the transaction:
select transaction, type, value,
max(value) over (partition by transaction) as new_value
from t;

Incrementally comparing multiple value sets or lists between two tables in Oracle

I am trying to compare two sets of values between 2 Oracle tables as below. I am trying to look for and match groups of data in table B with those in table A. The group number is common between tables
Its considered a match only if all groups and values under an id in Table A are equal the group and value pair in Table B. I have highlighted the 'matches' in green. Table A could have variable number of group/value pairs under ida value. There could be ids that have only one group/value pair and there could be some that have 3 group/value pairs
Comparison Example
Ida GroupA Vala|GroupB Valb| Match?
------------------------------------------------------------------------
50 1 4 | 1 1 | No - Value doesn't match
56 1 5 | 1 1 | No - Value doesn't match
57 1 1 | 1 1 | Yes - Both Groups (1&2) and Values match
57 2 101 | 2 101 | Yes - Both Group (1&2)and Values match
94 1 1 | 1 1 | Yes - Group and Value match
96 1 1 | 1 1 | No - Only group 1 matches
96 2 102 | 2 101 | No - Only group 1 matches. Group 2 doesn't
Trial (and Error!)
I figured I would have to use some sort of count and tried using a partition by to count the groups in Table A. But, I am not sure how to use this in a query to do a sequential/multi value comparison. I looked up hierarchical functions but realized they may not fit here.. What would be the best approach to deal with such data comparison? Thanks for your help..
Happy Halloween! :)
select a.*,MAX(a.groupa) OVER (PARTITION BY a.ida ORDER BY a.groupa desc)
occurs
from tab_a a, tab_b b
where a.groupa=b.groupb and a.vala=b.valb
and a.groupa<=3
Tables
Tables A and B
create table tab_a
(
ida number,
groupa number,
vala number
)
create table tab_b
(
idb number,
groupb number,
valb number
)
Data
insert into tab_a values (50,1,4);
insert into tab_a values (56,1,5);
insert into tab_a values (57,1,1);
insert into tab_a values (57,2,101);
insert into tab_a values (58,1,1);
insert into tab_a values (58,2,104);
insert into tab_a values (60,2,102);
insert into tab_a values (94,1,1);
insert into tab_a values (95,1,1);
insert into tab_a values (95,2,101);
insert into tab_a values (96,1,1);
insert into tab_a values (96,2,102);
insert into tab_a values (97,1,1);
insert into tab_a values (97,2,101);
insert into tab_a values (97,3,201);
insert into tab_b values (752,1,1);
insert into tab_b values (752,2,101);
insert into tab_b values (752,3,201);
I don't think this is all the way there but might get you started. You can do:
select a.*, b.*,
count(case when a.groupa = b.groupb and a.vala = b.valb then a.ida end)
over (partition by a.ida) match_count,
count(distinct a.groupa||':'||a.vala)
over (partition by a.ida) val_count
from tab_a a
full outer join tab_b b on b.groupb = a.groupa and b.valb = a.vala
where a.groupa <= 3;
The distinct may not be needed, and the concatenation with the colon needs to use a characters that isn't in any real value, I suppose, to avoid potential for false matched.
That gets:
IDA GROUPA VALA IDB GROUPB VALB MATCH_COUNT VAL_COUNT
--- ------ ---- ---- ------ ---- ----------- ----------
50 1 4 0 1
56 1 5 0 1
57 1 1 752 1 1 2 2
57 2 101 752 2 101 2 2
58 1 1 752 1 1 1 2
58 2 104 1 2
60 2 102 0 1
94 1 1 752 1 1 1 1
95 1 1 752 1 1 2 2
95 2 101 752 2 101 2 2
96 1 1 752 1 1 1 2
96 2 102 1 2
97 1 1 752 1 1 3 3
97 2 101 752 2 101 3 3
97 3 201 752 3 201 3 3
And then use that as a CTE or inline view and decode the results:
with t as (
select a.ida, a.groupa, a.vala, b.groupb, b.valb,
count(case when a.groupa = b.groupb and a.vala = b.valb then a.ida end)
over (partition by a.ida) match_count,
count(distinct a.groupa||':'||a.vala)
over (partition by a.ida) val_count
from tab_a a
full outer join tab_b b on b.groupb = a.groupa and b.valb = a.vala
where a.groupa <= 3
)
select ida, groupa, vala, groupb, valb,
case
when match_count = 0 then 'No - Value doesn''t match'
when match_count = val_count and val_count = 1
then 'Yes - Group and Value match'
when match_count = val_count and val_count = 2
then 'Yes - Both Group (1&2) and Values match'
when match_count < val_count and val_count = 2 and valb is not null
then 'No - Only group 1 matches'
when match_count < val_count and val_count = 2 and valb is null
then 'No - Only group 1 matches. Group 2 doesn''t'
else 'Unknown scenario?'
end as "Match?"
from t;
Which gets:
IDA GROUPA VALA GROUPB VALB Match?
--- ------ ---- ------ ---- ------------------------------------------
50 1 4 No - Value doesn't match
56 1 5 No - Value doesn't match
57 1 1 1 1 Yes - Both Group (1&2) and Values match
57 2 101 2 101 Yes - Both Group (1&2) and Values match
58 1 1 1 1 No - Only group 1 matches
58 2 104 No - Only group 1 matches. Group 2 doesn't
60 2 102 No - Value doesn't match
94 1 1 1 1 Yes - Group and Value match
95 1 1 1 1 Yes - Both Group (1&2) and Values match
95 2 101 2 101 Yes - Both Group (1&2) and Values match
96 1 1 1 1 No - Only group 1 matches
96 2 102 No - Only group 1 matches. Group 2 doesn't
97 1 1 1 1 Yes - All Group (1&2&3) and Values match
97 2 101 2 101 Yes - All Group (1&2&3) and Values match
97 3 201 3 201 Yes - All Group (1&2&3) and Values match
I think that gets the match result you showed in your examples; not sure if the others you didn't show are what you want... ID 97 matches on three groups/values, and it's easy enough to do:
when match_count = val_count and val_count = 3
then 'Yes - All Group (1&2&3) and Values match'
for that exact match, but figuring out what to show if one or two of those three match is trickier. You could also capture the min and max B values that do match and work out from those which one(s) are missing; but then you might add a fourth group, and it doesn't scale.
This query should work:
select a.ida
from tab_a a
where a.groupa||a.vala in
(select b.groupb|| b.valb from tab_b b where b.groupb = a.groupa )
group by a.ida
having count(distinct a.groupa||a.vala) =
(select count(distinct a1.groupa||a1.vala)
from tab_a a1
where a1.ida = a.ida)
Bit of explanation:
1. where clause gets all the rows from tab_a
that exist in tab_b for a group+val combo.
- So let's say there are 2 (out of 2) rows in tab_a
that match with 2(out of 3) rows in tab_b.
2. left hand side of the having clause adds
a condition to the found rows such that
total number of rows of distinct group+val must equal to
- So here we start comparing that count 2
3. right hand side of the having clause
that provides the total number of
distinct group+val (regardless of any match with tab_b).
- here we enforce that left hand side must be equal
to the total number of rows found. So if in #2 above,
only 1 row of table_a matched (out of its 2 rows),
then #3 will exclude that set.
It's not the perfect one but match_strength 2 means that both are matched and match_strength 1 means you match only one column.
select * from (
select a.*, b.*, case when (a.vala = b.valb and a.groupa = b.groupb) then 2
when (a.vala = b.valb or a.groupa = b.groupb) then 1
else 0 end as match_strength,
row_number() over (partition by a.rowid order by
case when (a.vala = b.valb and a.groupa = b.groupb) then 2
when (a.vala = b.valb or a.groupa = b.groupb) then 1
else 0 end desc) r
from tab_a a, tab_b b)
where r = 1;
If you want to know exactly which column matches you can play with order by clause.
Assuming the requirement is to find all the ida for which all the pairs groupa, vala can be found in table_b (with no further information on why the ones that failed, failed) you could use the query below. The inner query actually shows why the ones that failed, failed (if you select * instead of just the ida). There is only one unusual thing in this solution - I have heard of the use of IN condition (and similar) for pairs, or tuples in general, instead of scalar values, but I hadn't used it till today. I just tested on your data and it works perfectly fine.
This works in the following general sense: it is not necessary to assume that groupa is unique for each ida, or the same for table_b; that is, (ida, groupa) does not have to be unique in the first table, nor does (idb, groupb) in the second table.
select distinct ida from tab_a where ida not in
(select ida from tab_a where (groupa, vala) not in (select groupb, valb from tab_b));
IDA
------
57
95
94
97

How do I `group by` rows and columns in SQLite3?

SQLite database table table1
user command date location
---------- ---------- ---------- ----------
user1 cmd1 2015-01-01 xxxdeyyy
user2 cmd1 2015-01-01 zzzfrxxx
user3 cmd1 2015-01-01 yyyukzzz
user1 cmd1 2015-01-01 xxxdezzz
...
Expected output
Output for where command='cmd1':
month users_de users_fr users_es
-------- -------- -------- --------
01 1 0 5
02 2 0 0
03 0 2 1
04 5 0 15
05 1 0 4
06 11 1 2
07 9 0 3
08 1 0 5
09 0 0 5
10 0 0 0
11 1 0 0
12 1 4 5
It is grouped by month (from column date) and also grouped by a substring in location (from column location).
Actual output
I can achieve this (per location):
month users_de
-------- --------
01 1
02 2
03 0
...
12 1
using this query:
select strftime('%m',date) as month, count(distinct user) as users_de
from table1
where command='cmd1' and location like '%de%'
group by strftime('%m',date);
I then repeat this query for the other locations (where ... and location='fr'):
month users_fr
-------- --------
01 0
02 0
03 2
...
12 4
and (where ... and location='es');
month users_es
-------- --------
01 5
02 0
03 1
...
12 5
Is there a way to have all the users_xx columns in one table (as output from SQLite and not through any external (downstream) processing)?
Am I thinking about this in the wrong way (grouping instead of subqueries in the top select)?
You can use the case statement to match each location and then if matches count the user.
select strftime('%m',date) as month,
CASE WHEN location='de' THEN count(distinct user) END users-de,
CASE WHEN location='fr' THEN count(distinct user) END users-fr,
CASE WHEN location='es' THEN count(distinct user) END users-es,
from table1
where command='cmd1'
group by strftime('%m',date),location;
I think you want conditional aggregation:
select strftime('%m',date) as month,
count(distinct CASE WHEN location like '%de%' THEN user END) as users_de,
count(distinct CASE WHEN location like '%fr%' THEN user END) as users__fr,
count(distinct CASE WHEN location like '%es%' THEN user END) as users_es
from table1
where command = 'cmd1'
group by strftime('%m',date);
Two notes:
like possibly isn't safe in this context. You have the country code embedded in the string, but the characters "de", "es", or "fr" could appear elsewhere in the string. Your question is not clear on better logic for this.
You should include the year in the date string, but your question specifically includes only the month.
Using query like this:
SELECT strftime('%m',date) AS month,
location,
count(distinct user) AS users-de,
count(distinct user) AS users-fr,
count(distinct user) AS users-es
FROM table1
WHERE command='cmd1' GROUP BY strftime('%m', date), location;

How to insert data into a table based on conditions in oracle/sql

I have a table1
Date Sec_ID Version value col5 col6 col7
20131111 001 1 100
20131112 002 2 99
I have a stored procedure to insert new data into the table1
so if I insert new date rows:
20131111 001 2 111
20130101 003 1 88
20131111 004 1 90
table1 will be something like:
Date Sec_ID Version value col5 col6 col7
20131111 001 2 111
20131112 002 2 99
20130101 003 1 88
20131111 004 1 90
Requirement: Date and Sec_ID formed a primary key.
for data that have same date and same Sec_ID, update its version and other columns.
in this case, for:
20131111 001 1 100
when new data:
20131111 001 2 111
is inserted
it'll keep
20131111 001 2 111
only.
Thanks!
It looks like you want to MERGE. If your table was called t42 you could do something like this:
select * from t42;
DT SEC_ID VERSION VALUE
--------- ---------- ---------- ----------
11-NOV-13 1 1 100
12-NOV-13 2 2 99
merge into t42
using (
select date '2013-11-11' as dt, 1 as sec_id, 2 as version, 111 as value
from dual
union all select date '2013-01-01', 3, 1, 88 from dual
union all select date '2013-11-11', 4, 1, 90 from dual
) new_data
on (new_data.dt = t42.dt and new_data.sec_id = t42.sec_id)
when matched then
update set t42.version = new_data.version, t42.value = new_data.value
when not matched then
insert (t42.dt, t42.sec_id, t42.version, t42.value)
values (new_data.dt, new_data.sec_id, new_data.version, new_data.value);
3 rows merged.
select * from t42;
DT SEC_ID VERSION VALUE
--------- ---------- ---------- ----------
11-NOV-13 1 2 111
12-NOV-13 2 2 99
01-JAN-13 3 1 88
11-NOV-13 4 1 90
SQL Fiddle
The new_data is coming from fixed values here, but could come from another table, or as a single row if you're passing values into your stored procedure. The merge itself can be standalone SQL or embedded in a PL/SQL block.
If the new_data fields dt (since date is a reserved word, and a bad name for a column) and sec_id match an existing record, that record is updated with the new version and value. If there is no match a new record is inserted.