collapse staggered records to a single row for repeating keys - sql

I want to collapse table to eliminate values in sql but the table has repeating keys. For example, I want to collapse this:
key1 key2 v1 v2 v3
1 A a NULL NULL
1 A NULL NULL 9
1 A NULL x NULL
1 A b NULL NULL
1 A NULL NULL 8
1 A NULL x NULL
1 A a NULL NULL
1 A NULL NULL 7
1 A NULL y NULL
1 A b NULL NULL
1 A NULL NULL 6
1 A NULL y NULL
1 B a NULL NULL
1 B NULL NULL 5
1 B NULL z NULL
1 B b NULL NULL
1 B NULL NULL 4
1 B NULL z NULL
1 C a NULL NULL
1 C NULL NULL 10
1 C z NULL
1 C b NULL NULL
1 C NULL NULL 11
1 C NULL z NULL
into this:
key1 key2 v1 v2 v3
1 A a x 9
1 A b x 8
1 A a y 7
1 A b y 6
1 B a z 5
1 B b z 4
1 C a z 10
1 C b z 11
Aggregate functions don't work and I haven't had success with self-join. Any idea?

You have a key/value table. This is something we usually avoid, but sometimes it cannot be avoided. Your original table looks something like this:
key1 key2 col value
1 A v1 a
1 A v1 a
1 A v1 a
1 A v1 a
1 A v1 b
1 A v1 b
1 A v1 b
1 A v1 b
1 A v2 x
1 A v2 x
1 A v2 y
1 A v2 y
...
I am showing the rows in another order then your query result, but that doesn't matter, for a table has no inherent order; it contains the data as an unordered set. We can see that for the same key 1|A|v1 the table contains diifferent values (four times 'a', four times 'b'). This is unexpected. Usually key value tables show one value per key.
So it may be that there is something wrong with your data model. Or the table has more columns, e.g. a date to show history data and also enable us to select the current value for 1|A|v1. Then you'd have to change your original query to take this into account. Or that data model is correct and 1|A|v1 does have four 'a' and four 'b', but then your expected query result makes no sense, for there is nothing to relate v1='a' to v2='x' for instance.
So something is wrong: datamodel, existing query, desired result. Find out which.

Have you tried "select distinct"
Select distinct key1, key2, v1, v2, v3
From SomeTable

Related

SQL Return rows with mix of nulls and non nulls in certain columns

If I have the following table
id a b c time
-----------------------------
0 1 4 "ca" 23
1 NULL NULL NULL 18
2 NULL 1 "pn" 13
3 6 NULL "ar" 27
4 1 2 NULL 24
I want to return all rows with at least one null and one non-null in columns a, b, and c. So I want to return:
id a b c time
-----------------------------
2 NULL 1 "pn" 13
3 6 NULL "ar" 27
4 1 2 NULL 24
I know I can write
select *
from table
where ((a is null and (b is not null or c is not null))
or (a is not null and (b is null or c is null)))
But what happens if I need to consider 4 columns or more? It becomes a mess. Note that the table could have 20 or more columns, of which I am only considering a small subset of columns for null/non-null analysis. Is there a concise way of doing this? Thanks
One method would be to unpivot your data, and COUNT the NULL and non-NULL values, and filter on that:
SELECT V.ID,
V.a,
V.b,
V.c,
V.time
FROM (VALUES(0,1,4,'"ca"',23),
(1,NULL,NULL,NULL,18),
(2,NULL,1,'"pn"',13),
(3,6,NULL,'"ar"',27),
(4,1,2,NULL,24))V(ID,a,b,c,time)
CROSS APPLY (SELECT COUNT(UP.V) AS NonNull,
COUNT(CASE WHEN UP.V IS NULL THEN 1 END) AS IsNull
FROM (VALUES(CONVERT(varchar(1),V.a)),
(CONVERT(varchar(1),V.b)),
(CONVERT(varchar(1),V.c)))UP(V))C
WHERE C.[IsNull] > 0
AND C.NonNull > 0;

Assigning string group ID in pandas

I have a data frame (data)
Col 1 Col 2 Combination
1 2 (1,2)
3 4 (3,4)
1 2 (1,2)
2 3 (2,3)
4 6 (4,6)
3 4 (3,4)
I want to assign a group ID based on Col 1 and Col 2 as a categorical variable not a numerical one
My output needed
Col 1 Col 2 Combination GroupID
1 2 (1,2) A
3 4 (3,4) C
1 2 (1,2) A
2 3 (2,3) B
4 6 (4,6) D
3 4 (3,4) C
The GroupID need to be a categorical data type need not to be numerical and can follow any order.
I have tried this code but the GroupID column is treated as numerical datatype
data['GroupID']=data1.groupby(['Col','Col2']).ngroup()
data['GroupID'] = data['GroupID'].astype('category')
Can anyone suggest a proper way to deal with this issue?

SAS: Need to count number of instances per id

Let's assume I have table1:
id value1 value2 value3
1 z null null
1 z null null
1 null y null
1 null null x
2 null y null
2 z null null
3 null y null
3 null null null
3 z null null
id value1 value2 value3
1 z null null
1 z null null
1 null y null
1 null null x
2 null y null
2 z null null
3 null y null
3 null null null
3 z null null
and I have table2:
id
1
2
3
I want to count number of values in each column per id to have output like this. (ex. id 1 has 2 - z's, one y and one x)
id value1 value2 value3
1 2 1 1
2 1 1 0
3 1 1 0
Need to do this in SAS. There is an example of this in Oracle but not in SAS.
If I understand correctly, this is a simple query using proc sql. For all the ids in the first table:
proc sql;
select id, count(val1) as val1, count(val2) as val2, count(val3 as val3)
from table1
group by id;
run;
count() counts the number of non-NULL values in a column or expression.

SQL left join - and after on clause is not working

I have a scenario in left join of SQL which is not generating required output which i need. Following is description in tabular form and my tried queries,
Table A
A_ID // PK OF TABLE A
IS_ACTIVE // VALUE=1 OR 0
Table B
B_ID // PK OF TABLE B
A_ID // FK OF TABLE A IN TABLE B
Sample Records of Table A
A_ID IS_ACTIVE
1 1
2 0
3 1
4 0
5 0
Sample Records of Table B
B_ID A_ID
1 1
2 1
3 4
4 4
5 4
6 4
Select * from A left join B on A.A_ID=B.A_ID
A_ID IS_ACTIVE B_ID A_ID
1 1 1 1
1 1 2 1
2 0 NULL NULL
3 1 NULL NULL
4 0 3 4
4 0 4 4
4 0 5 4
4 0 6 4
5 0 NULL NULL
Select * from A left join B on A.A_ID=B.A_ID and A.IS_ACTIVE=0
Following output is the actual output of above query with no effect to records by adding AND is_active=0 after ON clause.
A_ID IS_ACTIVE B_ID A_ID
1 1 1 1
1 1 2 1
2 0 NULL NULL
3 1 NULL NULL
4 0 3 4
4 0 4 4
4 0 5 4
4 0 6 4
5 0 NULL NULL
Following output is the required output which i need to solve my problem.
A_ID IS_ACTIVE B_ID A_ID
1 1 NULL NULL
1 1 NULL NULL
2 0 NULL NULL
3 1 NULL NULL
4 0 3 4
4 0 4 4
4 0 5 4
4 0 6 4
5 0 NULL NULL
I am facing problem in getting exact records which are required.
I need all records from Table A and matching records from Table B but
those records of Table B which are equal to is_active=0 of Table A.
Note : Query should show all records of Table A
Please help me how can i get this scenario in Left Join of SQL.
I tried your examples as code. And I get the result you needed. What is the problem?
CREATE TABLE #a(a_id int, is_active bit)
CREATE TABLE #b(b_id int, a_id int)
INSERT INTO #a(a_id,is_active)
VALUES(1,1),(2,0),(3,1),(4,0),(5,0)
INSERT INTO #b(b_id,a_id)
VALUES(1,1),(2,1),(3,4),(4,4),(5,4),(6,4)
SELECT *
FROM #a as a
LEFT JOIN #b as b
ON a.a_id = b.a_id
AND a.is_active = 0
DROP TABLE #a
DROP TABLE #b
Have you tried:
Select * from A left join B on A.A_ID=B.A_ID
Where A.IS_ACTIVE=0

how to combine Y or N or null values

still not found the solution described in update 2
thx for help
ill try to explain my issue with my poor english. hope someone can solve my problem.
i got the following table
A B
1 Y
2 null
3 Y
what result i want?
in dependency of the rank in column A i want to combine column B.
the result in that example is ... no result
the reason is because there is a null in rank 2 and the next and the last rank (=3) has a value (=Y).
next example
A B
1 Y
2 null
3 null
result i want is
A B
1 Y
because the way after is free... means 2 and the last 3 has null
another example
A B
1 null
2 N
3 null
again no result is what i want in this case. because first =1 has null value.
i try now to conclude ... if n(e.g. 2) of column B has value Y or N then the elements bevor (in this 1) must have the value Y or N.
thank you very much. i tried different technics without any success...
UPDATE 1
thank you fast comment
some example dates with expected result
example 1
A B
1 Y
2 N
3 null
4 null
expected result
A B
2 N
example 2
A B
1 N
2 Y
3 N
4 null
expected result
A B
3 N
example 3
A B
1 null
2 Y
3 Y
4 null
expected result
no result
example 4
A B
1 Y
2 Y
3 null
4 Y
expected result
no result
UPDATE 2
forget the basic case
A B
1 Y
2 N
3 Y
expected result
A B
3 N
Establish the highest value of A where B is Y or N, and the lowest value of A where B is null. Provided the first value is lower than the second value you have a valid result set.
select yt.A
, yt.B
from
( select max(case when B is not null then A else null end) as max_b_yn
, min(case when B is null then A else null end) as min_b_null
from your_table ) t1
, your_table yt
where ( t1.min_b_null is null
or t1.max_b_yn < t1.min_b_null )
and yt.A = t1.max_b_yn
/
I think you want the last row before the first NULL. If this is correct, then the following gets what you want:
select t.*
from t join
(select min(A) as NullA from t where B is NULL) t2
on t.A = t2.NullA - 1
Ah, I see. To get the last row with an "N":
select t.*
from t
where t.A in (select max(A) as MaxA
from t join
(select min(A) as NullA from t where B is NULL) t2
on t.A < t2.NullA
where t.B = 'N'
)
OK, I think I have what you need. It grabs the last row that has a value in column B as long as there isn't a row with a NULL that precedes it.
select MAX(A), B
from table_a
where B IS NOT NULL
and table_a.A > (
select MIN(A)
FROM table_a
WHERE B IS NULL
)
Group by B
select top 1 A, B
from #Temp
where B is not null
order by A desc
or
select top 1 A, B
from #Temp
where B = 'N'
order by A desc