Identify uniques on combination of 2 columns in SQL - sql

I am working with somewhat weird data. I want to identify unique records with columns having many to many relationship like in example below. Column B can be null, column A is non-null:
Output should be 2 unique values. The combinations of A and B (1,A), (1,B), (1,null), (2,B) are all only a single unique entity and rest are the second unique entity.
EDIT:
My requirement is not just finding distinct values. Imagine it's like user sessions - user A can have multiple sessions, but they can log in with different log ins ( such as B) and have same session id (1) and vice versa. I want to identify unique users based on session and log in where sessions and log ins can be n:n

You could do this using SQLITE3:
SELECT DISTINCT(ColumnA || Column B)
FROM Table;
For MYSQL I think it is:
SELECT CONCAT(ColumnA, ColumnB)
FROM Table;
I'm not sure how you want to output null values. You have 2 options:
If A has Null value and B has a value, output B
If A has null value and B has a value, output nullB
For Option 1 use the code above. For Option 2, you'll have to update the columnA database to replace all null values with the text 'null'.

Is this what you want?
select distinct a, b
from t
where not exists (select 1
from t t2
where t2.id = t.id and
t2.b not in ('A', 'B')
);
This select ids that have only 'A', 'B', or NULL values in the second column.

Related

How to retrieve rows from two SQL tables

Let's say we have two tables as follows:
Table A: [ID, CarName] with one row (1, 'Fiat')
Table B: [ID, FirstName] with one row (1, 'Super Man')
My question is: what kind of query that I can run in order to return rows from the tables A and B while there is no connection between them?
The returned rows will be:
#1 (1,'Fiat')
#2 (1,'Super Man')
Thanks
Works if both of the tables have the same number of columns:
select
*
from table1
union all
select
*
from table2
;
If the number of columns in the table is different, you need to explicitly call missing columns in the table with fewer columns as NULL (select A, B, null from table 1;)
But it would probably be better if you get the tables to have a common key to join them on.

Insert columnA values of Table 1 into another table if match occur

I have two tables.Table A has 4 columns. And table B has two columns.I want to insert value of one column of tabel A from one column of table B based on condtion if id matches.
how i can do this ? For example if [Movieid] in 1st table =[IMDBid] in second table then insert [count] of table 1=[CB] in table 2.
i want to do it once for full table.
[column] -> these are colums
i m using sql server.
Tabel 1 : Movieid,count,
Tabel 2: IMDBid, CB
Results which i want: i want to insert values of CB column in count where Movieid=IMDBid
Tabel 1 :
(Nick Id,MovieId,Rating,MovId)-> (1,4972,6.25,?)(1,24216,7.25,?)
Tabel 2 :
(Imdbid,Title,ImdbPyId,Id)-> (4972,hello,32450,1)(24216,hi,62450,2)
Insert /fill value of MovId(tabel1) using values Id(tabel2)where MovieId(tabel1)==Imdbid(tabel2)
You can do it like,
UPDATE tabl1
SET count = (SELECT CB FROM tabl2 WHERE IMDBid = Movieid)
But it is gonna blow up if you have multiple values returning from the subquery.
So make sure to use the appropriate function to get the single value from that subquery whichever meets your requirements.
If your tables have 1:1 relationship then it should be fine. But if it is 1:n then you need to use either aggragate functions or the TOP 1 clause in that subqyery.
solution is UPDATE tabl1
SET count = (SELECT CB FROM tabl2 WHERE IMDBid = Movieid)

SQL unpivot & insert

Sorry for the lack of info -- SQL Server 2008.
I'm struggling to get a couple of column values from table A into a new row in table B for each row in A where a column isn't null.
Table A's structure is as:
UserID | ClientUserID | ClientSessionID | [and a load of other irrelevant columns)
Table B:
UserID | Name | Value
I want to create rows in table B for each non-null ClientUserID or ClientSessionID in A - using the column name as B's "Name", and column value as "B's Value".
I'm struggling to write my "unpivot" statement - just getting the syntax correct! I'm trying to follow along with some samples but can't
Here's my SQL query so far - any further help would be appreciated (just getting this SELECT is frustrating me, let alone doing the insert!)
SELECT UserID, ClientUserID, ClientSessionID FROM websiteuser WHERE ClientSessionID IS NOT null
This gives me the rows that I need to perform actions upon -- but I just can't get the syntax correct for UNPIVOTing this data and turning it into my insert.
You can unpivot records in this fashion by using UNION to get each new row:
INSERT INTO TableB (UserID, Name, Value)
SELECT UserID, 'ClientUserID' AS Name, ClientUserID AS Value
FROM TableA
WHERE ClientUserID IS NOT NULL
UNION ALL
SELECT UserID, 'ClientSessionID' AS Name, ClientSessionID AS Value
FROM TableA
WHERE ClientSessionID IS NOT NULL
I am using UNION ALL in this case as UNION implies a DISTINCT operation across the entire set, which should normally be unnecessary when pivoting unique records.
If your ClientUserID and ClientSessionID columns are not the same datatype, you may have to cast one or both to the same.

Update a table based on a results of a group by

Update a table based on a results of a group by
I've got a tricky update problem I'm trying to solve. There are two tables that contain the same three columns plus additional varied columns, looking like this:
Table1 {pers_id, loc_id, pos, ... }
Table2 {pers_id, loc_id, pos, ... }
None of the fields are unique. The first two fields collectively identify the records in a table (or tables) as belonging to the same entity. Table1 could have 15 records belonging to an entity, and table2 could have 4 records belonging to the same entity. The third column 'pos' is an index from 0 to whatever, and this is the column that I'm trying to update.
In Table1 and in Table2, the pos column begins at 0, and increments based on user selection, so that in the example (15 records in table1 and 4 records in table2), table1 contains 'pos' values of 0 - 14, and Table2 contains 'pos' values of 0-3.
I want to increment the pos field in Table1 with the results of the count of similar entities in Table2. This is the sql statement that correctly gives me the results from table2:
select table2.pers_id, table2.loc_id, count(*) as pos_increment from table2 group by table2.pers_id, table2.loc_id;
The end result of the update, in the example (15 records in table1 and 4 records in table2), would be all records in Table1 of the same entity being incremented by 4 (the result of the specific entity group by). 0 would be changed to 4, 15 to 19, etc.
Is this achievable in a single statement?
Since you only need to increment the pos field the solution is really simple:
update table1 t1
set t1.pos = t1.pos +
(select count(1)
from table2 t2
where t2.pers_id = t1.pers_id
and t2.loc_id = t1.loc_id)
Yes, this is possible, you can use MERGE for some of these upadtes and there are ways to relate values between the update and the subselect. I have done this in the past, but it's tricky and I don't have an existing example.
You can find several examples on this site, some for Oracle and some for other database that will awork with slight modifications.

SQL Query - Ensure a row exists for each value in ()

Currently struggling with finding a way to validate 2 tables (efficiently lots of rows for Table A)
I have two tables
Table A
ID
A
B
C
Table matched
ID Number
A 1
A 2
A 9
B 1
B 9
C 2
I am trying to write a SQL Server query that basically checks to make sure for every value in Table A there exists a row for a variable set of values ( 1, 2,9)
The example above is incorrect because t should have for every record in A a corresponding record in Table matched for each value (1,2,9). The end goal is:
Table matched
ID Number
A 1
A 2
A 9
B 1
B 2
B 9
C 1
C 2
C 9
I know its confusing, but in general for every X in ( some set ) there should be a corresponding record in Table matched. I have obviously simplified things.
Please let me know if you all need clarification.
Use:
SELECT a.id
FROM TABLE_A a
JOIN TABLE_B b ON b.id = a.id
WHERE b.number IN (1, 2, 9)
GROUP BY a.id
HAVING COUNT(DISTINCT b.number) = 3
The DISTINCT in the COUNT ensures that duplicates (IE: A having two records in TABLE_B with the value "2") from being falsely considered a correct record. It can be omitted if the number column either has a unique or primary key constraint on it.
The HAVING COUNT(...) must equal the number of values provided in the IN clause.
Create a temp table of values you want. You can do this dynamically if the values 1, 2 and 9 are in some table you can query from.
Then, SELECT FROM tempTable WHERE NOT IN (SELECT * FROM TableMatched)
I had this situation one time. My solution was as follows.
In addition to TableA and TableMatched, there was a table that defined the rows that should exist in TableMatched for each row in TableA. Let’s call it TableMatchedDomain.
The application then accessed TableMatched through a view that controlled the returned rows, like this:
create view TableMatchedView
select a.ID,
d.Number,
m.OtherValues
from TableA a
join TableMatchedDomain d
left join TableMatched m on m.ID = a.ID and m.Number = d.Number
This way, the rows returned were always correct. If there were missing rows from TableMatched, then the Numbers were still returned but with OtherValues as null. If there were extra values in TableMatched, then they were not returned at all, as though they didn't exist. By changing the rows in TableMatchedDomain, this behavior could be controlled very easily. If a value were removed TableMatchedDomain, then it would disappear from the view. If it were added back again in the future, then the corresponding OtherValues would appear again as they were before.
The reason I designed it this way was that I felt that establishing an invarient on the row configuration in TableMatched was too brittle and, even worse, introduced redundancy. So I removed the restriction from groups of rows (in TableMatched) and instead made the entire contents of another table (TableMatchedDomain) define the correct form of the data.