How to get pairs from sqlite - sql

I am a database and I am just playing with sqlite3 for a group project.
I have two tables that look something like this:
Table 1:
tv_show_id, tv_show_rating
Table 2:
tv_show_id, cast_id
Each tv_show has 1 unique id, but in table two there are multiple cast_ids for each tv_show
So we have something like this:
Table 1:
1234, 90
5678, 88
Table 2:
1234, "person 1"
1234, "person 2"
5678, "person 1"
5678, "person 3"
I want the following results: (person a, person b, # of shows together)
(person 1, person 2, 1)
(person 1, person 3, 1)
(person 2, person 1, 1)
(person 2, person 3, 0)
(person 3, person 1, 1)
(person 3, person 2, 0)
How can i use JOINS to get these results?

You can try this. Fiddle
select z.id,count(w.show_id) from
(
select distinct concat(x.cid,',',x.did) as id
from
(
select tt.cast_id as cid,ttt.cast_id as did
from t2 tt,t2 ttt
where tt.cast_id <> ttt.cast_id
) x
left join t2 on x.cid = t2.cast_id
) z
left join (select show_id, group_concat(cast_id order by cast_id) cid
from t2
group by show_id
union all
select show_id, group_concat(cast_id order by cast_id desc) cid
from t2
group by show_id
) w
on z.id = w.cid
group by z.id;

One way to do this (which might not be the most clever) would be to use a cross join to create a set of all possible pairs and the use a join and left join to determine if the pair have worked together.
In this query your source table is called table2:
select
cast1,
cast2,
sum(case when t2.cast_id is null then 0 else 1 end) as worked_together
from (
select distinct
t1.cast_id cast1,
t2.cast_id cast2
from table2 t1,table2 t2
where t1.cast_id != t2.cast_id
) p
join table2 t1 on p.cast1 = t1.cast_id
left join table2 t2 on t1.show_id = t2.show_id
and p.cast2 = t2.cast_id
and t1.cast_id != t2.cast_id
group by cast1, cast2;
Sample SQL Fiddle (tested with SQL.js). The query uses only ANSI SQL and is portable to other databases.
In some other databases you could have done this using a full outer join but SQLite does not support that construct.

Related

How to compare two tables and if the values match, update one of the tables with the values from a third table?

I have 3 tables, the MAIN_TABLE, the SUB_TABLE and the ID_TABLE.
I need to compare the CODE in the MAIN_TABLE with the CODE in the SUB_TABLE, and if they match, search for the SUB_ID in the ID_TABLE and update the ID in the MAIN_TABLE with that ID.
In the example shown below, the query should update the MAIN_TABLE with the ID = 2071.
MAIN_TABLE:
CODE
ID
0290380007800
994526
SUB_TABLE:
CODE
SUB_ID
029038078
106603
ID_TABLE:
ID
SUB_ID
2071
106603
To match the code from the MAIN_TABLE with the code from the SUB_TABLE, I need to select it like this:
SELECT
SUBSTRING(CODE, 1, 6) + SUBSTRING(CODE, 9, 3)
FROM
MAIN_TABLE
How can I achieve this?
Here's the dbfiddle with more data in each table: https://dbfiddle.uk/6H_mnPDR?hide=28
Just join your tables together as part of an update statement. Note this gives you duplicates, but then you already had duplicate IDs so I guess thats expected (although unusual).
UPDATE mt SET
id = it.id
FROM MAIN_TABLE mt
INNER JOIN SUB_TABLE st ON st.code = SUBSTRING(mt.CODE, 1, 6) + SUBSTRING(mt.CODE, 9, 3)
INNER JOIN ID_TABLE it ON it.SUB_ID = st.SUB_ID;
Using a sub-query you can solve your problem. You can try below code:
update MAIN_TABLE set
MAIN_TABLE.ID = Final.SUB_ID
from (
select distinct ID_TABLE.SUB_ID, MAIN_TABLE.codeNew, MAIN_TABLE.Code, SUB_TABLE.Code
from (
select SUBSTRING(CODE, 1, 6) + SUBSTRING(CODE, 9, 3) as CodeNew, Code
from MAIN_TABLE
) as MAIN_TABLE
inner join SUB_TABLE on MAIN_TABLE.CodeNew = SUB_TABLE.Code
inner join ID_TABLE on ID_TABLE.SUB_ID = SUB_TABLE.SUB_ID
) Final
where MAIN_TABLE.code = Final.Code

How to accumulate matches when querying multiple tables in SQLite?

Table1
ID Name CourseID
1 Course 1 4002
2 Course 2 2342
3 Course 3 2410
Table2
CourseID ProfName
4002 John
2342 bob
2410 Bill
4002 Hannah
2342 Cyrus
When I try
SELECT ID, Name, CourseID, ProfName
FROM Table1, Table2
WHERE Table1.CourseID = Table2.CourseID
I get multiple instances for the same CourseID returned such that when I print it out
"1, Course 1, 4002, John" and
"1, Course 1, 4002, Hannah" are two different outputs.
I would like them to be of the form
"1, Course 1, 4002, John and Hannah"
Not sure how to alter my SQL query to make this happen?
Use string aggregation:
select t1.id, t1.name, t1.courseid, group_concat(t2.profname, ' and ') profnames
from table1 t1
inner join table2 t2 on t1.courseid = t2.courseid
group by t1.id, t1.name, t1.courseid
Note that this uses standard join syntax (join ... on ...) rather than implicit joins (with a comma in the from clause): this old-school syntax should not be used in new code.
You could also use a subquery:
select t1.*
(
select group_concat(t2.profname, ' and ')
from table2 t2
where t2.courseid = t1.courseid
) profnames
from table1 t1
Unrelated note: and does not seem like a great choice for a list separator: if there are more than two values, the result is not correct English. A more usual choice is, for example, the comma (,) - that's the default separator in SQLite and most other databases.
I am guessing you want commas between most of the names and and only for the last one. That would be:
select t1.*, t2.profnames
from table1 t1 join
(select t2.courseid,
(group_concat(case when seqnum > 1 then profname end) || ', and '
max(case when seqnum = 1 then profname end)
) as profnames
from (select t2.*,
row_number() over (partition by courseid order by profname desc) as seqnum,
from table2 t2
) t2
group by t2.courseid
) t2
on t1.courseid = t2.courseid;

How to combine two rows under specific condition

I have following table:
CREATE table table1 (id int , cd date, ct TIME, co text)
INSERT INTO table1
VALUES (0, '1/1/2018', '12:00:00', 'B'),
(1, '1/1/2018', '12:30:00', 'BC'),
(2, '1/12/2018', '12:00:00', 'B'),
(3, '1/22/2018', '12:00:00', 'BC')
I need to combine "co" column when values of "cd" and "ct" columns are the same for 'B' AND 'BC' or only values of "ct" are different and display values of record 'B' for 'B-BC'.
for the above table1 records I need following result:
"id" "cd" "ct" "co"
"0" "1/1/2018" "12:00:00 PM" "B-BC"
"2" "1/12/2018" "12:00:00 PM" "B"
"3" "1/22/2018" "12:00:00 PM" "BC"
What is the most effective way to do that in postgresql 8.4?
I created two common table expressions (CTE) to come up with a list of B and BC records that will be combined. The logic is get B record and BC record. Then do a left join from table1 to cte_A table and left join to cte_B table but cte_b.id_del is null. This will remove the id found in cte_B. Lastly, do a case when to use a new co value (B-BC) for id found in cte_A table. See demo here: http://sqlfiddle.com/#!15/62caa/49
with cte_a as (select a.id as id_keep
from table1 a
inner join table1 b on a.cd=b.cd
and a.co='B' and b.co='BC')
,cte_b as (select b.id as id_del
from table1 a
inner join table1 b on a.cd=b.cd
and a.co='B' and b.co='BC')
select t1.id,
t1.cd,
t1.ct,
case when cte_a.id_keep is null
then t1.co
else 'B-BC' end as co
from table1 t1
left join cte_a
on t1.id=cte_a.id_keep
left join cte_b
on t1.id = cte_b.id_del
where cte_b.id_del is null;
You could use string_agg:
SELECT cd, MIN(id) AS id, min(ct) AS ct, string_agg(co, '-' ORDER BY ct) AS co
FROM table1
GROUP BY cd;
DBFiddle Demo
EDIT:
Lovely answer! but ct for 'B' is not always minimum value. I need the ct, which is specifically in record 'B', which may be greater 'ct' for 'BC' or less
You could use windowed function:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY cd ORDER BY ct ASC) AS rn FROM table1
)
SELECT id, cd, ct,s.co
FROM cte
JOIN LATERAL(SELECT string_agg(co, '-' ORDER BY ct) AS co
FROM cte c2 WHERE c2.cd=cte.cd)s ON TRUE
WHERE rn=1;
DBFiddle Demo2

How to compare two tables in Hive based on counts

I have below hive tables
Table_1
ID
1
1
2
Table_2
ID
1
2
2
I am comparing two tables based on count of ID in both tables, I need the output like below
ID
1 - 2records in table 1 and 1 record in Table 2
2 - one record in Table 1 and 2 records in table 2
Table_1 is parent table
i am using below query
select count(*),ID from Table_1 group by ID;
select count(*),ID from Table_2 group by ID;
Just do a full outer join on your queries with the on condition as X.id = Y.id, and then select * from the resultant table checking for nulls on either side.
Select id, concat(cnt1, " entries in table 1, ",cnt2, "entries in table 2") from (select * from (select count(*) as cnt1, id from table1 group by id) X full outer join (select count(*) as cnt2, id from table2 group by id)
on X.id=Y.id
)
Try This. You may use a case statement to check if it should be record / records etc.
SELECT m.id,
CONCAT (COALESCE(a.ct, 0), ' record in table 1, ', COALESCE(b.ct, 0),
' record in table 2')
FROM (SELECT id
FROM table_1
UNION
SELECT id
FROM table_2) m
LEFT JOIN (SELECT Count(*) AS ct,
id
FROM table_1
GROUP BY id) a
ON m.id = a.id
LEFT JOIN (SELECT Count(*) AS ct,
id
FROM table_2
GROUP BY id) b
ON m.id = b.id;
You could use this Python program to do a full comparison of 2 Hive tables:
https://github.com/bolcom/hive_compared_bq
If you want a quick comparison just based on counts, then pass the "--just-count" option (you can also specify the group by column with "--group-by-column").
The script also allows you to visually see all the differences on all rows and all columns if you want a complete validation.

SQL - Is it possible to join a table to a resultset created by several select/union-alls?

I'm trying to find a workaround (hack) to some limitations preventing me from using a temporary table or a table variable in my SQL query.
I have a real table (technically it's a derived table that results from an UNPIVOT of a poorly designed table) which lacks several necessary fields. I need to hardcode these fields into the result until we can cleanup the database issue.
Given a table like:
tblEntity
ID | Name
1 | One
2 | Two
I need to join several fields such as:
ID | Order
1 | 2
2 | 1
The join would result in:
ID | Name | Order
1 | One | 2
2 | Two | 1
My question is: can I join tblEntity to a resultset created like:
SELECT 1, 2
UNION ALL
SELECT 2, 1
Is it possible to join? If so, what is the syntax?
select te.*, t.Ord from tblEntity te
inner join (
SELECT 1 as Id, 2 as Ord
UNION ALL
SELECT 2, 1
) t on te.ID = t.Id
Making a few assumptions, this would do it:
SELECT en.ID, en.Name, xx.OrderBy
from tblEntity en
inner join (select 1 Id, 2 OrderBy
union all
select 2,1) xx
on xx.Id = en.ID
In SQL-Server 2008, it's also possible to use Table Value Constructors:
CREATE TABLE #tblEntity
( ID INT
, Name CHAR(10)
) ;
INSERT INTO #tblEntity
(ID, Name)
VALUES
( 1, 'One' ) ,
( 2, 'Two' ) ;
SELECT
t.ID, t.Name, o.Ordr AS "Order"
FROM
#tblEntity AS t
JOIN
( VALUES
(1,2)
, (2,1)
) AS o(ID, Ordr)
ON o.ID = t.ID ;
You can test the above in: data.stackexchange.com
Many ways of doing this e.g.
WITH T1 (Id, "Order")
AS
(
SELECT 1, 2
UNION ALL
SELECT 2, 1
)
SELECT e.*, T1."Order"
FROM tblEntity e
JOIN T1
ON e.Id = T1.Id;
e.g. 2
SELECT e.*, T1."Order"
FROM tblEntity e
JOIN (
VALUES (1, 2),
(2, 1)
) AS T1 (Id, "Order")
ON e.Id = T1.Id;
e.g. 3
WITH T1
AS
(
SELECT *
FROM (
VALUES (1, 2),
(2, 1)
) AS T (Id, "Order")
)
SELECT e.*, T1."Order"
FROM tblEntity e
JOIN T1
ON e.Id = T1.Id;
...and so on.