SQL: create a view comparing different rows - sql

I have data for movies that looks something like this:
cast_id | cast_name | movie_id
1 A 11
2 B 11
3 C 11
4 D 12
5 E 12
1 A 13
I want to create a view where I compare two different cast members so that I will start with something like this:
CREATE VIEW compare(cast_id_1, cast_id_2, num_movies);
SELECT * FROM compare LIMIT 1;
(1,2,2)
where I am looking at actor A and actor B, who have a total of 2 movies between the two of them.
Not sure how to compare the two different rows and my searchers so far have been unsuccessful. Any help is much appreciated!

That's a self-join:
create view myview as
select t1.cast_id cast_id_1, t2.cast_id cast_id_2, count(*) num_movies
from mytable t1
inner join mytable t2 on t2.movie_id = t1.movie_id and t1.cast_id < t2.cast_id
group by t1.cast_id, t2.cast_id
Thives generates all combinations of cast members that once appeared in the same movie, with the total number of movies. Join condition t1.cast_id < t2.cast_id is there to avoid "mirror" records.
You can then query the view. If you want members that have two common movies (which is actually not showing in your sample data...):
select * from myview where num_movies = 2

I'm thinking a procedure might be helpful. This stored procedure takes the 2 cast_id's and num_movies as input parameters. It selects the movie_id's of the movies the two cast_id's have appeared in together. Then based on whether or not that number exceeds the num_movies parameter: either 1) a list of movies (release dates, director, etc.) is returned, else the message 'Were not in 2 movies together' is returned.
drop proc if exists TwoMovieActors;
go
create proc TwoMovieActors
#cast_id_1 int,
#cast_id_2 int,
#num_movies int
as
set nocount on;
declare #m table(movie_id int unique not null,
rn int not null);
declare #rows int;
with
cast_cte as (
select *, row_number() over (partition by movie_id order by cast_name) rn
from movie_casts mc
where cast_id in(#cast_id_1, #cast_id_2))
insert #m
select movie_id, row_number() over (order by movie_id) rn
from cast_cte
where rn=2
select #rows=##rowcount;
if #rows<#num_movies
select concat('Were not in ', cast(#num_movies as varchar(11)), ' movies together');
else
select m.movie_id, mv.movie_name, mv.release_date, mv.director
from #m m
join movies mv on m.movie_id=mv.movie_id;
To execute it would be something like
exec TwoMovieActors 1, 2, 2;

Related

Finding the id's which include multiple criteria in long format

Suppose I have a table like this,
id
tagId
1
1
1
2
1
5
2
1
2
5
3
2
3
4
3
5
3
8
I want to select id's where tagId includes both 2 and 5. For this fake data set, It should return 1 and 3.
I tried,
select id from [dbo].[mytable] where tagId IN(2,5)
But it takes 2 and 5 into account respectively. I also did not want to keep my table in wide format since tagId is dynamic. It can reach any number of columns. I also considered filtering with two different queries to find (somehow) the intersection. However since I may search more than two values inside the tagId in real life, it sounds inefficient to me.
I am sure that this is something faced before when tag searching. What do you suggest? Changing table format?
One option is to count the number of distinct tagIds (from the ones you're looking for) each id has:
SELECT id
FROM [dbo].[mytable]
WHERE tagId IN (2,5)
GROUP BY id
HAVING COUNT(DISTINCT tagId) = 2
This is actually a Relational Division With Remainder question.
First, you have to place your input into proper table format. I suggest you use a Table Valued Parameter if executing from client code. You can also use a temp table or table variable.
DECLARE #ids TABLE (tagId int PRIMARY KEY);
INSERT #ids VALUES (2), (5);
There are a number of different solutions to this type of question.
Classic double-negative EXISTS
SELECT DISTINCT
mt.Id
FROM mytable mt
WHERE NOT EXISTS (SELECT 1
FROM #ids i
WHERE NOT EXISTS (SELECT 1
FROM mytable mt2
WHERE mt2.id = mt.id
AND mt2.tagId = i.tagId)
);
This is not usually efficient though
Comparing to the total number of IDs to match
SELECT mt.id
FROM mytable mt
JOIN #ids i ON i.tagId = mt.tagId
GROUP BY mt.id
HAVING COUNT(*) = (SELECT COUNT(*) FROM #ids);
This is much more efficient. You can also do this using a window function, it may be more or less efficient, YMMV.
SELECT mt.Id
FROM mytable mt
JOIN (
SELECT *,
total = COUNT(*) OVER ()
FROM #ids i
) i ON i.tagId = mt.tagId
GROUP BY mt.id
HAVING COUNT(*) = MIN(i.total);
Another solution involves cross-joining everything and checking how many matches there are using conditional aggregation
SELECT mt.id
FROM (
SELECT
mt.id,
mt.tagId,
matches = SUM(CASE WHEN i.tagId = mt.tagId THEN 1 END),
total = COUNT(*)
FROM mytable mt
CROSS JOIN #ids i
GROUP BY
mt.id,
mt.tagId
) mt
GROUP BY mt.id
HAVING SUM(matches) = MIN(total)
AND MIN(matches) >= 0;
db<>fiddle
There are other solutions also, see High Performance Relational Division in SQL Server

Need help writing an SQL query to count non duplicate rows (not a distinct count)

I have a table like below. I'm trying to do a count of IDs that are not duplicated. I don't mean a distinct count. A distinct count would return a result of 7 (a, b, c, d, e, f, g). I want it to return a count of 4 (a, c, d, f). These are the IDs that do not have multiple type codes. I've tried the following queries but got counts of 0 (the result should be a count in the millions).
select ID, count (ID) as number
from table
group by ID
having count (ID) = 1
Select count (distinct ID)
From table
Having count (ID) = 1
ID|type code
a|111
b|222
b|333
c|444
d|222
e|111
e|333
e|555
f|444
g|333
g|444
thanks to #scaisEdge! The first query you provided gave me exactly what I'm looking for in the above question. Now that that's figured out my leaders have asked for it to be taken a step further to show the count of how many times there is an ID within a single type code. For example, we want to see
type code|count
111|1
222|1
444|2
There are 2 instances of IDs that have a single type code of 444 (c, f), there is one instance of an ID that has a single type code of 111 (a), and 222 (d). I've tried modifying the query as such, but have been coming across errors when running the query
select count(admin_sys_tp_cd) as number
from (
select cont_id from
imdmadmp.contequiv
group by cont_id
having count(*) =1) t
group by admin_sys_tp_cd
If you want the count Could be
select count(*) from (
select id from
my_table
group by id
having count(*) =1
) t
if you want the id
select id from
my_table
group by id
having count(*) =1
Hou about this you do a loop in a temporary table?:
select
*
into #control
from tablename
declare #acum as int
declare #code as char(3)
declare #id as char(1)
declare #id2 as int
select #acum=0
while exists (select* from #control)
begin
select #code = (select top 1 code from #control order by id)
select #id = (select top 1 id from #control order by id)
select #id2 =count(id) from #control where id in (select id from tablename where id = #id and code <> #code)
if #id2=0
begin
select #acum = #acum+1
end
delete #control
where id = #id --and code = #code
end
drop table #control
print #acum

Simple SQL: How to calculate unique, contiguous numbers for duplicates in a set?

Let's say I create a table with an int Page, int Section, and an int ID identity field, where the page field ranges from 1 to 8 and the section field ranges from 1 to 30 for each page. Now let's say that two records have duplicate page and section. How could I renumber those two records so that the sequence of page and section numbering is contiguous?
select page, section
from #fun
group by page, section having count(*) > 1
shows the duplicates:
page 1 section 3
page 2 section 3
page 1 section 4 and page 2 section 4 are missing. Is there a way without using a cursor to find and renumber the positions in SQL 2000 that doesn't support Row_Number()?
This rownum below of course produces exactly the same number as in section:
select page, section,
(select count(*) + 1
from #fun b
where b.page = a.page and b.section < a.section) as rownum
from #fun a
I could create a pivot table having values 1 through 100, but what would I join against?
What I want to do is something like this:
update p set section = (expression that gets 4)
from #fun p
where (expression that identifies duplicate sections by page)
I don't have a 2000 server to test this on, but I think it should work.
Create test tables/data:
CREATE TABLE #fun
(Id INT IDENTITY(100,1)
,page INT NOT NULL
,section INT NOT NULL
)
INSERT #fun (page, section)
SELECT 1,1
UNION ALL SELECT 1,3 UNION ALL SELECT 1,2
UNION ALL SELECT 1,3 UNION ALL SELECT 1,5
UNION ALL SELECT 2,1 UNION ALL SELECT 2,2
UNION ALL SELECT 2,3 UNION ALL SELECT 2,5
UNION ALL SELECT 2,3
Now the processing:
-- create a worktable
CREATE TABLE #fun2
(Id INT IDENTITY(1,1)
,funId INT
,page INT NOT NULL
,section INT NOT NULL
)
-- insert data into the second temp table ordered by the relevant columns
-- the identity column will form the basis of the revised section number
INSERT #fun2 (funId, page, section)
SELECT Id,page,section
FROM #fun
ORDER BY page,section,Id
-- write the calculated section value back where it is different
UPDATE p
SET section = y.calc_section
FROM #fun AS p
JOIN
(
SELECT f2.funId, f2.id - x.adjust calc_section
FROM #fun2 AS f2
JOIN (
-- this subquery is used to calculate an offset like
-- PARTITION BY in a 2005+ ROWNUMBER function
SELECT MIN(Id) - 1 adjust, page
FROM #fun2
GROUP BY page
) AS x
ON f2.page = x.page
) AS y
ON p.Id = y.funId
WHERE p.section <> y.calc_section
SELECT * FROM #fun order by page, section
Disclaimer: I don't have SQL Server to test.
If I understand you correctly, if you knew the ROW_NUMBER of your #fun records partitioned over (page, section) duplicates, you could use this relative ranking to increment the "section":
UPDATE p
SET section = section + (rownumber - 1)
FROM #fun AS p
INNER JOIN ( -- SELECT id, ROW_NUMBER() OVER (PARTITION BY page, section) ...
SELECT id, COUNT(1) AS rownumber
FROM #fun a
LEFT JOIN #fun b
ON a.page = b.page AND a.section = b.section AND a.id <= b.id
GROUP BY a.id, a.page, a.section) d
ON p.id = d.id
WHERE rownumber > 1
That won't handle the case where the number of duplicates push you past your upper limit of 30. It may also create new duplicates where if higher numbered sections per page already exist -- that is, one instance of (pg 1, sec 3) becomes (pg 1, sec 4), which already existed -- but you can run the UPDATE repeatedly until no duplicates exist.
And then add a unique index on (page, section).

Make SQL Select same row multiple times

I need to test my mail server. How can I make a Select statement
that selects say ID=5469 a thousand times.
If I get your meaning then a very simple way is to cross join on a derived query on a table with more than 1000 rows in it and put a top 1000 on that. This would duplicate your results 1000 times.
EDIT: As an example (This is MSSQL, I don't know if Access is much different)
SELECT
MyTable.*
FROM
MyTable
CROSS JOIN
(
SELECT TOP 1000
*
FROM
sysobjects
) [BigTable]
WHERE
MyTable.ID = 1234
You can use the UNION ALL statement.
Try something like:
SELECT * FROM tablename WHERE ID = 5469
UNION ALL
SELECT * FROM tablename WHERE ID = 5469
You'd have to repeat the SELECT statement a bunch of times but you could write a bit of VB code in Access to create a dynamic SQL statement and then execute it. Not pretty but it should work.
Create a helper table for this purpose:
JUST_NUMBER(NUM INT primary key)
Insert (with the help of some (VB) script) numbers from 1 to N. Then execute this unjoined query:
SELECT MYTABLE.*
FROM MYTABLE,
JUST_NUMBER
WHERE MYTABLE.ID = 5469
AND JUST_NUMBER.NUM <= 1000
Here's a way of using a recursive common table expression to generate some empty rows, then to cross join them back onto your desired row:
declare #myData table (val int) ;
insert #myData values (666),(888),(777) --some dummy data
;with cte as
(
select 100 as a
union all
select a-1 from cte where a>0
--generate 100 rows, the max recursion depth
)
,someRows as
(
select top 1000 0 a from cte,cte x1,cte x2
--xjoin the hundred rows a few times
--to generate 1030301 rows, then select top n rows
)
select m.* from #myData m,someRows where m.val=666
substitute #myData for your real table, and alter the final predicate to suit.
easy way...
This exists only one row into the DB
sku = 52 , description = Skullcandy Inkd Green ,price = 50,00
Try to relate another table in which has no constraint key to the main table
Original Query
SELECT Prod_SKU , Prod_Descr , Prod_Price FROM dbo.TB_Prod WHERE Prod_SKU = N'52'
The Functional Query ...adding a not related table called 'dbo.TB_Labels'
SELECT TOP ('times') Prod_SKU , Prod_Descr , Prod_Price FROM dbo.TB_Prod,dbo.TB_Labels WHERE Prod_SKU = N'52'
In postgres there is a nice function called generate_series. So in postgreSQL it is as simple as:
select information from test_table, generate_series(1, 1000) where id = 5469
In this way, the query is executed 1000 times.
Example for postgreSQL:
CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; --To be able to use function uuid_generate_v4()
--Create a test table
create table test_table (
id serial not null,
uid UUID NOT NULL,
CONSTRAINT uid_pk PRIMARY KEY(id));
-- Insert 10000 rows
insert into test_table (uid)
select uuid_generate_v4() from generate_series(1, 10000);
-- Read the data from id=5469 one thousand times
select id, uid, uuid_generate_v4() from test_table, generate_series(1, 1000) where id = 5469;
As you can see in the result below, the data from uid is read 1000 times as confirmed by the generation of a new uuid at every new row.
id |uid |uuid_generate_v4
----------------------------------------------------------------------------------------
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"5630cd0d-ee47-4d92-9ee3-b373ec04756f"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"ed44b9cb-c57f-4a5b-ac9a-55bd57459c02"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"3428b3e3-3bb2-4e41-b2ca-baa3243024d9"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"7c8faf33-b30c-4bfa-96c8-1313a4f6ce7c"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"b589fd8a-fec2-4971-95e1-283a31443d73"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"8b9ab121-caa4-4015-83f5-0c2911a58640"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"7ef63128-b17c-4188-8056-c99035e16c11"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"5bdc7425-e14c-4c85-a25e-d99b27ae8b9f"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"9bbd260b-8b83-4fa5-9104-6fc3495f68f3"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"c1f759e1-c673-41ef-b009-51fed587353c"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"4a70bf2b-ddf5-4c42-9789-5e48e2aec441"
Of course other DBs won't necessarily have the same function but it could be done:
See here.
If your are doing this in sql Server
declare #cnt int
set #cnt = 0
while #cnt < 1000
begin
select '12345'
set #cnt = #cnt + 1
end
select '12345' can be any expression
Repeat rows based on column value of TestTable. First run the Create table and insert statement, then run the following query for the desired result.
This may be another solution:
CREATE TABLE TestTable
(
ID INT IDENTITY(1,1),
Col1 varchar(10),
Repeats INT
)
INSERT INTO TESTTABLE
VALUES ('A',2), ('B',4),('C',1),('D',0)
WITH x AS
(
SELECT TOP (SELECT MAX(Repeats)+1 FROM TestTable) rn = ROW_NUMBER()
OVER (ORDER BY [object_id])
FROM sys.all_columns
ORDER BY [object_id]
)
SELECT * FROM x
CROSS JOIN TestTable AS d
WHERE x.rn <= d.Repeats
ORDER BY Col1;
This trick helped me in my requirement.
here, PRODUCTDETAILS is my Datatable
and orderid is my column.
declare #Req_Rows int = 12
;WITH cte AS
(
SELECT 1 AS Number
UNION ALL
SELECT Number + 1 FROM cte WHERE Number < #Req_Rows
)
SELECT PRODUCTDETAILS.*
FROM cte, PRODUCTDETAILS
WHERE PRODUCTDETAILS.orderid = 3
create table #tmp1 (id int, fld varchar(max))
insert into #tmp1 (id, fld)
values (1,'hello!'),(2,'world'),(3,'nice day!')
select * from #tmp1
go
select * from #tmp1 where id=3
go 1000
drop table #tmp1
in sql server try:
print 'wow'
go 5
output:
Beginning execution loop
wow
wow
wow
wow
wow
Batch execution completed 5 times.
The easy way is to create a table with 1000 rows. Let's call it BigTable. Then you would query for the data you want and join it with the big table, like this:
SELECT MyTable.*
FROM MyTable, BigTable
WHERE MyTable.ID = 5469

Picking info using junction table(SQL SERVER 2005) [ SET BASED]

I have 3 tables
1) tblPurchaser having 2 columns:
PurchaserId PurchaserName
1 A1
2 A2
3 A3
2) tblCar having 2 columns:
CarId Carname
11 C1
12 C2
13 C3
14 C4
And the last is a junction table tblInformation where the information about those persons are given who has purchased cars.
PurchaserId CarId
1 11
1 12
2 11
2 13
Now I need to write a set based query where I can be able to obtain the information of those cars which has not been purchased by the persons
Desired Output
PurchaserId CarId
1 13
1 14
2 12
2 14
3 11
3 12
3 13
3 14
Note: This is a real time problem which I am implementing in my project. Because of privacy of company, I have changed the tables and information. But my situation is something similar
Please help me
Edited
So far I have written this query:
SELECT 1 as purchaserid,carid from tblcar
where carid not in (select carid from tblinformation where purchaserid = 1)
union all
SELECT 2 as purchaserid,carid from tblcar
where carid not in (select carid from tblinformation where purchaserid = 2)
union all
SELECT 3 as purchaserid,carid from tblcar
where carid not in (select carid from tblinformation where purchaserid = 3)
But as you can make out that i am hardcoding the purchaserid's. And also in real time I will not know how many id's will be there. So everything has to be done at runtime.
Please helpenter code here
Clue: NOT EXISTS
You should really try to do some homework yourself... 3rd question today...
LEFT JOIN ... WHERE ... IS NULL to the rescue:
SELECT tblPurchaser.PurchaserId, tblCar.CarId
FROM tblPurchaser JOIN tblCar
LEFT JOIN tblInformation ON(
tblPurchaser.PurchaserId = tblInformation.PurchaserId
AND tblCar.CarId = tblInformation.CarId)
WHERE tblInformation.CarId IS NULL
Try this
select pur.PurchaserId, car.CarId
from tblPurchaser pur, tblCar car
where not exists (select 1 from tblInformation where PurchaserId = pur. PurchaserId and CarId = car. CarId)
order by pur.PurchaserId;
Try this:
SELECT PurchaserID, CarID
FROM Purchasers
CROSS JOIN Cars
EXCEPT
SELECT *
FROM tblInformation
Here is a SQL script that demonstrates that this technique works correctly:
declare #soPurchaser table(PurchaserId int, PurchaserName varchar(4));
insert #soPurchaser select 1,'A1'
insert #soPurchaser select 2,'A2'
insert #soPurchaser select 3,'A3'
Declare #SOtblCar table(CarId int, Carname varchar(4))
insert #SOtblCar select 11,'C1'
insert #SOtblCar select 12,'C2'
insert #SOtblCar select 13,'C3'
insert #SOtblCar select 14,'C4'
Declare #SOtblInfo table(PurchaserId int, CarId int)
insert #SOtblInfo select 1,11
insert #SOtblInfo select 1,12
insert #SOtblInfo select 2,11
insert #SOtblInfo select 2,13
SELECT PurchaserID, CarID
FROM #soPurchaser
CROSS JOIN #SOtblCar
EXCEPT
SELECT *
FROM #SOtblInfo
The SQL Set operators (UNION, INTERSECT, and EXCEPT) all operate on two table-sets. You will note that they have no way to map the columns from one set to the other. In all cases in SQL when column must be mapped to each other, but there is no syntax to do it explicitly, then they are always mapped based on column order.
So in this one case, if you have one of the table's column order wrong, then it will not work correctly.