How to Make a mutually excusive select query in SQL? - sql

I'm new to sql, and I need to write a query for a table that looks like this
CREATE TABLE TESTS
PATH_ID int PRIMARY KEY,
Day DATE NOT NULL,
Direction varchar(255) NOT NULL,
D_ID int NOT NULL,
FOREIGN KEY (D_ID) REFERENCES Drivers(D_ID),
);
INSERT INTO TESTS(PATH_ID,Day,Direction,D_ID)
VALUES (1,'2021-02-01' ,'Right',001),
(2,'2021-02-01' ,'Left',002),
(3,'2021-02-02','Right',002),
What I need to do is write a query that shows drivers (D_ID) who have ONLY ever gone Right (Direction), and show The D_ID, the Day, and all the times the driver went right.

One method is not exists:
select t.*
from tests t
where not exists (select 1
from tests t2
where t2.d_id = t.d_id and t2.direction <> 'Right'
);

you can use not in
select a.* from Tests a where D_ID not in (
select D_ID from Tests where direction <>'Right'
)

Related

Redshift create list and search different table with it

I think there a few ways to tackle this, but I'm not sure how to do any of them.
I have two tables, the first has ID's and Numbers. The ID's and numbers can potentially be listed more than once, so I create a result table that lists the unique numbers grouped by ID.
My second table has rows (100 million) with the ID and Numbers again. I need to search that table for any ID that has a Number not in the list of Numbers from the result table.
Can redshift do a query based on if the ID matches and the Number exists in the list from the table? Can this all be done in memory/one statement?
DROP TABLE IF EXISTS `myTable`;
CREATE TABLE `myTable` (
`id` mediumint(8) unsigned NOT NULL auto_increment,
`ID` varchar(255),
`Numbers` mediumint default NULL,
PRIMARY KEY (`id`)
) AUTO_INCREMENT=1;
INSERT INTO `myTable` (`ID`,`Numbers`)
VALUES
("CRQ44MPX1SZ",1890),
("UHO21QQY3TW",4370),
("JTQ62CBP6ER",1825),
("RFD95MLC2MI",5014),
("URZ04HGG2YQ",2859),
("CRQ44MPX1SZ",1891),
("UHO21QQY3TW",4371),
("JTQ62CBP6ER",1826),
("RFD95MLC2MI",5015),
("URZ04HGG2YQ",2860),
("CRQ44MPX1SZ",1892),
("UHO21QQY3TW",4372),
("JTQ62CBP6ER",1827),
("RFD95MLC2MI",5016),
("URZ04HGG2YQ",2861);
SELECT ID, listagg(distinct Numbers,',') as Number_List, count(Numbers) as Numbers_Count
FROM myTable
GROUP BY ID
AS result
DROP TABLE IF EXISTS `myTable2`;
CREATE TABLE `myTable2` (
`id` mediumint(8) unsigned NOT NULL auto_increment,
`ID` varchar(255),
`Numbers` mediumint default NULL,
PRIMARY KEY (`id`)
) AUTO_INCREMENT=1;
INSERT INTO `myTable2` (`ID`,`Numbers`)
VALUES
("CRQ44MPX1SZ",1870),
("UHO21QQY3TW",4350),
("JTQ62CBP6ER",1825),
("RFD95MLC2MI",5014),
("URZ04HGG2YQ",2859),
("CRQ44MPX1SZ",1891),
("UHO21QQY3TW",4371),
("JTQ62CBP6ER",1826),
("RFD95MLC2MI",5015),
("URZ04HGG2YQ",2860),
("CRQ44MPX1SZ",1882),
("UHO21QQY3TW",4372),
("JTQ62CBP6ER",1827),
("RFD95MLC2MI",5016),
("URZ04HGG2YQ",2861);
Pseudo Code
Select ID, listagg(distinct Numbers) as Violation
Where Numbers IN NOT IN result.Numbers_List
or possibly: WHERE Numbers NOT LIKE '%' || result.Numbers_List|| '%'
Desired Output
(“CRQ44MPX1SZ”, ”1870,1882”)
(“UHO21QQY3TW”, ”4350”)
EDIT
Going the JOIN route, I am not getting the right results...but I'm pretty sure my WHERE implementation is wrong.
SELECT mytable1.ID, listagg(distinct mytable2.Numbers, ',') as unauth_list, count(mytable2.Numbers) as unauth_count
FROM mytable1
LEFT JOIN mytable2 on mytable1.id = mytable2.id
WHERE (mytable1.id = mytable2.id)
AND (mytable1.Numbers <> mytable2.Numbers)
GROUP BY mytable1.id
Expected output:
(“CRQ44MPX1SZ”, ”1870,1882”, 2)
(“UHO21QQY3TW”, ”4350”, 1)
Just left join the two tables on ID and numbers and check for (where clause) to see if the match wasn't found. Shouldn't be a need for listagg() and complex comparing. Or did I miss part of the question?

Get data from one table with nested relations

I am new in DB and I have a table topics and in this table, I have a foreign key master_topic_id and this foreign key is related to the same table topics column id.
Schema:
CREATE TABLE public.topics (
id bigserial NOT NULL,
created_at timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
published_at timestamp NULL,
master_topic_id int8 NULL,
CONSTRAINT t_pkey PRIMARY KEY (id),
CONSTRAINT t_master_topic_id_fkey FOREIGN KEY (master_topic_id) REFERENCES topics(id
);
I write a query - SELECT * FROM topics WHERE id = 10. But if this record has master_topic_id I need to get data by master_topic_id too.
I tried to do it by using JOIN, but join just concat records, but I need to have data from master_topic_id as new row.
Any help?
I think you are describing:
select t.*
from topics t
where t.id = 10 or
exists (select 1
from topics t2
where t2.master_topic_id = t.id and t2.id = 10
);
However, you might just want:
where 10 in (id, master_topic_id)
Use or in your where condition
SELECT *
FROM topics
WHERE id = 10
or master_topic_id = 10
you can use union all as well
SELECT *
FROM topics
WHERE id = 10
union all
SELECT *
FROM topics
WHERE master_topic_id = 10

PostgreSQL SELECT JOIN

I have a problem with making a proper SELECT for my exercise:
There are two tables that I have created:
1. Customer
2. Order
ad. 1
CREATE TABLE public."Customer"
(
id integer NOT NULL DEFAULT nextval('"Customer_id_seq"'::regclass),
name text NOT NULL,
surname text NOT NULL,
address text NOT NULL,
email text NOT NULL,
password text NOT NULL,
CONSTRAINT "Customer_pkey" PRIMARY KEY (id),
CONSTRAINT "Customer_email_key" UNIQUE (email)
)
ad.2
CREATE TABLE public."Order"
(
id integer NOT NULL DEFAULT nextval('"Order_id_seq"'::regclass),
customer_id integer NOT NULL,
item_list text,
order_date date,
execution_date date,
done boolean DEFAULT false,
confirm boolean DEFAULT false,
paid boolean DEFAULT false,
CONSTRAINT "Order_pkey" PRIMARY KEY (id),
CONSTRAINT "Order_customer_id_fkey" FOREIGN KEY (customer_id)
REFERENCES public."Customer" (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
Please do not mind how columns properties were set.
The problem I have is following:
How to make a SELECT query which will give me as a result ids and emails of customers who have ordered something after '2017-09-15'
I suppose that this should go with JOIN but none of the queries I tried have worked :/.
Thanks!
You should post the queries that you tried, but in the meantime try this. It's a simple join :
SELECT DISTINCT id
, email
FROM public."Customer" c
JOIN public."Order" o
ON c.id = o.customer_id
WHERE order_date > '2017-09-15'
In table "Order" you just need to add current constraint for customer id:
customer_id integer REFERENCES Customer (id)
for more information check this page:
https://www.postgresql.org/docs/9.2/static/ddl-constraints.html
So, the query should be like this:
SELECT id, email
FROM Customer
INNER JOIN Order
ON (Order.customer_id = Customer.id)
WHERE order_date >= '2017-09-15'
Also, the useful docs you can check: https://www.postgresql.org/docs/current/static/tutorial-join.html

Join multiple tables, including one table twice, and sort by counting a group

I am an amateur just trying to finish his last question of his assignment (it is past due at this point, just looking for understanding) I sat and shot attempts at this for almost 5 hours now across two days, and have had no success.
I have tried looking through all the different types of joins, couldn't get grouping to work (ever) and have had little luck with the sorting as well. I can do all of these things one at a time, but the difficulty here was getting all of these things to work in union.
This is the question:
Write a SQL query to retrieve a list that has (source city, source code, destination city,
destination code, and number-of-flights) for all source-dest pairs with at least 2 flights. Order
by the number_of_flights. Note that the “dest”, and “source” attributes in the “flights” table
are both referenced to the “airportid” in the “airports” table.
Here are the tables I have to work with (also came with about 3000 lines of dummy entries)
create table airports (
airportid char(3) primary key,
city varchar(20)
);
create table airlines (
airlineid char(2) primary key,
name varchar(20),
hub char(3) references airports(airportid)
);
create table customers (
customerid char(10) primary key,
name varchar(25),
birthdate date,
frequentflieron char(2) references airlines(airlineid)
);
create table flights (
flightid char(6) primary key,
source char(3) references airports(airportid),
dest char(3) references airports(airportid),
airlineid char(2) references airlines(airlineid),
local_departing_time date,
local_arrival_time date
);
create table flown (
flightid char(6) references flights(flightid),
customerid char(10) references customers,
flightdate date
);
The first problem I ran in to was outputting airports.city twice in the same query but with different results. Not only that, but no matter what I tried when grouping I would always get the same result:
Not a GROUP BY expression
Normally I have fun trying to piece these together, but this has been frustrating. Help!
select source.airportid as source_airportid,
source.city source_city,
dest.airportid as dest_airportid,
dest.city as dest_city,
count(*) as flights
from flights
inner join airports source on source.airportid = flights.source
inner join airports dest on dest.airportid = flights.dest
group by
source.airportid,
source.city,
dest.airportid,
dest.city
having count(*) >= 2
order by 5;
Have you tried a subquery?
SELECT source_airports.city,
source_airports.airportid,
dest_airports.city,
dest_airports.airportid,
x.number_of_flights
FROM
(
SELECT source, dest, COUNT(*) as number_of_flights
FROM flights
GROUP BY source, dest
HAVING COUNT(*) > 1
) as x
INNER JOIN airports as dest_airports
ON dest_airports.airportid = x.dest
INNER JOIN airports as source_airports
ON source_airports.airportid = x.source
ORDER BY x.number_of_flights ASC

SQL Server 2005 query optimization with Max subquery

I've got a table that looks like this (I wasn't sure what all might be relevant, so I had Toad dump the whole structure)
CREATE TABLE [dbo].[TScore] (
[CustomerID] int NOT NULL,
[ApplNo] numeric(18, 0) NOT NULL,
[BScore] int NULL,
[OrigAmt] money NULL,
[MaxAmt] money NULL,
[DateCreated] datetime NULL,
[UserCreated] char(8) NULL,
[DateModified] datetime NULL,
[UserModified] char(8) NULL,
CONSTRAINT [PK_TScore]
PRIMARY KEY CLUSTERED ([CustomerID] ASC, [ApplNo] ASC)
);
And when I run the following query (on a database with 3 million records in the TScore table) it takes about a second to run, even though if I just do: Select BScore from CustomerDB..TScore WHERE CustomerID = 12345, it is instant (and only returns 10 records) -- seems like there should be some efficient way to do the Max(ApplNo) effect in a single query, but I'm a relative noob to SQL Server, and not sure -- I'm thinking I may need a separate key for ApplNo, but not sure how clustered keys work.
SELECT BScore
FROM CustomerDB..TScore (NOLOCK)
WHERE ApplNo = (SELECT Max(ApplNo)
FROM CustomerDB..TScore sc2 (NOLOCK)
WHERE sc2.CustomerID = 12345)
Thanks much for any tips (pointers on where to look for optimization of sql server stuff appreciated as well)
When you filter by ApplNo, you are using only part of the key. And not the left hand side. This means the index has be scanned (look at all rows) not seeked (drill to a row) to find the values.
If you are looking for ApplNo values for the same CustomerID:
Quick way. Use the full clustered index:
SELECT BScore
FROM CustomerDB..TScore
WHERE ApplNo = (SELECT Max(ApplNo)
FROM CustomerDB..TScore sc2
WHERE sc2.CustomerID = 12345)
AND CustomerID = 12345
This can be changed into a JOIN
SELECT BScore
FROM
CustomerDB..TScore T1
JOIN
(SELECT Max(ApplNo) AS MaxApplNo, CustomerID
FROM CustomerDB..TScore sc2
WHERE sc2.CustomerID = 12345
) T2 ON T1.CustomerID = T2.CustomerID AND T1.ApplNo= T2.MaxApplNo
If you are looking for ApplNo values independent of CustomerID, then I'd look at a separate index. This matches your intent of the current code
CREATE INDEX IX_ApplNo ON TScore (ApplNo) INCLUDE (BScore);
Reversing the key order won't help because then your WHERE sc2.CustomerID = 12345 will scan, not seek
Note: using NOLOCK everywhere is a bad practice