implementing a divide query in postgresql

implementing a divide query in postgresql - sql

Basically i have these relations, account(account_number,branch_name,balance) and city(branch_name,country). What i want is all the account_numbers of the people who have accounts on all the cities in the city table but for some reason im getting an empty query and i dont get why.
Ps: ive seen some confusion on what the divide means, the divide is the division of 2 sets like in relational algebra, what i want is to divide the account set with the city set, in doing so, it is supposed to return all accounts that have a branch_name in every city in cities
Example:
account table
--------------|-----------|---------
account_number|branch_name|balance
--------------|-----------|---------
'A-000000' |'Downtown' | 3467
'A-000001' |'Downtown' | 1500
'A-000002' |'London' | 1500
'A-000826' |'Manchester'| 9999999
'A-000826' |'Downtown' | 33399
------------------------------------
city
--------------|-----------
branch_name | country
--------------|-----------
'Manchester' | 'UK'
'Downtown' | 'USA'
--------------------------
Account ÷ city
--------------|-----------|---------
account_number|branch_name|balance
--------------|-----------|---------
'A-000826' |'Manchester'| 9999999
'A-000826' |'Downtown' | 33399
-----------------------------------
'A-000826' is the only account in all of the cities in the city table
Program:
create table account(
account_number char(9),
branch_name varchar(80) not null,
balance numeric(16,4));
create table city(
branch_name varchar(80) not null,
country varchar(80) not null,
check(branch_name != ''));
INSERT INTO city(branch_name,country)
VALUES ('Manchester','UK'),
('Downtown','USA');
INSERT INTO account(account_number, branch_name, balance)
VALUES ('A-000000','Downtown',3467),
('A-000001','Downtown',1500),
('A-000002','London',1500),
('A-000826','Manchester',9999999),
('A-000826','Downtown',33399);
Query:
select *
from account as A
where not exists ( (select branch_name
from account)
except
(select C.branch_name
from city as C
where A.branch_name = C.branch_name)
)

I agree with Tim Biegeleisen's answer.
Your logic in the where not exists was backwards.
If you insist on sticking to relational algebra, then try this, instead:
select *
from account as A
where not exists (
select branch_name
from city
except
select branch_name
from account b
where b.account_number = a.account_number
);
db<>fiddle here

I would use an aggregation approach:
SELECT a.account_number
FROM account a
INNER JOIN city c
ON c.branch_name = a.branch_name
GROUP BY a.account_number
HAVING COUNT(*) = (SELECT COUNT(*) FROM city);

Related

Select all entries from one table which has two specific entries in another table

So, I have 2 tables defined like this:
CREATE TABLE tblPersons (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT
);
CREATE TABLE tblHobbies (
person_id INTEGER REFERENCES tblPersons (id),
hobby TEXT
);
And for example I have 3 person added to tblPersons:
1 | John
2 | Bob
3 | Eve
And next hobbies in tblHobbies:
1 | skiing
1 | serfing
1 | hiking
1 | gunsmithing
1 | driving
2 | table tennis
2 | driving
2 | hiking
3 | reading
3 | scuba diving
And what I need, is query which will return me a list of person who have several specific hobbies.
The only thing I could've come up with, is this:
SELECT id, name FROM tblPersons
INNER JOIN tblHobbies as hobby1 ON hobby1.hobby = 'driving'
INNER JOIN tblHobbies as hobby2 ON hobby2.hobby = 'hiking'
WHERE tblPersons.id = hobby1.person_id and tblPersons.id = hobby2.person_id;
But it is rather slow. Isn't there any better solution?

First, you don't have a Primary Key on tblHobbies this is one cause of slow query (and other problems). Also you should consider creating a index on tblHobbies.hobby.
Second, I'd to advice you to create a third table to evidence N:N cardinality that exists in your model and avoid redundant hobbies. Something like:
--Person
CREATE TABLE tblPersons (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT
);
--Hobby
CREATE TABLE tblHobbies (
id INTEGER PRIMARY KEY AUTOINCREMENT,
hobby TEXT
);
--Associative table between Person and Hobby
CREATE TABLE tblPersonsHobbies (
person_id INTEGER REFERENCES tblPersons (id),
hobby_id INTEGER REFERENCES tblHobbies (id),
PRIMARY KEY (person_id, hobby_id)
);
Adds an extra table but it's worth it.
--Query on your current model
SELECT id, name FROM tblPersons
INNER JOIN tblHobbies as hobby1 ON tblPersons.id = hobby1.person_id
WHERE hobby1.hobby IN ('driving', 'hiking');
--Query on suggested model
SELECT id, name FROM tblPersons
INNER JOIN tblPersonsHobbies as personsHobby ON tblPersons.id = personsHobby.person_id
INNER JOIN tblHobbies as hobby1 ON hobby1.id = personsHobby.hobby_id
WHERE hobby1.hobby IN ('driving', 'hiking');

You can aggregate the hobbies table to get persons with both hobbies:
select person_id
from tblhobbies
group by person_id
having count(case when hobby = 'driving' then 1 end) > 0
and count(case when hobby = 'hiking' then 1 end) > 0
Or better with a WHERE clause restricting the records to read:
select person_id
from tblhobbies
where hobby in ('driving', 'hiking')
group by person_id
having count(distinct hobby) =2
(There should be a unique constraint on person + hobby in the table, though. Then you could remove the DISTINCT. And as I said in the comments section it should even be person_id + hobby_id with a separate hobbies table. EDIT: Oops, I should have read the other answer. Michal suggested this data model three hours ago already :-)
If you want the names, select from the persons table where you find the IDs in above query:
select id, name
from tblpersons
where id in
(
select person_id
from tblhobbies
where hobby in ('driving', 'hiking')
group by person_id
having count(distinct hobby) =2
);
With the better data model you'd replace
from tblhobbies
where hobby in ('driving', 'hiking')
group by person_id
having count(distinct hobby) =2
with
from tblpersonhobbies
where hobby_id in (select id from tblhobbies where hobby in ('driving', 'hiking'))
group by person_id
having count(*) =2

SQL Server : query to update record with latest entry

I have a table that maintains records of employers and employees' data. Something like this
EmployerName EmployerPhone EmployerAddress EmployeeName EmployeePhone EmployeeAddress Date
-------------------------------------------------------------------------------------------------------
John 12345 NewYork Harry 59786 NewYork 12-1-1991
Mac 22345 Bankok John 12345 Delhi 12-3-1991
Smith 54732 Arab Amar 59226 China 21-6-1991
Sarah 12345 Bhutan Mac 22345 NewYork 5-9-1991
Root 85674 NewYork Smith 54732 Japan 2-11-1991
I have another table that will have generic records on the basis of phone number (both employers and employees).
Table structure is as following
Phone Name Address
I want to put latest records according to date from Table1 to Table2 on the basis of phone..
Like this
Phone Name Address
-----------------------
59786 Harry NewYork
22345 Mac NewYork
59226 Amar China
12345 Sarah Bhutan
22345 Mac NewYork
85674 Root NewYork
54732 Smith Arab
I've written many queries but couldn't find anyone resulted as required.
Any kind of help will be appreciated.

For initialize the table without phone duplicates:
INSERT IGNORE INTO Table2 (Phone, Name, Address)
SELECT X.* FROM (
SELECT EmployeeName,EmployeePhone,EmployeeAddress FROM Table1
UNION
SELECT EmployerName,EmployerPhone,EmployerAddress FROM Table1
) X
WHERE NOT EXISTS (SELECT Phone FROM Table2 WHERE Phone=X.Phone)

I think this is what you are looking for if I understand your question correctly. Should work for a once-off
DECLARE #restbl TABLE
(
Name varchar(100),
Phone varchar(20),
Addr varchar(100),
[Date] date,
RecType varchar(100)
)
INSERT INTO #restbl
SELECT EmployerName, EmployerPhone, NULL, MAX([Date]), 'Employer'
FROM #tbl
GROUP BY EmployerName, EmployerPhone
INSERT INTO #restbl
SELECT EmployeeName, EmployeePhone, NULL, MAX([Date]), 'Employee'
FROM #tbl
GROUP BY EmployeeName, EmployeePhone;
WITH LatestData (Name, Phone, [Date])
AS
(
SELECT Name, Phone, MAX([Date])
FROM #restbl
GROUP BY Name, Phone
)
INSERT INTO FinalTable (Name, Phone, [Address])
SELECT DISTINCT ld.Name, ld.Phone, ISNULL(tEmployer.EmployerAddress, tEmployee.EmployeeAddress) AS [Address]
FROM LatestData ld
LEFT JOIN #tbl tEmployer ON ld.Name = tEmployer.EmployerName AND ld.Phone = tEmployer.EmployerPhone AND ld.Date = tEmployer.Date
LEFT JOIN #tbl tEmployee ON ld.Name = tEmployee.EmployeeName AND ld.Phone = tEmployee.EmployeePhone AND ld.Date = tEmployee.Date

SQL count with join

Total novice. Trying this again.
2 Tables Biz and Users
Business has IdNum, created_at, account_type, business_name
Users has IdNum, country, first_name, last_name
Question: How many total businesses are from Japan?
I know I need to use inner join.

I made some assumptions and created this example.
First I created two tables, db_users and db_partners and
inserted some sample data. I am assuming "Users" are
sales managers for the "Partners" and that each partner
is assigned one user. Users can have multiple partners.
It is strange that "Country" is an attribute of Users and
not Partners, but that was how I interpreted the example.
MariaDB [test_time]> create table db_users (
UserID int unsigned not null auto_increment primary key,
UserName varchar(20),
Country varchar(8)
);
MariaDB [test_time]> create table db_partners (
PartnerID int unsigned not null auto_increment primary key,
PartnerName varchar(20),
Created datetime,
Size int unsigned,
UserID int unsigned
);
MariaDB [test_time]> insert into db_users
(UserName,Country)
values
('Abel','CA'),
('Baker','CA'),
('Charlie','JP'),
('Donald','JP'),
('Edgar','JP')
;
MariaDB [test_time]> insert into db_partners
(PartnerName,Created,Size,UserID)
values
('Kraft',now(),45,1),
('Ford',now(),66,2),
('Hortons',now(),22,1),
('Kroger',now(),15,4)
;
Then I selected the partners where the associated user was in CA:
MariaDB [test_time]> select
UserName,PartnerName
from
db_users join db_partners using (UserID)
where
Country='CA'
;
+----------+-------------+
| UserName | PartnerName |
+----------+-------------+
| Abel | Kraft |
| Abel | Hortons |
| Baker | Ford |
+----------+-------------+
3 rows in set (0.00 sec)
MariaDB [test_time]> select
count(Country)
from
db_users join db_partners using (UserID)
where
Country='CA'
;
+----------------+
| count(Country) |
+----------------+
| 3 |
+----------------+
1 row in set (0.00 sec)
I am not sure this is what you wanted. If not, please clarify your
question.

If IdNum is your foreign key (join condition between your tables) just use
SELECT count(*)
FROM Business, Users
WHERE Business.IdNum = Users.IdNum
AND Business.country = 'Japan';

If you wanted the total count of all businesses from Canada (assuming you can join from the ID field in Users to the ID field in Biz), counting each distinct business only once-
select count(distinct u.country) --will only count unique entries once
from Biz b
inner join Users u
on b.ID = u.ID
where u.country = 'Japan'
If you wanted a count of all rows, instead of the unique rows-
select count(u.country) --will count all entries
from Biz b
inner join Users u
on b.ID = u.ID
where u.country = 'Japan'

Join logic from two separate tables in sql

We returned a list of cardID's after a query and those cardID's belong to two tables Student and Personnel. So how can I join those cardID's with Student and Personnel so I can return a table that shows name of Student and Personnel according to cardID's?
Personnel table:
PERSONNELID NUMBER(9,0)
PERSONNELNAME VARCHAR2(20)
PERSONNELSURNAME VARCHAR2(20)
PERSONNELJOB VARCHAR2(40)
PERSONNELCARDID NUMBER(4,0)
Student table:
STUDENTID NUMBER(9,0)
STUDENTNAME VARCHAR2(20)
STUDENTSURNAME VARCHAR2(20)
STUDENTDEPT VARCHAR2(40)
STUDENTFACULTY VARCHAR2(20)
STUDENTCARDID NUMBER(4,0)
CardID table
CARDID NUMBER(4,0)
USERTYPE VARCHAR2(20)
CHARGE NUMBER(3,2)
CREDIT NUMBER(4,2)
PaymentDevice table:
ORDERNO NUMBER
PAYDEVIP NUMBER(8,0)
PAYDEVDATE DATE No
PAYDEVTIME VARCHAR2(8)
CHARGEDCARDID NUMBER(9,0)
MEALTYPE VARCHAR2(10)
I tried to return first 10 person's name and surname that eat at cafeteria on 27/12/2012
SELECT C.CARDID
FROM CARD C, PAYMENTDEVICE P
WHERE P.ORDERNO
BETWEEN (SELECT MIN(ORDERNO)
FROM PAYMENTDEVICE
WHERE PAYDEVDATE='27/12/2012') AND (SELECT MIN(ORDERNO)
FROM PAYMENTDEVICE
WHERE PAYDEVDATE='27/12/2012')+10 AND C.CARDID=P.CHARGEDCARDID;
Our orderNo isn't reset everyday but keeps increasing so we found the min orderNo that day and add 10 to this value to find first 10 person who eat on that day between those order numbers.
So what return from this query:
CARDID
1005
1000
1002
1003
1009
2000
2001
1007
2002
1004
1006
and those some of those cardId (start with 1) are studentCardId and some of them (starts with 2) are PersonnelCardId. So how can I match and write names accordingly?

SELECT *
FROM Personel p INNER JOIN Student s
ON p.PersonnelCardId = s.StudentCardId
INNER JOIN ReturnedQuery rq
ON rq.CardId = p.PersonnelCardId
updated:
SELECT p.PersonnelName, rq.CardId
FROM Personel p INNER JOIN ReturnedQuery rq
ON rq.CardId = p.PersonnelCardId
UNION
SELECT s.StudentName, rq.Cardid
FROM Student s INNER JOIN ReturnedQuery rq
ON s.StudentCardId = rq.Cardid

Your original query is actually pretty fragile. I'd rewrite it like so (and added the needed joins):
WITH First_Daily_Purchase as (SELECT chargedCardId,
MIN(payDevTime) as payDevTime,
MIN(orderNo) as orderNo
FROM PaymentDevice
WHERE payDevDate >=
TO_DATE('2012-12-27', 'YYYY-MM-DD')
AND payDevDate <
TO_DATE('2012-12-28', 'YYYY-MM-DD')
GROUP BY chargedCardId),
First_10_Daily_Purchasers as (SELECT chargedCardId
FROM (SELECT chargedCardId,
RANK() OVER(ORDER BY payDevTime,
orderNo) as rank
FROM First_Daily_Purchase) a
WHERE a.rank < 11)
SELECT a.chargedCardId, b.personnelName, b.personnelSurname
FROM First_10_Daily_Purchasers a
JOIN Personnel b
ON b.personnelCardId = a.chargedCardId
UNION ALL
SELECT a.chargedCardId, b.studentName, b.studentSurname
FROM First_10_Daily_Purchasers a
JOIN Student b
ON b.studentCardId = a.chargedCardId
(Have a working SQL Fiddle - generally bullet-proofing this took me a while.)
This should get you the first 10 people who made a purchase (not the first 11 purchases, which is what you were actually getting). This of course assumes that payDevTime is actually stored in a sortable format (if it isn't you have bigger problems than this query not working quite right).
That said, there's a number of troubling things about your schema design.

Add or delete repeated row

I have an output like this:
id name date school school1
1 john 11/11/2001 nyu ucla
1 john 11/11/2001 ucla nyu
2 paul 11/11/2011 uft mit
2 paul 11/11/2011 mit uft
I would like to achieve this:
id name date school school1
1 john 11/11/2001 nyu ucla
2 paul 11/11/2011 mit uft
I am using direct join as in:
select distinct
a.id, a.name,
b.date,
c.school
a1.id, a1.name,
b1.date,
c1.school
from table a, table b, table c,table a1, table b1, table c1
where
a.id=b.id
and...
Any ideas?

We will need more information such as what your tables contain and what you are after.
One thing I noticed is you have a school and then school1. 3nf states that you should never duplicate fields and append numbers to them to get more information even if you think that the relationship will only be 1 or 2 additional items. You need to create a second table that stores a user associated with 1 to many schools.

I agree with everyone else that both your source table and your desired output are poor design. While you probably can't do anything about your source table, I recommend the following code and output:
Select id, name, date, school from MyTable;
union
Select id, name, date, school1 from MyTable;
(repeat as necessary)
This will give you results in the format:
id name date school
1 john 11/11/2001 nyu
1 john 11/11/2001 ucla
2 paul 11/11/2011 mit
2 paul 11/11/2011 uft
(Note: in my version of SQL, union queries automatically select distinct records so the distinct flag isn't needed)
With this format, you could easily count the number of schools per student, number of students per school, etc.
If processing time and/or storage space is a factor here, you could then split this into 2 tables, 1 with the id,name & date, the other with the id & school (basically what JonH just said). But if you're just working up some simple statistics, this should suffice.

This problem was just too irresistable, so I just took a guess at the data structures that we are dealing with. The technology wasn't specified in the question. This is in Transact-SQL.
create table student
(
id int not null primary key identity,
name nvarchar(100) not null default '',
graduation_date date not null default getdate(),
)
go
create table school
(
id int not null primary key identity,
name nvarchar(100) not null default ''
)
go
create table student_school_asc
(
student_id int not null foreign key references student (id),
school_id int not null foreign key references school (id),
primary key (student_id, school_id)
)
go
insert into student (name, graduation_date) values ('john', '2001-11-11')
insert into student (name, graduation_date) values ('paul', '2011-11-11')
insert into school (name) values ('nyu')
insert into school (name) values ('ucla')
insert into school (name) values ('uft')
insert into school (name) values ('mit')
insert into student_school_asc (student_id, school_id) values (1,1)
insert into student_school_asc (student_id, school_id) values (1,2)
insert into student_school_asc (student_id, school_id) values (2,3)
insert into student_school_asc (student_id, school_id) values (2,4)
select
s.id,
s.name,
s.graduation_date as [date],
(select max(name) from
(select name,
RANK() over (order by name) as rank_num
from school sc
inner join student_school_asc ssa on ssa.school_id = sc.id
where ssa.student_id = s.id) s1 where s1.rank_num = 1) as school,
(select max(name) from
(select name,
RANK() over (order by name) as rank_num
from school sc
inner join student_school_asc ssa on ssa.school_id = sc.id
where ssa.student_id = s.id) s2 where s2.rank_num = 2) as school1
from
student s
Result:
id name date school school1
--- ----- ---------- ------- --------
1 john 2001-11-11 nyu ucla
2 paul 2011-11-11 mit uft

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

implementing a divide query in postgresql - sql

I would use an aggregation approach: SELECT a.account_number FROM account a INNER JOIN city c ON c.branch_name = a.branch_name GROUP BY a.account_number HAVING COUNT() = (SELECT COUNT() FROM city);

Related

Select all entries from one table which has two specific entries in another table

SQL Server : query to update record with latest entry

SQL count with join

Join logic from two separate tables in sql

Add or delete repeated row

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

implementing a divide query in postgresql - sql

I would use an aggregation approach: SELECT a.account_number FROM account a INNER JOIN city c ON c.branch_name = a.branch_name GROUP BY a.account_number HAVING COUNT(*) = (SELECT COUNT(*) FROM city);

Related

Select all entries from one table which has two specific entries in another table

SQL Server : query to update record with latest entry

SQL count with join

Join logic from two separate tables in sql

Add or delete repeated row

Categories

Resources

I would use an aggregation approach: SELECT a.account_number FROM account a INNER JOIN city c ON c.branch_name = a.branch_name GROUP BY a.account_number HAVING COUNT() = (SELECT COUNT() FROM city);