SQL count with join - sql

Total novice. Trying this again.
2 Tables Biz and Users
Business has IdNum, created_at, account_type, business_name
Users has IdNum, country, first_name, last_name
Question: How many total businesses are from Japan?
I know I need to use inner join.

I made some assumptions and created this example.
First I created two tables, db_users and db_partners and
inserted some sample data. I am assuming "Users" are
sales managers for the "Partners" and that each partner
is assigned one user. Users can have multiple partners.
It is strange that "Country" is an attribute of Users and
not Partners, but that was how I interpreted the example.
MariaDB [test_time]> create table db_users (
UserID int unsigned not null auto_increment primary key,
UserName varchar(20),
Country varchar(8)
);
MariaDB [test_time]> create table db_partners (
PartnerID int unsigned not null auto_increment primary key,
PartnerName varchar(20),
Created datetime,
Size int unsigned,
UserID int unsigned
);
MariaDB [test_time]> insert into db_users
(UserName,Country)
values
('Abel','CA'),
('Baker','CA'),
('Charlie','JP'),
('Donald','JP'),
('Edgar','JP')
;
MariaDB [test_time]> insert into db_partners
(PartnerName,Created,Size,UserID)
values
('Kraft',now(),45,1),
('Ford',now(),66,2),
('Hortons',now(),22,1),
('Kroger',now(),15,4)
;
Then I selected the partners where the associated user was in CA:
MariaDB [test_time]> select
UserName,PartnerName
from
db_users join db_partners using (UserID)
where
Country='CA'
;
+----------+-------------+
| UserName | PartnerName |
+----------+-------------+
| Abel | Kraft |
| Abel | Hortons |
| Baker | Ford |
+----------+-------------+
3 rows in set (0.00 sec)
MariaDB [test_time]> select
count(Country)
from
db_users join db_partners using (UserID)
where
Country='CA'
;
+----------------+
| count(Country) |
+----------------+
| 3 |
+----------------+
1 row in set (0.00 sec)
I am not sure this is what you wanted. If not, please clarify your
question.

If IdNum is your foreign key (join condition between your tables) just use
SELECT count(*)
FROM Business, Users
WHERE Business.IdNum = Users.IdNum
AND Business.country = 'Japan';

If you wanted the total count of all businesses from Canada (assuming you can join from the ID field in Users to the ID field in Biz), counting each distinct business only once-
select count(distinct u.country) --will only count unique entries once
from Biz b
inner join Users u
on b.ID = u.ID
where u.country = 'Japan'
If you wanted a count of all rows, instead of the unique rows-
select count(u.country) --will count all entries
from Biz b
inner join Users u
on b.ID = u.ID
where u.country = 'Japan'

Related

implementing a divide query in postgresql

Basically i have these relations, account(account_number,branch_name,balance) and city(branch_name,country). What i want is all the account_numbers of the people who have accounts on all the cities in the city table but for some reason im getting an empty query and i dont get why.
Ps: ive seen some confusion on what the divide means, the divide is the division of 2 sets like in relational algebra, what i want is to divide the account set with the city set, in doing so, it is supposed to return all accounts that have a branch_name in every city in cities
Example:
account table
--------------|-----------|---------
account_number|branch_name|balance
--------------|-----------|---------
'A-000000' |'Downtown' | 3467
'A-000001' |'Downtown' | 1500
'A-000002' |'London' | 1500
'A-000826' |'Manchester'| 9999999
'A-000826' |'Downtown' | 33399
------------------------------------
city
--------------|-----------
branch_name | country
--------------|-----------
'Manchester' | 'UK'
'Downtown' | 'USA'
--------------------------
Account รท city
--------------|-----------|---------
account_number|branch_name|balance
--------------|-----------|---------
'A-000826' |'Manchester'| 9999999
'A-000826' |'Downtown' | 33399
-----------------------------------
'A-000826' is the only account in all of the cities in the city table
Program:
create table account(
account_number char(9),
branch_name varchar(80) not null,
balance numeric(16,4));
create table city(
branch_name varchar(80) not null,
country varchar(80) not null,
check(branch_name != ''));
INSERT INTO city(branch_name,country)
VALUES ('Manchester','UK'),
('Downtown','USA');
INSERT INTO account(account_number, branch_name, balance)
VALUES ('A-000000','Downtown',3467),
('A-000001','Downtown',1500),
('A-000002','London',1500),
('A-000826','Manchester',9999999),
('A-000826','Downtown',33399);
Query:
select *
from account as A
where not exists ( (select branch_name
from account)
except
(select C.branch_name
from city as C
where A.branch_name = C.branch_name)
)
I agree with Tim Biegeleisen's answer.
Your logic in the where not exists was backwards.
If you insist on sticking to relational algebra, then try this, instead:
select *
from account as A
where not exists (
select branch_name
from city
except
select branch_name
from account b
where b.account_number = a.account_number
);
db<>fiddle here
I would use an aggregation approach:
SELECT a.account_number
FROM account a
INNER JOIN city c
ON c.branch_name = a.branch_name
GROUP BY a.account_number
HAVING COUNT(*) = (SELECT COUNT(*) FROM city);

How to delete values from first table by using name of the second sql

I have a table groups
group_id | name_group
1 ISI
2 IZI
And a table students
id | first_name | last_name | group_id
6 Bob Surname1 1
17 John Surname2 2
How can I delete all information from student table by using groups.name?
i.e. I need query which select all students with the same group_id which is equivalent to name.
group_id 1 = 'ISI'
group_id 2 = 'IZI'
And a query must delete exactly by name.
You can use this query
Delete from Students where group_id=(Select group_id from groups where name_group='ISI');
This all the records with the group_id of 1 (via group_name='ISi').
There are different ways. A simple one, could be selecting the Id of the group and deleting from there. Example:
DECLARE
#name as nvarchar(20) = 'myName'
-- we display the data just for check
SELECT s.*, g.group_id
FROM students s ON g.group_id = s.group_id
WHERE g.name_group = #name
--we look the group id and delete the matches with students
DELETE
FROM students
WHERE group_id in (SELECT group_id FROM groups WHERE name_group = #name)
PD: This basic approach could work on both: MySQL and MSSQL.

Validating a summary count column with the actual records

I have a column in the User table 'total_approved_sales' that contains the count of all sales with status'approved'.
My total_approved_sales column might be off for some users, so I want to list all users who's total_approved_sales doesn't equal the sum from the sales table
i.e. select count(*) from sales where userId=#userId and status='approved'
Table layout looks like:
USER
- total_approved_sales
sales
- userId
- STATUS
How can I query for those users who's counts are off?
joining to an aggregated derived table:
select
u.UserId
, u.total_approved_sales
, a.recount
from user u
left join (
select s.userid, recount = count(*)
from sales s
where s.status = 'approved'
group by s.userid
) a
on u.userid = a.userid
where u.total_approved_sales <> isnull(a.recount,0)
given the following test setup:
create table [user] (userid int, total_approved_sales int);
insert into [user] values (0,0),(1,1),(2,1)
create table sales (userid int, [status] varchar(32))
insert into sales values (1,'approved'),(1,'pending'),(2,'approved'),(2,'approved')
rextester demo: http://rextester.com/TPQZ17719
returns:
+--------+----------------------+---------+
| UserId | total_approved_sales | recount |
+--------+----------------------+---------+
| 2 | 1 | 2 |
+--------+----------------------+---------+
You can achieve this using APPLY operator:
select *
from [user] u
outer apply (select count(*) from sales where userId=u.id and status='approved') sales(cnt)
where u.total_approved_sales <> sales.cnt;

Select all entries from one table which has two specific entries in another table

So, I have 2 tables defined like this:
CREATE TABLE tblPersons (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT
);
CREATE TABLE tblHobbies (
person_id INTEGER REFERENCES tblPersons (id),
hobby TEXT
);
And for example I have 3 person added to tblPersons:
1 | John
2 | Bob
3 | Eve
And next hobbies in tblHobbies:
1 | skiing
1 | serfing
1 | hiking
1 | gunsmithing
1 | driving
2 | table tennis
2 | driving
2 | hiking
3 | reading
3 | scuba diving
And what I need, is query which will return me a list of person who have several specific hobbies.
The only thing I could've come up with, is this:
SELECT id, name FROM tblPersons
INNER JOIN tblHobbies as hobby1 ON hobby1.hobby = 'driving'
INNER JOIN tblHobbies as hobby2 ON hobby2.hobby = 'hiking'
WHERE tblPersons.id = hobby1.person_id and tblPersons.id = hobby2.person_id;
But it is rather slow. Isn't there any better solution?
First, you don't have a Primary Key on tblHobbies this is one cause of slow query (and other problems). Also you should consider creating a index on tblHobbies.hobby.
Second, I'd to advice you to create a third table to evidence N:N cardinality that exists in your model and avoid redundant hobbies. Something like:
--Person
CREATE TABLE tblPersons (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT
);
--Hobby
CREATE TABLE tblHobbies (
id INTEGER PRIMARY KEY AUTOINCREMENT,
hobby TEXT
);
--Associative table between Person and Hobby
CREATE TABLE tblPersonsHobbies (
person_id INTEGER REFERENCES tblPersons (id),
hobby_id INTEGER REFERENCES tblHobbies (id),
PRIMARY KEY (person_id, hobby_id)
);
Adds an extra table but it's worth it.
--Query on your current model
SELECT id, name FROM tblPersons
INNER JOIN tblHobbies as hobby1 ON tblPersons.id = hobby1.person_id
WHERE hobby1.hobby IN ('driving', 'hiking');
--Query on suggested model
SELECT id, name FROM tblPersons
INNER JOIN tblPersonsHobbies as personsHobby ON tblPersons.id = personsHobby.person_id
INNER JOIN tblHobbies as hobby1 ON hobby1.id = personsHobby.hobby_id
WHERE hobby1.hobby IN ('driving', 'hiking');
You can aggregate the hobbies table to get persons with both hobbies:
select person_id
from tblhobbies
group by person_id
having count(case when hobby = 'driving' then 1 end) > 0
and count(case when hobby = 'hiking' then 1 end) > 0
Or better with a WHERE clause restricting the records to read:
select person_id
from tblhobbies
where hobby in ('driving', 'hiking')
group by person_id
having count(distinct hobby) =2
(There should be a unique constraint on person + hobby in the table, though. Then you could remove the DISTINCT. And as I said in the comments section it should even be person_id + hobby_id with a separate hobbies table. EDIT: Oops, I should have read the other answer. Michal suggested this data model three hours ago already :-)
If you want the names, select from the persons table where you find the IDs in above query:
select id, name
from tblpersons
where id in
(
select person_id
from tblhobbies
where hobby in ('driving', 'hiking')
group by person_id
having count(distinct hobby) =2
);
With the better data model you'd replace
from tblhobbies
where hobby in ('driving', 'hiking')
group by person_id
having count(distinct hobby) =2
with
from tblpersonhobbies
where hobby_id in (select id from tblhobbies where hobby in ('driving', 'hiking'))
group by person_id
having count(*) =2

Add or delete repeated row

I have an output like this:
id name date school school1
1 john 11/11/2001 nyu ucla
1 john 11/11/2001 ucla nyu
2 paul 11/11/2011 uft mit
2 paul 11/11/2011 mit uft
I would like to achieve this:
id name date school school1
1 john 11/11/2001 nyu ucla
2 paul 11/11/2011 mit uft
I am using direct join as in:
select distinct
a.id, a.name,
b.date,
c.school
a1.id, a1.name,
b1.date,
c1.school
from table a, table b, table c,table a1, table b1, table c1
where
a.id=b.id
and...
Any ideas?
We will need more information such as what your tables contain and what you are after.
One thing I noticed is you have a school and then school1. 3nf states that you should never duplicate fields and append numbers to them to get more information even if you think that the relationship will only be 1 or 2 additional items. You need to create a second table that stores a user associated with 1 to many schools.
I agree with everyone else that both your source table and your desired output are poor design. While you probably can't do anything about your source table, I recommend the following code and output:
Select id, name, date, school from MyTable;
union
Select id, name, date, school1 from MyTable;
(repeat as necessary)
This will give you results in the format:
id name date school
1 john 11/11/2001 nyu
1 john 11/11/2001 ucla
2 paul 11/11/2011 mit
2 paul 11/11/2011 uft
(Note: in my version of SQL, union queries automatically select distinct records so the distinct flag isn't needed)
With this format, you could easily count the number of schools per student, number of students per school, etc.
If processing time and/or storage space is a factor here, you could then split this into 2 tables, 1 with the id,name & date, the other with the id & school (basically what JonH just said). But if you're just working up some simple statistics, this should suffice.
This problem was just too irresistable, so I just took a guess at the data structures that we are dealing with. The technology wasn't specified in the question. This is in Transact-SQL.
create table student
(
id int not null primary key identity,
name nvarchar(100) not null default '',
graduation_date date not null default getdate(),
)
go
create table school
(
id int not null primary key identity,
name nvarchar(100) not null default ''
)
go
create table student_school_asc
(
student_id int not null foreign key references student (id),
school_id int not null foreign key references school (id),
primary key (student_id, school_id)
)
go
insert into student (name, graduation_date) values ('john', '2001-11-11')
insert into student (name, graduation_date) values ('paul', '2011-11-11')
insert into school (name) values ('nyu')
insert into school (name) values ('ucla')
insert into school (name) values ('uft')
insert into school (name) values ('mit')
insert into student_school_asc (student_id, school_id) values (1,1)
insert into student_school_asc (student_id, school_id) values (1,2)
insert into student_school_asc (student_id, school_id) values (2,3)
insert into student_school_asc (student_id, school_id) values (2,4)
select
s.id,
s.name,
s.graduation_date as [date],
(select max(name) from
(select name,
RANK() over (order by name) as rank_num
from school sc
inner join student_school_asc ssa on ssa.school_id = sc.id
where ssa.student_id = s.id) s1 where s1.rank_num = 1) as school,
(select max(name) from
(select name,
RANK() over (order by name) as rank_num
from school sc
inner join student_school_asc ssa on ssa.school_id = sc.id
where ssa.student_id = s.id) s2 where s2.rank_num = 2) as school1
from
student s
Result:
id name date school school1
--- ----- ---------- ------- --------
1 john 2001-11-11 nyu ucla
2 paul 2011-11-11 mit uft