SQL - Selecting data from two tables and removing duplicates - sql

So I have two tables and I'm trying to display some data from both and remove the duplicates. Sorry, I'm new to SQL and databases. Here's my code
Table 1
CREATE TABLE customer
(
customer_id VARCHAR2(5),
customer_name VARCHAR2(50) NOT NULL,
customer_address VARCHAR2(150) NOT NULL,
customer_phone VARCHAR2(11) NOT NULL,
PRIMARY KEY (customer_id)
);
Table 2
CREATE TABLE shop
(
shop_id VARCHAR2(7),
shop_address VARCHAR2(150) NOT NULL,
customer_id VARCHAR2(7),
PRIMARY KEY (shop_id),
FOREIGN KEY (customer_id) REFERENCES customer (customer_id)
);
I want to display everything from the SHOP table, and customer_id, customer_name from the CUSTOMER TABLE.
I've tried this so far, but it's displaying everything from both tables and I get two duplicate customer_id columns:
SELECT *
FROM shop
JOIN customer ON shop.customer_id = customer.customer_id
ORDER BY customer_name;
Anyone able to help?
Thanks

Due to both tables has column customer_id, so you can show everything on shop table and only column customer_name from customer table
SELECT s.*, c.customer_name
FROM shop s
JOIN customer c ON s.customer_id = c.customer_id
ORDER BY c.customer_name;

select distinct c.customer_id, c.customer_name, s.*
from customer c
inner join shop s on c.customer_id = s.customer_id
To remove duplicates, you need to use distinct keyword
https://www.w3schools.com/sql/sql_distinct.asp

You need to manually list the columns you want. Using * will pull in every column from every table. SQL does not have any way of saying "select all columns except these...".
I hope you're only using * casually - it's a very bad idea to use SELECT * inside program code that then expects certain columns to exist in a particular order or with a certain name.
To save typing, you could use * for one of the tables and manually name the rest:
SELECT
customer.*,
shop.shop_id,
shop.shop_address
FROM
...

Related

What is the most efficient way of joining tables of different dimensions?

I have the following schema:
CREATE TABLE products (
id BIGSERIAL NOT NULL,
created_at_timestamp TIMESTAMP NOT NULL DEFAULT NOW(),
last_update_timestamp TIMESTAMP NOT NULL DEFAULT NOW(),
PRIMARY KEY (id)
);
CREATE TABLE product_names (
product_id BIGINT NOT NULL,
language TEXT NOT NULL,
name TEXT NOT NULL,
PRIMARY KEY (product_id, language),
FOREIGN KEY (product_id) REFERENCES products (id)
);
CREATE TABLE product_summaries (
product_id BIGINT NOT NULL,
language TEXT NOT NULL,
summary TEXT NOT NULL,
PRIMARY KEY (product_id, language),
FOREIGN KEY (product_id) REFERENCES products (id)
);
And I want to select all Products.
However as you can see a Product contains a list of names and summaries (per language).
I can retrieve all Products
SELECT * FROM products
And then iterate all the rows (in this case in Kotlin), and then request the names and summaries:
SELECT * FROM product_names WHERE product_id = $id
And
SELECT * FROM product_summaries WHERE product_id = $id
However, this seems inefficient, since I am making 3 separate queries to the database.
I though of using JOINs to get all of this with one query, but then I get multiple repeated rows for each product_names and product_summaries entry.
So in the end, is there a better way of requesting all this data in one query?
You definitely don't want to do multiple queries and then iterate over them in the code. That's horribly inefficient. When you do the second JOIN, you need to include language in the JOIN. That should keep you from getting duplicate rows. This should give you one row for each unique combination of [products.id, product_names.language]
SELECT
products.id
,products.created_at_timestamp
,products.last_update_timestamp
,product_names.name
,product_summaries.summary
,product_names.language
FROM
products
INNER JOIN
product_names ON product_names.product_id = products.id
INNER JOIN
product_summaries ON product_summaries.product_id = products.id
AND product_summaries.language = product_names.language
I've found a way of doing it:
SELECT * FROM products as p INNER JOIN
(SELECT json_agg(product_names) as names, product_id FROM product_names GROUP BY product_id) as tb_names ON tb_names.product_id = p.id
INNER JOIN
(SELECT json_agg(product_summaries) as summaries, product_id FROM product_summaries GROUP BY product_id) as tb_summaries ON tb_summaries.product_id = p.id
returns:
1 | 2018-07-20 09:36:21.56904 | 2018-07-20 09:36:21.56904 | [{"product_id":1,"language":"EN","name":"lol"},
{"product_id":1,"language":"DE","name":"lel"}] | 1 [{"product_id":1,"language":"EN","summary":"deded"},
{"product_id":1,"language":"DE","summary":"rererere"},
{"product_id":1,"language":"FR","summary":"jejejeje"}] | 1
Basically I'm converting the multi-dimensional tables to JSON :)
Postgres is amazing!

SQL DB2 Find and Display Common

I am working on a query in which I need to find all pair of distinct customers who bought atleast one title in common and display it, with the customer with higher id as the first customer A and customer B being the one with lower id. The schema looks like
create table customer (
id smallint not null,
name varchar(20)
primary key (id))
create table purchase (
id smallint not null,
title varchar(25) not null,
primary key (id,title))
Here is the query I wrote but its not outputting the desired result
Select
distinct A.name as customera,B.name as customerb
from customer A,customer B, purchase C
where A.id=C.id and B.id=C.id
But this is yeilding a wrong result to what I want. I am a beginner in sql and this database is what i got to work on.
My output should look like this which It does but it displays both customers as same which is wrong.
CUSTOMERA CUSTOMERB
-------------------- --------------------
Some customer with a higher id other customer
Any help on this or how i can fix this.
First, never use commas in the from clause. Always use proper, explicit, standard join syntax.
Assuming that the id in purchase matches the id in customer, then you can just do:
select distinct p1.id, p2.id
from purchase p1 join
purchase p2
on p2.title = p1.title and p1.id > p2.id;

Oracle APEX Join and Count

I have two tables created with SQL code:
CREATE TABLE
TicketSales(
purchase# Number(10),
client# Integer CONSTRAINT fk1 REFERENCES Customers,
PRIMARY KEY(purchase#));
CREATE TABLE Customers(
client# Integer,
name Char(30),
Primary Key(client#);
Basically table TicketSales holds ticket sales data and client# is foreign key referenced in customers table. I would like to count names that are in TicketSales table. i tried below code with no success:
select Count(name)
From Customers
Where Customers.Client#=TicketSales.Client#
Group by Name;
Any help appreciated.
Thanks,
If you want a count by each name, then include name in the select and group by clauses
select c.Name, Count(*)
From Customers c
INNER JOIN TicketSales t ON c.Client# =t.Client#
Group by c.Name;
If you want just the count of names, not tickets, then use
select Count(*)
From Customers c
;
Or, for a count of individuals who have tickets recrded against them:
select Count(DISTINCT t.Client#)
From TicketSales t
;

SQL multiple natural inner joins

Why does this correctly return the Order ID of an order, the Customer ID of the person who made the order, and the Last Name of the employee in charge of the transaction
SELECT "OrderID", "CustomerID", "LastName"
FROM orders O
NATURAL INNER JOIN customers JOIN employees ON O."EmployeeID" = employees."EmployeeID";
while
SELECT "OrderID", "CustomerID", "LastName"
FROM orders O
NATURAL INNER JOIN customers NATURAL INNER JOIN employees;
returns 0 rows?
I am sure that they have common columns.
Table orders
OrderId
EmployeeID
CustomerID
...
Table employees
EmployeeID
...
Table customers
CustomerID
...
Without seeing your full, unedited schema it's hard to be sure, but I'd say there are more common columns than you intended.
E.g. as #ClockworkMuse sugested:
CREATE TABLE orders (
OrderId integer primary key,
EmployeeID integer not null,
CustomerID integer not null,
created_at timestamp not null default current_timestamp,
...
);
CREATE TABLE employees (
EmployeeID integer primary key,
created_at timestamp not null default current_timestamp,
...
);
then orders NATURAL JOIN employees will be equivalent to orders INNER JOIN employees USING (EmployeeID, created_at). Which surely isn't what you intended.
You should use INNER JOIN ... USING (colname) or INNER JOIN ... ON (condition).
NATURAL JOIN is a poorly thought out feature that should really be avoided except on quick and dirty ad-hoc queries, if even then. Even if it works now, if you later add an unrelated column to a table it might change the meaning of existing queries. That's ... well, avoid natural joins.

Get results that have the same data in the table

I need to get all the customer name where their preference MINPRICE and MAXPRICE is the same.
Here's my schema:
CREATE TABLE CUSTOMER (
PHONE VARCHAR(25) NOT NULL,
NAME VARCHAR(25),
CONSTRAINT CUSTOMER_PKEY PRIMARY KEY (PHONE),
);
CREATE TABLE PREFERENCE (
PHONE VARCHAR(25) NOT NULL,
ITEM VARCHAR(25) NOT NULL,
MAXPRICE NUMBER(8,2),
MINPRICE NUMBER(8,2),
CONSTRAINT PREFERENCE_PKEY PRIMARY KEY (PHONE, ITEM),
CONSTRAINT PREFERENCE_FKEY FOREIGN KEY (PHONE) REFERENCES CUSTOMER (PHONE)
);
I think I need to do some compare between rows and rows? or create another views to compare? any easy way to do this?
its one to many. a customer can have multiple preferences so i need to query a list of customer that have the same minprice and maxprice. compare between rows minprice=minprice and maxprice=maxprice
A self-join on preference would find rows with the same price preference, but a different phone number:
select distinct c1.name
, p1.minprice
, p1.maxprice
from preference p1
join preference p2
on p1.phone <> p2.phone
and p1.minprice = p2.minprice
and p1.maxprice = p2.maxprice
join customer c1
on c1.phone = p1.phone
join customer c2
on c2.phone = p2.phone
order by
p1.minprice
, p1.maxprice
, c1.name
It seems strange that you have minprice and maxprice in your preference table. Is that a table that you update after each transaction, such that each customer only has 1 active preference record? I mean, it reads like a customer could pay two different prices for the same item, which seems odd.
Assuming customer and preference are 1:1
SELECT c.*
FROM customer c INNER JOIN preference p ON c.phone = p.phone
WHERE p.minprice = p.maxprice
However, if a customer can have multiple preferences and you are looking for minprice = maxprice for ALL item ... then you could do this
SELECT c.*
FROM (SELECT phone, MIN(minprice) as allMin, MAX(maxprice) as allMax
FROM preference
GROUP BY phone) p INNER JOIN customer c on p.phone = c.phone
WHERE allMin = allMax
This will show all the customer names that have the same price preferences.
SELECT minprice, maxprice, GROUP_CONCAT(name) names
FROM preference
JOIN customer USING (phone)
GROUP BY minprice, maxprice
HAVING COUNT(*) > 1
The HAVING clause prevents it showing preferences that have no duplicates. If you want to see those single-customer preferences, remove that line.