Ignore duplicate data id in SQL Query? - sql

How do I ignore duplicate data ids from query SQL results:
In this case I tried to combine several tables. Like this scheme that I made:
Transactions
----------------------------------------------------------------------------------------
id
user_id
type
amount
invoice_transaction (Relation to invoice)
created_at
updated_at
Users
----------------------------------------------------------------------------------------
id
name
email
phone
birth
address
picture
created_at
updated_at
Vouchers
----------------------------------------------------------------------------------------
id
code
amount
type
created_at
updated_at
Vouchers Transactions
----------------------------------------------------------------------------------------
id
user_id
voucher_id
created_at
updated_at
invoice
----------------------------------------------------------------------------------------
id
order_data
payment_id
last_total
status
created_at
updated_at
Payment
----------------------------------------------------------------------------------------
id
name
tax
created_at
updated_at
This is a query I made.
SELECT t.id, t.user_id, u1.name, u1.email, v.code, t.amount, t.type, t.created_at, t.invoice_transaction, i.status, p.name,
FROM transactions AS t
INNER JOIN users AS u1 on u1.id = t.user_id
LEFT JOIN vouchers_transaction AS vt on vt.user_id = u1.id
LEFT JOIN vouchers AS v on v.id = vt.voucher_id
LEFT JOIN invoice AS i on i.order_data = t.invoice_transaction
LEFT JOIN payment AS p on p.id = i.payment_id
WHERE t.type = 'buy'
ORDER BY id ASC
In this case I managed to get the data I wanted. But the results of the query contained duplicate transaction id data such as:
Result
---------------------------------------------------------------------------------------------------------------------------
id user_id name email code amount type invoice_transaction status payment_name
1 1 John Doe John#mail.com ycqs1 150 buy SCS11DAS success bank
1 1 John Doe John#mail.com ycqs1 150 buy SCS11DAS success bank
2 1 John Doe John#mail.com n1ksa 200 buy SCS12DAS success bank
Update
It seems like this happened because in the transaction voucher table there is no connection with the transaction table.
Example:
Voucher Transaction
---------------------------------------------------------------------------------------------------------------------------
id user_id voucher_id
1 1 1
2 1 2
3 2 3
Then each transaction will duplicate according to the number of vouchers used in the transaction vouchers, both transactions that use vouchers or not.
I know the best way is to change the database schema. But in this case can it still be done in this case or not?
Results
Result
---------------------------------------------------------------------------------------------------------------------------
id user_id name email code amount type invoice_transaction status payment_name
1 1 John Doe John#mail.com ycqs1 150 buy SCS11DAS success bank
1 1 John Doe John#mail.com sa31a 150 buy SCS11DAS success bank
2 1 John Doe John#mail.com n1ksa 200 buy SCS12DAS success bank
How do I ignore the duplicated transaction id?

Try using group by
SELECT t.id, t.user_id, u1.name, u1.email, v.code, t.amount, t.type, t.created_at, t.invoice_transaction, i.status, p.name,
FROM transactions AS t
INNER JOIN users AS u1 on u1.id = t.user_id
LEFT JOIN vouchers_transaction AS vt on vt.user_id = u1.id
LEFT JOIN vouchers AS v on v.id = vt.voucher_id
LEFT JOIN invoice AS i on i.order_data = t.invoice_transaction
LEFT JOIN payment AS p on p.id = i.payment_id
WHERE t.type = 'buy'
GROUP BY SELECT t.id, t.user_id, u1.name, u1.email, v.code, t.amount, t.type, t.created_at, t.invoice_transaction, i.status, p.name
ORDER BY id ASC

You will always get double ID's because the rows are different. The code and amount columns are all unique and you havent told SQL what to do with those columns. The group by Mahesh showed will work, if you change it to resolve the difference in the code and amount columns.
what amount do you want to see for ID 1? The lowest? highest? average? sum?.
either you have to remove those 2 columns from the query, or provide an aggregate function to resolve what to show

SELECT DISTINCT ON column1, column2, ...
FROM table_name;
The SELECT DISTINCT ON statement is used to return only distinct (different) values.

Related

How to sum up max values from another table with some filtering

I have 3 tables
User Table
id
Name
1
Mike
2
Sam
Score Table
id
UserId
CourseId
Score
1
1
1
5
2
1
1
10
3
1
2
5
Course Table
id
Name
1
Course 1
2
Course 2
What I'm trying to return is rows for each user to display user id and user name along with the sum of the maximum score per course for that user
In the example tables the output I'd like to see is
Result
User_Id
User_Name
Total_Score
1
Mike
15
2
Sam
0
The SQL I've tried so far is:
select TOP(3) u.Id as User_Id, u.UserName as User_Name, SUM(maxScores) as Total_Score
from Users as u,
(select MAX(s.Score) as maxScores
from Scores as s
inner join Courses as c
on s.CourseId = c.Id
group by s.UserId, c.Id
) x
group by u.Id, u.UserName
I want to use a having clause to link the Users to Scores after the group by in the sub query but I get a exception saying:
The multi-part identifier "u.Id" could not be bound
It works if I hard code a user id in the having clause I want to add but it needs to be dynamic and I'm stuck on how to do this
What would be the correct way to structure the query?
You were close, you just needed to return s.UserId from the sub-query and correctly join the sub-query to your Users table (I've joined in reverse order to you because to me its more logical to start with the base data and then join on more details as required). Taking note of the scope of aliases i.e. aliases inside your sub-query are not available in your outer query.
select u.Id as [User_Id], u.UserName as [User_Name]
, sum(maxScore) as Total_Score
from (
select s.UserId, max(s.Score) as maxScore
from Scores as s
inner join Courses as c on s.CourseId = c.Id
group by s.UserId, c.Id
) as x
inner join Users as u on u.Id = x.UserId
group by u.Id, u.UserName;

SELECT 100 last entries with maximum 3 entries per unique user id

I'm having the following request to get all artworks inner join with their user info:
SELECT a.*, row_to_json(u.*) as users
FROM artworks a INNER JOIN users u USING(address)
WHERE (a.flag != "ILLEGAL" OR a.flag IS NULL)
ORDER BY a.date DESC
LIMIT 100
How could i have the same query but including no more than 3 entries per user?
Each user have a unique id called "address"
I think DISTINCT ON only work for 1 per user, maybe ROW_NUMBER?
Thank you in advance, i'm pretty new to DB queries.
You need an extra column in which you specify the nth time that the user is in the table. This will look something like this:
USER | N
user1 | 1
user1 | 2
user1 | 3
user2 | 1
user2 | 2
Getting the extra column in a new table can be done by using the following code
--Create new Table as T
WITH T AS (
SELECT TOP 100
a.*,
row_to_json(u.*) as users,
ROW_NUMBER() OVER(PARTITION BY u.user ORDER BY a.date DESC) AS N
FROM artworks a INNER JOIN users u USING(address)
WHERE (a.flag != "ILLEGAL" OR a.flag IS NULL) )
--Select columns from your new table
SELECT columns from T
WHERE (T.N =1 OR T.N =2 OR T.N =3)
Just an addition to your original query will do. Count the resulting records for each user and then filter by the counter value.
I am using users.address as the user id.
SELECT * from
(
SELECT a.*, row_to_json(u.*) as userinfo,
row_number() over (partition by u.address order by a.date desc) as ucount
FROM artworks a INNER JOIN users u ON a.address = u.address
WHERE a.flag != "ILLEGAL" OR a.flag IS NULL
) t
WHERE ucount <= 3
ORDER BY date DESC
LIMIT 100;
A remark - you have users as a column alias and as a table name which may cause confusion. I have changed the alias to userinfo.

How to perform max on an inner join with 2 different counts on columns?

How to find the user with the most referrals that have at least three blue shoes using PostgreSQL?
table 1 - users
name (matches shoes.owner_name)
referred_by (foreign keyed to users.name)
table 2 - shoes
owner_name (matches persons.name)
shoe_name
shoe_color
What I have so far is separate queries returning parts of what I want above:
(SELECT count(*) as shoe_count
FROM shoes
GROUP BY owner_name
WHERE shoe_color = “blue”
AND shoe_count>3) most_shoes
INNER JOIN
(SELECT count(*) as referral_count
FROM users
GROUP BY referred_by
) most_referrals
ORDER BY referral_count DESC
LIMIT 1
Two subqueries seem like the way to go. They would look like:
SELECT s.owner_name, s.show_count, r.referral_count
FROM (SELECT owner_name, count(*) as shoe_count
FROM shoes
WHERE shoe_color = 'blue'
GROUP BY owner_name
HAVING shoe_count >= 3
) s JOIN
(SELECT referred_by, count(*) as referral_count
FROM users
GROUP BY referred_by
) r
ON s.owner_name = r.referred_by
ORDER BY r.referral_count DESC
LIMIT 1 ;

SQL query to generate the following output

Display user id, user name, total amount, amount to be paid after discount and give alias name as User_ID, user_name, Total_amount, Paid_amount. Display record in descending order by user id. Click on TABLE SHCEMA to get the table
This is the code I've written. But this is not showing the expected result:
SELECT
USERS.USER_ID AS User_ID,
NAME AS user_name,
(FARE * NO_SEATS) AS Total_amount
FROM
USERS
INNER JOIN
TICKETS ON USERS.USER_ID = TICKETS.USER_ID
INNER JOIN
PAYMENTS ON TICKETS.TICKET_ID = PAYMENTS.TICKET_ID
INNER JOIN
DISCOUNTS ON PAYMENTS.DISCOUNT_ID = DISCOUNTS.DISCOUNT_ID
GROUP BY
USERS.USER_ID, NAME, FARE, NO_SEATS
ORDER BY
USERS.USER_ID DESC;
This is the expected output:
USER_ID USER_NAME TOTAL_AMOUNT PAID_AMOUNT
----------------------------------------------
5 Krena 700 625
4 Johan 800 775
3 Ivan 3000 2900
1 John 4000 3950
1 John 4000 3950
1 John 2000 1900
This will work, try it
SELECT
USERS.USER_ID AS USER_ID,
USERS.NAME AS USER_NAME,
(TICKETS.FARE * TICKETS.NO_SEATS) AS TOTAL_AMOUNT,
(TICKETS.FARE * TICKETS.NO_SEATS - DISCOUNTS.DISCOUNT_AMOUNT) AS PAID_AMOUNT
FROM
USERS
INNER JOIN TICKETS
ON USERS.USER_ID=TICKETS.USER_ID
INNER JOIN PAYMENTS
ON TICKETS.TICKET_ID=PAYMENTS.TICKET_ID
INNER JOIN DISCOUNTS
ON PAYMENTS.DISCOUNT_ID=DISCOUNTS.DISCOUNT_ID
ORDER BY USERS.USER_ID DESC;
Try this code.
SELECT
USERS.USER_ID AS USER_ID,
USERS.NAME AS USER_NAME,
(TICKETS.FARE * TICKETS.NO_SEATS) AS TOTAL_AMOUNT,
(TICKETS.FARE * DISCOUNTS.DISCOUNT_AMOUNT) + TICKETS.FARE AS PAID_AMOUNT
FROM
USERS USERS
INNER JOIN TICKETS TICKETS
ON USERS.USER_ID=TICKETS.USER_ID
INNER JOIN PAYMENTS PAYMENTS
ON TICKETS.TICKET_ID=PAYMENTS.TICKET_ID
INNER JOIN DISCOUNTS DISCOUNTS
ON PAYMENTS.DISCOUNT_ID=DISCOUNTS.DISCOUNT_ID
ORDER BY USERS.USER_ID DESC;
I'm confused in your database structure if why is it in number datatype? you might want to change it to decimal if you want and make its length as (11,2) because i assume that your discount amount in your DISCOUNT table is in percentage right? Or just paste your sample database here so i can update my code.
if i had enough reputation, i would have written this as a comment (sigh):
u havent quite defined what ur expected results are.
i assume the following:
print the total sums for a user, both the Total_amount (ie. official price), Paid_amount (ie. after discounts), so it would include all the tickets and their payments.
i assume that the discount is a positive number to be deducted from the fair and is per 1 seat
then this should do the trick:
SELECT
USER_ID AS USER_ID,
NAME AS USER_NAME,
sum(FARE * NO_SEATS) AS TOTAL_AMOUNT,
sum((FARE - DISCOUNT_AMOUNT ) * NO_SEATS) AS PAID_AMOUNT
FROM USERS u
INNER JOIN TICKETS t
ON u.USER_ID=t.USER_ID
INNER JOIN PAYMENTS p
ON t.TICKET_ID=p.TICKET_ID
INNER JOIN DISCOUNTS d
ON p.DISCOUNT_ID=d.DISCOUNT_ID
group by USER_ID, name
ORDER BY USER_ID DESC;
This is the right answer for the question and would give the accurate output
SELECT User_ID, (select name from users u where t.user_id=u.user_id) as
user_name, ticket_id as Ticket_id, (no_seats*fare) as Total_amount,
((no_seats*fare)-(select discount_amount from discounts d
where d.discount_id=(select discount_id from payments p where t.Ticket_id=p.Ticket_id))) as Paid_amount
FROM tickets t ORDER BY User_ID desc;
Write a query to
display user id, user name, ticket id , total amount, amount to be
paid after discount and give alias name as User_ID, user_name,
Ticket_Id, Total_amount, Paid_amount. Display record in descending
order by user id.
Avoid duplicate records.
[Note : Total amount to be paid should be number of tickets * amount per ticket ]
this code runs 100%

In SQL how do I write a query to return 1 record from a 1 to many relationship?

Let's say I have a Person table and a Purchases table with a 1 to many relationship. I want to run a single query that returns this person and just their latest purchase. This seems easy but I just can't seem to get it.
select p.*, pp.*
from Person p
left outer join (
select PersonID, max(PurchaseDate) as MaxPurchaseDate
from Purchase
group by PersonID
) ppm
left outer join Purchase pp on ppm.PersonID = pp.PersonID
and ppm.MaxPurchaseDate = pp.PurchaseDate
where p.PersonID = 42
This query will also show the latest purchase for all users if you remove the WHERE clause.
Assuming you have something like a PurchaseDate column and want a particular person (SQL Server):
SELECT TOP 1 P.Name, P.PersonID, C.PurchaseDescription FROM Persons AS P
INNER JOIN Purchases AS C ON C.PersonID = P.PersonID
WHERE P.PersonID = #PersonID
ORDER BY C.PurchaseDate DESC
Many Databases preform the "Limit or Top" command in different ways. Here is a reference http://troels.arvin.dk/db/rdbms/#select-limit and below are a few samples
If using SQL Server
SELECT TOP 1
*
FROM Person p
INNER JOIN Purchases pc on pc.PersonID = P.PersonID
Order BY pc.PurchaseDate DESC
Should work on MySQL
SELECT
*
FROM Person p
INNER JOIN Purchases pc on pc.PersonID = P.PersonID
Order BY pc.PurchaseDate DESC
LIMIT 1
Strictly off the top of my head!...If it's only one record then...
SELECT TOP 1 *
FROM Person p
INNER JOIN Purchases pu
ON p.ID = p.PersonId
ORDER BY pu.OrderDate
WHERE p.ID = *thePersonYouWant*
otherwise...
SELECT TOP 1 *
FROM Person p
INNER JOIN
(
SELECT TOP 1 pu.ID
FROM Purchases pu
ON pu.PersonID = p.Id
ORDER BY pu.OrderDate
) sq
I think! I haven't got access to a SQL box right now to test it on.
Without knowing your structure at all, or your dbms, you would order the results descending by the purchase date/time, and return only the first joined record.
Try TOP 1 With an order by desc on date. Ex:
CREATE TABLE #One
(
id int
)
CREATE TABLE #Many
(
id int,
[date] date,
value int
)
INSERT INTO #One (id)
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3
INSERT INTO #Many (id, [date], value)
SELECT 1, GETDATE(), 1 UNION ALL
SELECT 1, DATEADD(DD, 1 ,GETDATE()), 3 UNION ALL
SELECT 1, DATEADD(DD, -1 ,GETDATE()), 0
SELECT TOP 1 *
FROM #One O
JOIN #Many M ON O.id = M.id
ORDER BY [date] DESC
If you want to select the latest purchase for each person, that would be:
SELECT PE.ID, PE.Name, MAx(PU.pucrhaseDate) FROM Persons AS PE JOIN PURCHASE as PU ON PE.ID = PU.Person_ID
If you want to have all persons also those who have no purchases, you need to use LEFT JOIN.
I think you need one more table called Items for example.
The PERSONS table would uniquely define each person and all their attributes, while the ITEMS table would uniquely define each items and their attributes.
Assume the following:
Persons |Purchases |Items
PerID PerName |PurID PurDt PerID ItemID |ItemID ItemDesc ICost
101 Joe Smith |201 101107 101 301 |301 Laptop 500
|202 101107 101 302 |302 Desktop 699
102 Jane Doe |203 101108 102 303 |303 iPod 199
103 Jason Tut |204 101109 101 304 |304 iPad 499
|205 101109 101 305 |305 Printer 99
One Person Parent may tie to none, one or many Purchase Child.
One Item Parent may tie to none, one or many Purchase Child.
One or more Purchases Children will tie to one Person Parent, and one Item Parent.
select per.PerName as Name
, pur.PurDt as Date
, itm.ItemDesc as Item
, itm.ICost as Cost
from Persons per
, Purchases pur
, Items itm
where pur.PerID = per.PerID -- For that Person
and pur.ItemID = itm.ItemID -- and that Item
and pur.PurDt = -- and the purchase date is
( Select max(lst.PurDt) -- the last date
from Purchases lst -- purchases
where lst.PerID = per.PerID ) -- for that person
This should return:
Name Date Item Cost
Joe Smith 101109 Ipad 499
Joe Smith 101109 Printer 99
Jane Doe 101108 iPod 199