HQL: DISTINCT Issue - sql

Say I have a table of Customer, Vendor the Customer visited, with each row being a distinct time a certain customer visited a vendor.
Row | Customer | Vendor
1 | 1 | 001
2 | 1 | 001
3 | 1 | 002
4 | 2 | 001
My question is, how can i pull a query to show every distinct visit to a certain vendor. For the above table, I'd like to see output of:
Row | Customer | Vendor
1 | 1 | 001
2 | 1 | 002
3 | 2 | 001

You can simply use DISTINCT clause, assuming that the row column is just for illustration purpose here, and not part of the actual table
SELECT DISTINCT customer, vendor
FROM table

You can use group by:
select min(row) as row, Customer, Vendor
from table t
group by Customer, Vendor;

Related

SQL Select foo if all match condition, return foo

Long buildup prob simple answer...
I know this is going to require a subquery of some kind...
But I am joining 3 tables and trying to get an output...
table one 'Status'
Contains many pk_tickNum
id | pk_tickNum | Status | time
/*table two 'Order'
Only One Order*/
id | pk_order_num | tickNum | taker
/*table three 'Transaction'
Many Transactions, Many Item_num, One location p/item*/
id | pk_transaction | tickNum | item_num | Location
I have a statement that says...
Select
ticket1.pk_tickNum,ticket1.status,ticket1.time,order.pk_order_num
From
Status ticket1 left join Status ticket2
ON
(ticket1.pk_tickNum = ticket2.pk_tickNum AND ticket1.ID < ticket2.ID)
Inner Join
order
ticket1.pk_tickNum = order.tickNum
WHERE
(ticket2.ID IS NULL)
This will give me the most current status of the order....
Works perfectly!!! However, we have Bins, ie: Locations. and every order has multiple items...
As the item moves through the warehouse, every location is recorded. So for every order, there are multiple items and each item has a location to include the 'shipped' location which marks the end.
If I run the above query to left join the third Transaction table I get as many entries as there are item_num on a single transaction. I don't need that!
All I am looking for is a single output for the current status of a ticket if ALL items on a ticket are NOT in location='shipped'
Edit -
Content
Status
id | pk_tickNum | Status |
1 | 123456 | Green |
2 | 123457 | Blue |
3 | 123456 | Yellow |
4 | 123456 | Red |
5 | 123457 | Green |
Order
id | pk_order_num | tickNum |
1 | 987654 | 123456
2 | 987656 | 123457
Transaction
id | pk_transaction | tickNum | item_num | Location
1 | 5555555555 | 123456 | Some | Floor
2 | 5555555556 | 123456 | Thing | Floor
3 | 5555555557 | 123456 | Smart | Shipped
4 | 5555555558 | 123456 | or | Shipped
5 | 5555555559 | 123457 | Really | Shipped
6 | 5555555560 | 123457 | Noth | Shipped
7 | 5555555561 | 123457 | ing | Shipped
Output -
pk_order_num | pk_tickNum | Status |
987654 | 123456 | Red |
/*987656 | 123457 | Green |*/ This should not show!
Answer! - Posted By #Used_By_Already And sample code supplied available at SQLfiddle
Thank you!
I really do hope you don't have tables called "order" and "transaction", if you do make sure they are contained in [] or "" for my sanity I used "s" on the end of those names.
To achieve this result (available at SQLFiddle):
| pk_order_num | tickNum | Status |
|--------------|---------|--------|
| 987654 | 123456 | Red |
I have assumed that the "most recent" row in the status table is determined by the reverse order of the ID column (this isn't a great way to do it, but that's the only available columns to work with). A better column would be a "last updated" datetime value to base this on, perhaps that is the column [time] in that table, but no data was supplied for it.
SELECT
o.pk_order_num
, o.tickNum
, s.Status
FROM [orders] o
INNER JOIN (
select pk_tickNum, Status
, row_number() over(partition by pk_tickNum
order by id desc) rn
from status
) s ON o.ticknum = s.pk_tickNum and s.rn = 1
INNER JOIN (
SELECT
ticknum
FROM [transactions]
GROUP BY ticknum
HAVING COUNT(*) <> SUM(CASE WHEN Location = 'shipped' THEN 1 ELSE 0 END)
) t ON s.pk_tickNum = t.ticknum
;
Also note that the final subquery using the having clause determines if all details in the transactions have been shipped or not. Only orders with unshipped transactions will be returned by that subquery.
Select
s.pk_tickNum, s.status, s.time, o.pk_order_num
From Status s
-- actually this join already multiplies rows: ticket 123456 has more than one record in Status table in your sample data
Inner Join order o ON s.pk_tickNum = o.tickNum
WHERE NOT EXISTS
(
-- why is it named `pk_tickNum` if this is not a PK?
SELECT 1 FROM Status ticket2
WHERE s.pk_tickNum = ticket2.pk_tickNum AND s.ID < ticket2.ID
)
AND NOT EXISTS
(
-- might catch "empty orders" if any
SELECT 1 FROM Transaction t
WHERE t.tickNum = s.pk_tickNum
and t.Location = 'shipped'
)
Note, output from your sample data would be empty, because ticket 123456 has two items with location 'shipped' which violates conditions you described.

Rows that have same value in a column, sum all values in another column and display 1 row

Example Table user:
ID | USER_ID | SCORE |
1 | 555 | 50 |
2 | 555 | 10 |
3 | 555 | 20 |
4 | 123 | 5 |
5 | 123 | 5 |
6 | 999 | 30 |
The result set should be like
ID | USER_ID | SCORE | COUNT |
1 | 555 | 80 | 3 |
2 | 123 | 10 | 2 |
3 | 999 | 30 | 1 |
Is it possible to generate a sql that can return the table above, so far I can only count the rows where certain user_id appear, but don't know how to sum and show for every user ?
You've included a column called "ID" in both the source data and desired results, but I'm going to assume that these ID values are not related and simply represent the row or line number - otherwise the question doesn't make sense.
In which case, you can simply use:
SELECT
USER_ID,
SUM(SCORE) AS SCORE,
COUNT(USER_ID) AS COUNT
FROM
<Table>
GROUP BY
USER_ID
If you really want to generate the ID column as well, then how you do this depends on the database platform being used. For example on Oracle you could use the ROWNUM pseudocolumn, on SQL Server you will need to use ROW_NUMBER() function (which also works for Oracle).
SELECT ID
,sum(SCORE)
,count(USER_ID)
FROM Table
GROUP BY
ID
I think COUNT is the number of scores per user_id, if so, then your sql request should be :
SELECT
ID,
USER_ID,
SUM(SCORE)AS SCORE,
COUNT(SCORE)AS COUNT
FROM
TABLE
GROUP BY
USER_ID

SQL query result from multiple tables without duplicates

I have a number of tables with filtered from all the records customer ID's, Last Order Date and that order Total $, Segment Name. Each filter is based on different criteria but, same customer ID can belong two different tables, two different segments. Same ID would have different values in Last Order and Total in . Segments, table names are A, B, C, D.
I need to group the records from All the segment tables in a way that there are no duplicate ID's in the set. i.e.: if an ID appears in more than one table (say ID 2 is in tables A and B) the result set has to be showing ID columns from the first table, table A.
So I need to list of all the records and their column values from Segment A table, list of all the records and its values from Segment B table except if any ID in Segment B table is in Segment A and list of all the records from Segment C table except if ID from Segment C are in Segment A or B table . I hope it does makes sense.
I made it sound like a question from 70-461 exam :D I've researched it quite thoroughly but perhaps I don't see how to ask that questions. I wonder if anyone would have idea of how to build a query to get that result. Big thanks for any suggestions.
Thanks guys. I couldn't seem to post a screenshot. Let me try to type it via html. There are more segment tables but just typing two to give you an idea. Thanks guys!
Segment A
----------------------------------------
ID | Last Order Date | Total | Segment
----------------------------------------
1 | 01/01/2012 | $1 | A
----------------------------------------
2 | 01/01/2012 | $1 | A
----------------------------------------
3 | 01/01/2012 | $5 | A
----------------------------------------
6 | 01/01/2012 | $7 | A
----------------------------------------
8 | 01/01/2012 | $8 | A
Segment B
ID | Last Order Date | Total | Segment
--------------------------------------
4 | 01/01/2010 | $3 | B
--------------------------------------
2 | 01/01/2010 | $5 | B
--------------------------------------
1 | 01/01/2010 | $2 | B
--------------------------------------
3 | 01/01/2010 | $1 | B
--------------------------------------
5 | 01/01/2010 | $7 | B
Result Set
ID | Last Order Date | Total | Segment
--------------------------------------
1 | 01/01/2012 | $1 | A
--------------------------------------
2 | 01/01/2012 | $1 | A
--------------------------------------
3 | 01/01/2012 | $5 | A
--------------------------------------
4 | 01/01/2010 | $3 | B
--------------------------------------
5 | 01/01/2010 | $7 | B
Here's something to get you started:
SELECT ID, LastOrderDate, Total, Segment
FROM SegmentA
UNION ALL
SELECT ID, LastOrderDate, Total, Segment
FROM SegmentB
WHERE ID NOT IN (SELECT ID FROM SegmentA)
UNION ALL
SELECT ID, LastOrderDate, Total, Segment
FROM SegmentC
WHERE ID NOT IN (SELECT ID FROM SegmentA)
AND ID NOT IN (SELECT ID FROM SegmentB)
UNION ALL
SELECT ID, LastOrderDate, Total, Segment
FROM SegmentD
WHERE ID NOT IN (SELECT ID FROM SegmentA)
AND ID NOT IN (SELECT ID FROM SegmentB)
AND ID NOT IN (SELECT ID FROM SegmentC)
A very simplistic answer, more information is needed if you want to optimize this.

Selecting unique records from database

Running this query,
select * from table;
Returns the following
|branch | number |
-------------------
| 1 | 123 |
| 1 | 001 |
| 2 | 123 |
| 3 | 123 |
| 4 | 123 |
| 1 | 123 |
| 1 | 789 |
| 2 | 123 |
| 3 | 123 |
| 4 | 009 |
I want to find values that are unique to ONLY branch 1
| 1 | 001 |
| 1 | 789 |
Can this be done without the data being stored in separate tables? I've tried a few "select distinct" queries & don't seem to get the results I'm expecting.
SELECT branch, number
FROM table
WHERE branch = 1
GROUP BY branch, number
If you do not need any aggregates, you can use distinct instead of group by:
select distinct branch
, number
from YourTable
where branch = 1
I guess what I'm trying to say is that I want to find all numbers that are unique to ONLY branch 1. If they are found in any other branch, I don't want to see them.
I guess this is what you want.
SELECT distinct number
FROM MyTable
WHERE branch=1 and number not in
( SELECT distinct number
FROM MyTable
WHERE branch != 1 )
Try this:
SELECT branch, number
FROM table
GROUP BY branch, number
Here is a SQLFiddle for you to have a look at
If you want to limit it to only branch 1, then just add a where clause.
SELECT branch, number
FROM table
WHERE branch = 1
GROUP BY branch, number
To select all values that are unique in column number and have a branch value of 1 you can use the following code:
SELECT branch, number
FROM table1
WHERE number IN (
SELECT number
FROM table1
GROUP BY number
HAVING (COUNT(number ) = 1)
)
AND branch = 1
For a demo see http://sqlfiddle.com/#!2/97145/62

How to get a MAX and a COUNT from a three table join?

I got an interview question where there's a Car sale modeled in a DB. Each Car represents a physical car in a Car sale which refers to a Make and a Model table. A Sale table keeps track of each Car that is sold. A Sale only consists of one Car, so there's a record in Sale per every unique Car that had been sold.
The question was to find-out the name of the most sold Model in the car sale. I answered with a 3-level nested query. The interviewer specifically asked for a solution using joins where I only succeeded in just joining the tables without the aggregates.
How would you join 3 tables as below (Car, Make, Sale) while using two other aggregates?
Here's a rough sketch of the schema. The most sold Model here should return 'Corolla'
Car
| carid| modid | etc...
_________________
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 2 |
Make
| mkid | name |
_________________
| 1 | Toyota |
| 2 | Nissan |
| 3 | Chevy |
| 4 | Merc |
| 5 | Ford |
Model
| modid| name | mkid |
________________________
| 1 | Corolla| 1
| 2 | Sunny | 2
| 3 | Carina | 1
| 4 | Skyline| 2
| 5 | Focus | 5
Sale
| sid | carid | etc...
_________________
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 5 | 5 |
Edit:
Using MS SQL Server 2008
Output needed:
Model Name | Count
_____________________
Corolla | 3
i.e. The model of the Car that has been sold the most.
Notice only 3 Corollas and 2 Sunnys are in the Car table while Sale table corresponds to each of those with other sales detail. The 5 Sale records are actually Corolla, Corolla, Corolla, Sunnnu and Sunny.
Since you are using SQL Server 2008, make use of Common Table Expression and Window Function.
WITH recordList
AS
(
SELECT c.name, COUNT(*) [Count],
DENSE_RANK() OVER (ORDER BY COUNT(*) DESC) rn
FROM Sale a
INNER JOIN Car b
ON a.carid = b.carID
INNER JOIN Model c
ON b.modID = c.modID
GROUP BY c.Name
)
SELECT name, [Count]
FROM recordList
WHERE rn = 1
SQLFiddle Demo
When interviewers ask for this they usually want you to say that you'd use windowed functions. You could give each sale a unique ascending number partitioned by model and the highest sale number you'd get would be the max count.
http://www.postgresql.org/docs/9.1/static/tutorial-window.html
Following query works on oracle 11g . here's fiddle link
SELECT name FROM (
SELECT model.name AS name FROM car , sale , model
WHERE car.carid=sale.carid
AND car.modid=model.modid
GROUP BY model.name
ORDER BY count(*) DESC )
WHERE rownum = 1;
Or
SELECT name FROM (
SELECT model.name AS name FROM car natural join sale natural join model
GROUP BY model.name
ORDER BY count(*) DESC )
WHERE rownum = 1;
OUTPUT
| NAME |
-----------
| Corolla |
Based on your newly added SQL Server 2008 tag. If you are using a different RDBMS you'll probably need to use limit instead of top and place it at the end of the top_sold_car subquery.
select Make.name as Make, Model.name as Model
from (
select top 1 count(*) as num_sold
from Car
group by modid
order by num_sold desc) as top_sold_car
join Model
on (top_sold_car.modid = Model.modid)
join Make
on (Model.mkid = Make.mkid)