Query to fetch non distinct records from a table - sql

Need to know SQL query to fetch data as following
TableA
------------------------------------
| CUSTOMER_ID | ACCOUNT_TYPE |
_____________________________________
| 1 | SB |
| 1 | SB |
| 2 | SB |
| 2 | CR |
| 3 | CR |
_____________________________________
There is a requirement to fetch rows as follows
------------------------------------
| CUSTOMER_ID | ACCOUNT_TYPE |
_____________________________________
| 1 | SB |
| 1 | SB |
| 3 | CR |
_____________________________________
I need to eliminate the customer_id details which has two different account_type and show only customer_id to which the ACCOUNT type is either same are has only one row.
Can someone help in giving a ORACLE SQL query for this.
Thanks in advance

I only focused on the following requirements you provided.
I need to eliminate the customer_id details which has two different
account_type and show only customer_id to which the ACCOUNT type is
either same are has only one row.
You need to eliminate customer_id details which has two different account_type
SELECT CUSTOMER_ID, ACCOUNT_TYPE
FROM CUSTOMER
WHERE COUNT(ACCOUNT_TYPE) > 1 AND COUNT(CUSTOMER_ID) > 1
ORDER BY CUSTOMER_ID, ACCOUNT_TYPE;
I don't understand your requirements, it is ambiguous. Please revise your statement.

Related

Getting a distinct value from one column if all rows matches a certain criteria

I'm trying to find a performant and easy-to-read query to get a distinct value from one column, if all rows in the table matches a certain criteria.
I have a table that tracks e-commerce orders and whether they're delivered on time, contents and schema as following:
> select * from orders;
+----+--------------------+-------------+
| id | delivered_on_time | customer_id |
+----+--------------------+-------------+
| 1 | 1 | 9 |
| 2 | 0 | 9 |
| 3 | 1 | 10 |
| 4 | 1 | 10 |
| 5 | 0 | 11 |
+----+--------------------+-------------+
I would like to get all distinct customer_id's which have had all their orders delivered on time. I.e. I would like an output like this:
+-------------+
| customer_id |
+-------------+
| 10 |
+-------------+
What's the best way to do this?
I've found a solution, but it's a bit hard to read and I doubt it's the most efficient way to do it (using double CTE's):
> with hits_all as (
select memberid,count(*) as count from orders group by memberid
),
hits_true as
(select memberid,count(*) as count from orders where hit = true group by memberid)
select
*
from
hits_true
inner join
hits_all on
hits_all.memberid = hits_true.memberid
and hits_all.count = hits_true.count;
+----------+-------+----------+-------+
| memberid | count | memberid | count |
+----------+-------+----------+-------+
| 10 | 2 | 10 | 2 |
+----------+-------+----------+-------+
You use group by and having as follows:
select customer_id
from orders
group by customer_id
having sum(delivered_on_time) = count(*)
This works because an ontime delivery is identified by delivered_on_time = 1. So you can just ensure that the sum of delivered_on_time is equal to the number of records for the customer.
You can use aggregation and having:
select customer_id
from orders
group by customer_id
having min(delivered_on_time) = max(delivered_on_time);

Join Lookup from 1 table to multiple columns

How do I link 1 table with multiple columns in another table without using mutiple JOIN query?
Below is my scenario:
I have table User with ID and Name
User
+---------+------------+
| Id | Name |
+---------+------------+
| 1 | John |
| 2 | Mike |
| 3 | Charles |
+---------+------------+
And table Product with multiple columns, but just focus on 2 columns CreateBy And ModifiedBy
+------------+-----------+-------------+
| product_id | CreateBy | ModifiedBy |
+------------+-----------+-------------+
| 1 | 1 | 3 |
| 2 | 1 | 3 |
| 3 | 2 | 3 |
| 4 | 2 | 1 |
| 5 | 2 | 3 |
+------------+-----------+-------------+
With normal JOIN, i will need to do 2 JOIN:
SELECT p.Product_id,
u1.Name AS CreateByName,
u2.Name AS ModifiedByName
FROM Product p
JOIN USER user u1 ON p.CreateBy = u1.Id,
JOIN USER user u2 ON p.ModifiedBy = u2.Id
to come out result
+------------+---------------+-----------------+
| product_id | CreateByName | ModifiedByName |
+------------+---------------+-----------------+
| 1 | John | Charles |
| 2 | John | Charles |
| 3 | Mike | Charles |
| 4 | Mike | John |
| 5 | Mike | Charles |
+------------+---------------+-----------------+
How do i avoid that 2 times JOIN?
I'm using MS-SQL , but open to all SQL query for my own learning curious
Your current design/approach is acceptable, I think, and the need for two joins is a function of there being two user ID columns. Each of the two columns requires a separate join.
For fun, here is a table design which you may consider if you really want to have to perform only one join:
+------------+-----------+-------------+
| product_id | user_id | type |
+------------+-----------+-------------+
| 1 | 1 | created |
| 2 | 1 | created |
| 3 | 2 | created |
| 4 | 2 | created |
| 5 | 2 | created |
| 1 | 3 | modified |
| 2 | 3 | modified |
| 3 | 3 | modified |
| 4 | 1 | modified |
| 5 | 3 | modified |
+------------+-----------+-------------+
Now, you can get away with a just a single join followed by an aggregation:
SELECT
p.product_id,
MAX(CASE WHEN t.type = 'created' THEN u.Name END) AS CreateByName,
MAX(CASE WHEN t.type = 'modified' THEN u.Name END) AS ModifiedByName
FROM Product p
INNER JOIN user u
ON p.user_id = u.Id
GROUP BY
p.product_id;
Note that I don't recommend this approach at all. It is much cleaner to use your current approach and use two joins. Joins can fairly easily be optimized using one or more indices. The above aggregation approach would probably not perform as well as what you already have.
If you use natural keys instead of surrogates, you won't need to join at all.
I don't know how you tell your products apart in the real world, but for the example I will assume you have a UPC
CREATE TABLE User
(Name VARCHAR(20) PRIMARY KEY);
CREATE TABLE Product
(UPC CHAR(12) PRIMARY KEY,
CreatedBy VARCHAR(20) REFERENCES User(Name),
ModifiedBy VARCHAR(20) REFERENCES User(Name)
);
Now your query is a simple select, and you also enforce uniqueness of your user names as a bonus, and don't need additional indexes.
Try it...
HTH
Join is the best Approach, but if looking for alternate approach you can use Inline Query.
SELECT P.PRODUCT_ID,
(SELECT [NAME] FROM #USER WHERE ID = CREATED_BY) AS CREATED_BY,
(SELECT [NAME] FROM #USER WHERE ID = MODIFIED_BY) AS MODIFIED_BY
FROM #PRODUCT P
DEMO

Find number of rows identical one some, but different on another column

Say I have the following table:
CREATE TABLE data (
PROJECT_ID VARCHAR,
TASK_ID VARCHAR,
REF_ID VARCHAR,
REF_VALUE VARCHAR
);
I want to identify rows where
PROJECT_ID, REF_ID, REF_VALUE are the same
but TASK_ID are different.
The desired output is a list of TASK_ID_1, TASK_ID_2 and COUNT(*) of such conflicts. So, for example,
DATA
+------------+---------+--------+-----------+
| PROJECT_ID | TASK_ID | REF_ID | REF_VALUE |
+------------+---------+--------+-----------+
| 1 | 1 | 1 | 1 |
| 1 | 1 | 1 | 2 |
| 1 | 2 | 1 | 1 |
| 1 | 2 | 1 | 2 |
+------------+---------+--------+-----------+
OUTPUT
+-----------+-----------+----------+
| TASK_ID_1 | TASK_ID_2 | COUNT(*) |
+-----------+-----------+----------+
| 1 | 2 | 2 |
| 2 | 1 | 2 |
+-----------+-----------+----------+
would mean that there are two entries with TASK_ID == 1 and two entries with TASK_ID == 2 that share the same values for the other three columns. The inherent symmetry in the output is fine.
How would I go about finding this information? I've tried joining the table onto itself and grouping, but this turned up more results for a single task than the table had rows altogether, so it's clearly wrong.
The database used is PostgreSQL, though a solution that applies to most common SQL systems would be preferable.
You want a self join and aggregation:
select d1.task_id as task_id_1, d2.task_id as task_id_2, count(*)
from data d1 join
data d2
on d1.project_id = d2.project_id and
d1.ref_id = d2.ref_id and
d1.ref_value = d2.ref_value and
d1.task_id <> d2.task_id
group by d1.task_id, d2.task_id;
Notes:
Add the condition d1.task_id < d2.task_id if you want each pair to occur only once in the result set.
This does not handle NULL values, although that is easy enough to handle. Use is not distinct from instead of =.
You can also simplify this a bit with the using clause:
select d1.task_id as task_id_1, d2.task_id as task_id_2, count(*)
from data d1 join
data d2
using (project_id, ref_id, ref_value)
where d1.task_id <> d2.task_id
group by d1.task_id, d2.task_id;
You can get an idea of how many rows might be returned by using:
select d.project_id, d.ref_id, d.ref_value, count(distinct d.task_id), count(*)
from data d
group by d.project_id, d.ref_id, d.ref_value;
This is how I understand your question. This assume there are only two task for the same combination.
SQL DEMO
SELECT "PROJECT_ID", "REF_ID", "REF_VALUE",
MIN("TASK_ID") as TASK_ID_1,
MAX("TASK_ID") as TASK_ID_2,
COUNT(*) as cnt
FROM Table1
GROUP BY "PROJECT_ID", "REF_ID", "REF_VALUE"
HAVING MIN("TASK_ID") != MAX("TASK_ID")
-- COUNT(*) > 1 also should work
OUTPUT
I add more column to make clear what are the same elements:
| PROJECT_ID | REF_ID | REF_VALUE | task_id_1 | task_id_2 | cnt |
|------------|--------|-----------|-----------|-----------|-----|
| 1 | 1 | 2 | 1 | 2 | 2 |
| 1 | 1 | 1 | 1 | 2 | 2 |

SQL Select foo if all match condition, return foo

Long buildup prob simple answer...
I know this is going to require a subquery of some kind...
But I am joining 3 tables and trying to get an output...
table one 'Status'
Contains many pk_tickNum
id | pk_tickNum | Status | time
/*table two 'Order'
Only One Order*/
id | pk_order_num | tickNum | taker
/*table three 'Transaction'
Many Transactions, Many Item_num, One location p/item*/
id | pk_transaction | tickNum | item_num | Location
I have a statement that says...
Select
ticket1.pk_tickNum,ticket1.status,ticket1.time,order.pk_order_num
From
Status ticket1 left join Status ticket2
ON
(ticket1.pk_tickNum = ticket2.pk_tickNum AND ticket1.ID < ticket2.ID)
Inner Join
order
ticket1.pk_tickNum = order.tickNum
WHERE
(ticket2.ID IS NULL)
This will give me the most current status of the order....
Works perfectly!!! However, we have Bins, ie: Locations. and every order has multiple items...
As the item moves through the warehouse, every location is recorded. So for every order, there are multiple items and each item has a location to include the 'shipped' location which marks the end.
If I run the above query to left join the third Transaction table I get as many entries as there are item_num on a single transaction. I don't need that!
All I am looking for is a single output for the current status of a ticket if ALL items on a ticket are NOT in location='shipped'
Edit -
Content
Status
id | pk_tickNum | Status |
1 | 123456 | Green |
2 | 123457 | Blue |
3 | 123456 | Yellow |
4 | 123456 | Red |
5 | 123457 | Green |
Order
id | pk_order_num | tickNum |
1 | 987654 | 123456
2 | 987656 | 123457
Transaction
id | pk_transaction | tickNum | item_num | Location
1 | 5555555555 | 123456 | Some | Floor
2 | 5555555556 | 123456 | Thing | Floor
3 | 5555555557 | 123456 | Smart | Shipped
4 | 5555555558 | 123456 | or | Shipped
5 | 5555555559 | 123457 | Really | Shipped
6 | 5555555560 | 123457 | Noth | Shipped
7 | 5555555561 | 123457 | ing | Shipped
Output -
pk_order_num | pk_tickNum | Status |
987654 | 123456 | Red |
/*987656 | 123457 | Green |*/ This should not show!
Answer! - Posted By #Used_By_Already And sample code supplied available at SQLfiddle
Thank you!
I really do hope you don't have tables called "order" and "transaction", if you do make sure they are contained in [] or "" for my sanity I used "s" on the end of those names.
To achieve this result (available at SQLFiddle):
| pk_order_num | tickNum | Status |
|--------------|---------|--------|
| 987654 | 123456 | Red |
I have assumed that the "most recent" row in the status table is determined by the reverse order of the ID column (this isn't a great way to do it, but that's the only available columns to work with). A better column would be a "last updated" datetime value to base this on, perhaps that is the column [time] in that table, but no data was supplied for it.
SELECT
o.pk_order_num
, o.tickNum
, s.Status
FROM [orders] o
INNER JOIN (
select pk_tickNum, Status
, row_number() over(partition by pk_tickNum
order by id desc) rn
from status
) s ON o.ticknum = s.pk_tickNum and s.rn = 1
INNER JOIN (
SELECT
ticknum
FROM [transactions]
GROUP BY ticknum
HAVING COUNT(*) <> SUM(CASE WHEN Location = 'shipped' THEN 1 ELSE 0 END)
) t ON s.pk_tickNum = t.ticknum
;
Also note that the final subquery using the having clause determines if all details in the transactions have been shipped or not. Only orders with unshipped transactions will be returned by that subquery.
Select
s.pk_tickNum, s.status, s.time, o.pk_order_num
From Status s
-- actually this join already multiplies rows: ticket 123456 has more than one record in Status table in your sample data
Inner Join order o ON s.pk_tickNum = o.tickNum
WHERE NOT EXISTS
(
-- why is it named `pk_tickNum` if this is not a PK?
SELECT 1 FROM Status ticket2
WHERE s.pk_tickNum = ticket2.pk_tickNum AND s.ID < ticket2.ID
)
AND NOT EXISTS
(
-- might catch "empty orders" if any
SELECT 1 FROM Transaction t
WHERE t.tickNum = s.pk_tickNum
and t.Location = 'shipped'
)
Note, output from your sample data would be empty, because ticket 123456 has two items with location 'shipped' which violates conditions you described.

Remove dulicate rows using SQL

I want to know if there is a way to remove duplicate values from a table. The key 'distinct' will fetch us the unique rows however if one value differs in a column, it wont. so just wanted to know if this can be achieved by any means. Hope the below example will help.
For example : In the below table there are two entries for Emp_ID 1234 with two different priorities. my output should consider the higher priority row alone. Is it possible?
My table
+---------+------+--------+-------+
| Employee_ID| priority | gender |
+------------+-----------+--------+
| 1234 | 1 | F |
| 1234 | 10 | F |
| 5678 | 2 | M |
| 5678 | 25 | M |
| 9101 | 45 | F |
+------------+-----------+--------+
Output
+---------+------+--------+-------+
| Employee_ID| priority | gender |
+------------+-----------+--------+
| 1234 | 1 | F |
| 5678 | 2 | M |
| 9101 | 45 | F |
+------------+-----------+--------+
DELETE
FROM Table t
WHERE EXISTS ( SELECT Employee_ID FROM Table WHERE Employee_ID = t.Employee_ID AND priority < t.Priority)
That is if you really want to remove them from the table. The Exists part can also be used in a select query to leave the values in the Original table.
SELECT *
FROM Table t
WHERE NOT EXISTS (SELECT Employee_ID FROM Table WHERE Employee_ID = t.Employee_ID AND priority > t.Priority)
select Employee_ID,max(priority) as priority,gender
from table
group by Employee_ID,gender