Select distinct values from union - sql

I have three tables (sales orders, invoices, purchase orders)
sales_order
------------
so_id (primary key)
item_id (foreign key)
entry_date
invoice
------------
invc_id (primary key)
item_id (foreign key)
entry_date
purchase_order
------------
po_id (primary key)
item_id (foreign key)
entry_date
and they all refer to one central table (item):
item_id (pk)
I am trying to write a sql query that will return all items with activity within a date range.
This is what I've come up with:
select distinct item_id from sales_order where entry_date between ? and ?
union
select distinct item_id from invoice where entry_date between ? and ?
union
select distinct item_id from purchase where entry_date between ? and ?
I think this is the correct solution, but I'm not sure how to test it.
Question 1:
Does the "distinct" keyword apply to all of the statements or only to each statement? i.e., will each query return a distinct set but when you "union" them together it can show duplicates?
Question 2:
Is there a way to return the total (unique) item count (as a separate query)? Like:
select count(
select distinct item_id from sales_order where entry_date between ? and ?
union
select distinct item_id from invoice where entry_date between ? and ?
union
select distinct item_id from purchase where entry_date between ? and ?
)
??

The distinct is redundant. I usually write such as query as:
select item_id from sales_order where entry_date between ? and ?
union -- intentionally removing duplicates
select item_id from invoice where entry_date between ? and ?
union
select item_id from purchase where entry_date between ? and ?;
To return the total count, you can use a subquery:
select count(*)
from (select item_id from sales_order where entry_date between ? and ?
union -- intentionally removing duplicates
select item_id from invoice where entry_date between ? and ?
union
select item_id from purchase where entry_date between ? and ?
) i;

Related

SQL query to select single record when record may exist in another colum

i just need help putting together a query. Below is an example of what I would like to achieve:
Table name: Current_Orders_tbl
Indented output query
We can get an order without a transaction_id, or an order that has an order_id that is the same as a transaction_id in the table.
When I query the table, I would like to only see:
The order_id that contains a transaction_id that matches an already used order_id.
The null transaction_id if that order_id hasn't been used before
Thanks in advance
SELECT COALESCE(table2.order_id, table1.order_Id) AS order_id,
table1.order_name,
COALESCE(table2.transaction_id, table1.transaction_id) AS transaction_id
FROM current_orders_tbl AS table1 LEFT OUTER JOIN
Current_orders_tbl AS table2 ON table1.order_id = table2.transaction_iD
WHERE
(table1.transaction_id IS NULL and table2.order_id IS NULL) OR
Table2.order_id IS NOT NULL
We can use ROW_NUMBER here.
The partition will be done by order_name and we will sort by transaction_id:
WITH sub AS
(SELECT order_id, order_name, transaction_id,
ROW_NUMBER()
OVER (PARTITION BY order_name ORDER BY transaction_id DESC) AS rowNr
FROM Current_Orders_tbl)
SELECT order_id, order_name, transaction_id
FROM sub
WHERE rowNr = 1;
This will - according to your sample - fetch only that row per order_name having the highest transaction_id.
Try out here
Sidenote: If the result of that query should be sorted, I think adding ORDER BY order_id, order_name would be best here.
Try with exists and correlated subquery as the following:
select order_id, order_name, transaction_id
from table_name T
where
( /* for the 1st requirement: The order_id that contains a transaction_id that matches an already used order_id.*/
transaction_id is not null and
exists(select 1 from table_name D where D.order_id = T.transaction_id)
)
or
( /* for the 2nd requirement: The null transaction_id if that order_id hasn't been used before. */
transaction_id is null and
not exists(select 1 from table_name D where D.transaction_id = T.order_id)
)
See demo

Select multiple columns with not all columns mentioned in Groupby - Postgres v12

I have a table which contain review_id,product_id,ratings,reviewer_id,review_comments. The table i have is as below.
My need is quite simple but I have issues figuring it out. Need is to get product_id, rating, reviewer_id and review_comments of the product_id which has the max value of review_id
With below query, I am able to get product_id and review_id properly.
SELECT product_id,max(review_id) as review_id
FROM public.products Group by product_id;
But when I try to add ratings, reviewer_id, and review_comments, it raises an error that those columns have to be part of a groupby and if I add those columns, grouping gets disturbed since I need grouping only on product_id and nothing else.
Is there a way to solve this?
My expected result should contain all row content with review_id 7,5,8 since for product_id 1 review_id 7 is highest and for product_id 2 review_id 5 is highest and for product_id 3 review_id 8 is highest.
Try PostgreSQL's DISTINCT ON:
SELECT DISTINCT ON (product_id)
product_id,
review_id,
rating,
reviewer_id,
review_comments
FROM products
ORDER BY product_id, review_id DESC;
This will return the first row for each product_id in the ORDER BY order.
This can be done with NOT EXISTS:
select p.product_id, p.rating, p.reviewer_id, p.review_comments
from public.products p
where not exists (
select 1 from public.products
where product_id = p.product_id and review_id > p.review_id
)
You can try below way-
select * from tablename a
where review_id =(select max(review_id) from tablename b where a.product_id=b.product_id)
or use row_number()
select * from
(
select *, row_number() over(partition by product_id order by review_id desc) as rn
from tablename
)A where rn=1

Return only unique rows from a Table

I have a table with 4 columns and 7 rows.
This table contains 1 customer with the same ID same LNAME and FNAME.
Also the table has 2 customers with the same ID, but different LNAME or FNAME.
That is the sales reps input error. Ideally my table should have only 2 rows (Row with ID_pk 3 and 7)
I need to have the following result-sets from the above table:
All unique rows by all the four columns (Row with ID_pk 3 and 7). (excluding case # 3 listed below)
All duplicates by all the four columns (Row with ID_pk 3 and 8).
All duplicates by Customer_ID but with not matching LNAME and/or FNAME (Row with ID_pk 1, 2, 4 and 5) (these rows have to be sent back to sales reps for validation.)
Doing stuff this like relies heavily on nested queries, the GROUP BY clause, and the COUNT function.
Part 1 - Unique rows
This query will show you all the rows where the customer ID has matching data.
SELECT Customer_ID, Customer_FNAME, Customer_LNAME FROM dbo.customers WHERE Customer_ID IN (
SELECT Customer_ID FROM (
SELECT DISTINCT Customer_ID, Customer_FNAME, Customer_LNAME FROM dbo.customers
GROUP BY Customer_ID, Customer_FNAME, Customer_LNAME
) Customers
GROUP BY Customer_ID
HAVING COUNT(Customer_ID) = 1
)
GROUP BY Customer_ID, Customer_FNAME, Customer_LNAME
Part 2 - Duplicates
This query will show you all the rows that have the same data entered more than once.
SELECT Customer_ID, Customer_FNAME, Customer_LNAME
FROM dbo.customers
GROUP BY Customer_ID, Customer_FNAME, Customer_LNAME
HAVING COUNT(Customer_ID) > 1
Part 3 - Mismatched Data
This query is basically the same as the first, just looking for a different COUNT value.
SELECT Customer_ID, Customer_FNAME, Customer_LNAME FROM dbo.customers WHERE Customer_ID IN (
SELECT Customer_ID FROM (
SELECT DISTINCT Customer_ID, Customer_FNAME, Customer_LNAME FROM dbo.customers
GROUP BY Customer_ID, Customer_FNAME, Customer_LNAME
) Customers
GROUP BY Customer_ID
HAVING COUNT(Customer_ID) > 1
)
GROUP BY Customer_ID, Customer_FNAME, Customer_LNAME
You may use a CTE (Common Table expression): https://msdn.microsoft.com/en-us/library/ms175972.aspx
;WITH checkDup AS (
SELECT Customer_ID, ROW_NUMBER() OVER (PARTITION BY Customer_ID ORDER BY Customer ID) AS 'RN'
FROM Table)
SELECT Customer_ID FROM checkDup
WHERE RN = 1;
Will give you your example output.
You may manipulate the CTE to get the other results you seek.

SQL Statement for counting how many times a record is changed

I have a unique requirement where I have 3 audit tables where any time a change is made it creates a new record for the change. now I need to create a view that will get the distinct order id's and how many time they are changed in the 3 respected tables.
this is my SQL I have developed
select distinct orderid as salesorderid,COUNT (*) AS SALESORDERUPDATE
FROM foobar
where ModifiedDateTime>'2015-11-01 20:44:55.107' group by Orderid
union all
select distinct orderid as salesorderid,count(*) SALESORDERUPDATE AS
from foobar1
where ModifiedDateTime>'2015-11-01 20:44:55.107' group by OrderID
union all
select distinct orderid as salesorderid,count(*)AS SALESORDERUPDATE
from foobar2
where ModifiedDateTime>'2015-11-01 20:44:55.107' group by Orderid;
Now the results I get are like this
for orderid 1 i have 3 entries showing different values for count (*). I want to sum that up in one line
so it would be something like
orderid 123 salesorderupdate(20)
as suppose to
orderId 123 salesorderupdate(10)
orderid 123 salesorderupdate(7)
orderid 123 salesorderupdate(3)
Thanks,
Like this:
select orderID, sum(salesorderupdate)
from
(the query from your question goes here) temp
group by orderId
Make your query the subtable, then select the id and the sum
select distinct orderid, sum(SALESORDERUPDATE)
from (
select distinct orderid as salesorderid,COUNT (*) AS SALESORDERUPDATE
FROM foobar
where ModifiedDateTime>'2015-11-01 20:44:55.107' group by Orderid
union all
select distinct orderid as salesorderid,count(*) SALESORDERUPDATE AS
from foobar1
where ModifiedDateTime>'2015-11-01 20:44:55.107' group by OrderID
union all
select distinct orderid as salesorderid,count(*)AS SALESORDERUPDATE
from foobar2
where ModifiedDateTime>'2015-11-01 20:44:55.107' group by Orderid;
) changes group by orderid;

sql - check for uniqueness of COMPOSITE key

Can somebody please help me with this difficulty I am having?
I would like to check some data whether it is valid, so a small part of the validation consists of entity integrity where I check that my primary key is unique
SELECT order_id, COUNT(order_id)
FROM temp_order
GROUP BY order_id
HAVING ( COUNT(order_id) > 1 )
in this case, order_id is the primary key. This query works fine.
The problem:
I now have another table temp_orditem which has a composite primary key made up of 2 fields: order_id, product_id.
How can I check whether the primary key is unique (i.e. the combination of the 2 fields together)? Can I do the following?
SELECT order_id, product_id, COUNT(order_id), COUNT(product_id)
FROM temp_order
GROUP BY order_id, product_id
HAVING ( COUNT(order_id) > 1 AND COUNT(product_id)>1)
I would just write this:
SELECT order_id, product_id, COUNT(*) AS x
FROM temp_order
GROUP BY order_id, product_id
HAVING x > 1
This is the query you need:
select order_id, product_id, count(*)
from temp_order
group by order_id, product_id
having count(*) > 1