Distinct Values in SQL Query - Advanced

Distinct Values in SQL Query - Advanced - sql

I have searched high and low and have tried for hours to manipulate the various other queries that seemed to fit but I've had no joy.
I have several Tables in Microsoft SQL Server 2005 that I'm trying to join, an example of which is:
Company Table (Comp_CompanyId, Comp_Name)
GroupCode_Link Table (gcl_c_groupcodelinkid, gcl_c_groupcodeid, gcl_c_companyid)
GroupCode Table (grp_c_groupcodeid, grp_c_groupcode, grp_c_name)
ItemCode Table (itm_c_itemcodeid, itm_c_name, itm_c_itemcode, itm_c_group)
ItemCode_Link Table (icl_c_itemcodelinkid, icl_c_companyid, icl_c_groupcodeid, icl_c_itemcodeid)
I'm using Link Tables to associate a Group to a Company, and an Item to a Group, so a Company could have multiple groups, with multiple items in each group.
Now, I'm trying to create an Advanced Find Function that will allow a user to enter, for example, an Item Code and the result should display those companies that have that item, sounds nice and simple!
However, I haven't done something right, if I use the following query ' if the company has this item OR this item, display it's name', I get the company appearing twice in the result set, once for each item.
What I need is to be able to say is:
"Show me a list of companies that have these items (displaying each company only once!)"
I've had a go at using COUNT, DISTINCT and HAVING but have failed on each as my query knowledge isn't up to it!

First, from your description it sounds like you might have a problem with your E-R (entity-relationship) model. Your description tells me that your E-R model looks something like this:
Associative entities (CompanyGroup, GroupItem) exist to implement many-to-many relationships (since many-to-many isn't supported directly by relational databases).
Nothing wrong with that if a group can exist within multiple companies or an item across multiple groups. It would seem more likely that, at least, each group is specific to a company (I can see items existing across multiple companies and/or groups: more than one company retails, for instance, Cuisinart food processors). If that is the case, a better E-R model would be to make each group a dependent entity with a CompanyID that is a component of its primary key. It's a dependent entity because the group doesn't have an independent existence: it's created by/on behalf of and exists for its parent company. If the company goes away, the group(s) tied to it go away. No your E-R model looks like this:
From that, we can write the query you need:
select *
from Company c
where exists ( select *
from GroupItem gi
where gi.ItemID in ( desired-itemid-1 , ... , desired-itemid-n )
and gi.CompanyID = c.CompanyID
)
As you can see, dependent entities are a powerful thing. Because of the key propagation, queries tend to get simpler. With the original data model, the query would be somewhat more complex:
select *
from Company c
where exists ( select *
from CompanyGroup cg
join GroupItem gi on gi.GroupId = cg.GroupID
where gi.ItemID in ( desired-itemid-1 , ... , desired-itemid-n )
and cg.CompanyID = c.CompanyID
)
Cheers!

SELECT *
FROM company c
WHERE (
SELECT COUNT(DISTINCT icl_c_itemcodeid)
FROM GroupCode_Link gl
JOIN ItemCode_Link il
ON il.icl_c_groupcodeid = gcl_c_groupcodeid
WHERE gl.gcl_c_companyid = c.Comp_CompanyId
AND icl_c_companyid = c.Comp_CompanyId
AND icl_c_itemcodeid IN (#Item1, #Item2)
) >= 2
Replace >= 2 with >= 1 if you want "any item" instead of "all items".

If you need to show companies that have item1 AND item2, you can use Quassnoi's answer.
If you need to show companies that have item1 OR item2, then you can use this:
SELECT
*
FROM
company
WHERE EXISTS
(
SELECT
icl_c_itemcodeid
FROM
GroupCode_Link
INNER JOIN
ItemCode_Link
ON icl_c_groupcodeid = gcl_c_groupcodeid
AND icl_c_itemcodeid IN (#item1, #item2)
WHERE
gcl_c_companyid = company.Comp_CompanyId
AND
icl_c_companyid = company.Comp_CompanyId
)

I would write something like the code below:
SELECT
c.Comp_Name
FROM
Company AS c
WHERE
EXISTS (
SELECT
1
FROM
GroupCode_Link AS gcl
JOIN
ItemCode_Link AS icl
ON
gcl.gcl_c_groupcodeid = icl.icl_c_groupcodeid
JOIN
ItemCode AS itm
ON
icl.icl_c_itemcodeid = itm.itm_c_itemcodeid
WHERE
c.Comp_CompanyId = gcl.gcl_c_companyid
AND
itm.itm_c_itemcode IN (...) /* here provide list of one or more Item Codes to look for */
);
but I see there's a icl_c_companyid column in the ItemCode_Link so using GroupCode_Link table is not necessary?

Related

How do I write an SQL query whose where clause is contained in another query?

I have two tables, one for Customer and one for Item.
In Customer, I have a column called "preference", which stores a list of hard criteria expressed as a WHERE clause in SQL e.g. "item.price<20 and item.category='household'".
I'd like a query that works like this:
SELECT * FROM item WHERE interpret('SELECT preference FROM customer WHERE id = 1')
Which gets translated to this:
SELECT * FROM item WHERE item.price < 20 and item.category = 'household'
Example data model:
CREATE TABLE customer (
cust_id INT
preference VARCHAR
);
CREATE TABLE item (
item_id INT
price DECIMAL(19,4)
category VARCHAR
);
# Additional columns omitted for brevity
I've looked up casting and dynamic SQL but I haven't been able to figure out how I should do this.
I'm using PostgreSQL 9.5.1

I'm going to assume that preference is the same as my made up item_id column. You may need to tweak it to fit your case. For future questions like this it may pay to give us the table structures you are working with, it really helps us out!
What you are asking for is a subquery:
select *
from item
where item_id in (select
preference
from
customer
where id = 1)
What I would suggest though is a join:
select item.*
from item
join customer on item.item_id = customer.preference
where item.price<20 and
item.category='household'
customer.id = 1

I decided to change my schema instead, as it was getting pretty messy to store the criteria in preferences in that manner.
I restricted the kinds of preferences that could be specified, then stored them as columns in Customer.
After that, all the queries I wanted could be expressed as joins.

Querying records that meet muliple criteria

Hi I’m trying to write a query and I’m struggling to figure out how to go about it.
I have a suppliers table and a supplier parts table I want to write a query that lists suppliers that have specified related Parts in the supplier parts table. If a supplier doesn’t have all specified related parts then they should not be listed.
At the moment I have written a very basic query that lists the supplier if they have a related supplier part that meets the criteria.
SELECT id ,name
FROM
efacdb.dbo.suppliers INNER JOIN [efacdb].[dbo].[spmatrix] ON
id = spmsupp
WHERE spmpart
IN ('ALUM_5083', 'ALUM_6082')
I only want to show the supplier if they have both parts related. Does anyone know how I could do this?

Use a subquery with counting distinct occurences:
select * from suppliers s
where 2 = (select count(distinct spmpart) from spmatrix
where id = spmsupp and spmpart in ('ALUM_5083', 'ALUM_6082'))

As a note, you can modify your query to get what you want, just by using an aggregation:
SELECT id, name
FROM efacdb.dbo.suppliers INNER JOIN
[efacdb].[dbo].[spmatrix]
ON id = spmsupp
WHERE spmpart IN ('ALUM_5083', 'ALUM_6082')
GROUP BY id, name
HAVING MIN(spmpart) <> MAX(spmpart);
If you know there are no duplicates, then having count(*) = 2 also solves the problem.

Select all in table where a column in table 1 exists in another table and a second column equals a variable in second table

I know the title is confusing but its the best I could explain it. Basically im developing a cinema listings website for a company which owns two cinemas. So I have a database which has the two tables "Films" and "Listings" with data for both cinemas in them.
I'm trying to select all films and their data for one cinema if the films name shows up in the listings (since the two cinemas share all films but in the table but the may not have the same films showing)
Here is what i have come up with but I run into a problem as when the "SELECT DISTINCT" returns more than one result it obviously cant be matched with the FilmName on tbl Films.
How can i check this value for all FilmNames on tblFilms?
SELECT *
FROM tblFilms
WHERE FilmName = (SELECT DISTINCT FilmName FROM tblListings WHERE Cimema = 1)

use IN if the subquery return multiple values,
SELECT *
FROM tblFILMS
WHERE FilmName IN (SELECT DISTINCT FilmName FROM tblListings WHERE Cimema = 1)
Another way to solve thius is by using JOIN (which I recommend)
SELECT DISTINCT a.*
FROM tblFILMS a
INNER JOIN tblListings b
ON a.FilmName = b.FilmName AND
b.Cimema = 1
for faster query execution, add an INDEX on FilmName on both tables.

If you have your schemas for the tables, that would help.
That said, I believe what you want to look at is the JOIN keyword. (inner/outer/left/etc). That's exactly what JOIN is meant to do (ie your title).

MS Access Distinct Records in Recordset

So, I once again seem to have an issue with MS Access being finicky, although it seems to also be an issue when trying similar queries in SSMS (SQL Server Management Studio).
I have a collection of tables, loosely defined as follows:
table widget_mfg { id (int), name (nvarchar) }
table widget { id (int), name (nvarchar), mfg_id (int) }
table widget_component { id (int), name (nvarchar), widget_id (int), component_id }
table component { id (int), name (nvarchar), ... } -- There are ~25 columns in this table
What I'd like to do is query the database and get a list of all components that a specific manufacturer uses. I've tried some of these queries:
SELECT c.*, wc.widget_id, w.mfg_id
FROM ((widget_component wc INNER JOIN widget w ON wc.widget_id = w.id)
INNER JOIN widget_manufacturer wm on w.mfg_id = wm.id)
INNER JOIN component c on c.id = wc.component_id
WHERE wm.id = 1
The previous example displays duplicates of any part that is contained in multiple widget_component lists for different widgets.
I've also tried doing:
SELECT DISTINCT c.id, c.name, wc.widget_id, w.mfg_id
FROM component c, widget_component wc, widget w, widget_manufacturer wm
WHERE wm.id=w.mfg_id AND wm.id = 1
This doesn't display anything at all. I was reading about sub-queries, but I do not understand how they work or how they would apply to my current application.
Any assistance in this would be beneficial.
As an aside, I am not very good with either MS Access or SQL in general. I know the basics, but not a lot beyond that.
Edit:
I just tried this code, and it works to get all the component.id's while limiting them to a single entry each. How do I go about using the results of this to get a list of all the rest of the component data (component.*) where the id's from the first part are used to select this data?
SELECT DISTINCT c.part_no
FROM component c, widget w, widget_component wc, widget_manufacturer wm
WHERE(((c.id=wc.component_id AND wc.widget_id=w.id AND w.mfg_id=wm.id AND wm.id=1)))
(P.S. this is probably not the best way to do this, but I am still learning SQL.)

What I'd like to do is query the database and get a list of all
components that a specific manufacturer uses
There are several ways to do this. IN is probably the easiest to write
SELECT c.*
FROM component c
WHERE c.id IN (SELECT c.component_id
FROM widget w
INNER JOIN widget_component c
ON w.id = c.widget_id
WHERE w.mfg_id = 123)
The IN sub query finds all the component ids that a specific manufacturer uses. The outer query then selects any component.id that is that result. It doesn't matter if its in there once or 1000 times it will only get the component record once.
The other ways of doing this are using an EXISTS sub query or using a join to the query (but then you do need to de-dup it)

It sounds like your component -to- widget relationship is one-to-many. Hence the duplicates. (i.e., the same component is used by more than one widget).
Your Select is almost OK --
SELECT c.*, wc.widget_id, w.mfg_id
but the wc.widget_id is causing the duplicates (per the assumption above).
So remove wc.widget_id from the SELECT, or else aggregate it (min, max, count, etc.). Removing is easier. If you agregate, remember to add a group by clause.
Try this:
SELECT DISTINCT c.*, w.mfg_id
Also -- FWIW, it's generally a better practice to use field names, instead of the *

Fetch last item in a category that fits specific criteria

Let's assume I have a database with two tables: categories and articles. Every article belongs to a category.
Now, let's assume I want to fetch the latest article of each category that fits a specific criteria (read: the article does). If it weren't for that extra criteria, I could just add a column called last_article_id or something similar to the categories table - even though that wouldn't be properly normalized.
How can I do this though? I assume there's something using GROUP BY and HAVING?

Try with:
SELECT *
FROM categories AS c
LEFT JOIN (SELECT * FROM articles ORDER BY id DESC) AS a
ON c.id = a.id_category
AND /criterias about joining/
WHERE /more criterias/
GROUP BY c.id

If you provide us with the Tables schemas, we could be a little more specific, but you could try something like (12.2.9.6. EXISTS and NOT EXISTS, SELECT Syntax for LIMIT)
SELECT *
FROM articles a
WHERE EXISTS (
SELECT 1
FROM articles
where category_id = a.category_id
AND <YourCriteria Here>
ORDER BY <Order Required : ID DESC, LastDate DESC or something?
LIMIT 1
)

Assuming the id's in the articles table represent always increasing numbers, this should work. Using the id is not semantically correct IMHO, you should actually use a time/date tamp field if one is available.
SELECT * FROM articles WHERE article_id IN
(
SELECT
MAX(article_id)
FROM
articles
WHERE [your filters here]
GROUP BY
category_id
)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas