SQL with nested joins and sums - sql

Hoping someone can give me a little bit of help with a query that I'm stuck on.
Using MS-Sql server 2012
This is part of a larger query but for the purposes of my questions I'm only concerned with 4 tables: Account, user, product, productstats
And a simplified layout of each table is as follows:
account: id, parentaccountID, name
user: id, accountID, email
product: id, accountID
productstats: id, productID, views
So user links to the account table and the account table can link to itself with the parentaccountID field. Product table links to the user table and the productstats table links to the product table.
The productstats contains statistics on each product. In my example above we have how many times someone has viewed a product.
I want to get the sum of all product views under each parent account, including it's child accounts. However, when people search for an account they can search either via account.name or user.email
so if they search by user.email, i want to include all products from that users account, and any child or parent account(s) that it's part of.
One note - the parent/child account structure is only 1 level deep. Meaning an account is either the parent or the child, it's never both. parent accounts have a null value for ParentAccountID.
SELECT a2.ParentAccountID, a.id, a.Name, SUM(ps.PageViews)
FROM account a
LEFT JOIN account a2 ON a.id = a2.ParentAccountID
LEFT JOIN product p ON a.id = p.AccountID OR p.AccountID = a2.ID
LEFT JOIN ProductStatistic ps ON p.id = ps.ProductID
WHERE a.ame LIKE 'test'
GROUP BY a.id, a2.ParentAccountID, a.DealerName
That's a simplified version of the query - I haven't even included the user table yet since i haven't gotten it working this far yet.
The values I get back on that query are:
ParentAccountID =4, ID =4, name=test, sum=1617
When I run the following query
SELECT SUM(pageviews) FROM ProductStatistic WHERE ProductID IN (
SELECT id FROM product WHERE AccountID IN (4, 32, 112, 3757, 3794))
I get 453 back as the result - those account IDs are the parent account ID and it's 4 child accounts. I have no idea how it's getting 1617 since that's not even a multiple of 453

When you break up your query into some smaller parts, it will become a lot clearer.
First obtain the accounts involved
Then determine the relevant products
Only then join in the stats table to obtain the view counts
Have a look at this this sql fiddle.
[EDIT]
Added a new fiddle that adresses your comments. Not so simple no more, but I think it does what you need.

Related

Insert columns from two tables to a new table in PostgreSQL

I am building an application to manage an inventory and I have a problem when creating my tables for the database (I am using PostgreSQL). My problem is the following:
I have two tables, one called 'products' and one called 'users'. Each one with its columns (See image). I want to create a third table called 'product_act_register' , which will keep a record of activity of the products and has with it the columns id, activity_type, quantity, date. But, I want to add other columns which are taken from the table 'users' and 'products'.
It should look like this (Image)
Where product_id, product_name, product_category, product_unit are taken from the table 'products' and the column 'user_id' is taken from the table 'users'.
How can I do this with PostgreSQL ?
Your description and your precise goals are very unclear. You didn't tell us what should happen in the different cases (only a product exists or only a user exists or none exist or both exist). You also didn't tell us how the other columns not coming from these tables should be filled. You furthermore didn't tell us how the tables users and products depend on each other. Basically you con do something like this if you only want to do an insert if both tables have an entry:
INSERT INTO product_acts_register
SELECT 1,'ActivityType1',p.id, p.name, p.category,
p.unit, u.id, 100, CURRENT_DATE
FROM products p JOIN users u ON p.id = u.id;
(Since you didn't tell us how or if to join them, I assumed to join on their id column)
If you don't care about this, but want to insert an entry for any possible combination of users and products, you can just select both tables without joining:
INSERT INTO product_acts_register
SELECT 1,'ActivityType1',p.id, p.name, p.category,
p.unit, u.id, 100, CURRENT_DATE
FROM products p, users u;
You can replicate this here and try out other commands: db<>fiddle
Please be more precise and give us more information when asking the next question.

Combining table information

I have a simple database with three tables:
contributes
payment
user
Whereby contributes is a relationship table between the two user and payment tables. My problem is that when executing an SQL statement to retrieve relationship properties - such as the 'paid' value - and thus include the contributes table in the statement, the results from the query seem to be returned twice. For example, SELECT * FROM user, payment, contributes; produces:
Whereas SELECT * FROM user, payment; produces:
My only guess is that the SELECT statement is simply combining EVERY row of users with EVERY row of payments with EVERY row of contributes, much like a power set?
Forgive me if I'm missing anything obvious, any help would be much appreciated. Also, apologies for the weird table name formatting in the images, that's just how phpMyAdmin exported them!
SELECT u.id, u.email, u.first_name, u.last_name, c.host, c.paid, p.name, p.total, p.portion
FROM user u
INNER JOIN contributes c
ON u.id = c.user_id
INNER JOIN payment p
ON c.payment_id = p.id

Relational division - SQL

I have 3 tables.
Owner(owner_id, name)
House(code, owner_id, price)
Buyer(buyer_id, name)
Bought(buyer_id, code, price_bought, date_bought)
I have the following query:
List the names of the buyers that bought all the houses from some owner?
I know how to find if someone bought all the houses from a particular owner (say owner with id = 1):
SELECT name
FROM buyer
WHERE NOT EXISTS (SELECT code
FROM house
WHERE owner_id = 1
AND code NOT IN (SELECT code
FROM bought
WHERE bought.buyer_id= buyer.buyer_id))
How can I make this work for all owners?
The sentence: "List the names of the buyers that bought all the houses from some owner?". This can be interpreted two ways. (1) All the houses the buyer bought are from one owner. Or (2) All the houses sold by one owner when to the same buyer.
The following answers (1):
select b.buyer_id
from bought b join
house h
on b.code = h.code
group by b.buyer_id
having min(h.owner_id) = max(h.owner_id);
The answer to the second question is similar. However, the focus is on owners rather than buyers.
select min(b.buyer_id)
from bought b join
house h
on b.code = h.code
group by h.owner_id
having min(b.buyer_id) = max(b.buyer_id);
EDIT:
In both cases, the logic is quite similar, but let's look at the second query. The join is just combining the buyer and owner ids together (not really interesting).
The group by is creating a single row for each owner_id. The having clause then adds the condition that the query only returns the owner id when the minimum buyer and the maximum buyer are the same -- meaning there is only one value. You can also express this condition as count(distinct buyer_id) = 1, but min() and max() generally perform a bit better than count(distinct).
The select clause then returns those buyers. You could also include the owner to see whose house(s) they bought.

Issues with subqueries for stored procedure

The query I am trying to perform is
With getusers As
(Select userID from userprofspecinst_v where institutionID IN
(select institutionID, professionID from userprofspecinst_v where userID=#UserID)
and professionID IN
(select institutionID, professionID from userprofspecinst_v where userID=#UserID))
select username from user where userID IN (select userID from getusers)
Here's what I'm trying to do. Given a userID and a view which contains the userID and the ID of their institution and profession, I want to get the list of other userID's who also have the same institutionID and and professionID. Then with that list of userIDs I want to get the usernames that correspond to each userID from another table (user). The error I am getting when I try to create the procedure is, "Only one expression can be specified in the select list when the subquery is not introduced with EXISTS.". Am I taking the correct approach to how I should build this query?
The following query should do what you want to do:
SELECT u.username
FROM user AS u
INNER JOIN userprofspecinst_v AS up ON u.userID = up.userID
INNER JOIN (SELECT institutionID, professionID FROM userprofspecinst_v
WHERE userID = #userID) AS ProInsts
ON (up.institutionID = ProInsts.institutionID
AND up.professionID = ProInsts.professionID)
Effectively the crucial part is the last INNER JOIN statement - this creates a table constituting the insitutionsids and professsionids the user id belongs to. We then get all matching items in the view with the same institution id and profession id (the ON condition) and then link these back to the user table on the corresponding userids (the first JOIN).
You can either run this for each user id you are interested in, or JOIN onto the result of a query (your getusers) (it depends on what database engine you are running).
If you aren't familiar with JOIN's, Jeff Atwood's introductory post is a good starting place.
The JOIN statement effectively allows you to explot the logical links between your tables - the userId, institutionID and professionID are all examples of candidates for foreign keys - so, rather than having to constantly subquery each table and piece the results together, you can link all the tables together and filter down to the rows you want. It's usually a cleaner, more maintainable approach (although that is opinion).

Distinct Values in SQL Query - Advanced

I have searched high and low and have tried for hours to manipulate the various other queries that seemed to fit but I've had no joy.
I have several Tables in Microsoft SQL Server 2005 that I'm trying to join, an example of which is:
Company Table (Comp_CompanyId, Comp_Name)
GroupCode_Link Table (gcl_c_groupcodelinkid, gcl_c_groupcodeid, gcl_c_companyid)
GroupCode Table (grp_c_groupcodeid, grp_c_groupcode, grp_c_name)
ItemCode Table (itm_c_itemcodeid, itm_c_name, itm_c_itemcode, itm_c_group)
ItemCode_Link Table (icl_c_itemcodelinkid, icl_c_companyid, icl_c_groupcodeid, icl_c_itemcodeid)
I'm using Link Tables to associate a Group to a Company, and an Item to a Group, so a Company could have multiple groups, with multiple items in each group.
Now, I'm trying to create an Advanced Find Function that will allow a user to enter, for example, an Item Code and the result should display those companies that have that item, sounds nice and simple!
However, I haven't done something right, if I use the following query ' if the company has this item OR this item, display it's name', I get the company appearing twice in the result set, once for each item.
What I need is to be able to say is:
"Show me a list of companies that have these items (displaying each company only once!)"
I've had a go at using COUNT, DISTINCT and HAVING but have failed on each as my query knowledge isn't up to it!
First, from your description it sounds like you might have a problem with your E-R (entity-relationship) model. Your description tells me that your E-R model looks something like this:
Associative entities (CompanyGroup, GroupItem) exist to implement many-to-many relationships (since many-to-many isn't supported directly by relational databases).
Nothing wrong with that if a group can exist within multiple companies or an item across multiple groups. It would seem more likely that, at least, each group is specific to a company (I can see items existing across multiple companies and/or groups: more than one company retails, for instance, Cuisinart food processors). If that is the case, a better E-R model would be to make each group a dependent entity with a CompanyID that is a component of its primary key. It's a dependent entity because the group doesn't have an independent existence: it's created by/on behalf of and exists for its parent company. If the company goes away, the group(s) tied to it go away. No your E-R model looks like this:
From that, we can write the query you need:
select *
from Company c
where exists ( select *
from GroupItem gi
where gi.ItemID in ( desired-itemid-1 , ... , desired-itemid-n )
and gi.CompanyID = c.CompanyID
)
As you can see, dependent entities are a powerful thing. Because of the key propagation, queries tend to get simpler. With the original data model, the query would be somewhat more complex:
select *
from Company c
where exists ( select *
from CompanyGroup cg
join GroupItem gi on gi.GroupId = cg.GroupID
where gi.ItemID in ( desired-itemid-1 , ... , desired-itemid-n )
and cg.CompanyID = c.CompanyID
)
Cheers!
SELECT *
FROM company c
WHERE (
SELECT COUNT(DISTINCT icl_c_itemcodeid)
FROM GroupCode_Link gl
JOIN ItemCode_Link il
ON il.icl_c_groupcodeid = gcl_c_groupcodeid
WHERE gl.gcl_c_companyid = c.Comp_CompanyId
AND icl_c_companyid = c.Comp_CompanyId
AND icl_c_itemcodeid IN (#Item1, #Item2)
) >= 2
Replace >= 2 with >= 1 if you want "any item" instead of "all items".
If you need to show companies that have item1 AND item2, you can use Quassnoi's answer.
If you need to show companies that have item1 OR item2, then you can use this:
SELECT
*
FROM
company
WHERE EXISTS
(
SELECT
icl_c_itemcodeid
FROM
GroupCode_Link
INNER JOIN
ItemCode_Link
ON icl_c_groupcodeid = gcl_c_groupcodeid
AND icl_c_itemcodeid IN (#item1, #item2)
WHERE
gcl_c_companyid = company.Comp_CompanyId
AND
icl_c_companyid = company.Comp_CompanyId
)
I would write something like the code below:
SELECT
c.Comp_Name
FROM
Company AS c
WHERE
EXISTS (
SELECT
1
FROM
GroupCode_Link AS gcl
JOIN
ItemCode_Link AS icl
ON
gcl.gcl_c_groupcodeid = icl.icl_c_groupcodeid
JOIN
ItemCode AS itm
ON
icl.icl_c_itemcodeid = itm.itm_c_itemcodeid
WHERE
c.Comp_CompanyId = gcl.gcl_c_companyid
AND
itm.itm_c_itemcode IN (...) /* here provide list of one or more Item Codes to look for */
);
but I see there's a icl_c_companyid column in the ItemCode_Link so using GroupCode_Link table is not necessary?