can't join table on request - sql

I'm having issues on table joins in a sql query. I've looked it up on SO but nothing matched my problem.
Here are the two tables:
Charge
| tache_id (int) | semaine_id (int) | login (varchar) | charge (float) |
Tache
| tache_id (int) | type_id (int) | charge (float) |
Basically, every task (Tache) has a charge attribute which represents the amount of work (in days) necessary to complete the task.
The user can then plan it on several weeks (Charge).
What I want to do is display all the tasks that haven't been completely planned, and then display (in an other window) those who are.
I thought of doing something like this:
select Tache.tache_id, Tache.charge
left join Charge on Charge.tache_id = Tache.tache_id
where sum(Charge.charge) < Tache.charge
group by Tache.tache_id
But I get a 'invalid use of group function' error

something on these lines, if tache.tache_id is unique:
select Tache.tache_id, max(Tache.charge)
from tache
left join Charge on Charge.tache_id = Tache.tache_id
group by Tache.tache_id
having sum(Charge.charge) < max(Tache.charge)

What you meant is a having clause, you can use aggregate functions in the where clause..
select Tache.tache_id, max(Tache.charge)
from Tache
left join Charge on Charge.tache_id = Tache.tache_id
group by Tache.tache_id
having sum(Charge.charge) < max(Tache.charge)

Related

How to create two JOIN-tables so that I can compare attributes within?

I take a Database course in which we have listings of AirBnBs and need to be able to do some SQL queries in the Relationship-Model we made from the data, but I struggle with one in particular :
I have two tables that we are interested in, Billing and Amenities. The first one have the id and price of listings, the second have id and wifi (let's say, to simplify, that it equals 1 if there is Wifi, 0 otherwise). Both have other attributes that we don't really care about here.
So the query is, "What is the difference in the average price of listings with and without Wifi ?"
My idea was to build to JOIN-tables, one with listings that have wifi, the other without, and compare them easily :
SELECT avg(B.price - A.price) as averagePrice
FROM (
SELECT Billing.price, Billing.id
FROM Billing
INNER JOIN Amenities
ON Billing.id = Amenities.id
WHERE Amenities.wifi = 0
) A, (
SELECT Billing.price, Billing.id
FROM Billing
INNER JOIN Amenities
ON Billing.id = Amenities.id
WHERE Amenities.wifi = 1) B
WHERE A.id = B.id;
Obviously this doesn't work... I am pretty sure that there is a far easier solution to it tho, what do I miss ?
(And by the way, is there a way to compute the absolute between the difference of price ?)
I hope that I was clear enough, thank you for your time !
Edit : As mentionned in the comments, forgot to say that, but both tables have idas their primary key, so that there is one row per listing.
Just use conditional aggregation:
SELECT AVG(CASE WHEN a.wifi = 0 THEN b.price END) as avg_no_wifi,
AVG(CASE WHEN a.wifi = 1 THEN b.price END) as avg_wifi
FROM Billing b JOIN
Amenities a
ON b.id = a.id
WHERE a.wifi IN (0, 1);
You can use a - if you want the difference instead of the specific values.
Let's assume we're working with data like the following (problems with your data model are noted below):
Billing
+------------+---------+
| listing_id | price |
+------------+---------+
| 1 | 1500.00 |
| 2 | 1700.00 |
| 3 | 1800.00 |
| 4 | 1900.00 |
+------------+---------+
Amenities
+------------+------+
| listing_id | wifi |
+------------+------+
| 1 | 1 |
| 2 | 1 |
| 3 | 0 |
+------------+------+
Notice that I changed "id" to "listing_id" to make it clear what it was (using "id" as an attribute name is problematic anyways). Also, note that one listing doesn't have an entry in the Amenities table. Depending on your data, that may or may not be a concern (again, refer to the bottom for a discussion of your data model).
Based on this data, your averages should be as follows:
Listings with wifi average $1600 (Listings 1 and 2)
Listings without wifi (just 3) average 1800).
So the difference would be $200.
To achieve this result in SQL, it may be helpful to first get the average cost per amenity (whether wifi is offered). This would be obtained with the following query:
SELECT
Amenities.wifi AS has_wifi,
AVG(Billing.price) AS avg_cost
FROM Billing
INNER JOIN Amenities ON
Amenities.listing_id = Billing.listing_id
GROUP BY Amenities.wifi
which gives you the following results:
+----------+-----------------------+
| has_wifi | avg_cost |
+----------+-----------------------+
| 0 | 1800.0000000000000000 |
| 1 | 1600.0000000000000000 |
+----------+-----------------------+
So far so good. So now we need to calculate the difference between these 2 rows. There are a number of different ways to do this, but one is to use a CASE expression to make one of the values negative, and then simply take the SUM of the result (note that I'm using a CTE, but you can also use a sub-query):
WITH
avg_by_wifi(has_wifi, avg_cost) AS
(
SELECT Amenities.wifi, AVG(Billing.price)
FROM Billing
INNER JOIN Amenities ON
Amenities.listing_id = Billing.listing_id
GROUP BY Amenities.wifi
)
SELECT
ABS(SUM
(
CASE
WHEN has_wifi = 1 THEN avg_cost
ELSE -1 * avg_cost
END
))
FROM avg_by_wifi
which gives us the expected value of 200.
Now regarding your data model:
If both your Billing and Amenities table only have 1 row for each listing, it makes sense to combine them into 1 table. For example: Listings(listing_id, price, wifi)
However, this is still problematic, because you probably have a bunch of other amenities you want to model (pool, sauna, etc.) So you might want to model a many-to-many relationship between listings and amenities using an intermediate table:
Listings(listing_id, price)
Amenities(amenity_id, amenity_name)
ListingsAmenities(listing_id, amenity_id)
This way, you could list multiple amenities for a given listing without having to add additional columns. It also becomes easy to store additional information about an amenity: What's the wifi password? How deep is the pool? etc.
Of course, using this model makes your original query (difference in average cost of listings by wifi) a bit tricker, but definitely still doable.

SQL Spatial Subquery Issue

Greetings Benevolent Gods of Stackoverflow,
I am presently struggling to get a spatially enabled query to work for a SQL assignment I am working on. The wording is as follows:
SELECT PURCHASES.TotalPrice, STORES.GeoLocation, STORES.StoreName
FROM MuffinShop
join (SELECT SUM(PURCHASES.TotalPrice) AS StoreProfit, STORES.StoreName
FROM PURCHASES INNER JOIN STORES ON PURCHASES.StoreID = STORES.StoreID
GROUP BY STORES.StoreName
HAVING (SUM(PURCHASES.TotalPrice) > 600))
What I am trying to do with this query is perform a function query (like avg, sum etc) and get the spatial information back as well. Another example of this would be:
SELECT STORES.StoreName, AVG(REVIEWS.Rating),Stores.Shape
FROM REVIEWS CROSS JOIN
STORES
GROUP BY STORES.StoreName;
This returns a Column 'STORES.Shape' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. error message.
I know I require a sub query to perform this task, I am just having endless trouble getting it to work. Any help at all would be wildly appreciated.
There are two parts to this question, I would tackle the first problem with the following logic:
List all the store names and their respective geolocations
Get the profit for each store
With that in mind, you need to use the STORES table as your base, then bolt the profit onto it through a sub query or an apply:
SELECT s.StoreName
,s.GeoLocation
,p.StoreProfit
FROM STORES s
INNER JOIN (
SELECT pu.StoreId
,StoreProfit = SUM(pu.TotalPrice)
FROM PURCHASES pu
GROUP BY pu.StoreID
) p
ON p.StoreID = s.StoreID;
This one is a little more efficient:
SELECT s.StoreName
,s.GeoLocation
,profit.StoreProfit
FROM STORES s
CROSS APPLY (
SELECT StoreProfit = SUM(p.TotalPrice)
FROM PURCHASES p
WHERE p.StoreID = s.StoreID
GROUP BY p.StoreID
) profit;
Now for the second part, the error that you are receiving tells you that you need to GROUP BY all columns in your select statement with the exception of your aggregate function(s).
In your second example, you are asking SQL to take an average rating for each store based on an ID, but you are also trying to return another column without including that inside the grouping. I will try to show you what you are asking SQL to do and where the issue lies with the following examples:
-- Data
Id | Rating | Shape
1 | 1 | Triangle
1 | 4 | Triangle
1 | 1 | Square
2 | 1 | Triangle
2 | 5 | Triangle
2 | 3 | Square
SQL Server, please give me the average rating for each store:
SELECT Id, AVG(Rating)
FROM Store
GROUP BY StoreId;
-- Result
Id | Avg(Rating)
1 | 2
2 | 3
SQL Server, please give me the average rating for each store and show its shape in the result (but don't group by it):
SELECT Id, AVG(Rating), Shape
FROM Store
GROUP BY StoreId;
-- Result
Id | Avg(Rating) | Shape
1 | 2 | Do I show Triangle or Square ...... ERROR!!!!
2 | 3 |
It needs to be told to get the average for each store and shape:
SELECT Id, AVG(Rating), Shape
FROM Store
GROUP BY StoreId, Shape;
-- Result
Id | Avg(Rating) | Shape
1 | 2.5 | Triangle
1 | 1 | Square
2 | 3 | Triangle
2 | 3 | Square
As in any spatial query you need an idea of what your final geometry will be. It looks like you are attempting to group by individual stores but delivering an average rating from the subquery. So if I'm reading it right you are just looking to get the stores shape info associated with the average ratings?
Query the stores table for the shape field and join the query you use to get the average rating
select a.shape
b.*
from stores a inner join (your Average rating query with group by here) b
on a.StoreID = b.Storeid

result repetition in SQL inner join with one to many relationship

I am implementing an application that provides the opening hours of several venues. A simplified version of my DB implementation consists of two tables:
+-----------+ +------------------+
| Venue | | opening_hour |
+-----------+ +------------------+
| venue_id | | opening_hour_id |
| name | | day |
+-----------+ | close_time |
| open_time |
| venue_id |
+------------------+
In this case there is a one-to-many relationship between venue and opening hour.
Now, I would like to retrieve a list of all venues available in the database and their corresponding opening hours. To solve this problem I am using the following code:
SELECT ven.name as name, oh.day as day
FROM venue ven INNER JOIN opening_hour oh
ON oh.venue_id = ven.venue_id
With this implementation, for each day's opening hours I get a row result with the venue name and the day value. This means that if a venue is opened 6 days a week, I would receive 6 rows with the same name and the corresponding day. As a result I find myself with a lot of repeated data that I have to manipulate on the server side.
The only two solutions I can think of from my small DB knowledge is to either follow the current solution or to extract all venues and then perform a single query for each one of them in order to extract their opening hours. The latter one is clearly the worse solution since it would require a ridiculous amount of DB requests.
Can anyone thing of a better approach? The ideal would be to receive a row containing the venue name and an array formed by all the opening hours.
note: Not sure if this is relevant in this case, but I am using a PostgreSQL database.
This will give the venue name and an array of all days when the venue is open:
SELECT ven.name, array_agg(oh.day)
FROM venue ven
NATURAL JOIN opening_hour oh
GROUP BY ven.name;
For the ones using MSSQL;
SELECT
v.name,
REPLACE(REPLACE(REPLACE('['+(
SELECT
'''' + convert(nvarchar(max),s2.open_time) + '''' as a
FROM opening_hour s2
WHERE s2.venue_id= s.venue_id
FOR XML PATH('')
) + ']','<a>',''),'</a>',','),',]',']') as opening_hours
FROM opening_hour s
INNER JOIN Venue v on v.venue_id = s.venue_id
GROUP BY s.venue_id,v.name
Just to note here, this does not return data type Array. It is just a string in a array format.

SQL: SUM of MAX values WHERE date1 <= date2 returns "wrong" results

Hi stackoverflow users
I'm having a bit of a problem trying to combine SUM, MAX and WHERE in one query and after an intense Google search (my search engine skills usually don't fail me) you are my last hope to understand and fix the following issue.
My goal is to count people in a certain period of time and because a person can visit more than once in said period, I'm using MAX. Due to the fact that I'm defining people as male (m) or female (f) using a string (for statistic purposes), CHAR_LENGTH returns the numbers I'm in need of.
SELECT SUM(max_pers) AS "People"
FROM (
SELECT "guests"."id", MAX(CHAR_LENGTH("guests"."gender")) AS "max_pers"
FROM "guests"
GROUP BY "guests"."id")
So far, so good. But now, as stated before, I'd like to only count the guests which visited in a certain time interval (for statistic purposes as well).
SELECT "statistic"."id", SUM(max_pers) AS "People"
FROM (
SELECT "guests"."id", MAX(CHAR_LENGTH("guests"."gender")) AS "max_pers"
FROM "guests"
GROUP BY "guests"."id"),
"statistic", "guests"
WHERE ( "guests"."arrival" <= "statistic"."from" AND "guests"."departure" >= "statistic"."to")
GROUP BY "statistic"."id"
This query returns the following, x = desired result:
x * (x+1)
So if the result should be 3, it's 12. If it should be 5, it's 30 etc.
I probably could solve this algebraic but I'd rather understand what I'm doing wrong and learn from it.
Thanks in advance and I'm certainly going to answer all further questions.
PS: I'm using LibreOffice Base.
EDIT: An example
guests table:
ID | arrival | departure | gender |
10 | 1.1.14 | 10.1.14 | mf |
10 | 15.1.14 | 17.1.14 | m |
11 | 5.1.14 | 6.1.14 | m |
12 | 10.2.14 | 24.2.14 | f |
13 | 27.2.14 | 28.2.14 | mmmmmf |
statistic table:
ID | from | to | name |
1 | 1.1.14 | 31.1.14 |January | expected result: 3
2 | 1.2.14 | 28.2.14 |February| expected result: 7
MAX(...) is the wrong function: You want COUNT(DISTINCT ...).
Add proper join syntax, simplify (and remove unnecessary quotes) and this should work:
SELECT s.id, COUNT(DISTINCT g.id) AS People
FROM statistic s
LEFT JOIN guests g ON g.arrival <= s."from" AND g.departure >= s."too"
GROUP BY s.id
Note: Using LEFT join means you'll get a result of zero for statistics ids that have no guests. If you would rather no row at all, remove the LEFT keyword.
You have a very strange data structure. In any case, I think you want:
SELECT s.id, sum(numpersons) AS People
FROM (select g.id, max(char_length(g.gender)) as numpersons
from guests g join
statistic s
on g.arrival <= s."from" AND g.departure >= s."too"
group by g.id
) g join
GROUP BY s.id;
Thanks for all your inputs. I wasn't familiar with JOIN but it was necessary to solve my problem.
Since my databank is designed in german, I made quite the big mistake while translating it and I'm sorry if this caused confusion.
Selecting guests.id and later on grouping by guests.id wouldn't make any sense since the id is unique. What I actually wanted to do is select and group the guests.adr_id which links a visiting guest to an adress databank.
The correct solution to my problem is the following code:
SELECT statname, SUM (numpers) FROM (
SELECT statistic.name AS statname, guests.adr_id, MAX( CHAR_LENGTH( guests.gender ) ) AS numpers
FROM guests
JOIN statistics ON (guests.arrival <= statistics.too AND guests.departure >= statistics.from )
GROUP BY guests.adr_id, statistic.name )
GROUP BY statname
I also noted that my database structure is a mess but I created it learning by doing and haven't found any time to rewrite it yet. Next time posting, I'll try better.

SQL query - joins + counting

Here's my problem: I have a query that joins multiple tables to show details of some orders. The query result in a table with columns:
order ID | name | count | price | location | date
It's a hospital database and what i want to do is to add another column that says how many patients were at that location at given date.
There's another table that shows patient stays - I need to count those.
patient ID | location | dateFrom | dateTo
The thing is that the STAYS table shows 2 dates - FROM and TO so I need to count every patient that was present at given location (ward) when order was placed.
Here's the initial query I need to update:
SELECT
AP_ZAMPOZ.ID_TOW AS IDTowaru, --merchandiseID
GMSL_TOW.NAZWA_TOW AS Nazwa, --name
GMSL_TOW.MNOZNIK_SYN AS Mnoznik, --quantity
AP_ZAMPOZ.ZAM_CENA_S AS Cena, --price
AP_ZAMPOZ.ZAM_IL AS Ilosc, --count
AP_ZAMNAG.ZAM_DATE AS DataZam, --date
GMSL_MAG.NAZWA_MAG AS Magazyn, --location
APSL_TOW_PROD.PROD_NAZWA AS Producent, --producer
APSL_TOW_ATC.NAZWA AS Grupa -group
FROM
AP_ZAMPOZ
JOIN
GMSL_TOW ON AP_ZAMPOZ.ID_TOW = GMSL_TOW.ID_TOW
JOIN
AP_ZAMNAG ON AP_ZAMNAG.ZAM_ID_NAG = AP_ZAMPOZ.ZAM_ID_NAG
JOIN
GMSL_MAG ON AP_ZAMNAG.ID_MAG = GMSL_MAG.ID_MAG
JOIN
APSL_TOW ON AP_ZAMPOZ.ID_TOW = APSL_TOW.ID_TOW
LEFT JOIN
APSL_TOW_PROD ON APSL_TOW.ID_PROD = APSL_TOW_PROD.ID_PROD
LEFT JOIN
APSL_TOW_ATC ON APSL_TOW.KOD = APSL_TOW_ATC.KOD
The table with stays is called POBYT and has these relevant columns:
| ID_POB (ID) | IDK_JOS (location identifier) | DT_OD (date From) | DT_TO (date To)
Rows that I would like to see should look like those in my present query + number of patients at given location at given date.
Anyone have any ideas how to achieve this? I'm stuck...
problem solved by adding subquery as another SELECT column.
Heres the code
SELECT
.
.
.
APSL_TOW_ATC.NAZWA AS Grupa
(SELECT Count(*)
FROM pobyt
WHERE (TO_DATE( AP_ZAMNAG.ZAM_DATE, 'YY/MM/DD') >= TO_DATE(DT_OD, 'YY-MM-DD') AND (TO_DATE( AP_ZAMNAG.ZAM_DATE, 'YY/MM/DD') <= TO_DATE(DT_DO, 'YY-MM-DD') OR dt_do IS NULL))
AND IDK_JOS = GMSL_MAG.KOD_MAG) AS LiczbaPacjentow --no. of patients at given date
FROM AP_ZAMPOZ
.
.
.
Works great.