Find a value based on a table result - sql

First of all, sorry for the title. Couldn't think of any better title.
This is what I got:
SELECT study FROM old_employee;
study
---------
STUDY1
STUDY2
STUDY3
STUDY1
STUDY2
SELECT id,name_string FROM studies;
id | name_string
----+-------------------
1 | STUDY1
2 | STUDY2
3 | STUDY3
Now I would like to find the id's based on the first output. This is what i've attempted but obviously it's not working.
SELECT id FROM studies WHERE name_string LIKE (SELECT study FROM old_employee);
My desired output:
id
----
1
2
3
1
2
edit: I'm saving old_employee as a view and i'm wondering if there's a smarter way of including it in the answers below instead of creating this view first.
CREATE VIEW old_employee AS
SELECT *
FROM dblink('dbname=mydb', 'select study from personnel')
AS t1(study char(10));

This can be accomplished without using SQL LIKE Operator. Here is the query.
SELECT s.id
FROM studies s,
old_employee o
WHERE s.name_string = o.study;
Second query (According to what #a_horse_with_no_name said):
SELECT studies.id
FROM studies
INNER JOIN old_employee
ON studies.name_string = old_employee.study

Related

SQL Query to Show When Golfer Not Attached to an Event/Year

I am working on a school assignment that has downright stumped me for days. The task is to, using a view (VAvailableGolfers), populate a list box with Golfers who are not tied to a given event/year selected from a combo box. Here is the data in the tables:
The expected output on the form, then, would be:
2015 shows Goldstein available
2016 shows no one available
2017 shows both Goldstein and Everett available
so, in other words, where there isn't a record in TGolferEventYears for a golfer for a particular year
I have tried left joins, full outer joins, exists, not in, not exists, etc and I cannot seem to nail down the SQL to make it happen.
Here is the VB Form and the SQL backing it. I cannot figure out what to code in the view:
"SELECT intGolferID, strLastName FROM vAvailableGolfers WHERE intEventYearID = " & cboEvents.SelectedValue.ToString
Here is the view, which I know isn't giving correct output:
select tg.intGolferID, strLastName, intEventYearID
from TGolferEventYears TGEY, TGolfers TG
Where tgey.intGolferID = tg.intGolferID
and intEventYearID not IN
(select intEventYearID
from TEventYears
where intEventYearID not in
(select intEventYearID
from TGolferEventYears))
Appreciate any help
I usually approach this type of question by using a cross join to generate all possibly combination and then a left join/where to filter out the ones that already exist:
select g.intGolferID, g.strLastName, ey.intEventYearID
from TEventYears ey cross join
TGolfers g left join
TGolferEventYears gey
on gey.intGolferID = g.intGolferID and
gey.intEventYearID = ey.intEventYearID
where gey.intGolferID is null;
Try this query:
SELECT tg.intGolferID, strLastName, tey.intEventYearID, tey.intEventYear
FROM TGolfers tg, TEventYears tey
WHERE tg.intGolferID NOT IN (
SELECT DISTINCT tgey.intGolferID
FROM TGolferEventYears tgey
WHERE tgey.intEventYearID = tey.intEventYearID
)
Explanation
Since you are trying to get combinations of data that is not in TGolferEventYears, you cannot use it in your outer-most SELECT as any of its columns would be NULL. Therefore, you need to SELECT FROM the tables that are the sources of that data, and going through each joined record, filter out the combinations that are in TGolferEventYears.
Main query
Select the data you need:
SELECT tg.intGolferID, strLastName, tey.intEventYearID, tey.intEventYear
...from TGolfers, cross join with TEventYears:
FROM TGolfers tg, TEventYears tey
...where the golfer ID does not exist in the following collection:
WHERE tg.intGolferID NOT IN ( ... )
Subquery
Select unique golfer IDs:
SELECT DISTINCT tgey.intGolferID
...from TGolferEventYears:
FROM TGolferEventYears tgey
...where the year is the current year of the outer query:
WHERE tgey.intEventYearID = tey.intEventYearID
Result
+-------------+-------------+----------------+--------------+
| intGolferID | strLastName | intEventYearID | intEventYear |
+-------------+-------------+----------------+--------------+
| 1 | Goldstein | 1 | 2015 |
| 1 | Goldstein | 3 | 2017 |
| 2 | Everett | 3 | 2017 |
+-------------+-------------+----------------+--------------+

SQL Spatial Subquery Issue

Greetings Benevolent Gods of Stackoverflow,
I am presently struggling to get a spatially enabled query to work for a SQL assignment I am working on. The wording is as follows:
SELECT PURCHASES.TotalPrice, STORES.GeoLocation, STORES.StoreName
FROM MuffinShop
join (SELECT SUM(PURCHASES.TotalPrice) AS StoreProfit, STORES.StoreName
FROM PURCHASES INNER JOIN STORES ON PURCHASES.StoreID = STORES.StoreID
GROUP BY STORES.StoreName
HAVING (SUM(PURCHASES.TotalPrice) > 600))
What I am trying to do with this query is perform a function query (like avg, sum etc) and get the spatial information back as well. Another example of this would be:
SELECT STORES.StoreName, AVG(REVIEWS.Rating),Stores.Shape
FROM REVIEWS CROSS JOIN
STORES
GROUP BY STORES.StoreName;
This returns a Column 'STORES.Shape' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. error message.
I know I require a sub query to perform this task, I am just having endless trouble getting it to work. Any help at all would be wildly appreciated.
There are two parts to this question, I would tackle the first problem with the following logic:
List all the store names and their respective geolocations
Get the profit for each store
With that in mind, you need to use the STORES table as your base, then bolt the profit onto it through a sub query or an apply:
SELECT s.StoreName
,s.GeoLocation
,p.StoreProfit
FROM STORES s
INNER JOIN (
SELECT pu.StoreId
,StoreProfit = SUM(pu.TotalPrice)
FROM PURCHASES pu
GROUP BY pu.StoreID
) p
ON p.StoreID = s.StoreID;
This one is a little more efficient:
SELECT s.StoreName
,s.GeoLocation
,profit.StoreProfit
FROM STORES s
CROSS APPLY (
SELECT StoreProfit = SUM(p.TotalPrice)
FROM PURCHASES p
WHERE p.StoreID = s.StoreID
GROUP BY p.StoreID
) profit;
Now for the second part, the error that you are receiving tells you that you need to GROUP BY all columns in your select statement with the exception of your aggregate function(s).
In your second example, you are asking SQL to take an average rating for each store based on an ID, but you are also trying to return another column without including that inside the grouping. I will try to show you what you are asking SQL to do and where the issue lies with the following examples:
-- Data
Id | Rating | Shape
1 | 1 | Triangle
1 | 4 | Triangle
1 | 1 | Square
2 | 1 | Triangle
2 | 5 | Triangle
2 | 3 | Square
SQL Server, please give me the average rating for each store:
SELECT Id, AVG(Rating)
FROM Store
GROUP BY StoreId;
-- Result
Id | Avg(Rating)
1 | 2
2 | 3
SQL Server, please give me the average rating for each store and show its shape in the result (but don't group by it):
SELECT Id, AVG(Rating), Shape
FROM Store
GROUP BY StoreId;
-- Result
Id | Avg(Rating) | Shape
1 | 2 | Do I show Triangle or Square ...... ERROR!!!!
2 | 3 |
It needs to be told to get the average for each store and shape:
SELECT Id, AVG(Rating), Shape
FROM Store
GROUP BY StoreId, Shape;
-- Result
Id | Avg(Rating) | Shape
1 | 2.5 | Triangle
1 | 1 | Square
2 | 3 | Triangle
2 | 3 | Square
As in any spatial query you need an idea of what your final geometry will be. It looks like you are attempting to group by individual stores but delivering an average rating from the subquery. So if I'm reading it right you are just looking to get the stores shape info associated with the average ratings?
Query the stores table for the shape field and join the query you use to get the average rating
select a.shape
b.*
from stores a inner join (your Average rating query with group by here) b
on a.StoreID = b.Storeid

SQL: SUM of MAX values WHERE date1 <= date2 returns "wrong" results

Hi stackoverflow users
I'm having a bit of a problem trying to combine SUM, MAX and WHERE in one query and after an intense Google search (my search engine skills usually don't fail me) you are my last hope to understand and fix the following issue.
My goal is to count people in a certain period of time and because a person can visit more than once in said period, I'm using MAX. Due to the fact that I'm defining people as male (m) or female (f) using a string (for statistic purposes), CHAR_LENGTH returns the numbers I'm in need of.
SELECT SUM(max_pers) AS "People"
FROM (
SELECT "guests"."id", MAX(CHAR_LENGTH("guests"."gender")) AS "max_pers"
FROM "guests"
GROUP BY "guests"."id")
So far, so good. But now, as stated before, I'd like to only count the guests which visited in a certain time interval (for statistic purposes as well).
SELECT "statistic"."id", SUM(max_pers) AS "People"
FROM (
SELECT "guests"."id", MAX(CHAR_LENGTH("guests"."gender")) AS "max_pers"
FROM "guests"
GROUP BY "guests"."id"),
"statistic", "guests"
WHERE ( "guests"."arrival" <= "statistic"."from" AND "guests"."departure" >= "statistic"."to")
GROUP BY "statistic"."id"
This query returns the following, x = desired result:
x * (x+1)
So if the result should be 3, it's 12. If it should be 5, it's 30 etc.
I probably could solve this algebraic but I'd rather understand what I'm doing wrong and learn from it.
Thanks in advance and I'm certainly going to answer all further questions.
PS: I'm using LibreOffice Base.
EDIT: An example
guests table:
ID | arrival | departure | gender |
10 | 1.1.14 | 10.1.14 | mf |
10 | 15.1.14 | 17.1.14 | m |
11 | 5.1.14 | 6.1.14 | m |
12 | 10.2.14 | 24.2.14 | f |
13 | 27.2.14 | 28.2.14 | mmmmmf |
statistic table:
ID | from | to | name |
1 | 1.1.14 | 31.1.14 |January | expected result: 3
2 | 1.2.14 | 28.2.14 |February| expected result: 7
MAX(...) is the wrong function: You want COUNT(DISTINCT ...).
Add proper join syntax, simplify (and remove unnecessary quotes) and this should work:
SELECT s.id, COUNT(DISTINCT g.id) AS People
FROM statistic s
LEFT JOIN guests g ON g.arrival <= s."from" AND g.departure >= s."too"
GROUP BY s.id
Note: Using LEFT join means you'll get a result of zero for statistics ids that have no guests. If you would rather no row at all, remove the LEFT keyword.
You have a very strange data structure. In any case, I think you want:
SELECT s.id, sum(numpersons) AS People
FROM (select g.id, max(char_length(g.gender)) as numpersons
from guests g join
statistic s
on g.arrival <= s."from" AND g.departure >= s."too"
group by g.id
) g join
GROUP BY s.id;
Thanks for all your inputs. I wasn't familiar with JOIN but it was necessary to solve my problem.
Since my databank is designed in german, I made quite the big mistake while translating it and I'm sorry if this caused confusion.
Selecting guests.id and later on grouping by guests.id wouldn't make any sense since the id is unique. What I actually wanted to do is select and group the guests.adr_id which links a visiting guest to an adress databank.
The correct solution to my problem is the following code:
SELECT statname, SUM (numpers) FROM (
SELECT statistic.name AS statname, guests.adr_id, MAX( CHAR_LENGTH( guests.gender ) ) AS numpers
FROM guests
JOIN statistics ON (guests.arrival <= statistics.too AND guests.departure >= statistics.from )
GROUP BY guests.adr_id, statistic.name )
GROUP BY statname
I also noted that my database structure is a mess but I created it learning by doing and haven't found any time to rewrite it yet. Next time posting, I'll try better.

SQL Views - Modify Returned Result

I'm a little stuck here. I'm trying to modify a returned View based on a condition. I'm fairly green on SQL and am having a bit of difficultly with the returned result. Heres a partial component of the view I wrote:
WITH A AS (
SELECT
ROW_NUMBER() OVER (PARTITION BY fkidContract,fkidTemplateItem ORDER BY bStdActive DESC, dtdateplanned ASC) AS RANK,
tblWorkItems.fkidContract AS ContractNo,
....
FROM tblWorkItems
WHERE fkidTemplateItem IN
(2895,2905,2915,2907,2908,
2909,3047,2930,2923,2969,
2968,2919,2935,2936,2927,
2970,2979)
AND ...
)
SELECT * FROM A WHERE RANK = 1
The return result is similar to the following:
ContractNo| ItemNumber | Planned | Complete
001 | 100 | 01/01/1900 | 02/01/1900
001 | 101 | 03/04/1900 | 02/01/1901
001 | 102 | 03/06/1901 | 02/08/1900
002 | 100 | 01/03/1911 | 02/08/1913
This gives me the results I expect, but due a nightmare crystal report I need to alter this view slightly. I want to take this returned result set and modify an existing column with a value pulled from the same table and the same Contract relationship, something like the following:
UPDATE A
SET A.Completed = ( SELECT R.Completed
FROM myTable R
INNER JOIN A
ON A.ContractNo = R.ContractNo
WHERE A.ItemNumber = 100 AND R.ItemNumber = 101
)
What I'm trying to do is modify the "Completed Date" of one task and make it the complete date of another task if they both share the same ContractNo field value.
I'm not sure about the ItemNumber relationships between A and R (perhaps it was just for testing...), but it seems like you don't really want to UPDATE anything, but you want to use a different value under some circumstances. So, maybe you just want to change the non-cte part of your query to something like:
SELECT A.ContractNo, A.ItemNumber, A.Planned,
COALESCE(R.Completed,A.Completed) as Completed
FROM A
LEFT OUTER JOIN myTable R
ON A.ContractNo = R.ContractNo
AND A.ItemNumber = 100 AND R.ItemNumber = 101 -- I'm not sure about this part
WHERE A.Rank = 1
So it turns out that actually reading the vendor documentation helps :)
SELECT
column1,
column2 =
case
when date > 1999 then 'some value'
when date < 1999 then 'other value'
else 'back to the future'
end
FROM ....
For reference, the total query did a triple inner join over ~5 million records and this case statement was surprisingly performant.
I suggest that this gets closed as a duplicate.

sybase - values from one table that aren't on another, on opposite ends of a 3-table join

Hypothetical situation: I work for a custom sign-making company, and some of our clients have submitted more sign designs than they're currently using. I want to know what signs have never been used.
3 tables involved:
table A - signs for a company
sign_pk(unique) | company_pk | sign_description
1 --------------------1 ---------------- small
2 --------------------1 ---------------- large
3 --------------------2 ---------------- medium
4 --------------------2 ---------------- jumbo
5 --------------------3 ---------------- banner
table B - company locations
company_pk | company_location(unique)
1 ------|------ 987
1 ------|------ 876
2 ------|------ 456
2 ------|------ 123
table C - signs at locations (it's a bit of a stretch, but each row can have 2 signs, and it's a one to many relationship from company location to signs at locations)
company_location | front_sign | back_sign
987 ------------ 1 ------------ 2
987 ------------ 2 ------------ 1
876 ------------ 2 ------------ 1
456 ------------ 3 ------------ 4
123 ------------ 4 ------------ 3
So, a.company_pk = b.company_pk and b.company_location = c.company_location. What I want to try and find is how to query and get back that sign_pk 5 isn't at any location. Querying each sign_pk against all of the front_sign and back_sign values is a little impractical, since all the tables have millions of rows. Table a is indexed on sign_pk and company_pk, table b on both fields, and table c only on company locations. The way I'm trying to write it is along the lines of "each sign belongs to a company, so find the signs that are not the front or back sign at any of the locations that belong to the company tied to that sign."
My original plan was:
Select a.sign_pk
from a, b, c
where a.company_pk = b.company_pk
and b.company_location = c.company_location
and a.sign_pk *= c.front_sign
group by a.sign_pk having count(c.front_sign) = 0
just to do the front sign, and then repeat for the back, but that won't run because c is an inner member of an outer join, and also in an inner join.
This whole thing is fairly convoluted, but if anyone can make sense of it, I'll be your best friend.
How about something like this:
SELECT DISTINCT sign_pk
FROM table_a
WHERE sign_pk NOT IN
(
SELECT DISTINCT front_sign sign
FROM table_c
UNION
SELECT DISTINCT rear_sign sign
FROM table_c
)
ANSI outer join is your friend here. *= has dodgy semantics and should be avoided
select distinct a.sign_pk, a.company_pk
from a join b on a.company_pk = b.company_pk
left outer join c on b.company_location = c.company_location
and (a.sign_pk = c.front_sign or a.sign_pk = c.back_sign)
where c.company_location is null
Note that the where clause is a filter on the rows returned by the join, so it says "do the joins, but give me only the rows that didn't to join to c"
Outer join is almost always faster than NOT EXISTS and NOT IN
I would be tempted to create a Temp table for the inner join and then outer join that.
But it really depends on the size of your data sets.
Yes, the schema design is flawed, but we can't always fix that!