Oracle SQL query comparing multiple rows with same identifier - sql

I'm honestly not sure how to title this - so apologies if it is unclear.
I have two tables I need to compare. One table contains tree names and nodes that belong to that tree. Each Tree_name/Tree_node combo will have its own line. For example:
Table: treenode
| TREE_NAME | TREE_NODE |
|-----------|-----------|
| 1 | A |
| 1 | B |
| 1 | C |
| 1 | D |
| 1 | E |
| 2 | A |
| 2 | B |
| 2 | D |
| 3 | C |
| 3 | D |
| 3 | E |
| 3 | F |
I have another table that contains names of queries and what tree_nodes they use. Example:
Table: queryrecord
| QUERY | TREE_NODE |
|---------|-----------|
| Alpha | A |
| Alpha | B |
| Alpha | D |
| BRAVO | A |
| BRAVO | B |
| BRAVO | D |
| CHARLIE | A |
| CHARLIE | B |
| CHARLIE | F |
I need to create an SQL where I input the QUERY name, and it returns any ‘TREE_NAME’ that includes all the nodes associated with the query. So if I input ‘ALPHA’, it would return TREE_NAME 1 & 2. If I ask it for CHARLIE, it would return nothing.
I only have read access, and don’t believe I can create temp tables, so I’m not sure if this is possible. Any advice would be amazing. Thank you!

You can use group by and having as follows:
Select t.tree_name
From tree_node t
join query_record q
on t.tree_node = q.tree_node
WHERE q.query = 'ALPHA'
Group by t.tree_name
Having count(distinct t.tree_node)
= (Select count(distinct q.tree_node) query_record q WHERE q.query = 'ALPHA');

Using an IN condition (a semi-join, which saves time over a join):
with prep (tree_node) as (select tree_node from queryrecord where query = :q)
select tree_name
from treenode
where tree_node in (select tree_node from prep)
group by tree_name
having count(*) = (select count(*) from prep)
;
:q in the prep subquery (in the with clause) is the bind variable to which you will assign the various QUERY values at runtime.
EDIT
I don't generally set up the test case on online engines; but in a comment below this answer, the OP said the query didn't work for him. So, I set up the example on SQLFiddle, here:
http://sqlfiddle.com/#!4/b575e/2
A couple of notes: for some reason, SQLFiddle thinks table names should be at most eight characters, so I had to change the second table name to queryrec (instead of queryrecord). I changed the name in the query, too, of course. And, second, I don't know how I can give bind values on SQLFiddle; I hard-coded the name 'Alpha'. (Note also that in the OP's sample data, this query value is not capitalized, while the other two are; of course, text values in SQL are case sensitive, so one should pay attention when testing.)

You can do this with a join and aggregation. The trick is to count the number of nodes in query_record before joining:
select qr.query, t.tree_name
from (select qr.*,
count(*) over (partition by query) as num_tree_node
from query_record qr
) qr join
tree_node t
on t.tree_node = qr.tree_node
where qr.query = 'ALPHA'
group by qr.query, t.tree_name, qr.num_tree_node
having count(*) = qr.num_tree_node;
Here is a db<>fiddle.

Related

SQL Join to the latest record in MS ACCESS

I want to join tables in MS Access in such a way that it fetches only the latest record from one of the tables. I've looked at the other solutions available on the site, but discovered that they only work for other versions of SQL. Here is a simplified version of my data:
PatientInfo Table:
+-----+------+
| ID | Name |
+-----+------+
| 1 | John |
| 2 | Tom |
| 3 | Anna |
+-----+------+
Appointments Table
+----+-----------+
| ID | Date |
+----+-----------+
| 1 | 5/5/2001 |
| 1 | 10/5/2012 |
| 1 | 4/20/2018 |
| 2 | 4/5/1999 |
| 2 | 8/8/2010 |
| 2 | 4/9/1982 |
| 3 | 7/3/1997 |
| 3 | 6/4/2015 |
| 3 | 3/4/2017 |
+----+-----------+
And here is a simplified version of the results that I need after the join:
+----+------+------------+
| ID | Name | Date |
+----+------+------------+
| 1 | John | 4/20/2018 |
| 2 | Tom | 8/8/2010 |
| 3 | Anna | 3/4/2017 |
+----+------+------------+
Thanks in advance for reading and for your help.
You can use aggregation and JOIN:
select pi.id, pi.name, max(a.date)
from appointments as a inner join
patientinfo as pi
on a.id = pi.id
group by pi.id, pi.name;
something like this:
select P.ID, P.name, max(A.Date) as Dt
from PatientInfo P inner join Appointments A
on P.ID=A.ID
group by P.ID, P.name
Both Bing and Gordon's answers work if your summary table only needs one field (the Max(Date)) but gets more tricky if you also want to report other fields from the joined table, since you would need to include them either as an aggregated field or group by them as well.
Eg if you want your summary to also include the assessment they were given at their last appointment, GROUP BY is not the way to go.
A more versatile structure may be something like
SELECT Patient.ID, Patient.Name, Appointment.Date, Appointment.Assessment
FROM Patient INNER JOIN Appointment ON Patient.ID=Appointment.ID
WHERE Appointment.Date = (SELECT Max(Appointment.Date) FROM Appointment WHERE Appointment.ID = Patient.ID)
;
As an aside, you may want to think whether you should use a field named 'ID' to refer to the ID of another table (in this case, the Apppintment.ID field refers to the Patient.ID). You may make your db more readable if you leave the 'ID' field as an identifier specific to that table and refer to that field in other tables as OtherTableID or similar, ie PatientID in this case. Or go all the way and include the name of the actual table in its own ID field.
Edited after comment:
Not quite sure why it would crash. I just ran an equivalent query on 2 tables I have which are about 10,000 records each and it was pretty instanteneous. Are your ID fields (i) unique numbers and (ii) indexed?
Another structure which should do the same thing (adapted for your field names and assuming that there is an ID field in Appointments which is unique) would be something like:
SELECT PatientInfo.UID, PatientInfo.Name, Appointments.StartDateTime, Appointments.Assessment
FROM PatientInfo INNER JOIN Appointments ON PatientInfo_UID = Appointments.PatientFID
WHERE Appointments.ID = (SELECT TOP 1 ID FROM Appointments WHERE Appointments.PatientFID = PatientInfo_UID ORDER BY StartDateTime DESC)
;
But that is starting to look a bit contrived. On my data they both produce the same result (as they should!) and are both almost instantaneous.
Always difficult to troubleshoot Access when it crashes - I guess you see no error codes or similar? Is this against a native .accdb database or another server?

SQL code to find if a series of lists do NOT contain a particular value

I have two tables
Jobs
+-----+------+
| Job | Name |
+-----+------+
| 1 | Foo |
| 2 | Bar |
| 3 | Baz |
| 4 | Qwe |
+-----+------+
Job_Operations
+-----+--------------+
| Job | Work_Center |
+-----+--------------+
| 1 | SomeCenter |
| 1 | Full Kit |
| 2 | SomeCenter |
| 3 | SomeCenter |
| 3 | Full Kit |
+-----+--------------+
The tables are linked on the Job column. How can I find the entries in Jobs without a corresponding 'Full Kit' entry in Job_Operations?
Desired Results
+-----+------+
| Job | Name |
+-----+------+
| 2 | Bar |
| 4 | Qwe |
+-----+------+
This seems like a straight forward NOT EXISTS query
SELECT J.*
FROM Jobs J
WHERE NOT EXISTS(SELECT *
FROM Job_Operations JO
WHERE JO.Job = J.Job
AND JO.Work_Center = 'Full Kit')
Select * from
(
select Jobs.* , job_Operations.Work_Center as wc
from Jobs
left join Job_Operations on Jobs.Job=Job_Operations.Job and Job_Operations.Work_Center='Full Kit'
) as sub1 where wc is null
In the subselect left join tells the SQL server to give me a row for every row in the Jobs table, even if it does not find a corresponding value in the job_Operations. From job_Operations only rows that contain your 'Full Kit' are regarded for the join. If the join fails, SQLsefer just returns a null for the fields in job_Operations. The outer select just fetches those rows.
Another way is to use Exists, see how that works in the other answer. But if you want to learn SQL try to get an understanding of how left, right inner and outer/full join work.
Simple solution in code below.
Also keep in mind that "working" doesn't meant "high performance".
Check SQL-plan on your specific DB.
select j.*
from job j
where j.job not in (select jo.job
from Job_Operations jo
where jo.Work_Center = 'Full Kit');

How to select table with a concatenated column?

I have the following data:
select * from art_skills_table;
+----+------+---------------------------+
| ID | Name | skills |
+----+------+---------------------------|
| 1 | Anna | ["painting","photography"]|
| 2 | Bob | ["drawing","sculpting"] |
| 3 | Cat | ["pastel"] |
+----+------+---------------------------+
select * from computer_table;
+------+------+-------------------------+
| ID | Name | skills |
+------+------+-------------------------+
| 1 | Anna | ["word","typing"] |
| 2 | Cat | ["code","editing"] |
| 3 | Bob | ["excel","code"] |
+------+------+-------------------------+
I would like to write an SQL statement which results in the following table.
+------+------+-----------------------------------------------+
| ID | Name | skills |
+------+------+-----------------------------------------------+
| 1 | Anna | ["painting","photography","word","typing"] |
| 2 | Bob | ["drawing","sculpting","excel","code"] |
| 3 | Cat | ["pastel","code","editing"] |
+------+------+-----------------------------------------------+
I've tried something like SELECT * from art_skills_table LEFT JOIN computer_table ON name. However it doesn't give what I need. I've read about array_cat but I'm having a bit of trouble implementing it.
if the skills column from both tables are arrays, then you should be able to get away with this:
SELECT a.ID, a.name, array_cat(a.skills, c.skills)
FROM art_skills_table a LEFT JOIN computer_table c
ON c.id = a.id
That said, While you used LEFT join in your sample, I think either an INNER or FULL (OUTER) join might serve you better.
First, i wondered why the data are stored in such a model.
Was of the opinion that NoSQL databases lack ability for joins and ...
... a semantic triple would be in the form of subject–predicate–object.
... a Key-value (KV) stores use associative arrays.
... a relational database would be normalized.
A few information about the use case would have helped.
Nevertheless, you can select the data with CONCAT and REPLACE for the desired form.
SELECT art_skills_table.ID, computer_table.name,
CONCAT(
REPLACE(art_skills_table.skills, '}',','),
REPLACE(computer_table.skills, '{','')
)
FROM art_skills_table JOIN computer_table ON art_skills_table.ID = computer_table.ID
The query returns the following result:
+----+------+--------------------------------------------+
| ID | Name | Skills |
+----+------+--------------------------------------------+
| 1 | Anna | {"painting","photography","word","typing"} |
| 2 | Cat | {"drawing","sculpting","code","editing"} |
| 3 | Bob | {"pastel","excel","code"} |
+----+------+--------------------------------------------+
I've used the ID for the JOIN, even though Bob has different values.
The JOIN should probably be done over the name.
JOIN computer_table ON art_skills_table.Name = computer_table.Name
BTW, you need to tell us what SQL engine you're running on.

1 to Many Query: Help Filtering Results

Problem: SQL Query that looks at the values in the "Many" relationship, and doesn't return values from the "1" relationship.
Tables Example: (this shows two different tables).
+---------------+----------------------------+-------+
| Unique Number | <-- Table 1 -- Table 2 --> | Roles |
+---------------+----------------------------+-------+
| 1 | | A |
| 2 | | B |
| 3 | | C |
| 4 | | D |
| 5 | | |
| 6 | | |
| 7 | | |
| 8 | | |
| 9 | | |
| 10 | | |
+---------------+----------------------------+-------+
When I run my query, I get multiple, unique numbers that show all of the roles associated to each number like so.
+---------------+-------+
| Unique Number | Roles |
+---------------+-------+
| 1 | C |
| 1 | D |
| 2 | A |
| 2 | B |
| 3 | A |
| 3 | B |
| 4 | C |
| 4 | A |
| 5 | B |
| 5 | C |
| 5 | D |
| 6 | D |
| 6 | A |
+---------------+-------+
I would like to be able to run my query and be able to say, "When the role of A is present, don't even show me the unique numbers that have the role of A".
Maybe if SQL could look at the roles and say, WHEN role A comes up, grab unique number and remove it from column 1.
Based on what I would "like" to happen (I put that in quotations as this might not even be possible) the following is what I would expect my query to return.
+---------------+-------+
| Unique Number | Roles |
+---------------+-------+
| 1 | C |
| 1 | D |
| 5 | B |
| 5 | C |
| 5 | D |
+---------------+-------+
UPDATE:
Query Example: I am querying 8 tables, but I condensed it to 4 for simplicity.
SELECT
c.UniqueNumber,
cp.pType,
p.pRole,
a.aRole
FROM c
JOIN cp ON cp.uniqueVal = c.uniqueVal
JOIN p ON p.uniqueVal = cp.uniqueVal
LEFT OUTER JOIN a.uniqueVal = p.uniqueVal
WHERE
--I do some basic filtering to get to the relevant clients data but nothing more than that.
ORDER BY
c.uniqueNumber
Table sizes: these tables can have anywhere from 50,000 rows to 500,000+
Pretending the table name is t and the column names are alpha and numb:
SELECT t.numb, t.alpha
FROM t
LEFT JOIN t AS s ON t.numb = s.numb
AND s.alpha = 'A'
WHERE s.numb IS NULL;
You can also do a subselect:
SELECT numb, alpha
FROM t
WHERE numb NOT IN (SELECT numb FROM t WHERE alpha = 'A');
Or one of the following if the subselect is materializing more than once (pick the one that is faster, ie, the one with the smaller subtable size):
SELECT t.numb, t.alpha
FROM t
JOIN (SELECT numb FROM t GROUP BY numb HAVING SUM(alpha = 'A') = 0) AS s USING (numb);
SELECT t.numb, t.alpha
FROM t
LEFT JOIN (SELECT numb FROM t GROUP BY numb HAVING SUM(alpha = 'A') > 0) AS s USING (numb)
WHERE s.numb IS NULL;
But the first one is probably faster and better[1]. Any of these methods can be folded into a larger query with multiple additional tables being joined in.
[1] Straight joins tend to be easier to read and faster to execute than queries involving subselects and the common exceptions are exceptionally rare for self-referential joins as they require a large mismatch in the size of the tables. You might hit those exceptions though, if the number of rows that reference the 'A' alpha value is exceptionally small and it is indexed properly.
There are many ways to do it, and the trade-offs depend on factors such as the size of the tables involved and what indexes are available. On general principles, my first instinct is to avoid a correlated subquery such as another, now-deleted answer proposed, but if the relationship table is small then it probably doesn't matter.
This version instead uses an uncorrelated subquery in the where clause, in conjunction with the not in operator:
select num, role
from one_to_many
where num not in (select otm2.num from one_to_many otm2 where otm2.role = 'A')
That form might be particularly effective if there are many rows in one_to_many, but only a small proportion have role A. Of course you can add an order by clause if the order in which result rows are returned is important.
There are also alternatives involving joining inline views or CTEs, and some of those might have advantages under particular circumstances.

Oracle 10 SQL: FULL JOIN through Cross Reference Table

http://sqlfiddle.com/#!4/24637/1
I have three tables, (better details/data shown in sqlfiddle link), one replacing another, and a cross reference table in between. One of the fields in each of the table uses the cross reference (version), and another one of the fields in each of the tables is the same (changeID).
I need a query that when passed a list of new_version + new_changeType, along with the equivalent original_version + old_changeType (if there is an old version equivalent) PLUS any old changeIDs that were 'missed' in the conversion of data.
TABLES (fields on the same line are equivalent)
OLD_table | XREF_table | NEW_Table
original_version | original_version |
changeID | | changeID
OLD_changeType | |
| new_version | new_version
| | NEW_changeType
DATA
111,1,CT1 | 111,AAA | AAA,1,ONE
111,2,CT2 | 222,BBB | AAA,2,TWO
222,1,CT1 | 333,DDD | BBB,1,ONE
222,2,CT2 | | BBB,2,TWO
222,3,CT3 | | CCC,1,ONE
333,1,CT1 | |
444,1,CT1 | |
If passed the following list, the result set should look like so. (order doesnt matter)
AAA,BBB,CCC
| NEW_VERSION | NEW_CHANGE_TYPE| ORIGINAL_VERSION | CHANGEID | OLD_CHANGE_TYPE |
|-------------|----------------|------------------|----------|-----------------|
| AAA | ONE | 111 | 1 | CT1 |
| AAA | TWO | 111 | 2 | CT2 |
| BBB | ONE | 222 | 1 | CT1 |
| BBB | TWO | 222 | 2 | CT2 |
| CCC | ONE | (null) | (null) | (null) |
| (null) | (null) | 222 | 3 | CT3 |
I'm having trouble getting ALL the data required. I've played with the following query, however I seem to either 1) miss a row or 2) get additional rows not matching the requirements.
The following queries I've played with are as follows.
select
a.new_version,
a.Change_type,
c.original_version,
c.changeID,
c.OLD_Change_type
from NEW_TABLE a
LEFT OUTER JOIN XREF_TABLE b on a.new_version = b.new_version
FULL OUTER JOIN OLD_TABLE c on
b.original_version = c.original_version and a.changeID = c.changeID
where (b.new_version in ('AAA','BBB','CCC') or b.new_version is null);
select
a.new_version,
a.Change_type,
c.original_version,
c.changeID,
c.OLD_Change_type
from NEW_TABLE a
FULL JOIN XREF_TABLE b on a.new_version = b.new_version
FULL JOIN OLD_TABLE c on
b.original_version = c.original_version and a.changeID = c.changeID
where (a.new_version in ('AAA','BBB','CCC'));
The first returns one 'extra' row with the 333,DDD data, which is not specified from the input.
The seconds returns one less row (with the changeID from the old table "missed" from when this data was converted over.
Any thoughts or suggestions on how to solve this?
First inner join old_table and xref_table, as you are not interested in any old_table entries without an xref_table entry. Then full outer join new_table. In your WHERE clause be aware that new_table.new_version can be null, so use coalesce to use xref_table.new_version in this case to limit your results to AAA, BBB and CCC. That's all.
select
coalesce(n.new_version, x.new_version) as new_version,
n.change_type,
o.original_version,
o.changeid,
o.old_change_type
from old_table o
inner join xref_table x
on x.original_version = o.original_version
full outer join new_table n
on n.new_version = x.new_version
and n.changeid = o.changeid
where coalesce(n.new_version, x.new_version) in ('AAA','BBB','CCC')
order by 1,2,3,4,5
;
Here is your fiddle: http://sqlfiddle.com/#!4/24637/11.
BTW: Better never use random aliases like a, b and c that don't indicate what table is meant. That makes the query harder to understand. Use the table's first letter(s) or an acronym instead.