Select all entries that have multiple associated versions - sql

Basically I want a select statement that selects ID, and Version, but each ID has multiple versions, so I want to select all ID's in the table and each version associated with each ID.
If I just do:
select ID, Version FROM Table
then I only get an ID and it's associated most recent version, I don't get multiple entries for ID=X and the multiple versions associated with ID X, etc
Example of data I want:
ID = 1, Version = 0
ID = 1, Version = 1.0
ID = 1, Version = 2.0
ID = 2, Version = 0
ID = 2, Version = 1.0
ID = 2, Version = 2.0
ID = 2, Version = 3.0
ID = 3, Version = 0
ID = 4, Version = 0
ID = 4, Version = 1.0
etc etc

If your table structure include 2 columns ID (see Note) and version AND a row existed for every combination of row/version then SELECT ID, version would list all the rows.
e.g. CREATE TABLE IF NOT EXISTS details (id INTEGER, version TEXT);
And it is populated as follows :-
Then the result of SELECT id,version from details; would be :-
Or if the query were SELECT 'ID= ' ||id||', Version= ' || version AS vercol FROM details; it would be (to cater for how you have shown you want the result) :-
Note where the column named id is not an alias of rowid (i.e it is not defined as id INTEGER PRIMARY KEY or id INTEGER PRIMARY KEY AUTOINCREMENT).
However
If the structure and usage is not based upon the above then you will likely encounter issues.
For example the following stores the latest version (and other versions) :-
Then SELECT id, version FROM details_alt; would result in what you appear to be describing i.e (the otherversion columns are ignored in this usage):-
To manipulate such a structure and either calculating the other than highest versions or including other columns would be quite complex. For example this is part way towards doing the former :-
SELECT
CASE
WHEN version < 1 THEN 'ID= '||id||', Version=0'
WHEN version < 2 THEN 'ID= '||id||' Version=0, 1.0'
WHEN version < 3 THEN 'ID= '||id||' Version=0, 1.0, 2.0'
WHEN version < 4 THEN 'ID= '||id||' Version=0, 1.0, 2.0, 3.0'
END AS idcol
FROM details_alt
and would result in :-
In short if your table isn't structured as initially shown, which it would appear to be the case, you are introducing unnecessary complexity and are not following well established database design techniques.
The answer, again if not using the initial structure, is to utilise a design along the lines of the initial design shown.
Otherwise you should amend your question to show the table's structure and how it is utilised when storing with an example of actual data.

It's as simple as
SELECT ID , Version FROM Table WHERE ID='X'
Here X is your ID number

honestly, your question is kinda confusing because
select ID, Version From table
would bring all data from table, I think the table is big and thats why it looks like it doesnt show all Versions for each id, So Try using an Order by .
select ID, Version FROM Table
order by ID, Version

Related

How to find bad references in a table in Oracle

I have a data problem I need to clean up. Basically I have two tables storing "package" information, one table for documents and one table for audit information. I have entries in the package tables that reference documents that no longer exist and have been replaced (same name but different id) and I want to write a query to find all the bad ones and which new document should replace them. The only thing linking these two is a string value in the audit table which stores the document name (not id).
I've setup a sample schema here: http://sqlfiddle.com/#!4/997bda/1
package_s is the single values for a package in our application
package_r is the repeating values for a package in our application
(these are joined with the same value in the id column)
audit_info is all the audit information in a package
docs is all the documents that can be attached to a package
This query finds the packages with bad attachments (may be more than one per package)
select distinct ps.pkgname, pr.doc_list
from package_s ps, package_r pr
where ps.id = pr.id
and not exists (
select 1 from docs
where pr.doc_list = id
)
order by 1,2 asc
;
I need to build a query with the following rules:
I need to return at least the package id, the position value and the new document id (I will build an update statement to put this new document id in the row matching the package id / position in the package_r table)
the way to get the document name from the audit information is:
SUBSTR(description,0,INSTR(description,'[')-2)
If the document was Added and then Removed, it should be ignored (string_1)
string_2 must not be 'Supporting'
the new document must match
state = 'Master'
latest = 1
pub = '0'
Right now I have a semi-working script that works on a per package basis, but the problem is affecting 2000+ packages. I find the audit entries that don't match documents correctly attached to the package and then search for those names in the document table. The problem with this is since there is no direct link between the package and document tables, if there are multiple problem attachments on one package, each "new" document is returned once per position value, i.e.
package id bad doc id position new doc id
p1 d1 -1 d1-new
p1 d1 -1 d4-new
p1 d4 -2 d1-new
p1 d4 -2 d4-new
It doesn't matter which new id goes into which position value, but the duplication result problem like this makes it hard to mass generate update scripts, some manual filtering would be required.
This is a somewhat complex and unique data issue, so any help would be greatly appreciated.
This query works according to informations provided:
with ai as (
select a1.audited_id id, dc.id doc_id, dc.docname,
row_number() over (partition by a1.audited_id order by dc.id) rn
from audit_info a1
join docs dc
on dc.state = 'Master' and dc.latest = 1 and dc.pub = '0'
and dc.docname = substr(a1.description, 1, instr(a1.description, '[')-2)
where string_1 = 'Added' and string_2 <> 'Supporting'
and not exists (
select * from audit_info a2
where a2.audited_id = a1.audited_id and string_1 = 'Removed'
and a2.description = a1.description )
and not exists ( -- here matching docs are eliminated
select 1 from package_r pr
where pr.id = a1.audited_id and pr.doc_list = dc.id ) ),
p as (
select ps.id, ps.pkgname, pr.doc_list, pr.position,
row_number() over (partition by ps.id order by doc_list) rn
from package_s ps
join package_r pr on pr.id = ps.id
where not exists ( select * from docs where pr.doc_list = docs.id )
)
select p.id, p.pkgname, p.doc_list, p.position
, ai.docname, ai.doc_id
from p join ai on ai.id = p.id and p.rn = ai.rn
order by p.id, p.doc_list, ai.doc_id
Output:
ID PKGNAME DOC_LIST POSITION DOCNAME DOC_ID
-- ------- -------- -------- ------- ------
p1 000001 d3 -3 doc3 d3-new
p1 000001 d4 -4 doc4 d4-new
p2 000002 d5 -2 doc5 d5-new
p4 000004 d6 -1 doc6 d6-new
Edit: Answers to issues reported in comments
it is identifying packages that do not have bad values, and then the doc_list column is blank,
Note that query (my subquery p) for identyfing packages is basically your query, I just added counter there.
I guess that some process/application or someone manually cleared column doc_list in package_r.
If you don't want such entries, just add condition and trim(doc_list) is not null in subquery p.
for the ones it gets right on the package part (they have a bad value) it is bringing back the wrong docname/doc_id to replace the bad value with, it is a different doc_id in the list.
I understand this only partially. Can you add such entries to your examples (in Fiddle or just edit your question and add problematic input rows and expected output for them?)
"It doesn't matter which new id goes into which position value".
Assignment I made this way - if we had two old docs with names "ABC", "DEF" and corrected docs have names "XXA", "DE12"
then they will be linked as "ABC"->"DE12" and "DEF"->"XXA" (alphabetical ordering seems more rational than totally random).
To make assigning random change order by ... to order by null in both row_number() functions.

Get Only One Row For Each ID With the Highest Value

I have a query
SELECT
*
FROM
mgr.MF_AGREEMENT_LGR TABLE1
INNER JOIN
(SELECT
MAX(VALUE_DATE) AS VALUE_DATE,
REGISTRATION_NO AS REGISTRATION_NO
FROM
mgr.MF_AGREEMENT_LGR
GROUP BY
REGISTRATION_NO) AS TABLE2 ON TABLE1.REGISTRATION_NO = TABLE2.REGISTRATION_NO
WHERE
TABLE1.VALUE_DATE = TABLE2.VALUE_DATE
AND TABLE1.TRX_CODE = 'LCLR'
ORDER BY
TABLE1.REGISTRATION_NO
This returns the rows with the latest date for each REGISTRATION_CODE. Some have like three or more results for each REGISTRATION_CODE because it has more than one transaction on the same date.
Also, each row has its DOC_NO field.
My question is, how am I going to get only one row from each REGISTRATION_CODE with the highest DOC_NO.
By the way, DOC_NO is a varchar.
Example value for this field is: Amort 1, Amort 12, Amort 5
If those examples are in one REGISTRATION_CODE, I only need the row with the highest amort which is Amort 12.
I am using a SQL Server 2000.
SQL Server 2000 has not been supported in years. You really should upgrade to supported software.
You can get what you want with not exists:
SELECT al.*
FROM mgr.MF_AGREEMENT_LGR al
WHERE NOT EXISTS (SELECT 1
FROM mgr.MF_AGREEMENT_LGR al2
WHERE al2.registration_no = al.registration_no and
(al2.date > al2.date or
al2.date = al.date and al2.DOC_NO > al.DOC_NO
)
) AND
al.TRX_CODE = 'LCLR';
You probably want the condition on 'LCLR' in the subquery as well. However, that is not in your original query, so I'm leaving it out.

MSSQLSRV - filtering out results with duplicate row

I'm having a frustrating issue with SQL Server. I need to create a view from a table containing details of files loaded through ETL. The table contains a file id (unique), filename, serverid (relating to the server it has been loaded onto).
The first 2 letters of the filename is a country code, i.e. US, UK, GB, DE - there are multiple files loaded per country. I want to get the record with the highest file id for each country. The below query does this but it returns the highest record PER SERVER, so there may be multiple file ids - i.e. it would return the highest file id for that country on server1 and server2 - I only want the highest record full stop.
I've played with an equivalent query on MySQL and got it working by commenting out the last line (GROUP BY t.[server_id]), which seemed to work fine, but of course MSSQLSRV needs all non-aggregates in the SELECT to be placed in the GROUP BY statement.
So, how can I get the same result in SQL Server - i.e. get one result, with the highest file_id, without getting a duplicate row for a different server_id?
Hope I'm making myself clear.
SELECT MAX(t.[file_id]) AS FID
,LEFT(t.[full_file_name], 2) AS COUNTRYCODE
,t.[server_id]
FROM [tracking_files] t
WHERE t.server_id IS NOT NULL
AND t.[server_id] = (
SELECT TOP 1 [server_id]
FROM [tracking_files] md
WHERE md.[file_id] = t.file_id
)
GROUP BY LEFT(t.[full_file_name], 2)
,t.[server_id]
EDIT:
Here is the sample data I've been playing with in MySQL, along with the result I got (which is the desired result).
In SQL Server, as I can't comment out that last GROUP BY clause, we're seeing e.g. two file_ids for GB (one for server 1 and one for server 2)
If you are using SQL Server 2005 or later you can use ROW_NUMBER():
SELECT t.File_ID,
t.full_file_name,
t.CountryCode,
t.Server_ID
FROM ( SELECT t.[File_ID],
t.full_file_name,
CountryCode = LEFT(t.full_file_name, 2),
t.Server_ID,
RowNumber = ROW_NUMBER() OVER(PARTITION BY LEFT(t.full_file_name, 2) ORDER BY [File_ID] DESC)
FROM [tracking_files] t
) t
WHERE t.RowNumber = 1;
If you are using a previous version you will need to use a subquery to get the maximum file ID per country code, then join back to your main table:
SELECT t.[File_ID],
t.full_file_name,
CountryCode = LEFT(t.full_file_name, 2),
t.Server_ID
FROM [tracking_files] t
INNER JOIN
( SELECT MaxFileID = MAX([File_ID])
FROM [tracking_files] t
GROUP BY LEFT(t.full_file_name, 2)
) MaxT
ON MaxT.MaxFileID = t.[File_ID];

Selecting Maximum Version Number from Two Columns

This relates to another question I asked previously. You may have a better understanding of this if you quickly scan it.
Version Numbers float, decimal or double
I have two colums and a foreign in a database table. A [Version] column and a [Revision] column. These are in relation to version numbers. e.g. Version 1, Revision 2 = v1.2
What I need to do is grab the maximum version number for a particular foreign key.
Here's what I have so far:
SELECT f.[pkFileID]
,x.[fkDocumentHeaderID]
,f.[fkDocumentID]
,x.[Version]
,x.[Revision]
,f.[FileURL]
,f.[UploadedBy]
,f.[UploadedDate]
FROM
(
SELECT
docs.[fkDocumentHeaderID]
,MAX([Version]) AS Version
,MAX([Revision]) AS Revision
FROM
[ClinicalGuidanceV2].[dbo].[tbl_DocumentFiles]
INNER JOIN
dbo.tbl_Documents docs ON [fkDocumentID] = [pkDocumentID]
GROUP BY
docs.[fkDocumentHeaderID]
)
AS x
INNER JOIN
dbo.tbl_DocumentFiles f ON
f.[fkDocumentHeaderID] = x.[fkDocumentHeaderID] AND
f.[Version] = x.[Version] AND
f.[Revision] = x.[Revision]
Basically grabbing the maximum and joining back to itself. This obvisouly doesn't work because if I have version numbers 1.1, 1.2 and 2.0 the maximum value I'm returning from the above query is 2.2 (which doesn't exist).
What I need to do (I think) is select the maximum [Version] and then select the maximum [Revision] for that [Version] but I can't quite figure how to do this.
Any help, suggestions, questions are all welcome.
Thanks.
You could change it to
SELECT f.[pkFileID]
,x.[fkDocumentHeaderID]
,f.[fkDocumentID]
,x.[Version]
,x.[Revision]
,f.[FileURL]
,f.[UploadedBy]
,f.[UploadedDate]
FROM (
SELECT docs.[fkDocumentHeaderID]
,MAX([Version] * 100000 + [Revision]) AS [VersionRevision]
FROM [ClinicalGuidanceV2].[dbo].[tbl_DocumentFiles]
INNER JOIN dbo.tbl_Documents docs
ON [fkDocumentID] = [pkDocumentID]
GROUP BY
docs.[fkDocumentHeaderID]
)AS x
INNER JOIN dbo.tbl_DocumentFiles f
ON f.[fkDocumentHeaderID] = x.[fkDocumentHeaderID]
AND f.[Version] * 100000 + f.[Revision] = x.[VersionRevision]
The idea is to multiply the Version with a constant large enough so it never collides with revision (I have taken 100.000 but any value would do).
After that, your JOIN does the same to retrieve the record.
The below should work to extract the top revision.
SELECT TOP 1 f.[pkFileID]
,x.[fkDocumentHeaderID]
,f.[fkDocumentID]
,x.[Version]
,x.[Revision]
,f.[FileURL]
,f.[UploadedBy]
,f.[UploadedDate]
FROM
(
SELECT
docs.[fkDocumentHeaderID]
,MAX([Version]) AS Version
-- Comment this out ,MAX([Revision]) AS Revision
FROM
[ClinicalGuidanceV2].[dbo].[tbl_DocumentFiles]
INNER JOIN
dbo.tbl_Documents docs ON [fkDocumentID] = [pkDocumentID]
GROUP BY
docs.[fkDocumentHeaderID]
)
AS x
INNER JOIN
dbo.tbl_DocumentFiles f ON
f.[fkDocumentHeaderID] = x.[fkDocumentHeaderID] AND
f.[Version] = x.[Version]
ORDER BY x.Revision DESC
Namely, it extracts only the records using the max version into table x. Then it orders these records by revision in descending order, and extracts the topmost of the bunch.

Mysql many to many query

Having a mental block with going around this query.
I have the following tables:
review_list: has most of the data, but in this case the only important thing is review_id, the id of the record that I am currently interested in (int)
variant_list: model (varchar), enabled (bool)
variant_review: model (varchar), id (int)
variant_review is a many to many table linking the review_id in review_list to the model(s) in variant_list review and contains (eg):
..
test1,22
test2,22
test4,22
test1,23
test2,23... etc
variant_list is a list of all possible models and whether they are enabled and contains (eg):
test1,TRUE
test2,TRUE
test3,TRUE
test4,TRUE
what I am after in mysql is a query that when given a review_id (ie, 22) will return a resultset that will list each value in variant_review.model, and whether it is present for the given review_id such as:
test1,1
test2,1
test3,0
test4,1
or similar, which I can farm off to some webpage with a list of checkboxes for the types. This would show all the models available and whether each one was present in the table
Given a bit more information about the column names:
Select variant_list.model
, Case When variant_review.model Is Not Null Then 1 Else 0 End As HasReview
From variant_list
Left join variant_review
On variant_review.model = variant_list.model
And variant_review.review_id = 22
Just for completeness, if it is the case that you can have multiple rows in the variant_review table with the same model and review_id, then you need to do it differently:
Select variant_list.model
, Case
When Exists (
Select 1
From variant_review As VR
Where VR.model = variant_list.model
And VR.review_id = 22
) Then 1
Else 0
End
From variant_list