How to query only old and duplicate data from a database in SQL

How to query only old and duplicate data from a database in SQL - sql

I'm trying to query my database to pull only duplicate/old data to write to a scratch section in excel (Using a macro passing SQL to the DB).
For now, I'm currently testing in Access alone to only filter out the old data.
First, I'm trying to filter my database by a specifed WorkOrder, RunNumber, and Row.
The code below only filters by Work Order, RunNumber, and Row. ...but SQL doesn't like when I tack on a 2nd AND statement; so this currently isn't working.
SELECT *
FROM DataPoints
WHERE (((DataPoints.[WorkOrder])=[WO2]) AND ((DataPoints.[RunNumber])=6) AND ((DataPoints.[Row]=1)
Once I figure that portion out....
Then if there is only 1 entry with specified WorkOrder, RunNumber, and Row, then I want filter it out. (its not needed in the scratch section, because its data is already written to the main section of my report)
If there are 2 or more entries with said criteria(WO, RN, and Row), then I want to filter out the newest entry based on RunDate and RunTime, and only keep all older entries.
For instance, in the clip below. The only item remaining in my filtered query will be the top entry with the timestamp 11:47:00AM.
.
Are there any recommended commands to complete this problem? Any ideas are helpful. Thank you.

I would suggest something along the lines of the following:
select t.*
from datapoints t
where
t.workorder = [WO2] and
t.runnumber = 6 and
t.row = 1 and
exists
(
select 1
from datapoints u
where
u.workorder = t.workorder and
u.runnumber = t.runnumber and
u.row = t.row and
(u.rundate > t.rundate or (u.rundate = t.rundate and u.runtime > t.runtime))
)
Here, if the correlated subquery within the where clause finds a record with the same workorder, runnumber and row, but with either a later rundate or the same rundate and a later runtime, then the record is returned by the main query.

You need two more )'s at the end of your code snippet. Or you can delete the parentheses completely in this example, MS Access will ad them back in as it deems necessary.
M.S. Access SQL can be tricky as it is not standards compliant and either doesn't allow for super complex queries, or it needs an ugly work around, like having a parentheses nesting nightmare when trying to join more than two tables.
For these reasons, I suggest using multiple Access queries to produce your results.

Related

How to run a sql query multiple times and combine the results into single output?

I have a list of 2500 obj numbers stored in Excel for which I need to run the below SQL:
SELECT
a.objno,
a.table_comment,
b.queue_comment
FROM
aq$_queue_tables a
JOIN
AQ$_QUEUES b ON a.objno = b.table_objno
WHERE
a.objno = 19551;
Is there any way I can write a loop on above SQL with objno feeding from a list or from a different table? I also want to store/produce all the results from each loop run as a single output.
I considered the option to upload the numbers into a new table and add a where condition:
a.objno=(SELECT newtab.objectno FROM newtab);
However, the logic I'll be writing in the query would exclude certain objectno results. Let's say that the associated objectno has certain queue_comment as of certain date associated with that objectno. I do not want to pull that record. This condition would match with some objectno and wouldn't match with others. Having that condition and running the query against all the objectno is returning 0 results. I couldn't share the original logic as it would reveal certain business rules and it'll be a violation of some policy.
So, I need to run the query on each objectno separately and combine the results.
I'm totally new to SQL and got this task assigned. I'm aware of the regular loop, for in SQL, but I don't think I can apply them in this situation.
Any guidance or reference links to helpful topics is much appreciated as well.
Thanks in advance for the help.

One option is to upload the object numbers from Excel sheet to a table in the database and run the query as following. Assuming newtab is the table where the objectno are uploaded.
SELECT
a.objno,
a.table_comment,
b.queue_comment
FROM
aq$_queue_tables a JOIN AQ$_QUEUES b on a.objno = b.table_objno
WHERE
a.objno IN (SELECT newtab.objectno FROM newtab);
I have used a subquery here, join to the aq$ can work as well.

Reading the comments and all I think you need to enhance your Excel with 2 additional columns and load to a new table.
IN can be used in the following way too:
SELECT
a.objno,
a.table_comment,
b.queue_comment
FROM
aq$_queue_tables a
JOIN
AQ$_QUEUES b ON a.objno = b.table_objno
WHERE
(a.objno,a.table_comment,b.queue_comment) IN (19551,'something','something');
so with the new table will be:
WHERE
(a.objno,a.table_comment,b.queue_comment) IN
(select n.objno, n.table_comment, n.queue_comment from new_table n)

SQL - LEFT JOIN and WHERE statement to show just first row

I read many threads but didn't get the right solution to my problem. It's comparable to this Thread
I have a query, which gathers data and writes it per shell script into a csv file:
SELECT
'"Dose History ID"' = d.dhs_id,
'"TxFieldPoint ID"' = tp.tfp_id,
'"TxFieldPointHistory ID"' = tph.tph_id,
...
FROM txfield t
LEFT JOIN txfielpoint tp ON t.fld_id = tp.fld_id
LEFT JOIN txfieldpoint_hst tph ON fh.fhs_id = tph.fhs_id
...
WHERE d.dhs_id NOT IN ('1000', '10000')
AND ...
ORDER BY d.datetime,...;
This is based on an very big database with lots of tables and machine values. I picked my columns of interest and linked them by their built-in table IDs. Now I have to reduce my result where I get many rows with same values and just the IDs are changed. I just need one(first) row of "tph.tph_id" with the mechanics like
WHERE "Rownumber" is 1
or something like this. So far i couldn't implement a proper subquery or use the ROW_NUMBER() SQL function. Your help would be very appreciated. The Result looks like this and, based on the last ID, I just need one row for every og this numbers (all IDs are not strictly consecutive).
A01";261511;2843119;714255;3634457;
A01";261511;2843113;714256;3634457;
A01";261511;2843113;714257;3634457;
A02";261512;2843120;714258;3634464;
A02";261512;2843114;714259;3634464;
....

I think "GROUP BY" may suit your needs.
You can group rows with the same values for a set of columns into a single row

Order by in subquery behaving differently than native sql query?

So I am honestly a little puzzled by this!
I have a query that returns a set of transactions that contain both repair costs and an odometer reading at the time of repair on the master level. To get an accurate Cost per mile reading I need to do a subquery to get both the first meter reading between a starting date and an end date, and an ending meter.
(select top 1 wf2.ro_num
from wotrans wotr2
left join wofile wf2
on wotr2.rop_ro_num = wf2.ro_num
and wotr2.rop_fac = wf2.ro_fac
where wotr.rop_veh_num = wotr2.rop_veh_num
and wotr.rop_veh_facility = wotr2.rop_veh_facility
AND ((#sdate = '01/01/1900 00:00:00' and wotr2.rop_tran_date = 0)
OR ([dbo].[udf_RTA_ConvertDateInt](#sdate) <= wotr2.rop_tran_date
AND [dbo].[udf_RTA_ConvertDateInt](#edate) >= wotr2.rop_tran_date))
order by wotr2.rop_tran_date asc) as highMeter
The reason I have the tables aliased as xx2 is because those tables are also used in the main query, and I don't want these to interact with each other except to pull the correct vehicle number and facility.
Basically when I run the main query it returns a value that is not correct; it returns the one that is second(keep in mind that the first and second have the same date.) But when I take the subquery and just copy and paste it into it's own query and run it, it returns the correct value.
I do have a work around for this, but I am just curious as to why this happening. I have searched quite a bit and found not much(other than the fact that people don't like order bys in subqueries). Talking to one of my friends that also does quite a bit of SQL scripting, it looks to us as if the subquery is ordering differently than the subquery by itsself when you have multiple values that are the same for the order by(i.e. 10 dates of 08/05/2016).
Any ideas would be helpful!
Like I said I have a work around that works in this one case, but don't know yet if it will work on a larger dataset.
Let me know if you want more code.

SQL MIN() returns multiple values?

I am using SQL server 2005, querying with Web Developer 2010, and the min function appears to be returning more than one value (for each ID returned, see below). Ideally I would like it to just return the one for each ID.
SELECT Production.WorksOrderOperations.WorksOrderNumber,
MIN(Production.WorksOrderOperations.OperationNumber) AS Expr1,
Production.Resources.ResourceCode,
Production.Resources.ResourceDescription,
Production.WorksOrderExcel_ExcelExport_View.PartNumber,
Production.WorksOrderOperations.PlannedQuantity,
Production.WorksOrderOperations.PlannedSetTime,
Production.WorksOrderOperations.PlannedRunTime
FROM Production.WorksOrderOperations
INNER JOIN Production.Resources
ON Production.WorksOrderOperations.ResourceID = Production.Resources.ResourceID
INNER JOIN Production.WorksOrderExcel_ExcelExport_View
ON Production.WorksOrderOperations.WorksOrderNumber = Production.WorksOrderExcel_ExcelExport_View.WorksOrderNumber
WHERE Production.WorksOrderOperations.WorksOrderNumber IN
( SELECT WorksOrderNumber
FROM Production.WorksOrderExcel_ExcelExport_View AS WorksOrderExcel_ExcelExport_View_1
WHERE (WorksOrderSuffixStatus = 'Proposed'))
AND Production.Resources.ResourceCode IN ('1303', '1604')
GROUP BY Production.WorksOrderOperations.WorksOrderNumber,
Production.Resources.ResourceCode,
Production.Resources.ResourceDescription,
Production.WorksOrderExcel_ExcelExport_View.PartNumber,
Production.WorksOrderOperations.PlannedQuantity,
Production.WorksOrderOperations.PlannedSetTime,
Production.WorksOrderOperations.PlannedRunTime
If you can get your head around it, I am selecting certain columns from multiple tables where the WorksOrderNumber is also contained within a subquery, and numerous other conditions.
Result set looks a little like this, have blurred out irrelevant data.
http://i.stack.imgur.com/5UFIp.png (Wouldn't let me embed image).
The highlighted rows are NOT supposed to be there, I cannot explicitly filter them out, as this result set will be updated daily and it is likely to happen with a different record.
I have tried casting and converting the OperationNumber to numerous other data types, varchar type returns '100' instead of the '30'. Also tried searching search engines, no one seems to have the same problem.
I did not structure the tables (they're horribly normalised), and it is not possible to restructure them.
Any ideas appreciated, many thanks.

The MIN function returns the minimum within the group.
If you want the minimum for each ID you need to get group on just ID.
I assume that by "ID" you are referring to Production.WorksOrderOperations.WorksOrderNumber.
You can add this as a "table" in your SQL:
(SELECT Production.WorksOrderOperations.WorksOrderNumber,
MIN(Production.WorksOrderOperations.OperationNumber)
FROM Production.WorksOrderOperations
GROUP BY Production.WorksOrderOperations.WorksOrderNumber)

SQL statement HAVING MAX(some+thing)=some+thing

I'm having trouble with Microsoft Access 2003, it's complaining about this statement:
select cardnr
from change
where year(date)<2009
group by cardnr
having max(time+date) = (time+date) and cardto='VIP'
What I want to do is, for every distinct cardnr in the table change, to find the row with the latest (time+date) that is before year 2009, and then just select the rows with cardto='VIP'.
This validator says it's OK, Access says it's not OK.
This is the message I get: "you tried to execute a query that does not include the specified expression 'max(time+date)=time+date and cardto='VIP' and cardnr=' as part of an aggregate function."
Could someone please explain what I'm doing wrong and the right way to do it? Thanks
Note: The field and table names are translated and do not collide with any reserved words, I have no trouble with the names.

Try to think of it like this - HAVING is applied after the aggregation is done.
Therefore it can not compare to unaggregated expressions (neither for time+date, nor for cardto).
However, to get the last (principle is the same for getting rows related to other aggregated functions as weel) time and date you can do something like:
SELECT cardnr
FROM change main
WHERE time+date IN (SELECT MAX(time+date)
FROM change sub
WHERE sub.cardnr = main.cardnr AND
year(date)<2009
AND cardto='VIP')
(assuming that date part on your time field is the same for all the records; having two fields for date/time is not in your best interest and also using reserved words for field names can backfire in certain cases)
It works because the subquery is filtered only on the records that you are interested in from the outer query.
Applying the same year(date)<200 and cardto='VIP' to the outer query can improve performance further.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to query only old and duplicate data from a database in SQL - sql

Related

How to run a sql query multiple times and combine the results into single output?

SQL - LEFT JOIN and WHERE statement to show just first row

Order by in subquery behaving differently than native sql query?

SQL MIN() returns multiple values?

SQL statement HAVING MAX(some+thing)=some+thing

Categories

Resources