Iterating through rows where rows could be "missing" - sql

I have a table with multiple rows with a column for ordering these rows. And then I have an expired column and a data column:
|ordering| expired | data |
------------------------------
| 1 |2020-12-31|whatever|
| 2 |2020-12-31|whatever|
| 3 |2020-12-31|whatever|
| 4 |2010-01-01|whatever|
| 5 |2020-12-31|whatever|
I also have a table with the line number for which I last got:
|number|
|------|
| 2 |
I get that number and add 1 to get the next line I need. Then I check if that number is higher than the rowcount and got back to 1 if it is. I store the new line back into number so I can use that the next time I need a row.
This is working perfectly except I filter the results by expired. In the above example row 4 will not be selected. So when I get the row with ordering=4 then the result will be empty. Of course it is easy to fix - if I get an empty results I can just try the next value. This takes extra time and lots of extra time if there are lots of expired rows.
I also need to take care that I don't loose a row if a row I previously got happens to get expired between two calls.
What is the most effective way to iterate through rows when they could be constantly changing?

Running the query twice is possibly the best approach. But, you can also order all the results, starting with the first value after the number and wrapping around:
order by (case when t1.ordering > t2.number then 1 else 2 end),
t1.ordering
If you have 10 rows, then this is fine. If you have 10,000 then you incur the overhead of sorting all the data, and two queries are probably faster.

You could perform a query using Min(ordering):
SELECT #nextOrderingNum = Coalesce(Min(t1.ordering), 0)
FROM table1 t1
WHERE t1.expired > GETDATE()
AND t1.ordering > #prevOrderingNum;
If #nextOrderingNum is 0 set #prevOrderingNum = #nextOrderingNum and run again.
Edit, even better instead of running twice just add a sub-query to the Coalesce:
SELECT #nextOrderingNum = Coalesce(Min(t1.ordering),
(SELECT Min(t2.ordering) FROM table1 t2
WHERE t2.expired > GETDATE()))
FROM table1 t1
WHERE t1.expired > GETDATE()
AND t1.ordering > #prevOrderingNum;

Related

sql query count rows per id to by selecting range between 2 min dates in different columns

temp
|id|received |changed |ur|context|
|33|2019-02-18|2019-11-18|
|33|2019-08-02|2019-09-18|
|33|2019-12-27|2019-12-18|
|18|2019-07-14|2019-10-18|
|50|2019-03-20|2019-05-26|
|50|2019-01-19|2019-06-26|
temp2
|id|min_received |min_changed |
|33|2019-02-18 |2019-09-18 |
|18|2019-04-14 |2019-09-18 |
|50|2019-01-11 |2019-05-25 |
The 'temp' table shows users who received a request for an activity. A user can make multiple requests. Hence the received column has multiple dates showing when the requests was received. The 'changed' table shows when the status was changed. There are also multiple values for it.
There is another temp2 column which shows the min dates for received and changed. Need to count total requests per user between the range of values in temp2
The expected result should look like this :- The third row of id- 33 should not be selected because the received date is after the changed date.
|id|total_requests_sent|
|33|2 |
|18|1 |
|50|2 |
Tried Creating 2 CTE's for both MIN date values and joined with the original one
I may be really over-simplifying your task, but wouldn't something like this work?
select
t.id, count (*) as total_requests_sent
from
temp t
join temp2 t2 on
t.id = t2.id
where
t.received between t2.min_received and t2.min_changed
group by
t.id
I believe the output will match your example on the use case you listed, but with a limited dataset it's hard to be sure.

SQL to return records that do not have a complete set according to a second table

I have two tables. I want to find the erroneous records in the first table based on the fact that they aren't complete set as determined by the second table. eg:
custID service transID
1 20 1
1 20 2
1 50 2
2 49 1
2 138 1
3 80 1
3 140 1
comboID combinations
1 Y00020Y00050
2 Y00049Y00138
3 Y00020Y00049
4 Y00020Y00080Y00140
So in this example I would want a query to return the first row of the first table because it does not have a matching 49 or 50 or (80 and 140), and the last two rows as well (because there is no 20). The second transaction is fine, and the second customer is fine.
I couldn't figure this out with a query, so I wound up writing a program that loads the services per customer and transid into an array, iterates over them, and ensures that there is at least one matching combination record where all the services in the combination are present in the initially loaded array. Even that came off as hamfisted, but it was less of a nightmare than the awkward outer joining of multiple joins I was trying to accomplish with SQL.
Taking a step back, I think I need to restructure the combinations table into something more accommodating, but I still can't think of what the approach would be.
I do not have DB2 so I have tested on Oracle. However listagg function should be there as well. The table service is the first table and comb the second one. I assume the service numbers to be sorted as in the combinations column.
select service.*
from service
join
(
select S.custid, S.transid
from
(
select custid, transid, listagg(concat('Y000',service)) within group(order by service) as agg
from service
group by custid, transid
) S
where not exists
(
select *
from comb
where S.agg = comb.combinations
)
) NOT_F on NOT_F.custid = service.custid and NOT_F.transid = service.transid
I dare to say that your database design does not conform to the first normal form since the combinations column is not atomic. Think about it.

In Sql Server 2014 ORDER BY clause with OFFSET FETCH NEXT returns weird results

I am currently using Sql Server 2014 Professional and the current version is (12.0.4100). I have a View and I am trying to SELECT 10 rows with specific offset.My View is like below:
BeginTime | EndTime | Duration | Name
09:00:00.0000000|16:00:00.0000000| 1 | some_name1
09:00:00.0000000|16:00:00.0000000| 2 | some_name2
09:00:00.0000000|16:00:00.0000000| 3 | some_name3
09:00:00.0000000|16:00:00.0000000| 4 | some_name4
09:00:00.0000000|16:00:00.0000000| 5 | some_name5
09:00:00.0000000|16:00:00.0000000| 6 | some_name6
09:00:00.0000000|16:00:00.0000000| 7 | some_name7
there are 100 rows like these and all have the exact same value in BeginTime and EndTime. Duration is incremented from 1 to 100 in related table. If query is only:
SELECT * FROM View_Name
ResultSet is correct. I can understand it by checking the duration column.
If I want to fetch only 10 rows starting from 0, ResultSet is correct and it is correct for starting from up to 18. When I want to fetch 10 rows starting from 19 or more than 19, Duration in ResultSet returns irrelevant results like Duration reversed. But it never returns the rows which has duration more than 11.
The query that I used to fetch specific rows is as follows:
SELECT * FROM View_Name ORDER BY BeginTime ASC OFFSET 20 ROWS FETCH NEXT 10 ROWS ONLY
There is also something strange in this situation; if I specify USE master, this problem disappears, but, if I specify USE [mydb_name], the problem appears again. By the way, I am using SQL SERVER 2014 Professional v(12.0.2269) in my local pc, this problem disappears for the above situation.
PS: I can not use USE master because, I am creating and listing the view dynamically, in Stored Procedures. Any help, answer or comment will be accepted. Thank You!
The documentation explains:
To achieve stable results between query requests using OFFSET and
FETCH, the following conditions must be met:
. . .
The ORDER BY clause contains a column or combination of columns that are guaranteed to be unique.
What happens in your case is that BeginTime is not unique. Databases in general -- and SQL Server in particular -- do not implement stable sorts. A stable sort is one where the rows are in the same order when the keys are the same. This is rather obvious, because tables and result sets represent unordered sets. They have no inherent ordering.
So, you need a unique key to make the sort stable. Given your data, this would seem to be either duration, name, or both:
SELECT *
ROM View_Name
ORDER BY BeginTime ASC, Duration, Name
OFFSET 20 ROWS FETCH NEXT 10 ROWS ONLY;
your order by should be unique,otherwise you will get indeterministic results(in your case ,begin time is not unique and your are not guarnteed to get same results every time).try changing your query to below to make it unique..
SELECT * FROM View_Name ORDER BY duration OFFSET 20 ROWS FETCH NEXT 10 ROWS ONLY
Further to add ,your first query (select * from view) result set is not guaranteed to be accurate every time unless you have an outer order by .

Multicriteria Insert/Update

I'm trying to create a query that will insert new records to a table or update already existing records, but I'm getting stuck on the filtering and grouping for the criteria I want.
I have two tables: tbl_PartInfo, and dbo_CUST_BOOK_LINE.
I'm want to select from dbo_CUST_BOOK_LINE based upon the combination of CUST_ORDER_ID, CUST_ORDER_LINE_NO, and REVISION_ID. Each customer order can have multiple lines, and each line can have multiple revision. I'm trying to select the unique combinations of each order and it's connected lines, but take the connected information for the row with the highest value in the revision column.
I want to insert/update from dbo_CUST_BOOK_LINE the following columns:
CUST_ORDER_ID
PART_ID
USER_ORDER_QTY
UNIT_PRICE
I want to insert/update them into tbl_PartInfo as the following columns respectively:
JobID
DrawingNumber
Quantity
UnitPrice
So if I have the following rows in dbo_CUST_BOOK_LINE (PART_ID omitted for example)
CUST_ORDER_ID CUST_ORDER_LINE_NO REVISION_ID USER_ORDER_QTY UNIT_PRICE
SCabc 1 1 0 100
SCabc 1 2 4 150
SCabc 1 3 4 125
SCabc 2 3 2 200
SCxyz 1 1 0 0
SCxyz 1 2 3 50
It would return
CUST_ORDER_ID CUST_ORDER_LINE_NO (REVISION_ID) USER_ORDER_QTY UNIT_PRICE
SCabc 1 3 4 125
SCabc 2 3 2 200
SCxyz 1 2 3 50
but with PART_ID included and without REVISION_ID
So far, my code is just for the inset portion as I was trying to get the correct records selected, but I keep getting duplicates of CUST_ORDER_ID and CUST_ORDER_LINE_NO.
INSERT INTO tbl_PartInfo ( JobID, DrawingNumber, Quantity, UnitPrice, ProductFamily, ProductCategory )
SELECT dbo_CUST_BOOK_LINE.CUST_ORDER_ID, dbo_CUST_BOOK_LINE.PART_ID, dbo_CUST_BOOK_LINE.USER_ORDER_QTY, dbo_CUST_BOOK_LINE.UNIT_PRICE, dbo_CUST_BOOK_LINE.CUST_ORDER_LINE_NO, Max(dbo_CUST_BOOK_LINE.REVISION_ID) AS MaxOfREVISION_ID
FROM dbo_CUST_BOOK_LINE, tbl_PartInfo
GROUP BY dbo_CUST_BOOK_LINE.CUST_ORDER_ID, dbo_CUST_BOOK_LINE.PART_ID, dbo_CUST_BOOK_LINE.USER_ORDER_QTY, dbo_CUST_BOOK_LINE.UNIT_PRICE, dbo_CUST_BOOK_LINE.CUST_ORDER_LINE_NO;
This has been far more complicated that anything I've done so far, so any help would be greatly appreciated. Sorry about the long column names, I didn't get to choose them.
I did some research and think I found a way to make it work, but I'm still testing it. Right now I'm using three queries, but it should be easily simplified into two when complete.
The first is an append query that takes the two columns I want to get distinct combo's from and selects them and using "group by," while also selecting max of the revision column. It appends them to another table that I'm using called tbl_TempDrop. This table is only being used right now to reduce the number of results before the next part.
The second is an update query that updates tbl_TempDrop to include all the other columns I wanted by setting the criteria equal to the three selected columns from the first query. This took an EXTREMELY long time to complete when I had 700,000 records to work with, hence the use of the tbl_TempDrop.
The third query is a basic append query that appends the rows of tbl_TempDrop to the end destination, tbl_PartInfo.
All that's left is to run all three in a row.
I didn't want to include the full details of any tables or queries yet until I ensure that it works as desired, and because some of the names are vague since I will be using this method for multiple query searches.
This website helped me a little to make sure I had the basic idea down. http://www.techonthenet.com/access/queries/max_query2_2007.php
Let me know if you see any flaws with the ideology!

SQL Update each record with its position in an ordered select

I'm using Access via OleDb. I have a table with columns ID, GroupID, Time and Place. An application inserts new records into the table, unfortunately the Place isn't calculated correctly.
I want to update each record in a group with its correct place according to its time ascending.
So assume the following data:
ID GroupId Time Place
Chuck 1 10:01 2
Alice 1 09:01 3
Bob 1 09:31 1
should result in:
ID GroupId Time Place
Chuck 1 10:01 3
Alice 1 09:01 1
Bob 1 09:31 2
I could come up with a solution using a cursor but that's AFAIK not possible in Access.
I just did a search on performing "ranking in Access" and I got this support.microsoft result.
It seems you create a query with a field that has the following expression:
Place: (Select Count(*) from table1 Where [Time] < [table1alias].[Time]) + 1
I can't test this, so I hope it works.
Using this you may be able to do (where queryAbove is the above query):
UPDATE table1
SET [Place] = queryAbove.[Place]
FROM queryAbove
WHERE table1.ID = queryAbove.ID
It's a long shot but please give it a go.
I don't think time is a number or time formatted column, time is unfortunately a text string containing the numbers and dilimetrs of the time format. This is why sorting after the time column is illegal. Removing the dilimiters ":" and "," casting to integer and then sorting numirically could do the job