Process SQL Table with no Unique Column - sql

We have a table which keeps the log of internet usage inside our company. this table is filled by a software bought by us and we cannot make any changes to its table. This table does not have a unique key or index (to make the data writing faster as its developers say)
I need to read the data in this table to create real time reports of internet usage by our users.
currently I'm reading data from this table in chunks of 1000 records. My problem is keeping the last record I have read from the table, so I can read the next 1000 records.
what is the best possible solution to this problem?
by the way, earlier records may get deleted by the software as needed if the database file size gets big.

Depending on your version of SQL Server, you can use row_number(). Once the row_number() is assigned, then you can page through the records:
select *
from
(
select *,
row_number() over(order by id) rn
from yourtable
) src
where rn between 1 and 1000
Then when you want to get the next set of records, you could change the values in the WHERE clause to:
where rn between 1001 and 2000
Based on your comment that the data gets deleted, I would do the following.
First, insert the data into a temptable:
select *, row_number() over(order by id) rn
into #temp
from yourtable
Then you can select the data by row number in any block as needed.
select *
from #temp
where rn between 1 and 1000

This would also help;
declare #numRecords int = 1000 --Number of records needed per request
declare #requestCount int = 0 --Request number starting from 0 and increase 1 by 1
select top (#numRecords) *
from
(
select *, row_number() over(order by id) rn
from yourtable
) T
where rn > #requestCount*#numRecords
EDIT: As per comments
CREATE PROCEDURE [dbo].[select_myrecords]
--Number of records needed per request
declare #NumRecords int --(= 1000 )
--Datetime of the LAST RECORD of previous result-set or null for first request
declare #LastDateTime datetime = null
AS
BEGIN
select top (#NumRecords) *
from yourtable
where LOGTime < isnull(#LastDateTime,getdate())
order by LOGTime desc
END

Without any index you cannot efficiently select the "last" records. The solution will not scale. You cannot use "real-time" and "repeated table scans of a big logging table" in the same sentence.
Actually, without any unique identification attribute for each row you cannot even determine what's new (proof: say, you had a table full of thousands of booleans. How would you determine which ones are new? They cannot be told apart! You cannot find out.). There must be something you can use, like a combination of DateTime, IP or so. Or, you can add an IDENTITY column which is likely to be transparent to the software you use.
Probably, the software you use will tolerate you creating an index on some ID or DateTime column as this is transparent to the software. It might create more load, so be sure to test it (my guess: you'll be fine).

Related

SQL How to update every nth row which meets requirement

I have a table that I would like to update one column data on every nth row if it meets row requirement.
My table has many columns but the key are Object_Id (in case this could be useful for creating temp table)
But the one I'm trying to update is online_status, it looks like below, but on bigger scales so I usually have 10rows that has same time but they all have %Online% in it and in total around 2000 rows (with Online and about another 2000 with Offline). I just need to update every 2-4 rows of those 10 that are repeating itself.
Table picture here: (for some reason table formatting doesn't come up good)
Table
So what I tried is: This pulls a list of every 3rd record that matches criteria Online, I just need a way to update it but can't get through this.
SELECT * FROM (SELECT *, row_number() over() rn FROM people
WHERE online_status LIKE '%Online%') foo WHERE online_status LIKE '%Online%' AND foo.rn % 3 =0
What I also tried is:
However this has updated every single row. not the ones I needed.
UPDATE people
SET online_status = 'Offline 00:00-24:00'
WHERE people.Object_id IN
(SELECT *
FROM
(SELECT people.Object_id, row_number() over() rn FROM people
WHERE online_status LIKE '%Online%') foo WHERE people LIKE '%Online%' AND foo.rn % 3 =0);
Is there a way to take list from Select code above and simply update it or run a few scripts that could add it to like temp table and store object ids, and the next script would update main table if object id would match temp table.
Thank you for any help :)
Don't select other columns but Object_id in the subquery at WHERE people.Object_id IN (..)
UPDATE people
SET online_status = 'Offline 00:00-24:00'
WHERE Object_id IN
( SELECT Object_id
FROM
( SELECT p.Object_id, row_number() over() rn
FROM people p
WHERE p.online_status LIKE '%Online%') foo
WHERE foo.rn % 3 = 0
);

how to create an order column in sql server

i want an order info for every row in my table. the following is my table. ID column is identity and its primary key. Order column is computed from id column.
ID -- Name -- Order
1 kiwi 1
2 banana 2
3 apple 3
everything is fine and i have an order column. But i cant switch the orders of rows.
for example i cant say that from now on kiwi's order becomes 2 and banana's order becomes 1
in other words if we would update a computed column then my problem could be solved.
if i dont create order column as computed column then for every new entry i have to compute largest order so that i can write (largest order) + 1 for new entry's order. But i do not calculate largest number for every entry since it is costly.
So what should i do now?
I ve searched and the solutions i found creating trigger function etc. i do not want to do that too.
I might not have understood the question - I don't think its very clear.
but why use a counter to order the set, couldnt you just use a timestamp for each order and use that to dictate which order is more recent?
CREATE TABLE dbo.Test (
ID INT IDENTITY(1,1),
Name varchar(50),
OrderTime Datetime
)
INSERT INTO dbo.TEST (Name,OrderTime)
VALUES ('kiwi',Getdate())
SELECT *
FROM dbo.TEST
ORDER BY OrderTime
if you needed an integer based on the order time you could use a rownumber function to return one;
SELECT *,
ROW_NUMBER() OVER (ORDER BY OrderTime Desc) as OrderInt
FROM dbo.TEST

SQL Help: A query that returns a row where a later row exists

I'm trying to get a count of items in a particular component of a pipeline. Each time it moves within the pipeline, there is an entry created in this particular table.
It's stored something like this:
ID:int-pk, ObjectId:varchar(25), EventType:int, Time:DateTime
For eg. I'm looking at time = 10:00
So if object 1 has event A at 9 AM and event 2 at 10 AM then I'd like to get the ObjectId (1).
Characteristics
The ObjectId is a unique id for the items going through the pipeline, so there will actually be very few of them (1 entry or each pipeline component, of which there are roughly 10)
Expecting ~10K inserts/day
Performance is a bit of a requirement (so EXISTS(...) might not be an option)
Hardware is solid, it's a datacenter SQL machine, but it's shared with lots of other teams/processes.
Problems I've had/what I'm trying:
It's a design right now, so I don't have actual data. I should have a proof of concept db up in a bit to test
Here's a bit of what I've been thinking of trying:
select objectid, time, eventtype
from objects
where -- can't use time < #t because I won't get the later events
group by objectid
having --
or
select objectid as oid, time, eventtype
from objects
where eventtype = 1
and time < #t
and exists (select objectid, eventtype, time
where objectid = oid -- not sure if this is legal
and eventtype = 2
and time > #t)
As you can probably tell, I don't write a whole lot of SQL so I've forgotten a bit.
Example
ID objectid eventtype time
1 12345 1 09:00 AM
2 12345 2 10:00 AM
eventtypeid description
1 "enter house"
2 "leave house"
3 "enter work"
So, the4 subject entered the house at 9am and left at 11am and I'm trying to see if they were in the house at 10am. 12345 is the subject's "name/number"
In this example, I'm trying to query to see if the subject was in the house at 10:00 AM. It's entirely possibly that the subject entered the house, but never left, and I don't want those for this query.
Questions
Am I on the right track?
How could I estimate the expected performance of the second query (assuming it works)?
Pointers? Suggestions? Examples?
Everything is appreciated.
For a given subject and a given time, you can do:
select top 1 o.*
from objects o
where eventtime < #t and
objectid = #objectid
order by eventtime desc;
Extending this to multiple objects is easiest with windows functions:
select o.*
from (select o.*,
row_number() over (partition by objectid order by eventtime desc) as seqnum
from objects o
where eventtime < #t
) o
where seqnum = 1;
These both give you information about the last event before (strictly before) a given time.
It's hard to follow what exactly you're after, but to return a row where a later row exists, simply, in SQL2012 can be done using the LEAD function:
DROP TABLE #test
CREATE TABLE #test (VALUE CHAR(25))
INSERT INTO #test VALUES('abcde'),('asaf'),('dogs'),(NULL),('')
SELECT Value, LEAD(Value,1,'Last Record') OVER (ORDER BY Value)
FROM #test
SELECT *
FROM (SELECT Value, LEAD(Value,1,'Last Record') OVER (ORDER BY Value)'Last_Flag'
FROM #test
)sub
WHERE Last_Flag <> 'Last Record'
From the test table created above, the next query pulls a value from the next row (the 2nd parameter '1' defines the offset, how many rows forward you want to look) for each line, and 'Last Record' is seeded as the default value if there is no next row (NULL By default, which is fine unless your data has NULLS, I like to seed a value just in case). Then the last one selects everything except for the one with no rows after.
You can add a PARTITION BY statement if you want to know for each objectID, IE:
LEAD(Value,1,'Last Record') OVER (PARTITION BY objectID ORDER BY Value)
I am kind of confused by your SQL but it appears from what you state you want the most recent object in a given time frame potentially that already exists based on object reference. I may be barking up the wrong tree as I am confused by the SQL but from what you ask for it apears you want an object that is grouped by having a common objectid that is not always unique and a type that is a generic type.
This may help you some it may not. It basically will always account for dupes by their objectId counts but I was not sure if you limit scope by typ as well or not so I left it in case. Then in the second iteration the dupes are partitioning by obj and then limiting scope by type as a variable potentially if you only care of one type. You could do this in the first iteration as well. If you encounter a null it assumes the type means 'everything'. I have used similar methods like this in production environments so it should be solid as long as you have indexes in the proper places . Namely indexes on Type and Datetime field. Example is self extracting and will run as is in SQL Management Studio 2008 and higher with table variables that auto populate.
declare #Object Table ( objectId int , typ varchar(2), obj varchar(8), dt datetime);
insert into #Object values (1, 'A', 'Brett', getdate() - 0.8) ,(1,'A','Sean', getdate() - 0.4),(1,'A','Brett', getdate() - 0.08),(2,'A','Michael', getdate() - 0.04)
,(2,'B','Ray', getdate() - 0.008),(3, 'B', 'Erik', getdate() - 0.004),(3, 'C', 'Ray', getdate() - 0.0001);
-- objects as they are
Select *
from #Object
;
-- Find dupe objects by two distinctions
select
obj
, count(objectId) over(partition by obj) as rowOccurencesByTyp
, count(objectId) over(partition by typ, obj) as rowOccurencesByTypAndObj
from #Object
;
-- limit scope by type
declare
-- CHANGE LINE AS NEEDED TO TEST HOW IT WORKS FOR 'A', 'B' OR NULL
#Type varchar(2) = NULL
-- Scope range of datetime too if you want
, #dt datetime
;
-- Find dupes first
with dupes as
(
select
obj
, typ
, dt
, count(objectId) over(partition by obj) as rowOccurencesByTyp
, count(objectId) over(partition by typ, obj) as rowOccurencesByTypAndObj
-- I made the Ray occurence be in DIFFERENT Types so this would be an edge case you may not want
from #Object
-- WHERE CLAUSE WOULD BE HERE WITH DATE RANGE. I was lazy in my example and made it small but you could
-- easily limit scope of dupes by a date range of 'dt between #Start and #End' or 'dt < #dt' or 'dt > #dt'
)
-- if you merely want to get the most recent objects you can do a windowed function to get them quite easily
, a as
(
select
*
, row_number() over(partition by obj order by dt desc) as rwn
-- I am find the ranking by shared objectid and then ordering by date descending(most current first).
-- You wish to also add the 'typ' before the objectID as I was not sure
from dupes
where typ = isnull(#Type, typ) -- limit scope by type potentially
and rowOccurencesByTyp > 1
-- you may set up other rowOccurrences here if that suits you better.
)
select *
from a
where rwn = 1
-- recently inserted double is a dupe, determining scope of dupe is done by
-- the most recent 'rwn' finding a repeat insert of a row from part 1
-- ordered by date descending and grouped by it's object

Getting the last record in SQL in WHERE condition

i have loanTable that contain two field loan_id and status
loan_id status
==============
1 0
2 9
1 6
5 3
4 5
1 4 <-- How do I select this??
4 6
In this Situation i need to show the last Status of loan_id 1 i.e is status 4. Can please help me in this query.
Since the 'last' row for ID 1 is neither the minimum nor the maximum, you are living in a state of mild confusion. Rows in a table have no order. So, you should be providing another column, possibly the date/time when each row is inserted, to provide the sequencing of the data. Another option could be a separate, automatically incremented column which records the sequence in which the rows are inserted. Then the query can be written.
If the extra column is called status_id, then you could write:
SELECT L1.*
FROM LoanTable AS L1
WHERE L1.Status_ID = (SELECT MAX(Status_ID)
FROM LoanTable AS L2
WHERE L2.Loan_ID = 1);
(The table aliases L1 and L2 could be omitted without confusing the DBMS or experienced SQL programmers.)
As it stands, there is no reliable way of knowing which is the last row, so your query is unanswerable.
Does your table happen to have a primary id or a timestamp? If not then what you want is not really possible.
If yes then:
SELECT TOP 1 status
FROM loanTable
WHERE loan_id = 1
ORDER BY primaryId DESC
-- or
-- ORDER BY yourTimestamp DESC
I assume that with "last status" you mean the record that was inserted most recently? AFAIK there is no way to make such a query unless you add timestamp into your table where you store the date and time when the record was added. RDBMS don't keep any internal order of the records.
But if last = last inserted, that's not possible for current schema, until a PK addition:
select top 1 status, loan_id
from loanTable
where loan_id = 1
order by id desc -- PK
Use a data reader. When it exits the while loop it will be on the last row. As the other posters stated unless you put a sort on the query, the row order could change. Even if there is a clustered index on the table it might not return the rows in that order (without a sort on the clustered index).
SqlDataReader rdr = SQLcmd.ExecuteReader();
while (rdr.Read())
{
}
string lastVal = rdr[0].ToString()
rdr.Close();
You could also use a ROW_NUMBER() but that requires a sort and you cannot use ROW_NUMBER() directly in the Where. But you can fool it by creating a derived table. The rdr solution above is faster.
In oracle database this is very simple.
select * from (select * from loanTable order by rownum desc) where rownum=1
Hi if this has not been solved yet.
To get the last record for any field from a table the easiest way would be to add an ID to each record say pID. Also say that in your table you would like to hhet the last record for each 'Name', run the simple query
SELECT Name, MAX(pID) as LastID
INTO [TableName]
FROM [YourTableName]
GROUP BY [Name]/[Any other field you would like your last records to appear by]
You should now have a table containing the Names in one column and the last available ID for that Name.
Now you can use a join to get the other details from your primary table, say this is some price or date then run the following:
SELECT a.*,b.Price/b.date/b.[Whatever other field you want]
FROM [TableName] a LEFT JOIN [YourTableName]
ON a.Name = b.Name and a.LastID = b.pID
This should then give you the last records for each Name, for the first record run the same queries as above just replace the Max by Min above.
This should be easy to follow and should run quicker as well
If you don't have any identifying columns you could use to get the insert order. You can always do it like this. But it's hacky, and not very pretty.
select
t.row1,
t.row2,
ROW_NUMBER() OVER (ORDER BY t.[count]) AS rownum from (
select
tab.row1,
tab.row2,
1 as [count]
from table tab) t
So basically you get the 'natural order' if you can call it that, and add some column with all the same data. This can be used to sort by the 'natural order', giving you an opportunity to place a row number column on the next query.
Personally, if the system you are using hasn't got a time stamp/identity column, and the current users are using the 'natural order', I would quickly add a column and use this query to create some sort of time stamp/incremental key. Rather than risking having some automation mechanism change the 'natural order', breaking the data needed.
I think this code may help you:
WITH cte_Loans
AS
(
SELECT LoanID
,[Status]
,ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS RN
FROM LoanTable
)
SELECT LoanID
,[Status]
FROM LoanTable L1
WHERE RN = ( SELECT max(RN)
FROM LoanTable L2
WHERE L2.LoanID = L1.LoanID)

Fetch two next and two previous entries in a single SQL query

I want to display an image gallery, and on the view page, one should be able to have a look at a bunch of thumbnails: the current picture, wrapped with the two previous entries and the two next ones.
The problem of fetching two next/prev is that I can't (unless I'm mistaken) select something like MAX(id) WHERE idxx.
Any idea?
note: of course the ids do not follow as they should be the result of multiple WHERE instances.
Thanks
Marshall
You'll have to forgive the SQL Server style variable names, I don't remember how MySQL does variable naming.
SELECT *
FROM photos
WHERE photo_id = #current_photo_id
UNION ALL
SELECT *
FROM photos
WHERE photo_id > #current_photo_id
ORDER BY photo_id ASC
LIMIT 2
UNION ALL
SELECT *
FROM photos
WHERE photo_id < #current_photo_id
ORDER BY photo_id DESC
LIMIT 2;
This query assumes that you might have non-contiguous IDs. It could become problematic in the long run, though, if you have a lot of photos in your table since TOP is often evaluated after the entire result set has been retrieved from the database. YMMV.
In a high load scenario, I would probably use these queries, but I would also prematerialize them on a regular basis so that each photo had a PreviousPhotoOne, PreviousPhotoTwo, etc column. It's a bit more maintenance, but it works well when you have a lot of static data and need performance.
if your IDs are continuous you could do
where id >= #id-2 and id <= #id+2
Otherwise I think you'd have to union 3 queries, one to get the record with the given id and two others messing about with top and order by like this
select *
from table
where id = #id
union
select top 2 *
from table
where id < #id
order by id desc
union
select top 2 *
from table
where id > #id
order by id
Performance will not be too bad as you aren't retrieving massive sets of data but it won't be great due to using a union.
If you find performance starts being a problem you could add columns to hold the ids of the previous and next items; calculating the ids using a trigger or overnight process or something. This will mean you only do the hard query once rather than each time you need it.
I think this method should work fine for non-continguous ID's and should be more effecient than using a UNION's. currentID would be set either using a constant in SQL or passing from your program.
SELECT * FROM photos WHERE ID = currentID OR ID IN (
SELECT ID FROM photos WHERE ID < currentID ORDER BY ID DESC LIMIT 2
) OR ID IN (
SELECT ID FROM photos WHERE ID > currentID ORDER BY ID ASC LIMIT 2
) ORDER BY ID ASC
If you are just interested in the previous and next records by id couldn't you just have a where clause that restricts WHERE id=xx, xx-1, xx-1, xx+1, xx+2 using multiple WHERE clauses or using WHERE IN ?