Fetch two next and two previous entries in a single SQL query - sql

I want to display an image gallery, and on the view page, one should be able to have a look at a bunch of thumbnails: the current picture, wrapped with the two previous entries and the two next ones.
The problem of fetching two next/prev is that I can't (unless I'm mistaken) select something like MAX(id) WHERE idxx.
Any idea?
note: of course the ids do not follow as they should be the result of multiple WHERE instances.
Thanks
Marshall

You'll have to forgive the SQL Server style variable names, I don't remember how MySQL does variable naming.
SELECT *
FROM photos
WHERE photo_id = #current_photo_id
UNION ALL
SELECT *
FROM photos
WHERE photo_id > #current_photo_id
ORDER BY photo_id ASC
LIMIT 2
UNION ALL
SELECT *
FROM photos
WHERE photo_id < #current_photo_id
ORDER BY photo_id DESC
LIMIT 2;
This query assumes that you might have non-contiguous IDs. It could become problematic in the long run, though, if you have a lot of photos in your table since TOP is often evaluated after the entire result set has been retrieved from the database. YMMV.
In a high load scenario, I would probably use these queries, but I would also prematerialize them on a regular basis so that each photo had a PreviousPhotoOne, PreviousPhotoTwo, etc column. It's a bit more maintenance, but it works well when you have a lot of static data and need performance.

if your IDs are continuous you could do
where id >= #id-2 and id <= #id+2
Otherwise I think you'd have to union 3 queries, one to get the record with the given id and two others messing about with top and order by like this
select *
from table
where id = #id
union
select top 2 *
from table
where id < #id
order by id desc
union
select top 2 *
from table
where id > #id
order by id
Performance will not be too bad as you aren't retrieving massive sets of data but it won't be great due to using a union.
If you find performance starts being a problem you could add columns to hold the ids of the previous and next items; calculating the ids using a trigger or overnight process or something. This will mean you only do the hard query once rather than each time you need it.

I think this method should work fine for non-continguous ID's and should be more effecient than using a UNION's. currentID would be set either using a constant in SQL or passing from your program.
SELECT * FROM photos WHERE ID = currentID OR ID IN (
SELECT ID FROM photos WHERE ID < currentID ORDER BY ID DESC LIMIT 2
) OR ID IN (
SELECT ID FROM photos WHERE ID > currentID ORDER BY ID ASC LIMIT 2
) ORDER BY ID ASC

If you are just interested in the previous and next records by id couldn't you just have a where clause that restricts WHERE id=xx, xx-1, xx-1, xx+1, xx+2 using multiple WHERE clauses or using WHERE IN ?

Related

How to use LIMIT and IN together to have a default row in SQL?

I am exploring SQL with W3School page and I have this requirements where I need to limit the query to a certain number but also having a default row included with that limit.
Here I want a default row where the customer name is Alfreds, then grab the remaining 29 rows to complete the query regardless of what their name is.
I tried to look on other SO question but they are too complicated to understand and using different syntax.
What you are looking for is a specific order clause.
Try this
SELECT * FROM Customers order by (case when CustomerName in ('Alfreds Futterkiste') then 0 else CustomerId end) limit 30 ;
If you're going to have a default row in SQL you should really have that row in the table with a known primary key, and then UNION it onto your query:
--default row, that is always included as long as the table has a PK 1
SELECT *
FROM Customers
WHERE CustomerId = 1
UNION ALL
--other rows, a variable number of
SELECT *
FROM Customers
WHERE CustomerId <> 1 AND ...
LIMIT 30
The limit presented in this way applies to the result of the Union
If you ever want to do something where you're unioning together limited sets in other combinations you might want to look at eg a form like
(... LIMIT 2)
UNION ALL
(... LIMIT 28)
Use UNION to combine the two queries.
SELECT *
FROM Customers
WHERE CustomerName != 'Alfredo Futterkiste'
LIMIT 9
UNION
SELECT *
FROM Customers
WHERE CustomerName = 'Alfreo Futterkiste'

Random sample table with Hive, but including matching rows

I have a large table containing a userID column and other user variable columns, and I would like to use Hive to extract a random sample of users based on their userID. Furthermore, sometimes these users will be on multiple rows and if a randomly selected userID is contained in other parts of the table I would like to extract those rows too.
I had a look at the Hive sampling documentation and I see that something like this can be done to extract a 1% sample:
SELECT * FROM source
TABLESAMPLE (1 PERCENT) s;
but I am not sure how to add the constraint where I would like all other instances of those 1% userIDs selected too.
You can use rand() to split the data randomly and with the proper percent of userid in your category. I recommend rand() because setting the seed to something make the results repeatable.
select c.*
from
(select userID
, if(rand(5555)<0.1, 'test','train') end as type
from
(select userID
from mytable
group by userID
) a
) b
right outer join
(select *
from userID
) c
on a.userid=c.userid
where type='test'
;
This is set up for entity level modeling purposes, which is why I have test and train as types.

Process SQL Table with no Unique Column

We have a table which keeps the log of internet usage inside our company. this table is filled by a software bought by us and we cannot make any changes to its table. This table does not have a unique key or index (to make the data writing faster as its developers say)
I need to read the data in this table to create real time reports of internet usage by our users.
currently I'm reading data from this table in chunks of 1000 records. My problem is keeping the last record I have read from the table, so I can read the next 1000 records.
what is the best possible solution to this problem?
by the way, earlier records may get deleted by the software as needed if the database file size gets big.
Depending on your version of SQL Server, you can use row_number(). Once the row_number() is assigned, then you can page through the records:
select *
from
(
select *,
row_number() over(order by id) rn
from yourtable
) src
where rn between 1 and 1000
Then when you want to get the next set of records, you could change the values in the WHERE clause to:
where rn between 1001 and 2000
Based on your comment that the data gets deleted, I would do the following.
First, insert the data into a temptable:
select *, row_number() over(order by id) rn
into #temp
from yourtable
Then you can select the data by row number in any block as needed.
select *
from #temp
where rn between 1 and 1000
This would also help;
declare #numRecords int = 1000 --Number of records needed per request
declare #requestCount int = 0 --Request number starting from 0 and increase 1 by 1
select top (#numRecords) *
from
(
select *, row_number() over(order by id) rn
from yourtable
) T
where rn > #requestCount*#numRecords
EDIT: As per comments
CREATE PROCEDURE [dbo].[select_myrecords]
--Number of records needed per request
declare #NumRecords int --(= 1000 )
--Datetime of the LAST RECORD of previous result-set or null for first request
declare #LastDateTime datetime = null
AS
BEGIN
select top (#NumRecords) *
from yourtable
where LOGTime < isnull(#LastDateTime,getdate())
order by LOGTime desc
END
Without any index you cannot efficiently select the "last" records. The solution will not scale. You cannot use "real-time" and "repeated table scans of a big logging table" in the same sentence.
Actually, without any unique identification attribute for each row you cannot even determine what's new (proof: say, you had a table full of thousands of booleans. How would you determine which ones are new? They cannot be told apart! You cannot find out.). There must be something you can use, like a combination of DateTime, IP or so. Or, you can add an IDENTITY column which is likely to be transparent to the software you use.
Probably, the software you use will tolerate you creating an index on some ID or DateTime column as this is transparent to the software. It might create more load, so be sure to test it (my guess: you'll be fine).

How do I check if all posts from a joined table has the same value in a column?

I'm building a BI report for a client where there is a 1-n related join involved.
The joined table has a field for employee ID (EmplId).
The query that I've built for this report is supposed to give a 1 in its field "OneEmployee" if all the related posts have the same employee in the EmplId field, null if it's different employees, i.e:
TaskTrans
TaskTransHours > EmplId: 'John'
TaskTransHours > EmplId: 'John'
This should give a 1 in the said field in the query
TaskTrans
TaskTransHours > EmplId: 'John'
TaskTransHours > EmplId: 'George'
This should leave the said field blank
The idea is to create a field where a case function checks this and returns the correct value. But my problem is whereas there is a way to check for this through SQL.
select not count(*) from your_table
where employee_id = GIVEN_ID
and your_field not in ( select min(your_field)
from your_table
where employee_id = GIVEN_ID);
Note: my first idea was to use LIMIT 1 in the inner query, but MYSQL didn't like it, so min it was - the points to use any, but only one. Min should work, but the field should be indexed, then this query will actually execute rather fast, as only indexes would be used (obviously employee_id should also be indexed).
Note2: Do not get too confused with not in front of count(*), you want 1 when there is none that is different, I count different ones, and then give you the not count(*), which will be one if count is 0, otherwise 0.
Seems a job for a window COUNT():
SELECT
…,
CASE COUNT(DISTINCT TaskTransHours.EmplId) OVER () WHEN 1 THEN 1 END
AS OneEmployee
FROM …

Fetch top X users, plus a specific user (if they're not in the top X)

I have a list of ranked users, and would like to select the top 50. I also want to make sure one particular user is in this result set, even if they aren't in the top 50. Is there a sensible way to do this in a single mysql query? Or should I just check the results for the particular user and fetch him separately, if necessary?
Thanks!
If I understand correctly, you could do:
select * from users order by max(rank) desc limit 0, 49
union
select * from users where user = x
This way you get 49 top users plus your particular user.
Regardless if a single, fancy SQL query could be made, the most maintainable code would probably be two queries:
select user from users where id = "fred";
select user from users where id != "fred" order by rank limit 49;
Of course "fred" (or whomever) would usually be replaced by a placeholder but the specifics depend on the environment.
declare #topUsers table(
userId int primary key,
username varchar(25)
)
insert into #topUsers
select top 50
userId,
userName
from Users
order by rank desc
insert into #topUsers
select
userID,
userName
from Users
where userID = 1234 --userID of special user
select * from #topUsers
The simplest solution depends on your requirements, and what your database supports.
If you don't mind the possibility of having duplicate results, then a simple union (as Mariano Conti demonstrated) is fine.
Otherwise, you could do something like
select distinct <columnlist>
from (select * from users order by max(rank) desc limit 0, 49
union
select * from users where user = x)
if you database supports it.