changing sorting criteria after the first result - sql

I am selecting from a database of news articles, and I'd prefer to do it all in one query if possible. In the results, I need a sorting criteria that applies ONLY to the first result.
In my case, the first result must have an image, but the others should be sorted without caring about their image status.
Is this something I can do with some sort of conditionals or user variables in a MySQL query?

Even if you manage to find a query that looks like one query, it is going to be logicaly two queries. Have a look at MySQL UNION if you really must make it one query (but it will still be 2 logical queries). You can union the image in the first with a limit of 1 and the rest in the second.

Something like this ensures an article with an image on the top.
SELECT
id,
title,
newsdate,
article
FROM
news
ORDER BY
CASE WHEN HasImage = 'Y' THEN 0 ELSE 1 END,
newsdate DESC
Unless you define "the first result" closer, of course. This query prefers articles with images, articles without will appear at the end.
Another variant (thanks to le dorfier, who deleted his answer for some reason) would be this:
SELECT
id,
title,
newsdate,
article
FROM
news
ORDER BY
CASE WHEN id = (
SELECT MIN(id) FROM news WHERE HasImage = 'Y'
) THEN 0 ELSE 1 END,
newsdate DESC
This sorts the earliest (assuming MIN(id) means "earliest") article with an image to the top.

I don't think it's possible, as it's effectively 2 queries (the first query the table has to get sorted for, and the second unordered), so you might as well use 2 queries with a LIMIT 1 in the first.

Related

I need to query a database based on keywords

I need to query a postgresql database where there are keywords stored in the same row as the data I am trying to query. If it is queried on that keyword, that object is more likely, but not guaranteed to be the object queried. I want it to query about 10 items at a time, but I'm pretty sure I know how to do that(select top 10). So basically if the keyword is present it is more likely but not guaranteed to be the object queried. How do I do this?
I have a year of experience as a database developer but I don't know how to solve this problem. I would also be open to switching software if there are better suggestions. Thanks!!
So for example if the user searches on Apples then Data2 is more likely, but not guaranteed to be queried.
You want to select 10 rows, prefering those matching the keyword. So, order by match, then restrict to ten rows:
select *
from mytable
order by
case when keyword1 = 'Apples' then 0 else 1 end +
case when keyword2 = 'Apples' then 0 else 1 end +
case when keyword3 = 'Apples' then 0 else 1 end
fetch first 10 rows only;
Demo: https://dbfiddle.uk/?rdbms=postgres_8.4&fiddle=34758b94fe725f7f51a476e80c97187c
A row with a matching keyword is more likely, but not guaranteed to be selected, because the query picks ten rows, making arbitrary choices in case of ties. The linked demo shows one situation with less than 10 matches and one with more than ten.

What is "Select -1", and how is it different from "Select 1"?

I have the following query that is part of a common table expression. I don't understand the function of the "Select -1" statement. It is obviously different than the "Select 1" that is used in "EXISTS" statements. Any ideas?
select days_old,
count(express_cd),
count(*),
case
when round(count(express_cd)*100.0/count(*),2) < 1 then '0'
else ''
end ||
cast(decimal(round(count(express_cd)*100.0/count(*),2),5,2) as varchar(7)) ||
'%'
from foo.bar
group by days_old
union all
select -1, -- Selecting the -1 here
count(express_cd),
count(*),
case
when round(count(express_cd)*100.0/count(*),2) < 1 then '0'
else ''
end ||
cast(decimal(round(count(express_cd)*100.0/count(*),2),5,2) as varchar(7)) ||
'%'
from foo.bar
where days_old between 1 and 7
It's just selecting the number "minus one" for each row returned, just like "select 1" will select the number "one" for each row returned.
There is nothing special about the "select 1" syntax uses in EXISTS statements by the way; it's just selecting some random value because EXISTS requires a record to be returned and a record needs data; the number 1 is sufficient.
Why you would do this, I have no idea.
When you have a union statement, each part of the union must contain the same columns. From what I read when I look at this, the first statement is giving you one line for each days old value and then some stats for each day old. The second part of the union is giving you a summary of all the records that are only a week or so less. Since days old column is not relevant here, they put in a fake value as a placeholder in order to do the union. OF course this is just a guess based on reading thousands of queries through the years. To be sure, I would need to actually run teh code.
Since you say this is a CTE, to really understand why this is is happening, you may need to look at the data it generates and how that data is used in the next query that uses the CTE. That might answer your question.
What you have asked is basically about a business rule unique to your company. The true answer should lie in any requirements documents for the original creation of the code. You should go look for them and read them. We can make guesses based on our own experience but only people in your company can answer the why question here.
If you can't find the documentation, then you need to talk (Yes directly talk, preferably in person) to the Stakeholders who use the data and find out what their needs were. Only do this after running the code and analyzing the results to better understand the meaning of the data returned.
Based on your query, all the records with days_old between 1 and 7 will be output as '-1', that is what select -1 does, nothing special here and there is no difference between select -1 and select 1 in exists, both will output the records as either 1 or -1, they are doing the same thing to check whether if there has any data.
Back to your query, I noticed that you have a union all and compare each four columns you select connected by union all, I am guessing your task is to get a final result with days_old not between 1 and 7 and combine the result with day_old, which is one because you take all between 1 and 7.
It is just a grouping logic there.
Your query returns aggregated
data (counts and rounds) grouped by days_old column plus one more group for data where days_old between 1 and 7.
So, -1 is just another additional group there, it cannot be 1 because days_old=1 is an another valid group.
result will be like this:
row1: days_old=1 count(*)=2 ...
row2: days_old=3 count(*)=5 ...
row3: days_old=9 count(*)=6 ...
row4: days_old=-1 count(*)=7

Vague count in sql select statements

I guess this has been asked in the site before but I can't find it.
I've seen in some sites that there is a vague count over the results of a search. For example, here in stackoverflow, when you search a question, it says +5000 results (sometimes), in gmail, when you search by keywords, it says "hundreds" and in google it says aprox X results. Is this just a way to show the user an easy-to-understand-a-huge-number? or this is actually a fast way to count results that can be used in a database [I'm learning Oracle at the moment 10g version]? something like "hey, if you get more than 1k results, just stop and tell me there are more than 1k".
Thanks
PS. I'm new to databases.
Usually this is just a nice way to display a number.
I don't believe there is a way to do what you are asking for in SQL - count does not have an option for counting up until some number.
I also would not assume this is coming from SQL in either gmail, or stackoverflow.
Most search engines will return a total number of matches to a search, and then let you page through results.
As for making an exact number more human readable, here is an example from Rails:
http://api.rubyonrails.org/classes/ActionView/Helpers/NumberHelper.html#method-i-number_to_human
With Oracle, you can always resort to analytical functions in order to calculate the exact number of rows about to be returned. This is an example of such a query:
SELECT inner.*, MAX(ROWNUM) OVER(PARTITION BY 1) as TOTAL_ROWS
FROM (
[... your own, sorted search query ...]
) inner
This will give you the total number of rows for your specific subquery. When you want to apply paging as well, you can further wrap these SQL parts as such:
SELECT outer.* FROM (
SELECT * FROM (
SELECT inner.*,ROWNUM as RNUM, MAX(ROWNUM) OVER(PARTITION BY 1) as TOTAL_ROWS
FROM (
[... your own, sorted search query ...]
) inner
)
WHERE ROWNUM < :max_row
) outer
WHERE outer.RNUM > :min_row
Replace min_row and max_row by meaningful values. But beware that calculating the exact number of rows can be expensive when you're not filtering using UNIQUE SCAN or relatively narrow RANGE SCAN operations on indexes. Read more about this here: Speed of paged queries in Oracle
As others have said, you can always have an absolute upper limit, such as 5000 to your query using a ROWNUM <= 5000 filter and then just indicate that there are more than 5000+ results. Note that Oracle can be very good at optimising queries when you apply ROWNUM filtering. Find some info on that subject here:
http://www.dba-oracle.com/t_sql_tuning_rownum_equals_one.htm
Vague count is a buffer which will be displayed promptly. If user wants to see more results then he can request more.
It's a performance facility, after displaying the results the sites like google keep searching for more results.
I don't know how fast this will run, but you can try:
SELECT NULL FROM your_tables WHERE your_condition AND ROWNUM <= 1001
If count of rows in result will equals to 1001 then total count of records will > 1000.
this question gives some pretty good information
When you do an SQL query you can set a
LIMIT 0, 100
for example and you will only get the first hundred answers. so you can then print to your viewer that there are 100+ answers to their request.
For google I couldn't say if they really know there is more than 27'000'000'000 answer to a request but I believe they really do know. There are some standard request that have results stored and where the update is done in the background.

Include a blank row in query results

Is there a way to include a blank row at the top of a sql query, eg if it is meant for a dropdown list? (MS Sql Server 2005 or 2008)
Select *
FROM datStatus
ORDER BY statusName
Where I want something like
-1 (please choose one)
1 Beginning
2 Middle
3 Ending
4 Canceled
From a table that is ordinarily just the above, but without the top row?
I feel it's nicer to do it outside SQL, but if you insist...
SELECT -1, '(please choose one)'
UNION
SELECT * FROM datStatus
ORDER BY statusName
I have found that it is better to do this in the presentation layer of your application, as you might have different requirements based on the context. In general I try to keep my data service layer free of these sorts of implementation specific rules. So in your case I would usually just add a new item by index in the first position of the list, after i had loaded it with data from my service layer.
Enjoy!
How about unioning the first row together with the rest of the query?
Select -1,'(please choose one)'
union all
select * FROM datStatus ORDER BY statusName

Paging in SQL with LIMIT/OFFSET sometimes results in duplicates on different pages

I'm developing an online gallery with voting and have a separate table for pictures and votes (for every vote I'm storing the ID of the picture and the ID of the voter). The tables related like this: PICTURE <--(1:n, using VOTE.picture_id)-- VOTE. I would like to query the pictures table and sort the output by votes number. This is what I do:
SELECT
picture.votes_number,
picture.creation_date,
picture.author_id,
picture.author_nickname,
picture.id,
picture.url,
picture.name,
picture.width,
picture.height,
coalesce(anon_1."totalVotes", 0)
FROM picture
LEFT OUTER JOIN
(SELECT
vote.picture_id as pid,
count(*) AS "totalVotes"
FROM vote
WHERE vote.device_id = <this is the query parameter> GROUP BY pid) AS anon_1
ON picture.id = anon_1.pid
ORDER BY picture.votes_number DESC
LIMIT 10
OFFSET 0
OFFSET is different for different pages, of course.
However, there are pictures with the same ID that are displayed on the different pages. I guess the reason is the sorting, but can't construct any better query, which will not allow duplicates. Could anybody give me a hint?
Thanks in advance!
Do you execute one query per page to display? If yes, I suspect that the database doesn't guarantee a consitent order for items with the same number of votes. So first query may return { item 1, item 2 } and a 2nd query may return { item 2, item 1} if both items have same number of votes. If the items are actually items 10 and 11, then the same item may appear on page 1 and then on page 2.
I had such a problem once. If that's also your case, append an extra clause to the order by to ensure a consistent ordering of items with same vote number, e.g.:
ORDER BY picture.vote, picture.ID
The simples explanation is that you had some data added or some votes occured when you was looking at different pages.
I am sure if you would sorte by ID or creation_date this issue would go away.
I.e. there is no issue with your code
in my case this problem was due to the Null value in the Order By clause, i solved this by adding another Unique ID field in Order By Clause along with other field.