So I came across the following at work, and I can tell right away what it's for but I want to find any documentation for it and can't find anything online!
with details as
(
select *,
row_number() over (order by CREATED_DATE) as [Row]
from
(
select top 10 * from MyTable
) t
)
select *
from details
where [Row] > #lowLimit and [Row] < #highLimit
This looks to me like its for paging functionality. However, I don't know exactly what structure I'm looking at within the sql syntax. Does anyone recognize this syntax and can you point me to where I can read more about it?
Thanks!
That's a common table expression. These are used as temporary result sets for single queries. They are treated by the following query much like a view. You can do some neat stuff with them, like recursion!
Here's a brief description of their functionality from the link:
Create a recursive query.
Substitute for a view when the general use of a view is not required; that is, you do not have to store the definition in metadata.
Enable grouping by a column that is derived from a scalar subselect, or a function that is either not deterministic or has external access.
Reference the resulting table multiple times in the same statement.
Regarding semicolons, please check out this answer for a really useful tip - why you should always preface CTEs with semicolons.
Related
I have a repository of SQL queries and I want to understand which queries use certain tables or fields.
Let's say I want to understand what queries use the email field, how can I write it?
Example SQL query:
select
users.email as email_user
,users.email as email_user_too
,email as email_user_too_2
email as email_user_too_3,
back_email as wrong_email -- wrong field
from users
So to state the problem more accurately, you are sorting through a list of SQL queries [as text], and you now need to find the queries that use certain fields using SQL & RegEx (Regular Expressions) in PostgreSQL. (please tag the question so that StackOverflow indexes your question correctly, more importantly, readers have more context about the question)
PostgreSQL has Regular Expression support OOTB (Out Of The Box). So we skip exploring other ways to do this. (If you are reading this as Microsoft SQL Server person, then I strongly suggest you to have a read of this brilliant article on Microsoft's website on defining a Table-Valued UDF (User Defined Function))
The simplest way I could think of to approach your problem, is to throw away what we don't want out of the query text first, and then filter out what's left.
This way, after throwing away the stuff you don't need, you will be left with a set of "tokens" that you can easily filter, and I'm putting token in quotes since we are not really parsing the SQL language, but if we did that would be the first step: to extract tokens.. (:
Take this query for example:
With Queries (
Id
, QueryText
) As (
values (1, 'select
users.email as email_user
,users.email as email_user_too
,email as email_user_too_2,
email as email_user_too_3,
back_email as wrong_email -- wrong field
from users')
)
Select QueryText
, found
From (
Select Id
, QueryText
, regexp_split_to_table (QueryText, '(--[\s\w]+|select|from|as|where|[ \s\n,])') As found
From Queries
) As Result
Where found != ''
And found = 'back_email'
I have sourced the concept of a "query repository" with a WITH statement for ease of doing the pseudo-code.
I have also selected few words/characters to split QueryText with. Like select, where etc. We don't need these in our 'found' set.
And in the end, as you can see above, I simply used found as what's left and filtered it with the field name you are looking for. (Assuming that you know the field you are looking for)
You could improve upon the RegEx I did, or change the method as you wish to make it better. But I think the general concept addresses what you need to achieve. One problem I can see with my solution right off the bat is the fact that you can search for anything really, not just names of the selected fields - which begs the question, why use RegEx, and not Like statements? But again, as I mentioned, you can improve upon the RegEx and address specific requirements you may have. Using Like might limit you in that direction. (In other words, only you know what's good for you. I can't say that from here.)
You can play with the query online here: db-fiddle query and use https://regex101.com/ for testing your RegEx.
Disclaimer I'm not a PostgreSQL developer. There must be other, perhaps better ways of doing this. (:
Analyzing an Oracle DB of an application of mine, I always run queries ending with the very same "order by" clause, given that every table has a date type "DT_EXTRACTION" column.
Is there a way to define an alias for String "order by DT_EXTRACTION desc" (say, equals to $DD) and write my query like this?
select *
from foo
$DD;
Since you're using SQL Developer you could (ab)use substitution variables for this:
define DD='order by DT_EXTRACTION desc'
select * from your_table
ⅅ
but you'd have to either define that string in each script/session, or add it to a login script to make it always available (which you can choose from Tools->Preferences->Database).
That would work in SQL*Plus too.
SQL Developer also has 'snippets', which you can view and manage from the panel revealed by View->Snippets. You can add your own snippet for that order by clause, and can then drag-and-drop it from the snippets panel into your code wherever you need to use it. Not quite what you asked for but still useful. #thatjeffsmith has a write up with pictures, so I won't repeat those details here, since it's not quite what you need.
You may find code templates useful too. From Tool->Preferences->Database choose SQL Editor Code Templates, and define a new one for your string:
Then in the worksheet, type as far as:
select * from your_table DD
hit control-space and it will expand automatically to
select * from your_table order by dt_extraction desc
this is probably a little thing
but i try to use this sql statement:
SELECT * FROM Colors
WHERE colorHueWarmth < 0
AND colorV >=0.7
AND (fk_subCategory=4 OR fk_subCategory=5 OR fk_subCategory=11)
And in the results i get the perfect colorHueWarmth and colorV but i also get the fk_subcategories for other values than 4, 5 or 11.
i tried changing the values but no results, is it even possible to do such a statement?
Does anyone what i am doing wrong?
Thanks in advance
You've actually got multiple options; although I'd point out that the query (in your qusetion) actually works for me (see this Sql Fiddle)
SELECT
*
FROM
Colors
WHERE
colorHueWarmth < 0
AND colorV >=0.7
AND (fk_subCategory=4 OR fk_subCategory=5 OR fk_subCategory=11)
As stated in one of the comments I would guess that your original didn't have braces on the fk_subCategory clause (the third table in my previous fiddle). Brackets are immensely important when working with logic and should always be used to group items together.
The easiest solution is as follows:
SELECT
*
FROM
Colors
WHERE
colorHueWarmth < 0
AND colorV >=0.7
AND (fk_subCategory IN(4,5,11));
You will find loads of documentation online regarding the LIKE clause here are a few you might find useful:
http://webcheatsheet.com/sql/interactive_sql_tutorial/sql_in.php
http://www.w3schools.com/sql/sql_in.asp (note W3Schools can't always be taken on face value and are often excluded from suggested links due to the errors/omissions they often contain)
http://msdn.microsoft.com/en-gb/library/ms177682.aspx
Given the size of the foreign key constraint (4,5 or 11) the IN clause is a reasonable option, if you have other queries using something similar with large collections this can become quite inefficient in which case you could create a temporary table which contains the ID's and INNER JOIN onto that. (here is a question regarding alternatives to LIKE)
This question already has answers here:
Subquery using Exists 1 or Exists *
(6 answers)
Closed 7 years ago.
I've seen some people use EXISTS (SELECT 1 FROM ...) rather than EXISTS (SELECT id FROM ...) as an optimization--rather than looking up and returning a value, SQL Server can simply return the literal it was given.
Is SELECT(1) always faster? Would Selecting a value from the table require work that Selecting a literal would avoid?
In SQL Server, it does not make a difference whether you use SELECT 1 or SELECT * within EXISTS. You are not actually returning the contents of the rows, but that rather the set determined by the WHERE clause is not-empty. Try running the query side-by-side with SET STATISTICS IO ON and you can prove that the approaches are equivalent. Personally I prefer SELECT * within EXISTS.
For google's sake, I'll update this question with the same answer as this one (Subquery using Exists 1 or Exists *) since (currently) an incorrect answer is marked as accepted. Note the SQL standard actually says that EXISTS via * is identical to a constant.
No. This has been covered a bazillion times. SQL Server is smart and knows it is being used for an EXISTS, and returns NO DATA to the system.
Quoth Microsoft:
http://technet.microsoft.com/en-us/library/ms189259.aspx?ppud=4
The select list of a subquery
introduced by EXISTS almost always
consists of an asterisk (*). There is
no reason to list column names because
you are just testing whether rows that
meet the conditions specified in the
subquery exist.
Also, don't believe me? Try running the following:
SELECT whatever
FROM yourtable
WHERE EXISTS( SELECT 1/0
FROM someothertable
WHERE a_valid_clause )
If it was actually doing something with the SELECT list, it would throw a div by zero error. It doesn't.
EDIT: Note, the SQL Standard actually talks about this.
ANSI SQL 1992 Standard, pg 191 http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
3) Case:
a) If the <select list> "*" is simply contained in a <subquery> that is immediately contained in an <exists predicate>, then the <select list> is equivalent to a <value expression> that is an arbitrary <literal>.
When you use SELECT 1, you clearly show (to whoever is reading your code later) that you are testing whether the record exists. Even if there is no performance gain (which is to be discussed), there is gain in code readability and maintainability.
Yes, because when you select a literal it does not need to read from disk (or even from cache).
doesn't matter what you select in an exists clause. most people do select *, then sql server automatically picks the best index
As someone pointed out sql server ignores the column selection list in EXISTS so it doesn't matter. I personally tend to use "SELECT null ..." to indicate that the value is not used at all.
If you look at the execution plan for
select COUNT(1) from master..spt_values
and look at the stream aggregate you will see that it calculates
Scalar Operator(Count(*))
So the 1 actually gets converted to *
However I have read somewhere in the "Inside SQL Server" series of books that * might incur a very slight overhead for checking column permissions. Unfortunately the book didn't go into any more detail than that as I recall.
Select 1 should be better to use in your example. Select * gets all the meta-data assoicated with the objects before runtime which adss overhead during the compliation of the query. Though you may not see differences when running both types of queries in your execution plan.
I am doing a simple SELECT statement in an Oracle DB and need to select the columns in a somewhat-specific order. Example:
Table A has 100 attributes, one of which is "chapter" that occurs somewhere in the order of columns in the table. I need to select the data with "chapter" first and the remaining columns after in no particular order. Essentially, my statement needs to read something like:
SELECT a.chapter, a. *the remaining columns* FROM A
Furthermore, I cannot simply type:
SELECT a.chapter, a.*
because this will select "chapter" twice.
I know the SQL statement seems simple, but if I know how to solve this problem, I can extrapolate this thought into more complicated areas. Also, let's assume that I can't just scroll over to find the "chapter" column and drag it to the beginning.
Thanks.
You should not select * in a program. As your schema evolves it will bring in things you do not know yet. Think about what happens when someone add a column with the whole book in it? The query you thought would be very cheap suddenly starts to bring in megabytes of data.
That means you have to specify every column you need.
Your best bet is just to select each column explicitly.
A quickie way to get around this would be SELECT a.chapter AS chapterCol, a.* FROM table a; This means there will be one column name chapterCol (assuming there's not a column already there named chapterCol. ;))
If your going to embed the 'SELECT *' into program code, then I would strongly recommend against doing that. As noted by the previous authors, your setting up the code to break if a column is ever added to (or removed from) the table. The simple advice is don't do it.
If your using this in development tools (viewing the data, and the like). Then, I'd recommend creating a view with the specific column order you need. Capture the output from 'SELECT COLUMN_NAME FROM ALL_TAB_COLUMNS' and create a select statement for the view with the column order you need.
This is how I would build your query without having to type all the names in, but with some manual effort.
Start with "Select a.chapter"
Now perform another select on your data base as follows :
select ','|| column_name
from user_tab_cols
where table_name = your_real_table_name
and column_name <> 'CHAPTER';
now take the output from that, in a cut-and-paste manner and append it to what you started with. Now run that query. It should be what you asked for.
Ta-da!
Unless you have a very good reason to do so, you should not use SELECT * in queries. It will break your application every time the schema changes.