select 1 from someTable where someColumn = #
or
select top 1 someColumn1 from someTable where someColumn2 = #
which one will be faster on a large scale table...
got no indexes at all on that table so that wont work.
thanks.
First one selects one column with value of literal 1 (a number) and as many rows as there are while second returns all the column but only for the first row.
It is not possible to compare the performance since they are doing different things.
Related
How to select top multiple of 10 entries?
I have a data in SQL table that is meaningful if only seen as bunch of 10 entries. I want to write a query that does this for ex. Select top 10*n from table where condition.
If for ex. 53 entries satisfy condition, I want only 50 to be seen and last 3 to be discarded.
Plz help.
Kbv
How about:
declare #rows int;
set #rows = ((select count(*) from table where condition)/10)*10
select top #rows * from table where condition
Try this:
with CTE AS (
SELECT * FROM Table WHERE Condition
)
Select top(((SELECT COUNT(*) FROM CTE)/10)*10) * From CTE
Please consider the following...
SELECT orderField,
field1,
...
FROM tblTable
WHERE condition
ORDER BY orderField
LIMIT 10 * numberOfGroups;
When constructing your query first decide which fields you want. In a simple one table query you can use SELECT * fairly safely, but if you are referring to a JOINed dataset then you should consider specifying which fields you are going to use and assigning aliases to those fields.
Make sure that whatever your orderField or orderFields are called they are covered by a wildcard such as * or by being explicitly specified.
The above code first selects all records that meet your criteria. It then sorts the resulting list based upon which field or fields you specify for ORDER BY. Note : The above assumes that you are sorting based upon existing values in your table. If you need to sort based on computed values then another (minor) level of complexity may need to be added.
LIMIT will then grab the first specified number of records from the sorted list. LIMIT accepts simply computed values such as 2 * 2 or 10 * numberOfGroups, where numberOfGroups is a variable set previously in the code or a value that explicitly replace numberOfGroups (i.e. 10 * #numberOfGroups where #numberOfGroups has previously been set to 5 or 10 * 5).
If you have any questions or comments, then please feel free to post a Comment accordingly.
I'd like to query the database as to whether or not one or more rows exist that satisfy a given predicate. However, I am not interested in the distinction between there being one such row, two rows or a million - just if there are 'zero' or 'one or more'. And I do not want Postgres to waste time producing an exact count that I do not need.
In DB2, I would do it like this:
SELECT 1 FROM SYSIBM.SYSDUMMY1 WHERE EXISTS
(SELECT 1 FROM REAL_TABLE WHERE COLUMN = 'VALUE')
and then checking if zero rows or one row was returned from the query.
But Postgres has no dummy table available, so what is the best option?
If I create a one-row dummy table myself and use that in place of SYSIBM.SYSDUMMY1, will the query optimizer be smart enough to not actually read that table when running the query, and otherwise 'do the right thing'?
PostgreSQL doesn't have a dummy table because you don't need one.
SELECT 1 WHERE EXISTS
(SELECT 1 FROM REAL_TABLE WHERE COLUMN = 'VALUE')
Alternatively if you want a true/false answer:
SELECT EXISTS(SELECT 1 FROM REAL_TABLE WHERE COLUMN = 'VALUE')
How about just doing this?
SELECT (CASE WHEN EXISTS (SELECT 1 FROM REAL_TABLE WHERE COLUMN = 'VALUE') THEN 1 ELSE 0 END)
1 means there is a value. 0 means no value.
This will always return one row.
If you are happy with "no row" if no row matches, you can even just:
SELECT 1 FROM real_table WHERE column = 'VALUE' LIMIT 1;
Performance is basically the same as with EXISTS. Key to performance for big tables is a matching index.
I want to check whether a table contains a row or not. Which is faster?
IF EXISTS(SELECT * FROM TABLE)
or
IF EXISTS(SELECT TOP 1 * FROM TABLE)
There is no difference between the queries!
The columns in the select don't get evaluated.
If you recall Logical Query processing, the from clause is executed first. The select clause is executed in the last step (actually Order By is, but that is a cosmetic thing).
So when the from clause gets executed, there are rows returned, regardless of the column names.
You have to add columnnames because otherwise you get syntax errors
IF EXISTS(SELECT 1 FROM TABLE)
is faster
there are some more suggestions
IF EXISTS(SELECT null FROM TABLE)
Obviously SELECT TOP 1 * FROM TABLE is faster.
since the index scan is reduced to one,number of rows return is one and
the estimated operation cost is also much less.
but if there is only on row in the table both the queries will show same operation cost.
SELECT *
FROM (SELECT top 1 *
FROM ms_data) temp
SELECT TOP 1 *
FROM ms_data
both the above queries have same operation cost.
How can I get distinct values from multiple fields within one table with just one request.
Option 1
SELECT WM_CONCAT(DISTINCT(FIELD1)) FIELD1S,WM_CONCAT(DISTINCT(FIELD2)) FIELD2S,..FIELD10S
FROM TABLE;
WM_CONCAT is LIMITED
Option 2
select DISTINCT(FIELD1) FIELDVALUE, 'FIELD1' FIELDNAME
FROM TABLE
UNION
select DISTINCT(FIELD2) FIELDVALUE, 'FIELD2' FIELDNAME
FROM TABLE
... FIELD 10
is just too slow
if you were scanning a small range in the data (not full scanning the whole table) you could use WITH to optimise your query
e.g:
WITH a AS
(SELECT field1,field2,field3..... FROM TABLE WHERE condition)
SELECT field1 FROM a
UNION
SELECT field2 FROM a
UNION
SELECT field3 FROM a
.....etc
For my problem, I had
WL1 ... WL2 ... correlation
A B 0.8
B A 0.8
A C 0.9
C A 0.9
how to eliminate the symmetry from this table?
select WL1, WL2,correlation from
table
where least(WL1,WL2)||greatest(WL1,WL2) = WL1||WL2
order by WL1
this gives
WL1 ... WL2 ... correlation
A B 0.8
A C 0.9
:)
The best option in the SQL is the UNION, though you may be able to save some performance by taking out the distinct keywords:
select FIELD1 FROM TABLE
UNION
select FIELD2 FROM TABLE
UNION provides the unique set from two tables, so distinct is redundant in this case. There simply isn't any way to write this query differently to make it perform faster. There's no magic formula that makes searching 200,000+ rows faster. It's got to search every row of the table twice and sort for uniqueness, which is exactly what UNION will do.
The only way you can make it faster is to create separate indexes on the two fields (maybe) or pare down the set of data that you're searching across.
Alternatively, if you're doing this a lot and adding new fields rarely, you could use a materialized view to store the result and only refresh it periodically.
Incidentally, your second query doesn't appear to do what you want it to. Distinct always applies to all of the columns in the select section, so your constants with the field names will cause the query to always return separate rows for the two columns.
I've come up with another method that, experimentally, seems to be a little faster. In affect, this allows us to trade one full-table scan for a Cartesian join. In most cases, I would still opt to use the union as it's much more obvious what the query is doing.
SELECT DISTINCT CASE lvl WHEN 1 THEN field1 ELSE field2 END
FROM table
CROSS JOIN (SELECT LEVEL lvl
FROM DUAL
CONNECT BY LEVEL <= 2);
It's also worthwhile to add that I tested both queries on a table without useful indexes containing 800,000 rows and it took roughly 45 seconds (returning 145,000 rows). However, most of that time was spent actually fetching the records, not running the query (the query took 3-7 seconds). If you're getting a sizable number of rows back, it may simply be the number of rows that is causing the performance issue you're seeing.
When you get distinct values from multiple columns, then it won't return a data table. If you think following data
Column A Column B
10 50
30 50
10 50
when you get the distinct it will be 2 rows from first column and 1 rows from 2nd column. It simply won't work.
And something like this?
SELECT 'FIELD1',FIELD1, 'FIELD2',FIELD2,...
FROM TABLE
GROUP BY FIELD1,FIELD2,...
I'm using sql-server 2005 and ASP.NET with C#.
I have Users table with
userId(int),
userGender(tinyint),
userAge(tinyint),
userCity(tinyint)
(simplified version of course)
I need to select always two fit to userID I pass to query users of opposite gender, in age range of -5 to +10 years and from the same city.
Important fact is it always must be two, so I created condition if ##rowcount<2 re-select without age and city filters.
Now the problem is that I sometimes have two returned result sets because I use first ##rowcount on a table. If I run the query.
Will it be a problem to use the DataReader object to read from always second result set? Is there any other way to check how many results were selected without performing select with results?
Can you simplify it by using SELECT TOP 2 ?
Update: I would perform both selects all the time, union the results, and then select from them based on an order (using SELECT TOP 2) as the union may have added more than two. Its important that this next select selects the rows in order of importance, ie it prefers rows from your first select.
Alternatively, have the reader logic read the next result-set if there is one and leave the SQL alone.
To avoid getting two separate result sets you can do your first SELECT into a table variable and then do your ##ROWCOUNT check. If >= 2 then just select from the table variable on its own otherwise select the results of the table variable UNION ALLed with the results of the second query.
Edit: There is a slight overhead to using table variables so you'd need to balance whether this was cheaper than Adam's suggestion just to perform the 'UNION' as a matter of routine by looking at the execution stats for both approaches
SET STATISTICS IO ON
Would something along the following lines be of use...
SELECT *
FROM (SELECT 1 AS prio, *
FROM my_table M1 JOIN my_table M2
WHERE M1.userID = supplied_user_id AND
M1.userGender <> M2.userGender AND
M1.userAge - 5 >= M2.userAge AND
M1.userAge + 15 <= M2.userAge AND
M1.userCity = M2.userCity
LIMIT TO 2 ROWS
UNION
SELECT 2 AS prio, *
FROM my_table M1 JOIN my_table M2
WHERE M1.userID = supplied_user_id AND
M1.userGender <> M2.userGender
LIMIT TO 2 ROWS)
ORDER BY prio
LIMIT TO 2 ROWS;
I haven't tried it as I have no SQL Server and there may be dialect issues.