SQL query:: number= select query in where clause - sql

I am not able to understand this query:
SELECT FIELD1 FROM TABLE1 T1
WHERE 3 = (
SELECT COUNT(FIELD1)
FROM TABLE1 T2
WHERE T2.FIELD1 <= T1.FIELD1
);
This query is running properly with out any error. The inner count query is returning result as 363.
in where clause if I put 3 = (select.. then I am getting one result. If I put 4=(select.. then no records are coming. If I put 363 = (select... then 3 records are coming.
I am confused with this. Please help me to understand this.

The subquery is counting how many FIELD1 values in the whole table are smaller or equal compared to the current one in the outer query (T1.FIELD1). Therefore the whole queue just works like this:
Return FIELD1 values from table TABLE1 if there are exactly 3 (or 4 or
whatever number you put there) other FIELD1 values in the table TABLE1
which are smaller or equal.
Note that it uses <= which means the subquery will allways return at least 1.

The query produces a result set consisting of the bottom n records in table1 with respect to the ordering implied by the values of field1 where nrepresents the literal number in the where clause. a non-empty result sets to the query also asserts that there are exactly n tuples to meet the comparison criterion.
you therefore compute the occupants of the nth rank in a tournament provided that the rank can be issued unequivocally.
example:
imagine a tournament result as follows:
scooby doo
donald duck
mickey mouse
calvin
hobbes
6. minnie mouse
...
363. whoever
this ranking would be compatible with your results (of course you would commonly label calvin & hobbes' rank as 4 instead of 5, as you'd have to if using your query to dertemine top n contestants).

Related

How can select different records from a list with duplicated values?

I'm new in the SQL/Oracle universe and I would like to ask for your help. This is a very simple question that I'm stuck in.
So, let me give you a picture. I have a regular table, let's call it "table1". The PK is the first column, "c1". Let's suppose that I would like to make the following select:
select (1) from table1 where c1 in ('1','2','3')
This will give me
(1)
1
1
2
1
3
1
However, if I make the following select
select (1) from table1 where c1 in ('1','2','2')
this will give me
(1)
1
1
2
1
My question is, why in the second case there is not 3 records? Can I modify the second case to give 3 records, in other words, how can I prevent to the selection acts like a "distinct" clause?
I know that it may be a dummy question, so let me thank you all in advance.
The where clause filters rows generated by the from clause.
Conditions in the where clause only specify whether or not a given row is in the result set. They do not specify how many times a given row is in the result set.
If you want to "multiply" the number of rows, you would need to use a join with a derived table that has duplicate values.

distinct values from multiple fields within one table ORACLE SQL

How can I get distinct values from multiple fields within one table with just one request.
Option 1
SELECT WM_CONCAT(DISTINCT(FIELD1)) FIELD1S,WM_CONCAT(DISTINCT(FIELD2)) FIELD2S,..FIELD10S
FROM TABLE;
WM_CONCAT is LIMITED
Option 2
select DISTINCT(FIELD1) FIELDVALUE, 'FIELD1' FIELDNAME
FROM TABLE
UNION
select DISTINCT(FIELD2) FIELDVALUE, 'FIELD2' FIELDNAME
FROM TABLE
... FIELD 10
is just too slow
if you were scanning a small range in the data (not full scanning the whole table) you could use WITH to optimise your query
e.g:
WITH a AS
(SELECT field1,field2,field3..... FROM TABLE WHERE condition)
SELECT field1 FROM a
UNION
SELECT field2 FROM a
UNION
SELECT field3 FROM a
.....etc
For my problem, I had
WL1 ... WL2 ... correlation
A B 0.8
B A 0.8
A C 0.9
C A 0.9
how to eliminate the symmetry from this table?
select WL1, WL2,correlation from
table
where least(WL1,WL2)||greatest(WL1,WL2) = WL1||WL2
order by WL1
this gives
WL1 ... WL2 ... correlation
A B 0.8
A C 0.9
:)
The best option in the SQL is the UNION, though you may be able to save some performance by taking out the distinct keywords:
select FIELD1 FROM TABLE
UNION
select FIELD2 FROM TABLE
UNION provides the unique set from two tables, so distinct is redundant in this case. There simply isn't any way to write this query differently to make it perform faster. There's no magic formula that makes searching 200,000+ rows faster. It's got to search every row of the table twice and sort for uniqueness, which is exactly what UNION will do.
The only way you can make it faster is to create separate indexes on the two fields (maybe) or pare down the set of data that you're searching across.
Alternatively, if you're doing this a lot and adding new fields rarely, you could use a materialized view to store the result and only refresh it periodically.
Incidentally, your second query doesn't appear to do what you want it to. Distinct always applies to all of the columns in the select section, so your constants with the field names will cause the query to always return separate rows for the two columns.
I've come up with another method that, experimentally, seems to be a little faster. In affect, this allows us to trade one full-table scan for a Cartesian join. In most cases, I would still opt to use the union as it's much more obvious what the query is doing.
SELECT DISTINCT CASE lvl WHEN 1 THEN field1 ELSE field2 END
FROM table
CROSS JOIN (SELECT LEVEL lvl
FROM DUAL
CONNECT BY LEVEL <= 2);
It's also worthwhile to add that I tested both queries on a table without useful indexes containing 800,000 rows and it took roughly 45 seconds (returning 145,000 rows). However, most of that time was spent actually fetching the records, not running the query (the query took 3-7 seconds). If you're getting a sizable number of rows back, it may simply be the number of rows that is causing the performance issue you're seeing.
When you get distinct values from multiple columns, then it won't return a data table. If you think following data
Column A Column B
10 50
30 50
10 50
when you get the distinct it will be 2 rows from first column and 1 rows from 2nd column. It simply won't work.
And something like this?
SELECT 'FIELD1',FIELD1, 'FIELD2',FIELD2,...
FROM TABLE
GROUP BY FIELD1,FIELD2,...

Why doesn't something like select count((select * from producers)) from producers; work?

I have defined the following query:
select count((select * from producers)) from producers;
Assuming a producers table with 3 columns (A, B and C) and 2 rows:
A B C
-----
0 1 2
3 4 5
I'd expect the following output:
2
2
It doesn't work. While the query itself is basically useless (even if it worked, it wouldn't yield any useful output), I'd like to try to understand why this doesn't run.
(select * from producers)
This would yield a list of rows with information on all the attributes on the producers table.
select count((select * from producers)) from producers;
This one will for each row on producers, show up the number 2 (the number of elements in producers).
Why doesn't it work? SQL limitation? Is there anything wrong with the logic I'm following here?
Thanks
It is a limitation of SQL, as far as I know. Subqueries are not allowed in the COUNT expression. Obviously (select * from producers) is a subquery, so it's not allowed there.
I think your misunderstanding is that you're thinking that you would call the function like COUNT(SELECT * FROM producers) whereas in SQL it's like SELECT COUNT(*) FROM producers.
Functions like MAX, MIN, SUM, and COUNT are aggregate functions, meaning that they take a scalar argument but execute once for each row, accumulating results every iteration. So SELECT MAX(column) FROM table executes the MAX function once for each row in table, while you might be thinking that MAX executes once and gets passed in every row in table.
Contrast this with operators like IN, EXISTS, ANY, and ALL, which have a subquery as an argument. They are effectively passed all the results of their subquery every time they are invoked.
It should just be
Select count(1) from producers;
If you are asking about inner selects, then the inner select must be part of the from clause, e.g.
Select count(1) from (select * from producers)
Both of these do the same but the first is more efficient.
COUNT() function expects only one value.
This will return what you want:
SELECT COUNT(*)
FROM producers p1
CROSS JOIN producers p2
GROUP BY p1.A
"Why does Count() only expect a value? Doesn't Count() take as input a set of rows? "
Count accepts an expression. If the expression evaluates to null, then it isn't counted. Otherwise it is.
If a table has five rows in it, and one column has three actual values and two null values, then a count of that column would return three.
ID Colour Size
1 Red 30
2 Blue <null>
3 <null> 20
4 <null> <null>
5 Blue 10
SELECT COUNT(COLOUR), COUNT(SIZE) would give 3 and 3.
COUNT(*) is special in that it gives the number of rows in the table, irrespective of any nulls.
COUNT can't/won't work with (select * from producers) or even (1,3) as they are not expressions that can be interpreted as null or not null.

Assistance with SQL statement

I'm using sql-server 2005 and ASP.NET with C#.
I have Users table with
userId(int),
userGender(tinyint),
userAge(tinyint),
userCity(tinyint)
(simplified version of course)
I need to select always two fit to userID I pass to query users of opposite gender, in age range of -5 to +10 years and from the same city.
Important fact is it always must be two, so I created condition if ##rowcount<2 re-select without age and city filters.
Now the problem is that I sometimes have two returned result sets because I use first ##rowcount on a table. If I run the query.
Will it be a problem to use the DataReader object to read from always second result set? Is there any other way to check how many results were selected without performing select with results?
Can you simplify it by using SELECT TOP 2 ?
Update: I would perform both selects all the time, union the results, and then select from them based on an order (using SELECT TOP 2) as the union may have added more than two. Its important that this next select selects the rows in order of importance, ie it prefers rows from your first select.
Alternatively, have the reader logic read the next result-set if there is one and leave the SQL alone.
To avoid getting two separate result sets you can do your first SELECT into a table variable and then do your ##ROWCOUNT check. If >= 2 then just select from the table variable on its own otherwise select the results of the table variable UNION ALLed with the results of the second query.
Edit: There is a slight overhead to using table variables so you'd need to balance whether this was cheaper than Adam's suggestion just to perform the 'UNION' as a matter of routine by looking at the execution stats for both approaches
SET STATISTICS IO ON
Would something along the following lines be of use...
SELECT *
FROM (SELECT 1 AS prio, *
FROM my_table M1 JOIN my_table M2
WHERE M1.userID = supplied_user_id AND
M1.userGender <> M2.userGender AND
M1.userAge - 5 >= M2.userAge AND
M1.userAge + 15 <= M2.userAge AND
M1.userCity = M2.userCity
LIMIT TO 2 ROWS
UNION
SELECT 2 AS prio, *
FROM my_table M1 JOIN my_table M2
WHERE M1.userID = supplied_user_id AND
M1.userGender <> M2.userGender
LIMIT TO 2 ROWS)
ORDER BY prio
LIMIT TO 2 ROWS;
I haven't tried it as I have no SQL Server and there may be dialect issues.

Returning more than one value from a sql statement

I was looking at sql inner queries (bit like the sql equivalent of a C# anon method), and was wondering, can I return more than one value from a query?
For example, return the number of rows in a table as one output value, and also, as another output value, return the distinct number of rows?
Also, how does distinct work? Is this based on whether one field may be the same as another (thus classified as "distinct")?
I am using Sql Server 2005. Would there be a performance penalty if I return one value from one query, rather than two from one query?
Thanks
You could do your first question by doing this:
SELECT
COUNT(field1),
COUNT(DISTINCT field2)
FROM table
(For the first field you could do * if needed to count null values.)
Distinct means the definition of the word. It eliminates duplicate returned rows.
Returning 2 values instead of 1 would depend on what the values were, if they were indexed or not and other undetermined possible variables.
If you are meaning subqueries within the select statement, no you can only return 1 value. If you want more than 1 value you will have to use the subquery as a join.
If the inner query is inline in the SELECT, you may struggle to select multiple values. However, it is often possible to JOIN to a sub-query instead; that way, the sub-query can be named and you can get multiple results
SELECT a.Foo, a.Bar, x.[Count], x.[Avg]
FROM a
INNER JOIN (SELECT COUNT(1) AS [Count], AVG(something) AS [Avg]) x
ON x.Something = a.Something
Which might help.
DISTINCT does what it says. IIRC, you can SELECT COUNT(DISTINCT Foo) etc to query distinct data.
you can return multiple results in 3 ways (off the top of my head)
By having a select with multiple values eg: select col1, col2, col3
With multiple queries eg: select 1 ; select "2" ; select colA. you would get to them in a datareader by calling .NextRecord()
Using output parameters, declare the parameters before exec the query then get the value from them afterwards. eg: set #param1 = "2" . string myparam2 = sqlcommand.parameters["param1"].tostring()
Distinct, filters resulting rows to be unique.
Inner queries in the form:
SELECT * FROM tbl WHERE fld in (SELECT fld2 FROM tbl2 WHERE tbl.fld = tbl2.fld2)
cannot return multiple rows. When you need multiple rows from a secondary query, you usually need to do an inner join on the other query.
rows:
SELECT count(*), count(distinct *) from table
will return a dataset with one row containing two columns. Column 1 is the total number of rows in the table. Column 2 counts only distinct rows.
Distinct means the returned dataset will not have any duplicate rows. Distinct can only appear once usually directly after the select. Thus a query such as:
SELECT distinct a, b, c FROM table
might have this result:
a1 b1 c1
a1 b1 c2
a1 b2 c2
a1 b3 c2
Note that values are duplicated across the whole result set but each row is unique.
I'm not sure what your last question means. You should return from a query all the data relevant to the query. As for faster, only benchmarking can tell you which approach is faster.