SQL - Can I have a Group By clause after a nestled Select? - sql

For example:
Select max(date)
From table A
Where max(date) < any (select..
...)
Group By Book_Name,Client_Name
So the max(date) field could be compared to the Nestled Select return, as if the grouping of the greater Select was already made.

What you want is typically done with the HAVING clause.
Select Book_Name,Client_Name, max(date)
From table A
Group By Book_Name,Client_Name
HAVING max(date) < any (select..
...)
I removed reference to the other answer. I don't think it was correct and doesn't really help because I think HAVING is what you need.

Related

Get a new column with updated values, where each row change in time depending on the actual column?

I have some data that includes as columns an ID, Date and Place denoted by a number. I need to simulate a real time update where I create a new column that says how many different places are at the moment, so each time a new place appear in the column, the new column change it's value and shows it.
This is just a little piece of the original table with hundreds of millions of rows.
Here is an example, the left table is the original one and the right table is what I need.
I tried to do it with this piece of code but I cannot use the function DISTINCT with the OVER clause.
SELECT ID, Dates, Place,
count (distinct(Place)) OVER (PARTITION BY Place ORDER BY Dates) AS
DiffPlaces
FROM #informacion_prendaria_muestra
order by ID;
I think it will be possible by using DENSE_RANK() in SQL server
you can try this
SELECT ID, Dates, Place,
DENSE_RANK() OVER(ORDER BY Place) AS
DiffPlaces
FROM #informacion_prendaria_muestra
I think you can use a self join query like this - without using windows functions -:
select
t.ID, t.[Date], t.Place,
count(distinct tt.Place) diffPlace
from
yourTable t left join
yourTable tt on t.ID = tt.ID and t.[Date] >= tt.[Date]
group by
t.ID, t.[Date], t.Place
order by
Id, [Date];
SQL Fiddle Demo

Get minimum without using row number/window function in Bigquery

I have a table like as shown below
What I would like to do is get the minimum of each subject. Though I am able to do this with row_number function, I would like to do this with groupby and min() approach. But it doesn't work.
row_number approach - works fine
SELECT * FROM (select subject_id,value,id,min_time,max_time,time_1,
row_number() OVER (PARTITION BY subject_id ORDER BY value) AS rank
from table A) WHERE RANK = 1
min() approach - doesn't work
select subject_id,id,min_time,max_time,time_1,min(value) from table A
GROUP BY SUBJECT_ID,id
As you can see just the two columns (subject_id and id) is enough to group the items together. They will help differentiate the group. But why am I not able to use the other columns in select clause. If I use the other columns, I may not get the expected output because time_1 has different values.
I expect my output to be like as shown below
In BigQuery you can use aggregation for this:
SELECT ARRAY_AGG(a ORDER BY value LIMIT 1)[SAFE_OFFSET(1)].*
FROM table A
GROUP BY SUBJECT_ID;
This uses ARRAY_AGG() to aggregate each record (the a in the argument list). ARRAY_AGG() allows you to order the result (by value) and to limit the size of the array. The latter is important for performance.
After you concatenate the arrays, you want the first element. The .* transforms the record referred to by a to the component columns.
I'm not sure why you don't want to use ROW_NUMBER(). If the problem is the lingering rank column, you an easily remove it:
SELECT a.* EXCEPT (rank)
FROM (SELECT a.*,
ROW_NUMBER() OVER (PARTITION BY subject_id ORDER BY value) AS rank
FROM A
) a
WHERE RANK = 1;
Are you looking for something like below-
SELECT
A.subject_id,
A.id,
A.min_time,
A.max_time,
A.time_1,
A.value
FROM table A
INNER JOIN(
SELECT subject_id, MIN(value) Value
FROM table
GROUP BY subject_id
) B ON A.subject_id = B.subject_id
AND A.Value = B.Value
If you do not required to select Time_1 column's value, this following query will work (As I can see values in column min_time and max_time is same for the same group)-
SELECT
A.subject_id,A.id,A.min_time,A.max_time,
--A.time_1,
MIN(A.value)
FROM table A
GROUP BY
A.subject_id,A.id,A.min_time,A.max_time
Finally, the best approach is if you can apply something like CAST(Time_1 AS DATE) on your time column. This will consider only the date part regardless of the time part. The query will be
SELECT
A.subject_id,A.id,A.min_time,A.max_time,
CAST(A.time_1 AS DATE) Time_1,
MIN(A.value)
FROM table A
GROUP BY
A.subject_id,A.id,A.min_time,A.max_time,
CAST(A.time_1 AS DATE)
-- Make sure the syntax of CAST AS DATE
-- in BigQuery is as I written here or bit different.
Below is for BigQuery Standard SQL and is most efficient way for such cases like in your question
#standardSQL
SELECT AS VALUE ARRAY_AGG(t ORDER BY value LIMIT 1)[OFFSET(0)]
FROM `project.dataset.table` t
GROUP BY subject_id
Using ROW_NUMBER is not efficient and in many cases lead to Resources exceeded error.
Note: self join is also very ineffective way of achieving your objective
A bit late to the party, but here is a cte-based approach which made sense to me:
with mins as (
select subject_id, id, min(value) as min_value
from table
group by subject_id, id
)
select distinct t.subject_id, t.id, t.time_1, t.min_time, t.max_time, m.min_value
from table t
join mins m on m.subject_id = t.subject_id and m.id = t.id

DISTINCT with CAST and GROUP BY

I'm trying to get a DISTINCT of the column FeedbackDT but i can't seem to figure why doesn't it work..
SQL Query:
SELECT COUNT(FeedbackID) as FeedbackID,
(SELECT DISTINCT CAST(feedbackDateTime AS DATE)) as FeedbackDT
FROM Feedback
WHERE feedBackDateTime <= GETDATE()
GROUP BY (feedbackDateTime)
The result of the executed query
I searched high and low but to no avail..
Appreciate any help, thanks..
Because your current query doesn't make much sense. When you use GROUP BY, you get the distinct values of the column you are grouping by (or the combination of columns, if you are using more than one). There's no need for the SELECT DISTINCT subquery that you are using.
It seems to me that you need to use a simple GROUP BY:
SELECT CAST(feedbackDateTime AS DATE) FeedbackDT,
COUNT(FeedbackID) as FeedbackID
FROM Feedback
WHERE feedBackDateTime <= GETDATE()
GROUP BY CAST(feedbackDateTime AS DATE)
;

Order by not working in Oracle subquery

I'm trying to return 7 events from a table, from todays date, and have them in date order:
SELECT ID
FROM table
where ID in (select ID from table
where DATEFIELD >= trunc(sysdate)
order by DATEFIELD ASC)
and rownum <= 7
If I remove the 'order by' it returns the IDs just fine and the query works, but it's not in the right order. Would appreciate any help with this since I can't seem to figure out what I'm doing wrong!
(edit) for clarification, I was using this before, and the order returned was really out:
select ID
from TABLE
where DATEFIELD >= trunc(sysdate)
and rownum <= 7
order by DATEFIELD
Thanks
The values for the ROWNUM "function" are applied before the ORDER BY is processed. That why it doesn't work the way you used it (See the manual for a similar explanation)
When limiting a query using ROWNUM and an ORDER BY is involved, the ordering must be done in an inner select and the limit must be applied in the outer select:
select *
from (
select *
from table
where datefield >= trunc(sysdate)
order by datefield ASC
)
where rownum <= 7
You cannot use order by in where id in (select id from ...) kind of subquery. It wouldn't make sense anyway. This condition only checks if id is in subquery. If it affects the order of output, it's only incidental. With different data query execution plan might be different and output order would be different as well. Use explicit order by at the end of the main query.
It is well known 'feature' of Oracle that rownum doesn't play nice with order by. See http://www.adp-gmbh.ch/ora/sql/examples/first_rows.html for more information. In your case you should use something like:
SELECT ID
FROM (select ID, row_number() over (order by DATEFIELD ) r
from table
where DATEFIELD >= trunc(sysdate))
WHERE r <= 7
See also:
http://www.orafaq.com/faq/how_does_one_select_the_top_n_rows_from_a_table
http://www.oracle.com/technetwork/issue-archive/2006/06-sep/o56asktom-086197.html
http://asktom.oracle.com/pls/asktom/f?p=100:11:507524690399301::::P11_QUESTION_ID:127412348064
See also other similar questions on SO, eg.:
Oracle SELECT TOP 10 records
Oracle/SQL - Select specified range of sequential records
Your outer query cant "see" the ORDER in the inner query and in this case the order in the inner doesn't make sense because it (the inner) is only being used to create a subset of data that will be used on the WHERE of the outer one, so the order of this subset doesn't matter.
maybe if you explain better what you want to do, we can help you
ORDER BY CLAUSE IN Subqueries:
the order by clause is not allowed inside a subquery, with the exception of the inline views. If attempt to include an ORDER BY clause, you receive an error message
An inline View is a query at the from clause.
SELECT t.*
FROM (SELECT id, name FROM student) t

SQL grammar for SELECT MIN(DATE)

I have a table with structure:
id(INT PK), title(VARCHAR), date(DATE)
How do I select all distinct titles with their earliest date?
Apparently, SELECT DISTINCT title, MIN(date) FROM table doesn't work.
You need to use GROUP BY instead of DISTINCT if you want to use aggregation functions.
SELECT title, MIN(date)
FROM table
GROUP BY title
An aggregate function requires a GROUP BY in standard SQL
This is "Get minimum date per title" in plain language
SELECT title, MIN(date) FROM table GROUP BY title
Most RDBMS and the standard require that column is either in the GROUP BY or in a functions (MIN, COUNT etc): MySQL is the notable exception with some extensions that give unpredictable behaviour
You are missing a GROUP BY here.
SELECT title, MIN (date) FROM table GROUP BY title
Above should fix this. And you don't even need a DISTINCT now.
If you want to get updated records then you can use the following query.
SELECT title, MAX(date) FROM table GROUP BY title
SELECT MIN(Date) AS Date FROM tbl_Employee /*To get First date Of Employee*/
To get the titles for dates greater than a week ago today, use this:
SELECT title, MIN(date_key_no) AS intro_date FROM table HAVING MIN(date_key_no)>= TO_NUMBER(TO_CHAR(SysDate, 'YYYYMMDD')) - 7
SELECT MIN(t.date)
FROM table t