Im having trouble understanding how can this subquery can calculate the number of rides of each bike station.
Can someone help me break it down an understand it?
SELECT
station_id,
name,
number_of_rides.
FROM
(
SELECT
start_station_id,
COUNT(*) number_of_rides
FROM
bigquery-public-data.new_york.citibike_trips
GROUP BY
start_station_id
)
INNER JOIN
bigquery-public-data.new_york_citibike.citibike_stations ON
station_id = start_station_id
ORDER BY
number_of_rides DESC
Below I attach the result of the query I run.
Related
How can I group the following query to the time frame in CrateDB?
SELECT * FROM (
SELECT
(
SELECT
date_bin('1 day'::INTERVAL, time_index, 0) AS time_frame,
count(*) FROM schema.status
WHERE processstatus IN ('State_01')
GROUP BY time_frame
ORDER BY time_frame DESC
) AS parts_good,
(
SELECT
date_bin('1 day'::INTERVAL, time_index, 0) AS time_frame,
count(*) FROM schema.status
WHERE processstatus IN ('State_02')
GROUP BY time_frame
ORDER BY time_frame DESC
) AS parts_bad
)
At the moment I'm getting the following error:
Error! UnsupportedFeatureException[Subqueries with more than 1 column are not supported.]
Maybe with a JOIN I can make it work, but I would like, if possible, to avoid the declaration of date_bin(), GROUP BY and ORDER BY in each SELECT statement, any suggestions?
Thanks!
I am not entirely sure, what you are trying to achieve, however the following query would give back the good and bad parts for every time_frame
SELECT
date_bin('1 day'::INTERVAL, time_index, 0) AS time_frame,
count(*) FILTER ( WHERE processstatus = 'State_01') AS "parts_good",
count(*) FILTER ( WHERE processstatus = 'State_02') AS "parts_bad"
FROM schema.status
GROUP BY time_frame
ORDER BY time_frame DESC
I am getting errors when I am trying to show the movie (s) with the most no. of reviews.
My query is as follows:
SELECT movieName, Count(*) NoOfReviews
FROM MovieReviews
where Count(*) NoOfReviews = (Select MAX(NoOfReviews))
Group by movieName
It keeps giving me an error but I am not sure why. Any input would be appreciated.
You can just order by and limit:
SELECT movieName, Count(*) NoOfReviews
FROM MovieReviews
GROUP BY movieName
ORDER BY NoOfReviews DESC
FETCH FIRST ROW WITH TIES
This gives you the movie with most reviews, ties included.
Note that the row limiting clause is available starting Oracle 12 only. In earlier vesions, one option is RANK():
SELECT movieName, NoOfReviews
FROM (
SELECT movieName, Count(*) NoOfReviews, RANK() OVER(ORDER BY Count(*)) rn
FROM MovieReviews
GROUP BY movieName
) t
WHERE rn = 1
You filter the results of an aggregation query using a having clause. That said, this is rather cumbersome to do with having and window functions are a better solution anyway:
SELECT *
FROM (SELECT movieName, Count(*) as NoOfReviews,
MAX(count(*)) OVER () as max_NoOfReviews
FROM MovieReviews
GROUP BY movieName
) mr
WHERE NoOfReviews = max_NoOfReviews
The results are below. I need to get the records (seller and purchaser) with the max count- grouped by purchaser (marked with yellow)
You can use window functions:
with q as (
<your query here>
)
select q.*
from (select q.*,
row_number() over (order by seller desc) as seqnum_s,
row_number() over (order by purchaser desc) as seqnum_p
from q
) q
where seqnum_s = 1 or seqnum_p = 1;
Try this:
SELECT COUNT,seller,purchaser FROM YourTable ORDER BY seller,purchaser DESC
SELECT T2.MaxCount,T2.purchaser,T1.Seller FROM <Yourtable> T1
Inner JOIN
(
Select Max(Count) as MaxCount, purchaser
FROM <Yourtable>
GROUP BY Purchaser
)T2
On T2.Purchaser=T1.Purchaser AND T2.MaxCount=T1.Count
First you select the Seller from which will give you a list of all 5 sellers. Then you write another query where you select only the Purchaser and the Max(count) grouped by Purchaser which will give you the two yellow-marked lines. Join the two queries on fields Purchaser and Max(Count) and add the columns from the joined table to your first query.
I can't think of a faster way but this works pretty fast even with rather large queries. You can further-by order the fields as needed.
I have got few sales offices, together with their sales. I am trying to set-up report that will basically tell how is each office performing. Getting some SUMs, COUNTs are quite easy, however I am struggling with getting rank of single office.
I would like to have this query return the rank of single office, during the entire period and/or specified time (eg. BETWEEN '2015-01-01' AND '2015-01-15')
I need to also exclude some offices from the rank list (eg. OfficeName NOT IN ('GGG','QQQ')), so using the sample data, the rank of office 'XYZ' would be 5.
In case that the OfficeName = 'XYZ' is included in WHERE clause, the RANK would be obviously = 1 as SQL filters out other rows, not contained in WHERE clause before executing the rest of the code.
Is there any way of doing the same, without using the TemporaryTable ?
SELECT OfficeName, SUM(Value) as SUM,
RANK() OVER (ORDER BY SUM(VALUE) DESC) AS Rank
FROM Transactions t
JOIN Office o ON t.TransID=o.ID
WHERE OfficeName NOT IN ('GGG','QQQ')
--AND OfficeName = 'XYZ'
GROUP BY OfficeName
ORDER BY 2 DESC;
I am using MS SQL server 2008.
SQL Fiddle with some random data is here: http://sqlfiddle.com/#!3/fac7a/35
Many thanks for help!
if i understand you correctly you want to do:
SELECT *
FROM (
SELECT OfficeName, SUM(Value) as SUM,
RANK() OVER (ORDER BY SUM(VALUE) DESC) AS Rank
FROM Transactions t
JOIN Office o ON t.TransID=o.ID
WHERE OfficeName NOT IN ('GGG','QQQ')
GROUP BY OfficeName
) dat
WHERE OfficeName = 'XYZ';
You just need to wrap your code as derived table or use a CTE like this and then do the filter for OfficeName = 'XYZ'.
;WITH CTE AS
(
SELECT OfficeName, SUM(Value) as SUM,
RANK() OVER (ORDER BY SUM(VALUE) DESC) AS Rank
FROM Transactions t
JOIN Office o ON t.TransID=o.ID
WHERE OfficeName NOT IN ('GGG','QQQ')
GROUP BY OfficeName
)
SELECT *
FROM CTE
WHERE OfficeName = 'XYZ';
Here is an amusing way to do this without a subquery:
SELECT TOP 1 OfficeName, SUM(Value) as SUM,
RANK() OVER (ORDER BY SUM(VALUE) DESC) AS Rank
FROM Transactions t JOIN
Office o
ON t.TransID = o.ID
WHERE OfficeName NOT IN ('GGG','QQQ')
GROUP BY OfficeName
ORDER BY (CASE WHEN OfficeName = 'XYZ' THEN 1 ELSE 2 END);
I'm using ROW_NUMBER() and a derived table to fetch data from the derived table result.
However, I get the error message telling me I don't have the appropriate columns in the GROUP BY clause.
Here's the error:
Column 'tblCompetition.objID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
What column am I missing? Or am I doing something else wrong? Find below the query that is not working, and the (more simple) query that is working.
SQL Server 2008.
Query that isn't working:
SELECT
objID,
objTypeID,
userID,
datAdded,
count,
sno
FROM
(
SELECT scc.objID,scc.objTypeID,scc.userID,scc.datAdded,
COUNT(sci.favID) as count,
ROW_NUMBER() OVER(PARTITION BY scc.userID ORDER BY scc.unqID DESC) as sno
FROM tblCompetition scc
LEFT JOIN tblFavourites sci
ON sci.favID = scc.objID
AND sci.datTimeStamp BETWEEN #datStart AND #datEnd
) as t
WHERE sno <= 2 AND objTypeID = #objTypeID
AND datAdded BETWEEN #datStart AND #datEnd
GROUP BY objID,objTypeID,userID,datAdded,count,sno
Simple query that is working:
SELECT objId,objTypeID,userId,datAdded FROM
(
SELECT objId,objTypeID,userId,datAdded,
ROW_NUMBER() OVER(PARTITION BY userId ORDER BY unqid DESC) as sno
FROM tblRdbCompetition
) as t
WHERE sno<=2 AND objtypeid=#objTypeID
AND datAdded BETWEEN #datStart AND #datEnd
Thank you!
you need the GROUP BY in your subquery since that's where the aggregate is:
SELECT
objID,
objTypeID,
userID,
datAdded,
count,
sno
FROM
(
SELECT scc.objID,scc.objTypeID,scc.userID,scc.datAdded,
COUNT(sci.favID) as count,
ROW_NUMBER() OVER(PARTITION BY scc.userID ORDER BY scc.unqID DESC) as sno
FROM tblCompetition scc
LEFT JOIN tblFavourites sci
ON sci.favID = scc.objID
AND sci.datTimeStamp BETWEEN #datStart AND #datEnd
GROUP BY scc.objID,scc.objTypeID,scc.userID,scc.datAdded) as t
WHERE sno <= 2 AND objTypeID = #objTypeID
AND datAdded BETWEEN #datStart AND #datEnd
You cannot have count in a group by clause. Infact the count is derived when you have other fields in group by. Remove count from your Group by.
In the innermost query you are using
COUNT(sci.favID) as count,
which is an aggregate, and you select other non-aggregating columns along with it.
I believe you wanted an analytic COUNT instead:
SELECT objID,
objTypeID,
userID,
datAdded,
count,
sno
FROM (
SELECT scc.objID,scc.objTypeID,scc.userID,scc.datAdded,
COUNT(sci.favID) OVER (PARTITION BY scc.userID ) AS count,
ROW_NUMBER() OVER (PARTITION BY scc.userID ORDER BY scc.unqID DESC) as sno
FROM tblCompetition scc
LEFT JOIN
tblFavourites sci
ON sci.favID = scc.objID
AND sci.datTimeStamp BETWEEN #datStart AND #datEnd
) as t
WHERE sno = 1
AND objTypeID = #objTypeID