I have folowing sql query an di want to get previous of max value from table.
select max(card_no),vehicle_number
FROM WBG.WBG_01_01
group by vehicle_number
Through this query i got each maximum card number of each vehicle.But i want to get previouse of that max.For example
if vehicle number has card number 21,19,17,10,5,6,1 and i want to get 19 from max function
Please anyone tell me how can i do this in sql.
Another idea would be to use analytics, something like this:
select
vehicle_number,
prev_card_no
from (
select
card_no,
vehicle_number,
lag(card_no) over
(partition by vehicle_number order by card_no) as prev_card_no,
max(card_no) over
(partition by vehicle_number) as max_card_no
FROM WBG.WBG_01_01
)
where max_card_no = card_no;
Of course, this doesn't take into account your seemingly arbitrary ordering from your question, nor would it work with duplicate maximum numbers.
try this one:
select max(card_no),vehicle_number
FROM WBG.WBG_01_01
where card_no < (Select max(card_no) from WBG.WBG_01_01 group by vehicle_number)
group by vehicle_number
Related
I have a table 'exam_table' containing : User_ID, Exam_date, Exam_status.
Exam_status = ['Success' or 'Fail']
The question is :
Based on the above data, propose an SQL
query to finds the 5 candidates with the most failures. In case
of equality, we wish to obtain first the students whose date of first exam is the most distant in time.
I found the 5 candidates with the most failures but I still don't know how to sort them according to exam_date in case of equality.
Do you have any suggestions? Thank you in advance for helping !
Your order by is a clause which has ordering criteria separated by ,. So you can easily add another criteria, like below:
SELECT User_ID, count(exam_status) as nb_Failures
FROM exam_table
GROUP BY User_ID
ORDER BY nb_Failures, min(exam_date)
LIMIT 5;
UPDATED:
corrected by the date of the first exam:
SELECT
user_id,
MIN (exam_date) AS first_exam_date,
SUM (
CASE exam_status
WHEN 'Failed' THEN 1
ELSE 0
END
) AS nb_failures
FROM exam_table
GROUP BY user_id
ORDER BY nb_failures DESC, first_exam_date ASC
LIMIT 5;
or like this:
SELECT
user_id,
MIN (exam_date) AS first_exam_date,
COUNT(exam_status) AS nb_failures
FROM exam_table
WHERE exam_status = 'Failed'
GROUP BY user_id
ORDER BY nb_failures DESC, first_exam_date ASC
LIMIT 5;
PS: aggregate functions must also be applied to the date
PPS: but the first and second queries have different results. In the first, the date of the first exam is selected, in principle, it does not matter if it is successful or not. The second selects only the date of the first failed exam.
I have a table with user, date and a col each for messages sent and messages received:
I want to get the max of messages_sent/messages_recieved by date and user for that ratio. So this is the output I expect:
Andrew Lean 10/2/2020 10
Andrew Harp 10/1/2020 6
This is my query:
SELECT
ds.date, ds.user_name, max(ds.ratio) from
(select a.user_name, a.date, a.message_sent/ a.message_received as ratio
from messages a
group by a.user_name, a.date) ds
group by ds.date
But the output I get is:
Andrew Lean 10/2/2020 10
Jalinn Kim 10/1/2020 6
In the above output 6 is the correct max ratio for the date grouped but the user is wrong. What am I doing wrong?
With a recent version of most databases, you could do something like this.
This assumes, as in your data, there's one row per user per day. If you have more rows per user per day, you'll need to provide a little more detail about how to combine them or ignore some rows. You could want to SUM them. It's tough to know.
WITH cte AS (
select a.user_name, a.date
, a.message_sent / a.message_received AS ratio
, ROW_NUMBER() OVER (PARTITION BY a.date ORDER BY a.message_sent / a.message_received DESC) as rn
from messages a
)
SELECT t.user_name, t.date, t.ratio
FROM cte AS t
WHERE t.rn = 1
;
Note: There's no attempt to handle ties, where more than one user has the same ratio. We could use RANK (or other methods) for that, if your database supports it.
Here, I am just calculating the ratio for each column in the first CTE.
In the second part, I am getting the maximum results of the ratio calculated in the first part on date level. This means I am assuming each user will have one row for each date.
The max() function on date level will ensure that we always get the highest ratio on date level.
There could be ties, between the ratios for that we can use ROW_NUMBER' OR RANK()` to set a rank for each row based on the criteria that we would like to pass in case of ties and then filter on the rank generated.
with data as (
select
date,
user_id,
messages_sent / messages_recieved as ratio
from [table name]
)
select
date,
max(ratio) as higest_ratio_per_date
from data
group by 1,2
Using the dataset hosted on Google (MBL Data) as an example, here is what I am accomplishing to do - obtain last 3 weeks score run for a given Venue.
My aggregated dataset looks like this without the strikes_3wk column -
Logic for strikes_3wk column is to partition the aggregated dataset by venueName, order by YearWeek column and then obtain the last 3 weeks aggregated strikes data.
Here is the query I have written so far. I see that the windowing function is where I need to modify the logic. So, is there a way to add grouping within the windowing function? Is there any alternative way of doing this?
In the image I added a new column 'expected', showing values for two weeks.
select inr.*
,sum(inr.strikes) over (Venue_Week rows between current row and 2 following) as strikes_3wk
from
(
select seasonType
,gameStatus
,homeTeamName
,awayTeamName
,venueName
,CAST(
CONCAT(
CAST(EXTRACT(YEAR FROM createdAt) as string)
,CAST(EXTRACT(WEEK(Monday) FROM createdAt) as string)
) as INT64)
as YearWeek
,sum(homeFinalRuns) as homeFinalRuns
,sum(strikes) as strikes
from `bigquery-public-data.baseball.games_wide`
where createdAt is not null
group by seasonType
,gameStatus
,homeTeamName
,awayTeamName
,venueName
,YearWeek
)inr
window Venue_Week as (
partition by inr.venueName
order by inr.YearWeek desc
)
So you are looking for strikes per venue regardless of who did them, right?
May be something like:
SELECT INR.*, STATS.strikes_3wk
FROM `bigquery-public-data.baseball.games_wide` INR
LEFT JOIN (
SELECT venueName, SUM(strikes) as strikes_3wk
FROM `bigquery-public-data.baseball.games_wide` INR2
WHERE YearWeek IN (
SELECT TOP 3 YearWeek
FROM `bigquery-public-data.baseball.games_wide`
WHERE venueName = INR2.venueName
ORDER BY YearWeek DESC
)
GROUP BY venueName
) STATS
ON INR.venueName = STATS.venueName
I only have basic SQL skills. I'm working in SQL in Navicat. I've looked through the threads of people who were also trying to get latest date, but not yet been able to apply it to my situation.
I am trying to get the latest date for each name, for each chemical. I think of it this way: "Within each chemical, look at data for each name, choose the most recent one."
I have tried using max(date(date)) but it needs to be nested or subqueried within chemical.
I also tried ranking by date(date) DESC, then using LIMIT 1. But I was not able to nest this within chemical either.
When I try to write it as a subquery, I keep getting an error on the ( . I've switched it up so that I am beginning the subquery a number of different ways, but the error returns near that area always.
Here is what the data looks like:
1
Here is one of my failed queries:
SELECT
WELL_NAME,
CHEMICAL,
RESULT,
APPROX_LAT,
APPROX_LONG,
DATE
FROM
data_all
ORDER BY
CHEMICAL ASC,
date( date ) DESC (
SELECT
WELL_NAME,
CHEMICAL,
APPROX_LAT,
APPROX_LONG,
DATE
FROM
data_all
WHERE
WELL_NAME = WELL_NAME
AND CHEMICAL = CHEMICAL
AND APPROX_LAT = APPROX_LAT
AND APPROX_LONG = APPROX_LONG,
LIMIT 2
)
If someone does have a response, it would be great if it is in as lay language as possible. I've only had one coding class. Thanks very much.
Maybe something like this?
SELECT WELL_NAME, CHEMICAL, MAX(DATE)
FROM data_all
GROUP BY WELL_NAME, CHEMICAL
If you want all information, then use the ANSI-standard ROW_NUMBER():
SELECT da.*
FROM (SELECT da.*
ROW_NUMBER() OVER (PARTITION BY chemical, name ORDER BY date DESC) as senum
FROM data_all da
) da
WHERE seqnum = 1;
I have tables with:
id desc total
1 baskets 25
2 baskets 15
3 baskets 75
4 noodles 10
I would like to ask the query with output which the sum of total is 40.
The output would be like:
id desc total
1 baskets 25
2 baskets 15
I believe this will get you a list of the results you're looking for, but not with your example dataset because nothing in your example dataset can provide a total sum of 40.
SELECT id, desc, total
FROM mytable
WHERE desc IN (
SELECT desc
FROM mytable
GROUP BY desc
HAVING SUM(total) = 40
)
Select Desc,SUM(Total) as SumTotal
from Table
group by desc
having SUM(Total) > = 40
Not quite sure what you want, but this may get you started
SELECT `desc`, SUM(Total) Total
FROM TableName
GROUP BY `desc`
HAVING SUM(Total) = 40
From reading your question, it sounds like you want a query that returns any subset of of sums that represent a certain target value and have the same description.
There is no simple way to do this. This migrates into algorithmic territory.
Assuming I am correct in what you are after, group bys and aggregate functions will not solve your problem. SQL cannot indicate that a query should be performed on subsets of data until it exhaust all possible permutations and finds the Sums that match your requirements.
You will have to intermix an algorithm into your sql ... i.e a stored procedure.
Or simply get all the data from the database that fits the desc then perform your algorithm on it in code.
I recall there was a CS algorithmic class I took where this was a known Problem:
I believe you could just adapt working versions of this algorithm to solve your problem
http://en.wikipedia.org/wiki/Subset_sum_problem
select desc
from (select desc, sum(total) as ct group by desc)