Problem with MySQL query for high scores leaderboard - sql

I have a MySQL high scores table for a game that shows the daily high score for each of the past days of the year. Right now I am doing a PHP for-loop and making a separate query for each day, but the table is becoming too large to do that so I would like to condense it into one simple MySQL statement.
Here is my new query right now (date_submitted is a timestamp):
SELECT date(date_submitted) as subDate, name, score FROM highScores WHERE date_submitted > "2009-07-16" GROUP BY subDate ORDER BY subDate DESC, score DESC LIMIT 10;
output:
+------------+------------+--------+
| subDate | name | score |
+------------+------------+--------+
| 2010-07-18 | krissy | 959976 |
| 2010-07-10 | claire | 260261 |
| 2010-07-05 | krissy | 771416 |
| 2010-06-19 | krissy | 698031 |
| 2010-06-18 | otli | 264898 |
| 2010-06-15 | robbie | 82303 |
| 2010-06-01 | dad | 480469 |
| 2010-05-29 | vicente | 124149 |
| 2010-05-27 | dad | 564007 |
| 2010-05-26 | caleb | 502623 |
+------------+------------+--------+
My problem is that when it grouped by subDate, it took the highest score for the earliest timestamp of that day, as you can see in the next query:
SELECT name, score, date_submitted FROM highScores WHERE date(date_submitted)='2010-06-15' GROUP BY name ORDER BY score DESC;
output:
+--------+--------+---------------------+
| name | score | date_submitted |
+--------+--------+---------------------+
| john | 304095 | 2010-06-15 22:58:02 |
| april | 247126 | 2010-06-15 21:25:31 |
| orli | 166021 | 2010-06-15 21:25:31 |
| robbie | 82303 | 2010-06-15 11:38:39 |
+--------+--------+---------------------+
As you can see, poor john should have been the leader for 2010-06-15. Can anyone help? Hopefully it is something real simple I am overlooking. I tried using max(score) before the FROM part in the 1st query and it gave me the correct score but didn't carry over the name.
Thank you for any help.

SELECT userName, userScore, subDate FROM (
SELECT
userName,
userScrore,
DATE(submitDate) as subDate,
#rn := CASE WHEN #subDate = DATE(submitDate)
THEN #rn + 1
ELSE 1
END AS rn,
#subDate := DATE(submitDate)
FROM (SELECT #subDate := NULL) vars, highScores
ORDER BY submitDate, userScore DESC
) deriv
WHERE rn=1;
See also the answer to another 'highest record per something'-question

Add a
ORDER BY userScore DESC
at the end of the second query.

Related

Cumulative SUM in a query (SQL access)

Using MS access SQL I have a query (actually a UNION made of multiple queries) and need a cumulative sum (actually a statement of account which items are in chronological order).
How do I get a cumulative sum?
Since they are duplicates by date I have to add a new ID, however, SQL in MS access does not seem to have ROW_ID or similar.
So, we need to sort donation data into chronological order across multiple tables with duplicates. First combine all the tables of donators in one query which sets up the simplest syntax. Then to put things in order we need to have an order for the duplicate dates. The dataset has two natural ways to sort duplicate dates including the donator and the amount. For instance, we could decide that after the date bigger donations come first, If the rule is complicated enough we abstract it to a code module and into public function and include it in the query so that we can sort by it:
'Sorted Donations:'
SELECT (BestDonator(q.donator)) as BestDonator, *
FROM tblCountries as q
UNION SELECT (BestDonator(j.donator)) as BestDonator, *
FROM tblIndividuals as j
ORDER BY EvDate Asc, Amount DESC , BestDonator DESC;
Public Function BestDonator(donator As String) As Long
BestDonator = Len(donator) 'longer names are better :)'
End Function
with sorted donations we have settled on an order for the duplicate dates and have combined both individual donations and country donations, so now we can calculate the running sum directly using either dsum or a subquery. There is no need to calculate row id. The tricky part is getting the syntax correct. I ended up abstracting the running sum calculation to a function and omitting BestDonator because I couldn't easily paste together this query in the query designer and I ran out of time to bug fix
Public Function RunningSum(EvDate As Date, Amount As Currency)
RunningSum = DSum("Amount", "Sorted Donations", "(EvDate < #" & [EvDate] & "#) OR (EvDate = #" & [EvDate] & "# AND Amount >= " & [Amount] & ")")
End Function
Carefully note the OR in the Dsum part of the RunningSum calculation. This is the tricky part to summing the right amounts.
'output
-------------------------------------------------------------------------------------
| donator | EvDate | Amount | RunningSum |
-------------------------------------------------------------------------------------
| Reiny | 1/10/2020 | 321 | 321 |
-------------------------------------------------------------------------------------
| Czechia | 3/1/2020 | 7455 | 7776 |
-------------------------------------------------------------------------------------
| Germany | 3/18/2020 | 4222 | 11998 |
-------------------------------------------------------------------------------------
| Jim | 3/18/2020 | 222 | 12220 |
-------------------------------------------------------------------------------------
| Australien | 4/15/2020 | 13423 | 25643 |
-------------------------------------------------------------------------------------
| Mike | 5/31/2020 | 345 | 25988 |
-------------------------------------------------------------------------------------
| Portugal | 6/6/2020 | 8755 | 34743 |
-------------------------------------------------------------------------------------
| Slovakia | 8/31/2020 | 3455 | 38198 |
-------------------------------------------------------------------------------------
| Steve | 9/6/2020 | 875 | 39073 |
-------------------------------------------------------------------------------------
| Japan | 10/10/2020 | 5234 | 44307 |
-------------------------------------------------------------------------------------
| John | 10/11/2020 | 465 | 44772 |
-------------------------------------------------------------------------------------
| Slowenia | 11/11/2020 | 4665 | 49437 |
-------------------------------------------------------------------------------------
| Spain | 11/22/2020 | 7677 | 57114 |
-------------------------------------------------------------------------------------
| Austria | 11/22/2020 | 3221 | 60335 |
-------------------------------------------------------------------------------------
| Bill | 11/22/2020 | 767 | 61102 |
-------------------------------------------------------------------------------------
| Bert | 12/1/2020 | 755 | 61857 |
-------------------------------------------------------------------------------------
| Hungaria | 12/24/2020 | 9996 | 71853 |
-------------------------------------------------------------------------------------

compare two columns in PostgreSQL show only highest value

This is my table
I'm trying to find in which urban area having high girls to boys ratio.
Thank you for helping me in advance.
| urban | allgirls | allboys |
| :---- | :------: | :-----: |
| Ran | 100 | 120 |
| Ran | 110 | 105 |
| dhanr | 80 | 73 |
| dhanr | 140 | 80 |
| mohan | 180 | 73 |
| mohan | 25 | 26 |
This is the query I used, but I did not get the expected results
SELECT urban, Max(allboys) as high_girls,Max(allgirls) as high_boys
from table_urban group by urban
Expected results
| urban | allgirls | allboys |
| :---- | :------: | :-----: |
| dhar | 220 | 153 |
First of all your example expected result doesn't seems correct because the girls to boys ratio is highest in "mohan" and not in "dhanr" - If what you are really looking for is the highest ratio and not the highest number of girls.
You need to first group and find the sum and then find the ratio (divide one with other) and get the first one.
select foo.urban as urban, foo.girls/foo.boys as ratio from (
SELECT urban, SUM(allboys) as boys, SUM(allgirls) as girls
FROM table_urban
GROUP BY urban) as foo order by ratio desc limit 1
SELECT urban, SUM(allboys) boys, SUM(allgirls) girls
FROM table_urban
GROUP BY urban
ORDER BY boys / girls -- or backward, "girls / boys"
LIMIT 1

SQL GROUPING with conditional

I am sure this is easy to accomplish but after spending the whole day trying I had to give up and ask for your help.
I have a table that looks like this
| PatientID | VisitId | DateOfVisit | FollowUp(Y/N) | FollowUpWks |
----------------------------------------------------------------------
| 123456789 | 2222222 | 20180802 | Y | 2 |
| 123456789 | 3333333 | 20180902 | Y | 4 |
| 234453656 | 4443232 | 20180506 | N | NULL |
| 455344243 | 2446364 | 20180618 | Y | 12 |
----------------------------------------------------------------------
Basically I have a list of PatientIDs, each patient can have multiple visits (VisitID and DateOfVisit). FollowUp(Y/N) specifies whether the patients has to be seen again and in how many weeks (FollowUpWks).
Now, what I need is a query that extracts PatientsID, DateOfVisit (the most recent one and only if FollowUp is YES) and the FollowUpWks field.
Final result should look like this
| PatientID | VisitId | DateOfVisit | FollowUp(Y/N) | FollowUpWks |
----------------------------------------------------------------------
| 123456789 | 3333333 | 20180902 | Y | 4 |
| 455344243 | 2446364 | 20180618 | Y | 12 |
----------------------------------------------------------------------
The closest I could get was with this code
SELECT PatientID,
Max(DateOfVisit) AS LastVisit
FROM mytable
WHERE FollowUp = True
GROUP BY PatientID;
The problem is that when I try adding the FollowUpWks field to the SELECT I get the following error: "The query does not include the specified expression as part of an aggregate function." However, if I add FollowUpWks to the GROUP BY statement than I get all visits, not just the most recent ones.
You need to match back to the most recent visit. One method uses a correlated subquery:
SELECT t.*
FROM mytable as t
WHERE t.FollowUp = True AND
t.DateOfVisit = (SELECT MAX(t2.DateOfVisit)
FROM mytable as t2
WHERE t2.PatientID = t.PatientID
);

How to optimize nested innner hive query

I have a table with following stock data where we have couple of columns like date, ticker, open and close(stock prices).
To query this data, I want to know which stock has given the highest margin on particular date. So if I have 516 different stocks, my query should return 516 rows of ticker, date, open, close and a new column Margin(which will be max(close-open)).
| deep_stocks.date_ | deep_stocks.ticker | deep_stocks.open | deep_stocks.close |
+--------------------+---------------------+-------------------+--------------------+--+
| 20100721 | A | 27.68 | 27.58 |
| 20100722 | A | 27.95 | 28.72 |
| 20100723 | A | 28.56 | 29.3 |
| 20100726 | A | 29.22 | 29.64 |
| 20100727 | A | 29.73 | 28.87 |
| 20100728 | A | 28.79 | 28.78 |
| 20100729 | A | 28.97 | 28.15 |
| 20100730 | A | 27.78 | 27.93 |
| 20100802 | A | 28.35 | 28.82 |
| 20100803 | A | 28.7 | 27.84 |
I have written a query where my approach was:
Step 1 - Get the difference between Close and Open prices (Inner/Sub query)
Step 2 - Get the maximum of margin for every stock (used group by with max function)
Step 3 - Join the results with Main Table and get the data.
I'll put my query in solution or comments can someone please correct it as it is taking more time. Also I would like to know can we have any other alternative approach.
As already told about my approach please find below query:
SELECT ds.ticker, ds.date_, ds.close, ds.open, ds.Margin FROM
(SELECT ticker, date_, close, open, case(close-open)>0 when true then round(close-open,2) else 0 end as Margin FROM DataStocks) ds
JOIN
(SELECT dsIn.ticker, max(dsIn.Margin) mxMargin FROM
(select ticker, case(close-open)>0 when true then round(close-open,2) else 0 end as Margin FROM DataStocks ) dsIn group by dsIn.ticker) dsEx
ON ds.ticker=dsEx.ticker AND ds.Margin=dsEx.mxMargin ORDER BY ds.Margin;
Do we have any other alternatives for this query or can it be possible to optimize it.

Extract data from largest date

I am having recods as below
---------------------------------------------------------------------
| AcnttNo | Date1 | Balance1 | Date2 | balance3 | date4 | balance4 |
|--------------------------------------------------------------------
| 123 | 50282 | 3456 | 45465 | 56557 | 4556 | 324235 |
| 123 | 56757 | 23434 | 234235 | 344324 | 56476 | 5676 |
| 123 | 435 | 2434 | 2343 | 234545 | 24245 | 2423424 |
---------------------------------------------------------------------
For example:
for each AcnttNo there will be several rows of data for balance and date.
I need to get the balance for largest date.
I'm using PL/SQL developer and an oracle database
If you want the row with the greatest date:
select
*
from
YourTable y
where
greatest(y.date1, y.date2, y.date3) =
(select max(greatest(yx.date1, yx.date2, yx.date3))
from
YourTable yx)
If you do actually need the balance matching the greatest date on that row:
select
greatest(y.date1, y.date2, y.date3) as GreatestDate,
case greatest(y.date1, y.date2, y.date3)
when y.Date1 then
y.balance1
when y.date2 then
y.balance2
when y.date3 then
y.balance3
end as GreatestDateBalance
from
YourTable y
where
greatest(y.date1, y.date2, y.date3) =
(select max(greatest(yx.date1, yx.date2, yx.date3))
from
YourTable yx)
But I think what you really need, is to reconsider your table design. :)
I'm not sure why you've multiple dates / balances in your table, however, the below should get you something interesting that you can work on...
SELECT *
FROM YourTable T
WHERE NOT EXISTS (
SELECT *
FROM YourTable T2
WHERE T2.AcntNo = T.AcntNo
AND T2.Date1 > T.Date1
)