Same query selects different data with row_number() - sql

Let´s say I have two tables. The first table represents Accounts and the second table represents Characters. Tables are connected with "AccountId" column. According to o game, characters are connected to accounts and every account can contain 4 characters.
In my website I have two pages. The first page is called "Characterlist.php" and generates all characters in game with one specific Authority (Column in Accounts table) which would be equal to 0.
I'm using this query:
$query = "WITH
Acc AS
(SELECT AccountId, Authority
FROM Account
WHERE Authority = '0'),
Char AS
(SELECT Character.*
FROM Character, Acc
WHERE Character.AccountId = Acc.AccountId),
Res AS
(SELECT *,
ROW_NUMBER() OVER
(ORDER BY Reput DESC) AS row_number
FROM Char)
SELECT Res.Class,
Res.Name,
Res.Level,
Res.HeroLevel,
Res.Reput,
Res.row_number
FROM Res
WHERE Res.row_number >= '$charlistpagemin'
AND Res.row_number <= '$charlistpagemax'";
$charlistpagemin and $vharlistpagemax are php variables that I'm using to divide whole character list to several pages.
When I search characters of my testing account, their position according to their Reputation generated and ordered by row_number() is alright.
Then I have the second page userpanel.php which is shown only to signed users where they can see their game characters in another list.
I'm using almost the same query, just with different rules in the end.
$query =
"WITH
Acc AS
(SELECT AccountId,
Authority
FROM Account
WHERE Authority = '0'),
Char AS
(SELECT Character.*
FROM Character, Acc
WHERE Character.AccountId = Acc.AccountId),
Res AS
(SELECT *,
ROW_NUMBER()
OVER (ORDER BY Reput DESC) AS row_number FROM Char)
SELECT Res.AccountId,
Res.Class,
Res.Name,
Res.Level,
Res.HeroLevel,
Res.Reput,
Res.row_number
FROM Res
WHERE Res.AccountId = '" . $_SESSION["accountid"] . "'";
And there is the problem. Their position according to Reputation is different (bad) than in characterlist.php. Where is the problem?
Edit:
The table "Account" looks like:
AccountId | Authority | ... |
The table "Character" looks like:
AccountId | Class | Name | Level | HeroLevel | Reput | ... |
My testing account has
AccountId: xxx | Authority: 0 | ... |
And has Characters
AccountId: XXX | Class: Dobrodruh | Name: Kryploij1 | Level: 15 | Herolevel: 0 | Reput: 0 | ... |
AccountId: XXX | Class: Dobrodruh | Name: Kryploid2 | Level: 15 | Herolevel: 0 | Reput: 0 | ... |
AccountId: XXX | Class: Dobrodruh | Name: Kryploij3 | Level: 15 | Herolevel: 0 | Reput: 0 | ... |
AccountId: XXX | Class: Dobrodruh | Name: Grakonecek<3 | Level: 15 | Herolevel: 0 | Reput: 0 | ... |
The expected result that is shown in characterlist.php is
Position: 139 | Class: Dobrodruh | Name: Kryploij1 | Level: 15 | Herolevel: 0 | Reput: 0 |
Position: 140 | Class: Dobrodruh | Name: Kryploid2 | Level: 15 | Herolevel: 0 | Reput: 0 |
Position: 141 | Class: Dobrodruh | Name: Kryploij3 | Level: 15 | Herolevel: 0 | Reput: 0 |
Position: 142 | Class: Dobrodruh | Name: Grakonecek<3 | Level: 15 | Herolevel: 0 | Reput: 0 |
But the bad result in userpanel.php is:
Position: 110 | Class: Dobrodruh | Name: Kryploij1 | Level: 15 | Herolevel: 0 | Reput: 0 |
Position: 111 | Class: Dobrodruh | Name: Kryploid2 | Level: 15 | Herolevel: 0 | Reput: 0 |
Position: 112 | Class: Dobrodruh | Name: Kryploij3 | Level: 15 | Herolevel: 0 | Reput: 0 |
Position: 113 | Class: Dobrodruh | Name: Grakonecek<3 | Level: 15 | Herolevel: 0 | Reput: 0 |
Simply, my problem is in changed positions in userpanel. True positions are shown in characterlist. To calculate position according to ordering data in tables by Reputation I'm using row_number() function as is shown in my queries.

Your data has ties for reput and your row_number() expression is:
ROW_NUMBER() OVER (ORDER BY Reput DESC) AS row_number
Ordering in SQL is unstable. That means that rows with ties are in an arbitrary and indeterminate order -- and this order might change from one execution to the next.
Why is sorting unstable? Easy. SQL tables represent unordered sets, so there is no "underlying" ordering to define what happens in the case of ties.
The simple solution is to add another key. In your case, I think Name might be sufficient:
ROW_NUMBER() OVER (ORDER BY Reput DESC, Name) AS row_number

This is really hard to answer without sample data and inputs, but I think it is because the two queries do different things.
The first query paginates through the results, and the pagination imposes the ordering - presumably, $charlistpagemin and $charlistpagemax are 0 and 10 when you first load the page, so you get the first ten results.
The second query has not ordering - it's just a list of results, in undefined order.
If you add
order by Res.row_number desc
at the end of the query, it should work.

Related

Cloudera / Impala / SQL: finding all rows with unique value in specific column

Hopefully a simple question for some of you: I have a table adsb_table as as follows (apologiesstrong text for the formatting of the table):
callsign | time | speed|
A | 23421 | 431 |
A | 23422 | 426 |
A | 23423 | 459 |
B | 23424 | 521 |
B | 23425 | 601 |
B | 23426 | 401 |
C | 23427 | 454 |
C | 23428 | 499 |
C | 23429 | 621 |
I want the resulting output to be the first row for each unique value of callsign:
A 23421 431
B 23424 521
C 23427 454
I have tried the following without success:
SELECT callsign, time, speed FROM adsb_table WHERE speed>400 ORDER BY callsign GROUP by callsign
I don't know if the fact that I am using Impala makes the difference in the query. No output is generated - if I remove the "GROUP BY" clause all ordered records are listed....so I am using the GROUP BY incorrectly I guess. Help.
If you always want the first row per callsign, you can use ROW_NUMBER()
WITH cte AS (
SELECT
callsign,
time,
speed,
ROW_NUMBER() OVER (PARTITION BY callsign) AS row_no
FROM adsb_table
WHERE speed > 400
)
SELECT *
FROM cte
WHERE row_no = 1
ORDER BY callsign

PowerBI / SQL Query to verify records

I am working on a PowerBI report that is grabbing information from SQL and I cannot find a way to solve my problem using PowerBI or how to write the required code. My first table, Certifications, includes a list of certifications and required trainings that must be obtained in order to have an active certification.
My second table, UserCertifications, includes a list of UserIDs, certifications, and the trainings associated with a certification.
How can I write a SQL code or PowerBI measure to tell if a user has all required trainings for a certification? ie, if UserID 1 has the A certification, how can I verify that they have the TrainingIDs of 1, 10, and 150 associated with it?
Certifications:
CertificationsTable
UserCertifications:
UserCertificationsTable
This is a DAX pattern to test if contains at least some values.
| Certifications |
|----------------|------------|
| Certification | TrainingID |
|----------------|------------|
| A | 1 |
| A | 10 |
| A | 150 |
| B | 7 |
| B | 9 |
| UserCertifications |
|--------------------|---------------|----------|
| UserID | Certification | Training |
|--------------------|---------------|----------|
| 1 | A | 1 |
| 1 | A | 10 |
| 1 | A | 300 |
| 2 | A | 150 |
| 2 | B | 9 |
| 2 | B | 90 |
| 3 | A | 7 |
| 4 | A | 1 |
| 4 | A | 10 |
| 4 | A | 150 |
| 4 | A | 1000 |
In the above scenario, DAX needs to find out if the mandatory trainings (Certifications[TrainingID]) by Certifications[Certification] is completed by
UserCertifications[UserID ]&&UserCertifications[Certifications] partition.
In the above scenario, DAX should only return true for UserCertifications[UserID ]=4 as it is the only User that completed at least all the mandatory trainings.
The way to achieve this is through the following measure
areAllMandatoryTrainingCompleted =
VAR _alreadyCompleted =
CONCATENATEX (
UserCertifications,
UserCertifications[Training],
"-",
UserCertifications[Training]
) // what is completed in the fact Table; the fourth argument is very important as it decides the sort order
VAR _0 =
MAX ( UserCertifications[Certification] )
VAR _supposedToComplete =
CONCATENATEX (
FILTER ( Certifications, Certifications[Certification] = _0 ),
Certifications[TrainingID],
"-",
Certifications[TrainingID]
) // what is comeleted in the training Table; the fourth argument is very important as it decides the sort order
VAR _isMandatoryTrainingCompleted =
CONTAINSSTRING ( _alreadyCompleted, _supposedToComplete ) // CONTAINSSTRING (<Within Text>,<Search Text>); return true false
RETURN
_isMandatoryTrainingCompleted

How to select oldest date row from each product using SQL

I would like to get the oldest only one from every type product and sum of the prices listed in listofproduct table. Another thing is to search only between prodacts that has at least one peace on lager.
With the SQL I managed to get all the products has at least one on the stock. But the rest I am stack...
So the sum cold be done later, that was my plan, but if you have better idea feel free to write
Here is my data:
+-------------+----------------+---------------+----------+
| IDProizvoda | NazivProizvoda | DatumKupovine | NaLageru |
+-------------+----------------+---------------+----------+
| 77 | Cokolada | 25-Feb-20 | 2 |
| 44 | fgyhufrthr | 06-Aug-20 | 5 |
| 55 | Auto | 06-Aug-23 | 0 |
| 55 | Auto | 11-Aug-20 | 200 |
| 77 | Cokolada | 06-Aug-27 | 0 |
| 77 | Cokolada | 25-Feb-20 | 10 |
| 77 | Cokolada | 25-Jan-20 | 555 |
| 77 | Cokolada | 25-Mar-20 | 40 |
+-------------+----------------+---------------+----------+
Access.ExeQuery("SELECT * FROM Products " &
"WHERE IDProizvoda IN (SELECT value FROM STRING_SPLIT(#listofproduct, ',')) " &
"AND NaLageru > 0 ")
I tried to add GROUP BY and HAVING but it does not worked because i choose the whole table. But I need Product ID and Stock field for edit it later, to subtract one from the stock for those products.
I would like to get the result:
+-------------+----------------+---------------+----------+
| IDProizvoda | NazivProizvoda | DatumKupovine | NaLageru |
+-------------+----------------+---------------+----------+
| 44 | fgyhufrthr | 06-Aug-20 | 5 |
| 55 | Auto | 11-Aug-20 | 200 |
| 77 | Cokolada | 25-Jan-20 | 555 |
+-------------+----------------+---------------+----------+
Thank you for all the help.
You can do it with a Cross Apply, this would be your SQL query:
Select P.IDProizvoda,
P.NazivProizvoda,
N.DatumKupovine,
N.NaLageru,
N.IDKupovine,
N.CenaPoKomadu
From
products P
Cross Apply
(
Select top 1 DatumKupovine,
NaLageru,
IDKupovine,
CenaPoKomadu
From products P2
where P2.IDProizvoda = P.IDProizvoda
and P2.NaLageru > 0
order by DatumKupovine
) N
group by P.IDProizvoda, P.NazivProizvoda, N.DatumKupovine, N.NaLageru, N.IDKupovine, N.CenaPoKomadu
And this your ExeQuery:
Access.ExeQuery("Select P.IDProizvoda, P.NazivProizvoda, N.DatumKupovine, N.NaLageru, N.IDKupovine, N.CenaPoKomadu From products P " &
" Cross Apply( Select top 1 DatumKupovine, NaLageru, IDKupovine, CenaPoKomadu From products P2 where P2.IDProizvoda = P.IDProizvoda and P2.NaLageru > 0 order by DatumKupovine) N " &
" where P.IDProizvoda in (Select value From STRING_SPLIT(#listofproduct, ',')) " &
" group by P.IDProizvoda, P.NazivProizvoda, N.DatumKupovine, N.NaLageru, N.IDKupovine, N.CenaPoKomadu " )
I think this is just aggregation with a filter:
SELECT IDProizvoda, NazivProizvoda, MAX(DatumKupovine),
SUM(NaLegaru)
FROM Products p
WHERE NaLegaru > 0
GROUP BY IDProizvoda, NazivProizvoda;
This should do it:
with cte as (
SELECT *, row_number() over (
partition by NazivProizvoda
order by DatumKupovine
) as rn
FROM Products
WHERE IDProizvoda IN (
SELECT value
FROM STRING_SPLIT(#listofproduct, ',')
)
AND NaLageru > 0
)
select *
from cte
where rn = 1;
By way of explanation, I'm using a common table expression to select the superset of the data you want by criteria and adding a column that enumerates each row within a group (a group being defined here as having NazivProizvoda be the same) in order of the DatumKupovine). With that done, anything that admits the value of 1 for that enumeration will be the oldest in the group. If you data is such that more than one row can be the oldest, use rank() instead of row_number().

Rails combine group by and min

Assume we have a table called Activities
+-----------+-----------+------------+--------------+
| player_id | device_id | event_date | games_played |
+-----------+-----------+------------+--------------+
| 1 | 2 | 2016-03-01 | 5 |
| 1 | 2 | 2016-05-02 | 6 |
| 2 | 3 | 2017-06-25 | 1 |
| 3 | 1 | 2016-03-02 | 0 |
| 3 | 4 | 2018-07-03 | 5 |
+-----------+-----------+------------+--------------+
I want to find out the player_id and it's first event_date as first_date.
SQL:
SELECT Activities.player_id, min(Activities.event_date) as first_date
FROM `activities`
GROUP BY `activities`.`player_id`
Result table:
+-----------+-------------+
| player_id | first_login |
+-----------+-------------+
| 1 | 2016-03-01 |
| 2 | 2017-06-25 |
| 3 | 2016-03-02 |
+-----------+-------------+
How to do it in Rails?
I've tried this one but retrieve an Activity collection which only contains player_id.
Activity.select('Activities.player_id, min(Activities.event_date) as first_date')
.group(:player_id)
Like this
[#<Activity:0x00007f94923bb888 player_id: 1>, #<Activity:0x00007f94923bb608 player_id: 2>, #<Activity:0x00007f94923b9ba0 player_id: 3>]
Actually your above query result
result = [#<Activity:0x00007f94923bb888 player_id: 1>, #<Activity:0x00007f94923bb608 player_id: 2>, #<Activity:0x00007f94923b9ba0 player_id: 3>]
loads first_date column, as first_date is not the attribute of Activity Object and rails do not display the virtual columns in this way.
You can access it using this syntax
result.last.first_date
If you need to see date value in result objects then modify your query like this
Activity.select('Activities.player_id, min(Activities.event_date) as event_date').group(:player_id)
then you will be able to get your desired result
<Activity:0x00007f94923bb608 player_id: 2, event_date: 'date value' >, #<Activity:0x00007f94923b9ba0 player_id: 3, event_date: 'date value' >]
In the result you got only player_id because when you do Activity.select('..'), rails returns ActiveRecord model object.
You probably want to run a custom query and convert the output to an array like this:
result = ActiveRecord::Base.connection.execute("SELECT Activities.player_id, min(Activities.event_date) as first_date FROM activities GROUP BY activities.player_id")
result.to_a # Converts PG::Result to an array, 'result.as_json' converts to json
Hope this helps.

Apply Limit for a Condition

I have a query that returns the credit notes (CN) and debit notes (DN) of an operation, each CN is accompanied by two or more DN (referenced by the field payment_plan_id). At the time of paging, for example I must bring 10 operations, that is 10 CN and their DN, but if I leave the limit at 10, it will also count the debit notes of the transaction that I must return in the query. So, it will only bring me 2, 3 or 4 operations depending on the number of DNs that accompany the credit note.
SELECT
value, installment, payment_plan_id, model,
creation_date, operation
FROM payment_plant
WHERE model != 'IMMEDIATE'
AND operation IN ('CN', 'DN')
AND creation_date BETWEEN '2017-06-12' AND '2017-07-12 23:59:59'
ORDER BY
model,
creation_date,
operation
LIMIT 10
OFFSET 1
Example of the table obviating some fields:
| id | payment_plan_id | value | installment | operation |
---------------------------------------------------------
| 1 | b3cdaede | 12 | 1 | NC |
| 2 | b3cdaede | 3.5 | 1 | ND |
| 3 | b3cdaede | 1.2 | 1 | ND |
| 4 | e1d7f051 | 36 | 1 | NC |
| 5 | e1d7f051 | 5.9 | 1 | ND |
| 6 | 00e6a0b4 | 15 | 1 | NC |
| 7 | 00e6a0b4 | 1 | 1 | ND |
| 8 | 00e6a0b4 | 3.6 | 1 | ND |
How can I limit the Limit so that it only consider the NCs?
Well, the query you give above doesn't do remotely what you describe. Assuming you actually want "the last 10 CN and their DN". You also don't explain what fields CN and DN have in common, so I'm going to assume that the fields are payment_plan_id and installment. Given that here's how you would get it:
WITH last_10_cn AS (
SELECT
value, installment, payment_plan_id, model,
creation_date
FROM payment_plant
WHERE model != 'IMMEDIATE'
AND operation = 'CN'
AND creation_date BETWEEN '2017-06-12' AND '2017-07-12 23:59:59'
ORDER BY
model,
creation_date,
operation
LIMIT 10
OFFSET 1 )
SELECT last_10_cn.*,
dn.value as dn_value, dn.model as dn_model,
dn.creation_date as dn_creation_date
FROM last_10_cn JOIN payment_plant as dn
ON last_10_cn.payment_plan_id = dn.payment_plan_id
AND last_10_cn.installment = dn.installment
ORDER BY
last_10_cn.model,
last_10_cn.creation_date,
last_10_cn.operation
dn.creation_date;
Adjust the above according to the actual join conditions and how you really want things to be sorted.
BTW, your table structure is what's giving you trouble here. DNs should really be a separate table with a foreign key to CNs. I realize that's not how most GLs do it, but the GL model predates relational databases.