SQL Server : need to find min/max/count value given time partition and conditional parameters - sql

I have a database containing results from horse races.
For each horse in each race i want to find their previous best time (TOTAL_TIME) given that the parameters was the same as the current race/row (DISTANCE, DIVISION, START_MODE).
I also want to count the number of times the horse won a race before the current/row date
Finally, for each race and horse; I want to RANK the competing horses based on the values calculated in 1 & 2
SELECT b.datum AS RACE_DATE,/*date of race*/
b.id AS RACE_ID,
c.namn AS HORSE_NAME,
a.plac AS PLACEMENT,/*what position the horse finished in*/
a.rank AS RANK_ODDS,
a.tid AS TOTAL_TIME,/*total time it took for horse to finish race*/
b.bana AS TRACK,/*geographical race track*/
b.distans AS DISTANCE,/*short, long, etc.*/
b.division AS DIVISION,/*what type of breed the competition was etc.*/
b.startsatt AS START_MODE /*way of starting the race, car or manual*/
FROM trav.prog a
JOIN trav.tvl b
ON a.tvlid = b.id
JOIN trav.horse c
ON a.horseid = c.id
ORDER BY race_date DESC,
race_id DESC
Sample output: (Thank you kindly for the formatting help)
+------------+---------+------------------+-----------+-----------+------------+-------+----------+----------+------------+
| RACE_DATE | RACE_ID | HORSE_NAME | PLACEMENT | RANK_ODDS | TOTAL_TIME | TRACK | DISTANCE | DIVISION | START_MODE |
+------------+---------+------------------+-----------+-----------+------------+-------+----------+----------+------------+
| 2017-03-28 | 166700 | YANKEE FRECEL* | 9 | 3 | 161 | MO | K | V | A |
| 2017-03-28 | 166700 | ALCYONE LOVE | 3 | 9 | 152 | MO | K | V | A |
| 2017-03-28 | 166700 | GIANT STAR* | 6 | 6 | 155 | MO | K | V | A |
| 2017-03-28 | 166697 | RUBY FRONTLINE | 6 | 11 | 188 | MO | M | U | V |
| 2017-03-28 | 166696 | CAN´T STAND STIL | 4 | 3 | 150 | MO | K | U | A |
| 2017-03-28 | 166696 | MISS MIRCHI* | 3 | 5 | 149 | MO | K | U | A |
| 2017-03-28 | 166695 | LYNLINN* | 4 | 6 | 262 | MO | K | U | A |
| 2017-03-28 | 166695 | SIKVELANDS SVAR | 3 | 1 | 257 | MO | K | U | A |
| 2017-03-28 | 166692 | NOK´EN FRÆKKERT | 1 | 6 | 134 | J | M | V | A |
| 2017-03-28 | 166692 | FLEX LANE | 5 | 4 | 137 | J | M | V | A |
| 2017-03-28 | 166692 | EDWARD ALE* | 4 | 3 | 137 | J | M | V | A |
| 2017-03-28 | 166692 | ATTACK DIABLO | 3 | 1 | 136 | J | M | V | A |
| 2017-03-28 | 166692 | SOLVATO | 2 | 2 | 136 | J | M | V | A |
| 2017-03-28 | 166692 | CALIBER T.T. | 7 | 8 | 140 | J | M | V | A |
| 2017-03-28 | 166692 | KASH´S CANTAB | 6 | 5 | 137 | J | M | V | A |
| 2017-03-28 | 166692 | CAPELLO BOB | 8 | 7 | 142 | J | M | V | A |
| 2017-03-28 | 166691 | WILDINTENTION | 4 | 2 | 125 | J | K | V | A |
| 2017-03-28 | 166691 | MR ARKANSAS | 3 | 10 | 124 | J | K | V | A |
| 2017-03-28 | 166691 | KAMANDA | 7 | 6 | 127 | J | K | V | A |
| 2017-03-28 | 166691 | APOLLO DAMGÅRD* | 6 | 8 | 127 | J | K | V | A |
+------------+---------+------------------+-----------+-----------+------------+-------+----------+----------+------------+

Related

How I can I add a count to rank null values in SQL Hive?

This is what I have right now:
| time | car_id | order | in_order |
|-------|--------|-------|----------|
| 12:31 | 32 | null | 0 |
| 12:33 | 32 | null | 0 |
| 12:35 | 32 | null | 0 |
| 12:37 | 32 | 123 | 1 |
| 12:38 | 32 | 123 | 1 |
| 12:39 | 32 | 123 | 1 |
| 12:41 | 32 | 123 | 1 |
| 12:43 | 32 | 123 | 1 |
| 12:45 | 32 | null | 0 |
| 12:47 | 32 | null | 0 |
| 12:49 | 32 | 321 | 1 |
| 12:51 | 32 | 321 | 1 |
I'm trying to rank orders, including those who have null values, in this case by car_id.
This is the result I'm looking for:
| time | car_id | order | in_order | row |
|-------|--------|-------|----------|-----|
| 12:31 | 32 | null | 0 | 1 |
| 12:33 | 32 | null | 0 | 1 |
| 12:35 | 32 | null | 0 | 1 |
| 12:37 | 32 | 123 | 1 | 2 |
| 12:38 | 32 | 123 | 1 | 2 |
| 12:39 | 32 | 123 | 1 | 2 |
| 12:41 | 32 | 123 | 1 | 2 |
| 12:43 | 32 | 123 | 1 | 2 |
| 12:45 | 32 | null | 0 | 3 |
| 12:47 | 32 | null | 0 | 3 |
| 12:49 | 32 | 321 | 1 | 4 |
| 12:51 | 32 | 321 | 1 | 4 |
I just don't know how to manage a count for the null values.
Thanks!
You can count the number of non-NULL values before each row and then use dense_rank():
select t.*,
dense_rank() over (partition by car_id order by grp) as row
from (select t.*,
count(order) over (partition by car_id order by time) as grp
from t
) t;

How do I get around aggregate function error?

I have the following sql to calculate a % total:
SELECT tblTourns_atp.ID_Ti,
Sum([FS_1]/(SELECT Sum(FSOF_1)
FROM stat_atp
WHERE stat_atp.ID_T = tblTourns_atp.ID_T)) AS S1_IP
FROM stat_atp
INNER JOIN tblTourns_atp ON stat_atp.ID_T = tblTourns_atp.ID_T
GROUP BY tblTourns_atp.ID_Ti
I'm getting the 'aggregate error' because it wants the ID_T fields either grouped or in an aggregate function. I've read loads of examples but none of them seem to apply when the offending field is the subject of 'WHERE'.
Tables and output as follows:
+----------+------+--------+--+---------------+-------+--+--------+--------+
| stat_atp | | | | tblTourns_atp | | | Output | |
+----------+------+--------+--+---------------+-------+--+--------+--------+
| ID_T | FS_1 | FSOF_1 | | ID_T | ID_Ti | | ID_Ti | S1_IP |
| 1 | 20 | 40 | | 1 | 1 | | 1 | 31.03% |
| 2 | 30 | 100 | | 2 | 1 | | 2 | 28.57% |
| 3 | 40 | 150 | | 3 | 1 | | 3 | 33.33% |
| 4 | 30 | 100 | | 4 | 2 | | | |
| 5 | 30 | 100 | | 5 | 2 | | | |
| 6 | 40 | 150 | | 6 | 2 | | | |
| 7 | 20 | 40 | | 7 | 3 | | | |
| 8 | 30 | 100 | | 8 | 3 | | | |
| 9 | 40 | 150 | | 9 | 3 | | | |
| 10 | 20 | 40 | | 10 | 3 | | | |
+----------+------+--------+--+---------------+-------+--+--------+--------+
Since you already have an inner join between the two tables, a separate subquery isn't required:
select t.id_ti, sum(s.fs_1)/sum(s.fsof_1) as pct
from tbltourns_atp t inner join stat_atp s on t.id_t = s.id_t
group by t.id_ti

Pandas - Grouping Rows With Same Value in Dataframe

Here is the dataframe in question:
|City|District|Population| Code | ID |
| A | 4 | 2000 | 3 | 21 |
| A | 8 | 7000 | 3 | 21 |
| A | 38 | 3000 | 3 | 21 |
| A | 7 | 2000 | 3 | 21 |
| B | 34 | 3000 | 6 | 84 |
| B | 9 | 5000 | 6 | 84 |
| C | 4 | 9000 | 1 | 28 |
| C | 21 | 1000 | 1 | 28 |
| C | 32 | 5000 | 1 | 28 |
| C | 46 | 20 | 1 | 28 |
I want to regroup the population counts by city to have this kind of output:
|City|Population| Code | ID |
| A | 14000 | 3 | 21 |
| B | 8000 | 6 | 84 |
| C | 15020 | 1 | 28 |
df = df.groupby(['City', 'Code', 'ID'])['Population'].sum()
You can make a group by 'City', 'Code' and 'ID then make sum of 'population'.

Transpose data using sql oracle

I have following data in a table
+---------+-----+----------+
| Zip_ cd | id | assignmt.|
|---------+-----+----------+
| 1812 | 777 | S |
| 1812 | 111 | P |
| 1451 | 878 | S |
| 55 | 45 | x |
| 55 | 646 | T |
| 55 | 455 | Z |
+---------+-----+----------+
I want to transpose it as following
+---------+-----+----+---------+-----+----------+---------+-----+----------+
| Zip_ cd | id | ass| Zip_cd1 | id1 |assignmt1 | Zip_cd2 | id2 |assignmt3 |
+---------+-----+----+---------+-----+----------+---------+-----+----------+
| 1812 | 777 | S | 1812 | 111 | P | 1812 | 111 | P |
| 1451 | 878 | S | | | | | | |
| 55 | 45 | X | 55 | 646 | T | 55 | 455 | Z |
+---------+-----+----------+---------+----------+---------+-----+----------+
So, I basically want to transpose based on the zip code. If 2 rows have same zip code its needs to be in single row.
Another query using pivot functionality:
select *
from table_name
pivot ( max(id) id
, max(assignmt) assignmt
FOR assignmt IN ('S' S,
'P' P));
+---------+-------+------------+--------+------------+
| ZIP_ CD | S_ID | S_ASSIGNMT.| P_ID | P_ASSIGNMT.|
+---------+-------+------------+--------+------------+
| 1812 | 777 | S | 111 | P |
| 1451 | 878 | S | NULL | NULL |
+---------+-------+------------+--------+------------+

Access Query for Ranking/Assigning Priority Values

I am doing data conversion from a previous system that was keyed in without validation rules. I am working with a table of Emergency Contacts, and trying to assign a primary contact with (Y/N) when the field is blank or duplicated (i.e. someone puts Y or N for multiple contacts I want to arbitrarily assign primary). I will also assign a new column with an alphabetic sequence (a, b, c, etc.) based on the priority which was designated in the other column.
Every ID must only have 1 Priority 'Y'.
Current Table:
+--------+---------+----------+
| id | fname | pri_cont |
+--------+---------+----------+
| 001000 | Rox | Y |
| 001000 | Dan | N |
| 001002 | May | Y |
| 001007 | Lee | Y |
| 001007 | Clive | Y |
| 001008 | Max | Y |
| 001008 | Kim | N |
| 001013 | Sam | Y |
| 001013 | Ann | |
| 001014 | Nat | Y |
| 001018 | Bruce | Y |
| 001018 | Mel | |
| 001020 | Wilson | Y |
| 001022 | Goi | Y |
| 001022 | Adele | N |
| 001022 | Gary | N |
+--------+---------+----------+
What I want:
+--------+---------+----------+----------+
| id | fname | pri_cont | priority |
+--------+---------+----------+----------+
| 001000 | Rox | Y | a |
| 001000 | Dan | N | b |
| 001002 | May | Y | a |
| 001007 | Lee | Y | a |
| 001007 | Clive | N | b |
| 001008 | Max | Y | a |
| 001008 | Kim | N | b |
| 001013 | Sam | Y | a |
| 001013 | Ann | N | b |
| 001014 | Nat | Y | a |
| 001018 | Bruce | Y | a |
| 001018 | Mel | N | b |
| 001020 | Wilson | Y | a |
| 001022 | Goi | Y | a |
| 001022 | Adele | N | b |
| 001022 | Gary | N | c |
+--------+---------+----------+----------+
How can I do that?
Well, as I see it your cleanup requires several queries (please note queries assume Emergency Contacts table has a unique autonumber, dbID):
One Select Query to count Y and N instances. Also, query can calculate Priority column using the Chr ASCII conversion of numbers to letters.:
SELECT t1.ID, t1.fname, t1.pri_cont,
(SELECT Count(*)
FROM EmergContacts t2
WHERE t1.dbID >= t2.dbID AND t1.ID = t2.ID
AND t1.pri_cont = t2.pri_cont AND t1.pri_cont = 'Y') AS YCount,
(SELECT Count(*)
FROM EmergContacts t3
WHERE t1.dbID >= t3.dbID AND t1.ID = t3.ID
AND t1.pri_cont = t3.pri_cont AND t1.pri_cont = 'N') AS NCount,
(SELECT Chr(Count(t2.ID) + 96)
FROM EmergContacts t2
WHERE t1.dbID >= t2.dbID AND t1.ID = t2.ID) AS Priority
FROM EmergContacts AS t1;
With output such as below:
ID | fname | pri_cont | YCount | NCount | Priority
1000 | Rox | Y | 1 | 0 | a
1000 | Dan | N | 0 | 1 | b
1002 | May | Y | 1 | 0 | a
1007 | Lee | Y | 1 | 0 | a
1007 | Clive | Y | 2 | 1 | b
1008 | Max | Y | 1 | 0 | a
1008 | Kim | N | 0 | 1 | b
1013 | Sam | Y | 1 | 0 | a
1013 | Ann | | 0 | 1 | b
1014 | Nat | Y | 1 | 0 | a
1018 | Bruce | Y | 1 | 0 | a
1018 | Mel | | 0 | 1 | b
1020 | Wilson | Y | 1 | 0 | a
1022 | Goi | Y | 1 | 0 | a
1022 | Adele | N | 0 | 1 | b
1022 | Gary | N | 0 | 2 | c
From there you run three update queries:
To clean up Nulls:
UPDATE EmergContacts
SET pri_cont = 'N'
WHERE pri_cont Is Null;
To clean up IDs with more than 1 Ys:
UPDATE EmergContacts
SET pri_cont = 'N'
WHERE ID IN (SELECT ID FROM EmergContPrCount WHERE YCount > 1)
AND fName IN (SELECT fName FROM EmergContPrCount WHERE YCount > 1);
And to clean up IDs with no Ys:
UPDATE EmergContacts
SET pri_cont = 'Y'
WHERE (ID IN (SELECT ID FROM EmergContPrCount WHERE YCount = 0)
AND fName IN (SELECT Max(fName) FROM EmergContPrCount WHERE YCount = 0));