sql how to select the lack of a condition? - sql

I have this table structure and data for keeping track of horse race results:
T_RACE_HISTORY
==============
HORSE_ID RACE_DT PLACE
-------- ---------- -----
1 2014-05-03 1
1 2014-07-22 1
1 2016-06-10 3
2 2016-06-10 2
3 2016-06-10 1
I want a query that returns each unique horse id and either the date of the latest race won by that horse, or null if the horse has never won.
In other words, I want a query with this output:
HORSE_ID RACE_DT
-------- ----------
1 2014-07-22
2 (null)
3 2016-06-10
I can get the winning horses with a query like this:
SELECT HORSE_ID,
MAX(RACE_DT)
FROM T_RACE_HISTORY
WHERE PLACE = 1
GROUP BY HORSE_ID
But I have no idea how to look for the lack of any won races.

You can use conditional aggregation:
select horse_id,
max(case when place = 1 then race_dt end)
from t_race_history
group by horse_id
SQL Fiddle Demo

Related

Count values separately until certain amount of duplicates SQL

I need a Statement that selects all patients and the amount of their appointments and when there are 3 or more appointments that are taking place on the same date they should be counted as one appointment
That is what my Statement looks so far
SELECT PATSuchname, Count(DISTINCT AKTDATUM) AS AKTAnz
FROM tblAktivitaeten
LEFT OUTER JOIN tblPatienten ON (tblPatienten.PATID=tblAktivitaeten.PATID)
WHERE (AKTDeleted<>'J' OR AKTDeleted IS Null)
GROUP BY PATSuchname
ORDER BY AKTAnz DESC
The result should look like this
PATSuchname Appointments
----------------------------------------
Joey Patner 13
Billy Jean 15
Example Name 13
As you can see Joey Patner has 13 Appointments, in the real table though he has 15 appointments but three of them have the same Date and because of that they are only counted as 1
So how can i write a Statement that does exactly that?
(I am new to Stack Overflow, sorry if the format I use is wrong and tell me if it is.
In the table it looks like this.
tblPatienten
----------
PATSuchname PATID
------------------------
Joey Patner 1
Billy Jean 2
Example Name 3
tblAktivitaeten
----------
AKTDatum PATID AKTID
-----------------------------------------
08.02.2021 1 1000 ----
08.02.2021 1 1001 ---- So these 3 should counted as 1
08.02.2021 1 1002 ----
09.05.2021 1 1003
09.07.2021 2 1004 -- these 2 shouldn't be counted as 1
09.07.2021 2 1005 --
Two GROUP BY should do it:
SELECT
x.PATID, PATSuchname, SUM(ApptCount)
FROM (
SELECT
PATID, AKTDatum, CASE WHEN COUNT(*) < 3 THEN COUNT(*) ELSE 1 END AS ApptCount
FROM tblAktivitaeten
GROUP BY
PATID, AKTDatum
) AS x
LEFT JOIN tblPatienten ON tblPatienten.PATID = x.PATID
GROUP BY
x.PATID, PATSuchname

How to query: "for which do these values apply"?

I'm trying to match and align data, or resaid, count occurrences and then list for which values those occurrences occur.
Or, in a question: "How many times does each ID value occur, and for what names?"
For example, with this input
Name ID
-------------
jim 123
jim 234
jim 345
john 123
john 345
jane 234
jane 345
jan 45678
I want the output to be:
count ID name name name
------------------------------------
3 345 jim john jane
2 123 jim john
2 234 jim jane
1 45678 jan
Or similarly, the input could be (noticing that the ID values are not aligned),
jim john jane jan
----------------------------
123 345 234 45678
234 123 345
345
but that seems to complicate things.
As close as I am to the desired results is in SQL, as
for ID, count(ID)
from table
group by (ID)
order by count desc
which outputs
ID count
------------
345 3
123 2
234 2
45678 1
I'll appreciate help.
You seem to want a pivot. In SQL, you have to specify the number of columns in advance (unless you construct the query as a string).
But the idea is:
select ID, count(*) as cnt,
max(case when seqnum = 1 then name end) as name_1,
max(case when seqnum = 2 then name end) as name_2,
max(case when seqnum = 3 then name end) as name_3
from (select t.*,
row_number() over (partition by id order by id) as seqnum -- arbitrary ordering
from table t
) t
group by ID
order by count desc;
If you have an unknown number of columns, you can aggregate the values into an array:
select ID, count(*) as cnt,
array_agg(name order by name) as names
from table t
group by ID
order by count desc
the query would look similar to this if that's what you're looking for.
SELECT
name,
id,
COUNT(id) as count
FROM
dataSet
WHERE
dataSet.name = 'input'
AND dataSet.id = 'input'
GROUP BY
name,
id

How do I create a frequency distribution?

I'm trying to create a frequency distribution to show how many customers have transacted 1x, 2x, 3x, etc.
I have a database transactions and column user_id. Each row indicates a transaction, and if a user_id shows up in multiple rows, that user has done multiple transactions.
Now I'd like to get a list that looks something like this:
Tra. | Freq.
0 | 345
1 | 543
2 | 45
3 | 20
4 | 0
5 | 3
etc
Currently I have this, but it just shows a list of users and how many transactions they have had.
SELECT user_id, COUNT(user_id) as number_of_transactions
FROM transactions
GROUP BY user_id
ORDER BY number_of_transactions DESC;
I did some digging and was suggested that generate_series might help, but I'm stuck and don't know how to move forward.
Use the first result as input to an outer query where you apply the count again, but this time grouping on number_of_transactions:
SELECT number_of_transactions, COUNT(*) AS freq
FROM (
SELECT user_id, COUNT(user_id) as number_of_transactions
FROM transactions
GROUP BY user_id
) A
GROUP BY number_of_transactions;
This would transform a result like:
user_id number_of_transactions
----------- ----------------------
1 2
2 1
3 2
4 4
to this:
number_of_transactions freq
---------------------- -----------
1 1
2 2
4 1

Elasticsearch Join columns from different index with condition

I have 2 different indexes in elasticsearch, indx1 and indx2 which i have indexed from an SQL Database using river plugin.
indx1
----------
id | Amt
1 2
2 3
3 2
1 9
2 4
----------
indx 2
----------
id | Name
1 Alex
2 Joe
3 MARY
----------
I want to create a new index now which calculate the average amount from indx1 and join everything in a single index.
So the final structure of the index should look like
indx_final
----------
id | Name | Avg Amt | Status
1 Alex 5.5 High
2 Joe 3.5 Med
3 Mary 2.0 Low
----------
The status is set according to average amount , if Avg amt > 4 , status = high, if avg amt >3, status = Med, If avg amt <2.5 ,status = Low.
Is this possible to do in elasticsearch only? If not possible i would have to do the calculation in SQL and then index the data again.
Any help would be appreciated. Thanks!

SQL sort that distributes results

Given a table of products like this:
ID Name Seller ID Updated at
-- ---- --------- ----------
1 First 3 2012-01-01 12:00:10
2 Second 3 2012-01-01 12:00:09
3 Third 4 2012-01-01 12:00:08
4 Fourth 4 2012-01-01 12:00:07
5 Fifth 5 2012-01-01 12:00:06
I want to construct a query to sort the products like this:
ID
---
1
3
5
2
4
In other words, the query should show most recently updated products, distributed by seller to minimize the likelihood of continuous sequences of products from the same seller.
Any ideas on how to best accomplish this? (Note that the code for this application is Ruby, but I'd like to do this in pure SQL if possible).
EDIT:
Note that the query should handle this case, too:
ID Name Seller ID Updated at
-- ---- --------- ----------
1 First 3 2012-01-01 12:00:06
2 Second 3 2012-01-01 12:00:07
3 Third 4 2012-01-01 12:00:08
4 Fourth 4 2012-01-01 12:00:09
5 Fifth 5 2012-01-01 12:00:10
to produce the following results:
ID
---
5
4
2
3
1
One option demonstrated in this sqlfiddle is
select subq.*
from (
select rank() over (partition by seller_id order by updated_at desc) rnk,
p.*
from products p) subq
order by rnk, updated_at desc;