Query to find value in column dependent on a different column in table being the minimum date - sql

I have a dataset that looks like this. I would like to pull a distinct id, the minimum date and value on the minimum date.
id date value
1 01/01/2020 0.5
1 02/01/2020 1
1 03/01/2020 2
2 01/01/2020 3
2 02/01/2020 4
2 03/01/2020 5
This code will pull the id and the minimum date
select Distinct(id), min(nav_date)
from table
group by id
How can I get the value on the minimum date so the output of my query looks like this?
id date value
1 01/01/2020 0.5
2 01/01/2020 3

Use distinct on:
select distinct on (id) t.*
from t
order by id, date;
This can take advantage of an index on (id, date) and is typically the fastest way to do this operation in Postgres.

Related

Select value where max(date)

I have a dataset with several values for the same medical procedure. I want to select the value where date_ref is max but I am not getting how to do that. Below it is an example of the dataset
PROC_CODE | VALUE | DATE
123456 20.90 2020-01-01
123456 30.00 2021-01-01
123456 15.47 2022-06-01
I want to return only the last row of the dataset, which assigns VALUE for the most recent date
PROC_CODE | VALUE | DATE
123456 15.47 2022-06-01
I tried the following code but it returns an error. What I am missing in my logic?
SELECT
PROC_CODE, VALUE
FROM MY_TABLE
WHERE MAX(DATE)
GROUP BY PROC_CODE -- Only grouping by PROC_CODE because grouping by PROC_CODE and VALUE returns the 3 lines of the dataset
You can use a subquery in your WHERE clause to do this:
SELECT PROC_CODE, VALUE, DATE
FROM MY_TABLE
WHERE DATE = (SELECT MAX(DATE) FROM MY_TABLE);
If you are wanting the value for the max date for each proc_code, then a correlated subquery will work as well:
SELECT PROC_CODE, VALUE, DATE
FROM MY_TABLE as MT
WHERE DATE = (SELECT MAX(DATE) FROM MY_TABLE WHERE PROC_CODE = MT.PROC_CODE);

Hive: Query to get max count per word per date

Here's the data I have:
date | word | count
01/01/2020 #abc 1
01/01/2020 #xyz 2
02/05/2020 #ghi 2
02/05/2020 #def 1
02/04/2020 #pqr 4
02/04/2020 #cde 3
01/01/2020 #lmn 1
Here's the result that I want:
date | word | count
01/01/2020 #xyz 2
02/04/2020 #pqr 4
02/05/2020 #ghi 2
So basically, I want the word with maximum count on each particular date.
Can someone help me out with the query?
Use row_number window function with partition by and order by clause and select only the maximum count from the partition!
SELECT date,word,count
FROM (
SELECT date,word,count,row_number() over (partition by date order by count desc) as rn
from <table_name>) sq
WHERE sq.rn = 1;

How to get the count of distinct values until a time period Impala/SQL?

I have a raw table recording customer ids coming to a store over a particular time period. Using Impala, I would like to calculate the number of distinct customer IDs coming to the store until each day. (e.g., on day 3, 5 distinct customers visited so far)
Here is a simple example of the raw table I have:
Day ID
1 1234
1 5631
1 1234
2 1234
2 4456
2 5631
3 3482
3 3452
3 1234
3 5631
3 1234
Here is what I would like to get:
Day Count(distinct ID) until that day
1 2
2 3
3 5
Is there way to easily do this in a single query?
Not 100% sure if will work on impala
But if you have a table days. Or if you have a way of create a derivated table on the fly on impala.
CREATE TABLE days ("DayC" int);
INSERT INTO days
("DayC")
VALUES (1), (2), (3);
OR
CREATE TABLE days AS
SELECT DISTINCT "Day"
FROM sales
You can use this query
SqlFiddleDemo in Postgresql
SELECT "DayC", COUNT(DISTINCT "ID")
FROM sales
cross JOIN days
WHERE "Day" <= "DayC"
GROUP BY "DayC"
OUTPUT
| DayC | count |
|------|-------|
| 1 | 2 |
| 2 | 3 |
| 3 | 5 |
UPDATE VERSION
SELECT T."DayC", COUNT(DISTINCT "ID")
FROM sales
cross JOIN (SELECT DISTINCT "Day" as "DayC" FROM sales) T
WHERE "Day" <= T."DayC"
GROUP BY T."DayC"
try this one:
select day, count(distinct(id)) from yourtable group by day

SQL Show All column by one column distinct

I have a table with duplicate item's ...
I need show the list of all columns without duplicate item's
for example i have this table:
ID CODE RANK TIME
1 12345 2 10:00
2 12345 2 11:00
3 98765 3 20:00
4 98765 3 22:00
5 66666 2 10:00
6 55555 5 11:00
result , i need :
ID CODE RANK TIME
1 12345 2 10:00
3 98765 3 20:00
5 66666 2 10:00
6 55555 5 11:00
The time column in not Important , only one of them most be show ...
try this:
SELECT * FROM myTable WHERE ID IN(SELECT MIN(ID) FROM myTable GROUP BY Code)
If there is no specific way the ID should show (just like the time column), and the ID and TIME column are always sorted that way,this should work.
SELECT MIN(id), code, rank, MIN(time)
FROM table
GROUP BY code, rank
So you only want rows where the CODE is not duplicated in the table.
SELECT "CODE"
FROM table1
GROUP BY "CODE"
HAVING COUNT(*) = 1
This will return the distinct CODE-s. Based on them - as they are unique - you can self-join it to fetch the whole rows:
SELECT *
FROM table1
WHERE "CODE" IN (
SELECT "CODE"
FROM table1
GROUP BY "CODE"
HAVING COUNT(*) = 1
)
I think you are looking for the DISTINCT clause;
SELECT DISTINCT
column_1,
column_2
FROM
tbl_name;

max data in one column based on another column in sql

Hello I am very new to SQL programming, started last week. I am trying to select a userID and Maxdate from a table that looks like this for example:
Key USERID Date
1 111 12/1/2014
2 202 4/1/2014
3 111 3/8/2014
4 111 2/5/2014
5 202 2/10/2014
I want to make a query that would end up with the following results:
USERID DATE
111 12/1/2014
202 4/1/2014
Simply use GROUP BY clause with aggregate function MAX to achieve this:
Try this:
SELECT USERID, MAX(Date) AS Date
FROM tableA
GROUP BY USERID