SQL: Get the first value - sql

I have two tables:
patients(ID, Firstname, Lastname, ...)
records(ID, Date, Time, Version)
I want to (inner) join these tables, so I have the records with patient data, but in the column for Version I want always the first value that was recorded for the patient (so with the minimum of date and time dependent on the patient (id)). I tried with subquery but HANA doesn't allow ORDER-BY or LIMIT clause in subqueries.
How can I implement this with SQL? (HANA SQL)
Kind regards and thanks in advance.

HANA supports window functions, so you can join against a derived table that picks the first version:
select p.*, r.id, r.date, r.time, r.version
from patients p
join (
select id, date, time, version, patient_id,
row_number() over (partition by patient_id order by version) as rn
from records
) r on p.id = r.patient_id and r.rn = 1
The above assumes that the records table has a column patient_id that contains the id of the patients table to which that record belongs to.

Related

SQL How to pull in all records that don't contain

This is a bit of a trick question to explain, but I'll try my best.
The essence of the question is that I have a employee salary table and the columns are like so,: Employee ID, Month of Salary, Salary (Currency).
I want to run a select that will show me all of the employees that don't have a record for X month.
I have attached an image to assist in the visualising of this, and here is an example of what UI would want from this data:
Let's say from this small example that I want to see all of the employees that weren't paid on the 1st October 2021. From looking I know that employee 3 was the only one paid and 1 and 2 were not paid. How would I be able to query this on a much larger range of data without knowing which month it could be that they weren't paid?
You need to join your EmployeeSalary table against a list of expected EmployeeID/MonthOfSalary values, and determine the gaps - the instances where there is no matching record in the EmployeeSalary table. A LEFT OUTER JOIN can be used here, whenever there's no matching record / missing record in your EmployeeSalary table, the LEFT OUTER JOIN will give you NULL.
The following query shows how to perform the LEFT OUTER JOIN, however note that I've joined your table on itself to get the list of EmployeeID and MonthOfSalary values. You would be better to join these from other tables, i.e. I assume you have an Employee table with all the IDs in it, which would be more efficient (and more accurate) to use, than building the ID list from the EmployeeSalary table (like I've done).
SELECT EmployeeList.EmployeeID, MonthList.MonthOfSalary
FROM (SELECT DISTINCT MonthOfSalary FROM EmployeeSalary) MonthList
JOIN (SELECT DISTINCT EmployeeID FROM EmployeeSalary) EmployeeList
LEFT OUTER JOIN EmployeeSalary
ON MonthList.MonthOfSalary = EmployeeSalary.MonthOfSalary
AND EmployeeList.EmployeeID = EmployeeSalary.EmployeeID
WHERE EmployeeSalary.EmployeeID IS NULL
You need first to get the latest value, then to calculate the difference and make a filter on it. The filter can be done thanks to having clause.
I propose you the following starting point, that you might need to adapt, at least to cast some formats according to your column types.
with latest_pay as (
-- Filter to get, for each employee, the latest paid month
select Employee_ID, Month, Salary, max(month) as latest_pay_month
from your_table
group by Employee_ID
)
-- Look for employees not paid since more than 'your_treshold' months
select Employee_ID, latest_pay_month, Salary, datediff(latest_pay_month, getdate(), Month) as latest_paid_month_delay
from latest_pay
having datediff(latest_pay_month, getdate(), Month) > your_threshold
Btw, I know it's an example, but avoid using column names such as Month, which would lead to confusions and errors with SQL keywords
This is ideally where you would use a calendar table - having one available is handy for tasks such as this where you need to find missing dates.
You can build one on the fly, I have done so in this example however you would normally have a permanant table to use.
In order to determin which rows are missing you need to generate a list of expected rows, an outer join to your actual data will then reveal the missing rows.
So here we have a CTE that generates a list of dates (based on a date range you can set), followed by another to give a list of all the EmployeeId values.
You expect each employeeId to have a row for each month, so we do a cross join to generate the list of expected results, we then outer join with the actual data and filter to the null rows, these are the employees who have no been paid for that month.
See example DB<>Fiddle
declare #from date='20210101', #to date='20211001';
with dates as (
select DateAdd(month,n,#from) dt from (
select top(100) Row_Number() over(order by (select null))-1 n from master.dbo.spt_values
)v
), e as (select distinct employeeId from t)
select dt, e.EmployeeId
from dates d cross join e
left join t on DatePart(month,d.dt)=DatePart(month,t.PaidDate) and t.EmployeeId=e.EmployeeId
where d.dt<=#to
and t.EmployeeId is null

select rows in sql with latest date from 3 tables in each group

I'm creating PREDICATE system for my application.
Please see image that I already
I have a question how can I select rows in SQL with latest date "Taken On" column tables for each "QuizESId" columns, before that I am understand how to select it but it only using one table, I learn from this
select rows in sql with latest date for each ID repeated multiple times
Here is what I have already tried
SELECT tt.*
FROM myTable tt
INNER JOIN
(SELECT ID, MAX(Date) AS MaxDateTime
FROM myTable
GROUP BY ID) groupedtt ON tt.ID = groupedtt.ID
AND tt.Date = groupedtt.MaxDateTime
What I am confused about here is how can I select from 3 tables, I hope you can guide me, of course I need a solution with good query and efficient performance.
Thanks
This is for SQL Server (you didn't specify exactly what RDBMS you're using):
if you want to get the "latest row for each QuizId" - this sounds like you need a CTE (Common Table Expression) with a ROW_NUMBER() value - something like this (updated: you obviously want to "partition" not just by QuizId, but also by UserName):
WITH BaseData AS
(
SELECT
mAttempt.Id AS Id,
mAttempt.QuizModelId AS QuizId,
mAttempt.StartedAt AS StartsOn,
mUser.UserName,
mDetail.Score AS Score,
RowNum = ROW_NUMBER() OVER (PARTITION BY mAttempt.QuizModelId, mUser.UserName
ORDER BY mAttempt.TakenOn DESC)
FROM
UserQuizAttemptModels mAttempt
INNER JOIN
AspNetUsers mUser ON mAttempt.UserId = muser.Id
INNER JOIN
QuizAttemptDetailModels mDetail ON mDetail.UserQuizAttemptModelId = mAttempt.Id
)
SELECT *
FROM BaseData
WHERE QuizId = 10053
AND RowNum = 1
The BaseData CTE basically selects the data (as you did) - but it also adds a ROW_NUMBER() column. This will "partition" your data into groups of data - based on the QuizModelId - and it will number all the rows inside each data group, starting at 1, and ordered by the second condition - the ORDER BY clause. You said you want to order by "Taken On" date - but there's no such date visible in your query - so I just guessed it might be on the UserQuizAttemptModels table - change and adapt as needed.
Now you can select from that CTE with your original WHERE condition - and you specify, that you want only the first row for each data group (for each "QuizId") - the one with the most recent "Taken On" date value.

SQL Server : get latest date from 2 tables

I have two tables P and G and want to write a query that will get the latest date from table G and will not pull in duplicate client IDs:
Table P
Table G
I want to get this result from the query:
So far I have joined the tables, but unable get the result intended.
Any help would be appreciated.
Not sure how your tables are related other than your column ClientID, but you would want to join the two tables on those columns:
select p.clientid,
max(g.created_on) latest_created_on,
max(p.info) as info
from tableP p
left join tableG g on p.ClientID = g.ClientID
group by p.clientid;
SQL Fiddle Demo
You can use OVER PARTITION to take the record with the most recent date for each ClientID.
In this case, I would write:
SELECT g.ClientID,
g.created_on,
g.INFO
FROM (
SELECT ClientID
created_on,
INFO,
row_number() OVER ( PARTITION BY ClientID ORDER BY created_on DESC) AS RowNum
FROM Table_G
) AS g
WHERE g.RowNum = 1
The subquery creates a table with all the columns you want, and the row_number() function assigns each record a row_number. PARTITION BY says what to group by, and ORDER BY says how to sort within that partition.
In this case, you want the record with the most recent date for each ClientID. We group by ClientID, sort by date to assign row numbers, and then in the main query, we select only the first row in each group, using WHERE g.RowNum = 1
This is a guide for PostreSQL, but it's helped me understand OVER PARTITION.

Query to return

I need to write a query which would return results in line with the following.. For three tables, customers, products and order history, I want to find the latest order for each product for each customer. Any help appreciated.
The latest order for each product for each customer - you can achieve this by using a ROW_NUMBER() window function and partitioning your data by those for each criteria.
So try something like this (just guessing table and column names, since you haven't provided anything to go on):
;WITH NewestData AS
(
SELECT
oh.OrderDate,
c.CustomerName,
p.ProductName,
RowNum = ROW_NUMBER() OVER (PARTITION BY oh.CustomerID, oh.ProductID
ORDER BY oh.OrderDate DESC)
FROM
dbo.OrderHistory oh
INNER JOIN
dbo.Customer c ON oh.CustomerID = c.CustomerID
INNER JOIN
dbo.Product p ON oh.ProductID = p.ProductID
)
SELECT
OrderDate, CustomerName, ProductName
FROM
NewestData
WHERE
RowNum = 1
OK, explanation time:
the CTE (Common Table Expression) basically joins the three tables (guessing what the table and column names are, and how they are connected) and selects some of the columns from those tables (you could add more columns, if you need them, of course!)
the RowNum is a consecutive number, starting at 1, for each "partition" of data; the PARTITION BY clause expresses that for each combination of a (CustomerID, ProductID), you want to have a "partition" which gets numbered (1, 2, 3, 4,.....) based on the ORDER BY clause - here, it gets number with 1 for the most recent order for that partition (for that customer+product).
So in the end, all you need to do, is select from that CTE, and select those rows only that have RowNum = 1 - those are the most recent orders for each "partition" of (CustomerID, ProductID) in your order history table.
This works from SQL Server 2005 on and newer versions - it is not supported in 2000 .....

help with query in DB2

i would like your help with my query.I have a table employee.details with the following columns:
branch_name, firstname,lastname, age_float.
I want this query to list all the distinct values of the age_float
attribute, one in each row of the result table, and beside each in the second field show the
number of people in the details table who had ages less than or equal to that value.
Any ideas? Thank you!
You can use OLAP functions:
SELECT DISTINCT age_float,
COUNT(lastname) OVER(ORDER BY age_float) AS number
FROM employee_details
COUNT(lastname) OVER(ORDER BY age_float) AS number orders rows by age, and returns employees count whose age <= current row age
or a simple join:
SELECT A.age_float, count(lastname)
FROM (SELECT DISTINCT age_float FROM employee_details) A
JOIN employee_details AS ED ON ED.age_float <= A.age_float
GROUP BY A.age_float