Fill null value by previous value and group by Postgresql - sql

I have an table and I want to fill the null value with previous value order by date but there have an group too
For example:
Table X:
Date
Group
value
1/1/2023
A
null
2/1/2023
A
Kevin
3/1/2023
A
null
4/1/2023
A
Tom
5/1/2023
A
null
6/1/2023
A
null
1/1/2023
B
Sara
2/1/2023
B
null
So I want to group by Group column and fill the null value of value column, The group can be multi value and the date is unique per group. I want the result like this:
Date
Group
value
1/1/2023
A
null
2/1/2023
A
Kevin
3/1/2023
A
Kevin
4/1/2023
A
Tom
5/1/2023
A
Tom
6/1/2023
A
Tom
1/1/2023
B
Sara
2/1/2023
B
Sara
How can I do it in postgresql ? Please help me
I have tried and I realy don't know how to do it. I just a newbie too

If you can have more than one NULL values consecutively, LAG function won't help you much. A generalized solution would use:
the COUNT window function to generate a partitioning of one non-null value and consecutive null values
the MAX window functions to reassign NULL values.
WITH cte AS (
SELECT *,
COUNT(CASE WHEN value_ IS NOT NULL THEN 1 END) OVER(
PARTITION BY Group_
ORDER BY Date_
) AS rn
FROM tab
)
SELECT Date_, Group_, MAX(value_) OVER(PARTITION BY group_, rn) AS value_
FROM cte
ORDER BY group_, Date_
Check the demo here.

If the input data is always in this form, we can use GREATEST and LAG:
SELECT
xdate,
xgroup,
GREATEST(xvalue, LAG(xvalue) OVER()) AS xvalue
FROM X
ORDER BY xgroup, xdate;
Try out here with your sample data: db<>fiddle
GREATEST fetches the highest of two (or more) values which is NOT NULL, LAG selects the value from the previous row.
If this is not sufficient in your scenario due to possible more complex input data, please edit your question to add further cases which should be covered.
In this answer, the columns were renamed by adding a x because the original names are SQL keywords and should be avoided if possible.

Related

SQL Server find results within partition

I have the following table:
ID Date
-------------------
1 Null
1 1/2/2020
2 Null
2 12/2/2020
3 Null
For every ID which has at least one non-null date, I need to classify as 'accounted'.
Result set should look like below:
id Date AccountFlag
----------------------------
1 Null Accounted
1 1/2/2020 Accounted
2 Null Accounted
2 12/2/2020 Accounted
3 Null Unaccounted
You can use window functions to check if the same id has at least one non-null date, and a case expression to set the flag accordingly. Window aggregate functions come handy for this:
select id, date,
case when max(date) over(partition by id) is not null
then 'Accounted'
ese 'Unaccounted'
end as accountflag
from mytable
max() ignores null values, so it returns null if and only if all values in the partition are null. This would work just the same with min().

Multiple rows with null values - i want one row with not null

My query result below:
ID desc Year pid
0006845503 tes1 null null
0006845503 null 2017 null
0006845503 null null 90
0006845503 null null null
I want to show these results:
ID desc year pid
0006845503 TEST1 2017 90
If a value for a column appear only once(only one value available) , then simple group by should do the task :
SELECT t.id,t.year,t.code,MAX(t.desc) as desc,MAX(t.year) as year,MAX(t.pid) as pid
FROM YourTable t
GROUP BY t.id,t.year,t.code
If you know that only one row per column would have a value, you can use an aggregate function like min or max that would skip the nulls and return the only value in the column. E.g.:
SELECT id, year, code, MAX("desc"), MAX("year"), MAX(pid)
FROM mytable
GROUP BY id, year, code

Count of group for null is always 0 (zero)

In TSql what is the recommended approach for grouping data containing nulls?
Example of the type of query:
Select Group, Count([Group])
From [Data]
Group by [Group]
It appears that the count(*) and count(Group) both result in the null group displaying 0.
Example of the expected table data:
Id, Group
---------
1 , Alpha
2 , Null
3 , Beta
4 , Null
Example of the expected result:
Group, Count
---------
Alpha, 1
Beta, 1
Null, 0
This is the desired result which can be obtained by count(Id). Is this the best way to get this result and why does count(*) and count(Group) return an "incorrect" result?
Group, Count
---------
Alpha, 1
Beta, 1
Null, 2
edit: I don't remember why I thought count(*) did this, it may be the answer I'm looking for..
The best approach is to use count(*) which behaves exactly like count(1) or any other constant.
The * will ensure every row is counted.
Select Group, Count(*)
From [Data]
Group by [Group]
The reason null shows 0 instead of 2 in this case is because each cell is counted as either 1 or null and null + null = null so the total of that group would also be null. However the column type is an integer so it shows up as 0.
Just do
SELECT [group], count([group])
GROUP BY [group]
SQL Fiddle Demo
Count(id) doesn't gives the expected result as mentioned in question. Gives value of 2 for group NULL
try this..
Select Group, Count(isNull(Group,0))
From [Data]
Group by [Group]
COUNT(*) should work:
SELECT Grp,COUNT(*)
FROM tab
GROUP BY Grp
One more solution could be following:
SELECT Grp, COUNT(COALESCE(Grp, ' '))
FROM tab
GROUP BY Grp
Here is code at SQL Fiddle

SQL: How to get the AVG(MIN(number))?

I am looking for the AVERAGE (overall) of the MINIMUM number (grouped by person).
My table looks like this:
Rank Name
1 Amy
2 Amy
3 Amy
2 Bart
1 Charlie
2 David
5 David
1 Ed
2 Frank
4 Frank
5 Frank
I want to know the AVERAGE of the lowest scores. For these people, the lowest scores are:
Rank Name
1 Amy
2 Bart
1 Charlie
2 David
1 Ed
2 Frank
Giving me a final answer of 1.5 - because three people have a MIN(Rank) of 1 and the other three have a MIN(Rank) of 2. That's what I'm looking for - a single number.
My real data has a couple hundred rows, so it's not terribly big. But I can't figure out how to do this in a single, simple statement. Thank you for any help.
Try this:
;WITH MinScores
AS
(
SELECT
"Rank",
Name,
ROW_NUMBER() OVER(PARTITION BY Name ORDER BY "Rank") row_num
FROM Table1
)
SELECT
CAST(SUM("Rank") AS DECIMAL(10, 2)) /
COUNT("Rank")
FROM MinScores
WHERE row_num = 1;
SQL Fiddle Demo
Selecting the set of minimum values is straightforward. The cast() is necessary to avoid integer division later. You could also avoid integer division by casting to float instead of decimal. (But you should be aware that floats are "useful approximations".)
select name, cast(min(rank) as decimal) as min_rank
from Table1
group by name
Now you can use the minimums as a common table expression, and select from it.
with minimums as (
select name, cast(min(rank) as decimal) as min_rank
from Table1
group by name
)
select avg(min_rank) avg_min_rank
from minimums
If you happen to need to do the same thing on a platform that doesn't support common table expressions, you can a) create a view of minimums, and select from that view, or b) use the minimums as a derived table.
You might try using a derived table to get the minimums, then get the average minimum in the outer query, as in:
-- Get the avg min rank as a decimal
select avg(MinRank * 1.0) as AvgRank
from (
-- Get everyone's min rank
select min([Rank]) as MinRank
from MyTable
group by Name
) as a
I think the easiest one will be
for max
select name , max_rank = max(rank)
from table
group by name;
for average
select name , avg_rank = avg(rank)
from table
cgroup by name;

Unclear on LAST_VALUE - Preceding

I have a table that looks like this,
Date Value
01/01/2010 03:59:00 324.44
01/02/2010 09:31:00 NULL
01/02/2010 09:32:00 NULL
.
.
.
01/02/2010 11:42:00 NULL
I want the first valid value to appear in all following rows. This is what I did,
select date,
nvl(value, LAST_VALUE(value IGNORE NULLS) over (order by value RANGE BETWEEN 1 PRECEDING AND CURRENT ROW)) value
from
table
This shows no difference at all, but if I say RANGE BETWEEN 3 PRECEDING AND CURRENT ROW it copies the data to all the rows. I'm not clear why this is happening. Can anyone explain if I'm misunderstanding how to use preceding?
Analytic functions still work on sets of data. They do not process one row at a time, you would need PL/SQL or MODEL to do that. PRECEDING refers to the last X rows, but before the analytic function has been applied.
These problems can be confusing in SQL because you have to build the logic into defining the set, instead of trying to pass data from one row to another. That's why I used CASE with LAST_VALUE in my previous answer.
Edit:
I've added a simple data set so we can all run the exact same query. VALUE1 seems to work to me, am I missing something? Part of the problem with VALUE2 is that the analytic ORDER BY uses VALUE, instead of the date.
select id, the_date, value
,last_value(value ignore nulls) over
(partition by id order by the_date) value1
,nvl(value, LAST_VALUE(value IGNORE NULLS) over
(order by value RANGE BETWEEN 1 PRECEDING AND CURRENT ROW)) value2
from
(
select 1 id, date '2011-01-01' the_date, 100 value from dual union all
select 1 id, date '2011-01-02' the_date, null value from dual union all
select 1 id, date '2011-01-03' the_date, null value from dual union all
select 1 id, date '2011-01-04' the_date, null value from dual union all
select 1 id, date '2011-01-05' the_date, 200 value from dual
)
order by the_date;
Results:
ID THE_DATE VALUE VALUE1 VALUE2
1 1/1/2011 100 100 100
1 1/2/2011 100
1 1/3/2011 100
1 1/4/2011 100
1 1/5/2011 200 200 200
It is possible to copy one row at time because i had done that using java Logic and Sql query
Statement sta;
ResultSet rs,rslast;
try{
//Connection creation code and "con" is an object of Connection Class so don't confuse about that.
sta = con.createStatement();
rs=sta.executeQuery("SELECT * FROM TABLE NAME");
rslast=sta.executeQuery("SELECT * FROM TABLENAME WHERE ID = (SELECT MAX(ID) FROM TABLENAME)");
rslast.next();
String digit =rslast.getString("ID");
System.out.print("ID"+rslast.getString("ID")); // it gives ID of the Last Record.
Instead using this method u can also use ORDER by Date in Descending order.
Now i hope u make logic that only insert Last record.