SQL - How to keep the last row if it meets a condition, but remove other rows beforehand which meet the same condition - sql

I'm coming across an SQL issue which I could do with some advice on.
I have an example below showing actions taken between different energy suppliers for dispute cases. The action_time_start is when supplier 1 sends an action, and action_time_end is when supplier 2 sends a response.
The row number is not included in the main table but I have added it in here for visibility.
dispute_id
supplier_1_action_sent
supplier_2_action_response
action_time_start
action_time_end
row_num
847294
Proposal received (P)
Accept Proposal
2023-01-23
2023-01-23
4
847294
Agreement made (Y)
NULL
2023-01-24
NULL
3
847294
Agreement made (Y)
Close Dispute
2023-01-25
2023-02-03
1
847294
Proposal received (P)
NULL
2023-02-3
NULL
1
I need to:
Include columns 2 and 4 in the results.
Include column 1 in the results (the last row), where the result for action_time_end is null.
Remove column 3 from results, where action_time_end is null.
For the table overall, I need to remove any columns where action_time_end is null except for when it is the last row, for each dispute_id. I also need to keep all columns in the results where action_time_end is not null.
If the last row has a value in action_time_end which is not null then this needs to be kept in, and all rows before where it's null removed.
Any suggestions here?
I have tried a number of different solutions, including:
Using MAX(COALESCE(TO_DATE(action_time_end), DATE '9999-01-01')) and filtered out instances where the action_time_start < action_time_end and action_time_end != '9999-01-01'.
Including row_num and filtering where row_num = 1 and action_time_end is not null
Doing a complex CASE WHEN in the last where clause of the query
The issue is that I'm not sure how to keep in the last row but remove all the others when a certain condition is met.

We can use CASE expressions here:
SELECT
CASE WHEN action_time_end IS NULL THEN dispute_id END AS dispute_id,
supplier_1_action_sent,
CASE WHEN action_time_end IS NOT NULL THEN supplier_2_action_response END AS supplier_2_action_response,
action_time_start
FROM yourTable;

Related

Trying to understand simple SQL query with case statement

I am trying to understand this query:
SELECT *
FROM servers
ORDER BY
CASE
WHEN status = "ACTIVE" THEN 1
WHEN status = "INACTIVE" THEN 2
ELSE 3
END
I know this is selecting all rows from the server table and ordering them first with where column status = "ACTIVE" and then where status = "INACTIVE."
What is the syntax THEN 1...THEN 2 ELSE 3 END mean? I know END is to close the case statement, but what are 1, 2, and 3?
Your CASE clause is in the ORDER BY section - it doesn't become part of the output, it's just used by the SQL engine for sorting.
The 1,2,3 are sortable values.
Basically it' saying to put the ACTIVE rows first (1), then the INACTIVE rows (2), then any rows that are neither (3) at the end.
Given that ACTIVE and INACTIVE sort the same way, I guess there are other values in the table that don't sort in that order (maybe CLOSED or DORMANT which would come before INACTIVE

Ignore a column in MS ACCESS query if the WHERE clause is not set

Problem
I've got a dropdown list, which shows all the Article_Group_ID's that are linked to a specific brand, using the following Query:
SELECT TbArticle.Article_Group_ID, TbArticle.Article_Brand_ID
FROM TbArticle
GROUP BY TbArticle.Article_Group_ID, TbArticle.Article_Brand_ID,
HAVING (((TbArticle.Article_Brand_ID)=1))
This works as expected, it returns the following:
Query results
Article_Brand_ID
Article_Group_ID
1
1
1
2
But, if a user does not wish to specify a specific Article_Brand_ID, the query results look like this:
Query
Article_Brand_ID
Article_Group_ID
1
1
2
1
3
1
1
2
As you can see, the same Article_Group_ID is returned three times. Because of this, the user now sees the same group three times, instead of just once. If I were to remove the Article_Brand_ID from the query, the results would look like this:
Article_Group_ID
1
2
Is there any way to achieve the same behavior, by "ignoring" the Article_Brand_ID column, if it's WHERE clause is not set?
Database layout
TbArticle
Article_Brand_ID
Article_Group_ID
1
1
2
1
3
1
1
2
A single query cannot return a variable number of columns. So, strictly speaking you cannot do what you want with a single query. However, if you are willing to accept the second column as NULL when the brand is not provided, then you can adjust the aggregation.
Let me denote the parameter by ?:
SELECT a.Article_Group_ID,
IIF(? IS NOT NULL, a.Article_Brand_ID, NULL) as Article_Brand_ID
FROM TbArticle as a
WHERE a.Article_Brand_ID = ? OR
? IS NULL
GROUP BY a.Article_Group_ID,
IIF(? IS NOT NULL, a.Article_Brand_ID, NULL);
Note: It is usually better to filter before aggregating (i.e. using WHERE) rather than filtering afterwards (i.e. using HAVING).

SQL - concatenate values in columns but keep only first non-null value

I have the following table in Postgresql. The first 4 records are the base data and the others were generated with the ROLLUP function.
I want to add a column "grp_1" that will display the first non-null value of the columns grp1_l1, grp2_l2 and grp2_l3
I can get to the desired result by nesting 3 "case" functions using the SQL below, but my real table has 4 groups with each 8 to 10 columns (so a lot of nested "case" function).
sql:
SELECT grp1_l1, grp1_l2, grp1_l3, case when grp1_l1 is not null then grp1_l1 else case when grp1_l2 is not null then grp1_l2 else case when grp1_l3 is not null then grp1_l3 else null end end end as grp1, value
FROM public.query_test;
Is there a better and more scalable to handle this requirement ? Any suggestions are welcome.
The id will not always have 3 digits, that is just the case in my example here
Use coalesce() it's defined as "returns the first of its arguments that is not null" - which is exactly what you want.
coalesce(grp1_l1, grp1_l2, grp1_l3)

Updating columns based on a combine rows value on the same table

Please assist if possible, I have used Stuff to combine rows into a single row based on other columns. However I want to turn each of the unique items into it's own column with a number showing if it exists, e.g. 1 or 0 and then doing the same for all subsequent rows?
I have been able to create the columns but I can't get them to update per whats in the one column.
But I want it to be dynamic so matter how many different names appear in categories it creates a new column and adds 1 or 0 if it appears or not
How about something like this for SQL Server?
strSQL = "SELECT Category, CASE WHEN Category IS NOT NULL THEN 1 ELSE 0 END AS IsCategoryExist FROM MyTable"
Sample data (the 2nd column shows as 1 if the first column is non-blank):
Cars, 1
[Blank], 0
Airplanes, 1
Radios, 1

Return 0 in Sheets Query if there is no data

I need some advice in google query language.
I want to count rows depending on date and a condition. But if the condition is not met, it should return 0.
What I'm trying to achieve:
Date Starts
05.09.2018 0
06.09.2018 3
07.09.2018 0
What I get:
Date Starts
06.09.2018 3
The query looks like =Query(Test!$A2:P; "select P, count(B) where (B contains 'starts') group by P label count(B) 'Starts'")
P contains ascending datevalues and B an event (like start in this case).
How can I force output a 0 for the dates with no entry containing "start"?
The main point is to get all needed data in one table in ascending order. But this is only working, if every day has an entry. If there is no entry for a day, the results for "start" do not match the datevalue in column A. 3 in column D would be in the first row of the table then.
I need it like this:
A B C D
Date Logins Sessions Starts
05.09.2018 1 2 0
06.09.2018 3 4 3
07.09.2018 4 5 0
Maybe this is easy to fix, but I don't see it.
Thanks in advance!
You can do some pre-processing before the query. Ex: check if column B contains 'start' with regexmatch and use a double unary (--) to force the boolean values into 1's and 0's. The use query to sum.
=Query(Arrayformula({--regexmatch(Test!$B2:B; "start")\ Test!$A2:P}); "select Col17, sum(Col1) where Col17 is not null group by Col17 label sum(Col1) 'Starts'")
Change ranges to suit.