What SQL query should I perform to get the result set expected? - sql

What SQL query should I perform to get the result set expected, giving the first element of the chain (2) as input data, or any of them ?
table name: changes
+----+---------------+---------------+
| id | new_record_id | old_record_id |
+----+---------------+---------------+
| 1| 4| 2|
| -- non relevant data -- |
| 6| 7| 4|
| -- non relevant data -- |
| 11| 13| 7|
| 12| 14| 13|
| -- non relevant data -- |
| 31| 20| 14|
+----+---------------+---------------+
Result set expected:
+--+
| 2|
| 4|
| 7|
|13|
|14|
|20|
+--+
I know I should consider change my data model, but: What if I couldn't?
Thank you in advance!

Following code should work to get your result
CREATE TABLE #return(val int)
DECLARE #VAL int
SET #VAL = 2
WHILE (SELECT count(*) FROM [changes]
WHERE old_record_id = #VAL) > 0
BEGIN
INSERT INTO #return values(#VAL)
SELECT #VAL = new_record_id FROM [changes]
WHERE old_record_id = #VAL
END
INSERT INTO #return values(#VAL)
SELECT * FROM #return

Since you're vague on the database, here is some good literature:
MySQL
Look around section 7, this goes in to hierarchy and recursive functions
http://www.artfulsoftware.com/mysqlbook/sampler/mysqled1ch20.html
MSSQL
Good article on the same thing (you're going to find functions the common place, though MSSQL can use WITH).
http://www.sqlservercurry.com/2009/06/simple-family-tree-query-using.html
PostgreSQL
Same type of article. All these have the same premise, working your way up a database tree from child to parent.
http://www.paragoncorporation.com/ArticleDetail.aspx?ArticleID=24

if you have data in the middle that you do not want to update then the only solution would be
update table set id=2 where id=4;
update table set id=4 where id=7;
update table set id=7 where id=13;
update table set id=13 where id=14;
update table set id=14 where id=20;
But this update might work if it does the updates in the order they appear which will generally happens if you have the pk of the table set to id.
update table set id=(select min(id) from table b where b.id>table.id)
You can also force this by adding an order by id at the end, and let you see if it allows that.

Related

Updating different unique value to each group

I have a table where everything that has the same classification_id and application_id have the same group_id.
id |classification_id |application_id |authorisation_id |group_id |
------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+
54f614f3-7582-4ae9-a07e-5ff6d29e7a3b|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
a01571a1-4f04-4ff9-9a7b-3a720736b9ec|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
3e18f2d0-4d5f-41b3-baf5-ba0feac8f43e|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
b2ebe2ee-ffed-4e32-8abe-cd8b7d400646|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
ef01e6f7-f6ad-4d4d-b129-9c756734bef5|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
7d340811-b679-49fd-bdd6-32a1bb9bbfed|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
c45d7bb6-2146-48d0-a804-929cc42484cd|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
ddec5929-a08f-4f48-97f8-ccc2b85531ac|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
ae9edbb2-def3-4c4e-9a27-72454a09e146|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
3a3fd904-1988-4f8c-bf27-8cdf349b8431|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
27c669b9-763c-49cf-887a-b9b1f85dc1ab|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
03820732-32c4-4cd4-910b-4e27fdd44bdf|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
I've managed to sort out subgroups of this group by authorisation_id and I've created a group_helper which basically shows my end goal - from this data set I want to get three different groups:
id |classification_id |application_id |authorisation_id |group_id |group_helper|
------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------+
54f614f3-7582-4ae9-a07e-5ff6d29e7a3b|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|8e563f95-ff0c-41e7-b211-d5ac6f78d056| 2|
a01571a1-4f04-4ff9-9a7b-3a720736b9ec|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|8e563f95-ff0c-41e7-b211-d5ac6f78d056| 2|
3e18f2d0-4d5f-41b3-baf5-ba0feac8f43e|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|8e563f95-ff0c-41e7-b211-d5ac6f78d056| 2|
b2ebe2ee-ffed-4e32-8abe-cd8b7d400646|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|8e563f95-ff0c-41e7-b211-d5ac6f78d056| 2|
ef01e6f7-f6ad-4d4d-b129-9c756734bef5|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|8e563f95-ff0c-41e7-b211-d5ac6f78d056| 3|
7d340811-b679-49fd-bdd6-32a1bb9bbfed|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|8e563f95-ff0c-41e7-b211-d5ac6f78d056| 3|
c45d7bb6-2146-48d0-a804-929cc42484cd|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|8e563f95-ff0c-41e7-b211-d5ac6f78d056| 3|
ddec5929-a08f-4f48-97f8-ccc2b85531ac|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|8e563f95-ff0c-41e7-b211-d5ac6f78d056| 3|
ae9edbb2-def3-4c4e-9a27-72454a09e146|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|8e563f95-ff0c-41e7-b211-d5ac6f78d056| |
3a3fd904-1988-4f8c-bf27-8cdf349b8431|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|8e563f95-ff0c-41e7-b211-d5ac6f78d056| |
27c669b9-763c-49cf-887a-b9b1f85dc1ab|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|8e563f95-ff0c-41e7-b211-d5ac6f78d056| |
03820732-32c4-4cd4-910b-4e27fdd44bdf|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|8e563f95-ff0c-41e7-b211-d5ac6f78d056| |
Now, I want each of those groups to have a different group_id. I don't have to update the one which has group_id = NULL since it is already unique. Now I want to give every row that has group_helper = 2 same (but different from those where group_id = NULL) UUID, every row that has group_helper = 3 same UUID (but different from those which have group_id = NULL or 2) and so on. This has to work on n amount of group_helper values because there can be much more than maximum 2.
So my end goal would look like this:
id |classification_id |application_id |authorisation_id |group_id |group_helper|
------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------+
54f614f3-7582-4ae9-a07e-5ff6d29e7a3b|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|fd3e63d1-d59c-477f-b58b-3ae3726c7992| 2|
a01571a1-4f04-4ff9-9a7b-3a720736b9ec|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|fd3e63d1-d59c-477f-b58b-3ae3726c7992| 2|
3e18f2d0-4d5f-41b3-baf5-ba0feac8f43e|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|fd3e63d1-d59c-477f-b58b-3ae3726c7992| 2|
b2ebe2ee-ffed-4e32-8abe-cd8b7d400646|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|fd3e63d1-d59c-477f-b58b-3ae3726c7992| 2|
ef01e6f7-f6ad-4d4d-b129-9c756734bef5|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|ed3ff96c-2f93-4182-8e4f-4594cb20cbb6| 3|
7d340811-b679-49fd-bdd6-32a1bb9bbfed|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|ed3ff96c-2f93-4182-8e4f-4594cb20cbb6| 3|
c45d7bb6-2146-48d0-a804-929cc42484cd|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|ed3ff96c-2f93-4182-8e4f-4594cb20cbb6| 3|
ddec5929-a08f-4f48-97f8-ccc2b85531ac|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|ed3ff96c-2f93-4182-8e4f-4594cb20cbb6| 3|
ae9edbb2-def3-4c4e-9a27-72454a09e146|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|8e563f95-ff0c-41e7-b211-d5ac6f78d056| |
3a3fd904-1988-4f8c-bf27-8cdf349b8431|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|8e563f95-ff0c-41e7-b211-d5ac6f78d056| |
27c669b9-763c-49cf-887a-b9b1f85dc1ab|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|8e563f95-ff0c-41e7-b211-d5ac6f78d056| |
03820732-32c4-4cd4-910b-4e27fdd44bdf|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|8e563f95-ff0c-41e7-b211-d5ac6f78d056| |
You can create a CTE which generates new group_id, selecting a single value for each group_helper column, then use update ... from .... (see demo)
with grouper(helper, gid) as
(select distinct on (group_helper)
group_helper
, gen_random_uuid()
from sometable
where group_helper is not null
order by group_helper
) --select * from grouper
update sometable
set group_id = gid
from grouper
where helper = group_helper;

Update/merge single value in group by where all columns have identical values in each group

I have a data quality task where I need to allow one row in each group to remain unchanged, and then update the 'duplicate' column in the Delta table for the rest of the rows in the group to 'true'. Initially, when the data is loaded into the table, all values in the 'duplicate' column are 'false'.
The data is stored in Delta format and I am using Spark in Databricks.
An example would be that below. I then want to run a query which updates the 'duplicate' column for all rows in a group to 'true' and leaves only a single value as 'false'. This is so a downstream pipeline can still pick up one of the values where we have duplicates for downstream processing.
The table starts out like this:
ID
Value1
Value2
Duplicate
23
a
b
false
23
a
b
false
24
c
d
false
25
e
f
false
26
g
h
false
26
g
h
false
and I need to end up with this:
ID
Value1
Value2
Duplicate
23
a
b
false
23
a
b
true
24
c
d
false
25
e
f
false
26
g
h
false
26
g
h
true
I could of course simply say spark.table("myTable").dropDuplicates() but I would like to assess how big the issue is with duplicates as an external supplier provides data and we need to understand if there are excessive retries on the system which push up costs.
I am struggling to find a way to change all but one entry where I do not have any unique identifier to use. I have tried many different ways, but have failed. I have included what I would like to achieve below (this will clearly fail).
WITH CTE AS (
SELECT
ID,
Duplicate,
row_number() over (partition by ID) AS rn,
FROM myTable
)
UPDATE myTable SET Duplicate = 'true' WHERE (SELECT rn FROM CTE) > 1
I am unsure if this is even possible without having some unique identifier. If possible I would like to try and avoid using any non-deterministic hashes just to accomplish this as the dataset is very large.
I can use any of Scala, PySpark or SQL within Spark so language isn't vital.
Any pointers would be greatly appreciated.
For SQL, something like this should do the trick.
WITH duplicates
AS (
SELECT
ID,
Duplicate,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS rn
FROM
myTable
)
UPDATE duplicates
SET Duplicate = 'false'
WHERE rn > 1
In pyspark you can try with row_number as below
df1.withColumn("rank",row_number().over(Window.partitionBy("ID").orderBy("ID"))).withColumn("duplicate",when(col("rank")>1,True).otherwise(False)).drop("rank").show()
#output
+---+------+------+---------+
| Id|value1|value2|duplicate|
+---+------+------+---------+
| 23| a| b| false|
| 23| a| b| true|
| 24| c| d| false|
| 25| e| f| false|
| 26| g| h| false|
| 26| g| h| true|
+---+------+------+---------+

Spark: how to perform loop fuction to dataframes

I have two dataframes as below, I'm trying to search the second df using the foreign key, and then generate a new data frame. I was thinking of doing a spark.sql("""select history.value as previous_year 1 from df1, history where df1.key=history.key and history.date=add_months($currentdate,-1*12)""" but then I need to do it multiple times for say 10 previous_years. and join them back together. How can I create a function for this? Many thanks. Quite new here.
dataframe one:
+---+---+-----------+
|key|val| date |
+---+---+-----------+
| 1|100| 2018-04-16|
| 2|200| 2018-04-16|
+---+---+-----------+
dataframe two : historical data
+---+---+-----------+
|key|val| date |
+---+---+-----------+
| 1|10 | 2017-04-16|
| 1|20 | 2016-04-16|
+---+---+-----------+
The result I want to generate is
+---+----------+-----------------+-----------------+
|key|date | previous_year_1 | previous_year_2 |
+---+----------+-----------------+-----------------+
| 1|2018-04-16| 10 | 20 |
| 2|null | null | null |
+---+----------+-----------------+-----------------+
To solve this, the following approach can be applied:
1) Join the two dataframes by key.
2) Filter out all the rows where previous dates are not exactly years before reference dates.
3) Calculate the years difference for the row and put the value in a dedicated column.
4) Pivot the DataFrame around the column calculated in the previous step and aggregate on the value of the respective year.
private def generateWhereForPreviousYears(nbYears: Int): Column =
(-1 to -nbYears by -1) // loop on each backwards year value
.map(yearsBack =>
/*
* Each year back count number is transformed in an expression
* to be included into the WHERE clause.
* This is equivalent to "history.date=add_months($currentdate,-1*12)"
* in your comment in the question.
*/
add_months($"df1.date", 12 * yearsBack) === $"df2.date"
)
/*
The previous .map call produces a sequence of Column expressions,
we need to concatenate them with "or" in order to obtain
a single Spark Column reference. .reduce() function is most
appropriate here.
*/
.reduce(_ or _) or $"df2.date".isNull // the last "or" is added to include empty lines in the result.
val nbYearsBack = 3
val result = sourceDf1.as("df1")
.join(sourceDf2.as("df2"), $"df1.key" === $"df2.key", "left")
.where(generateWhereForPreviousYears(nbYearsBack))
.withColumn("diff_years", concat(lit("previous_year_"), year($"df1.date") - year($"df2.date")))
.groupBy($"df1.key", $"df1.date")
.pivot("diff_years")
.agg(first($"df2.value"))
.drop("null") // drop the unwanted extra column with null values
The output is:
+---+----------+---------------+---------------+
|key|date |previous_year_1|previous_year_2|
+---+----------+---------------+---------------+
|1 |2018-04-16|10 |20 |
|2 |2018-04-16|null |null |
+---+----------+---------------+---------------+
Let me "read through the lines" and give you a "similar" solution to what you are asking:
val df1Pivot = df1.groupBy("key").pivot("date").agg(max("val"))
val df2Pivot = df2.groupBy("key").pivot("date").agg(max("val"))
val result = df1Pivot.join(df2Pivot, Seq("key"), "left")
result.show
+---+----------+----------+----------+
|key|2018-04-16|2016-04-16|2017-04-16|
+---+----------+----------+----------+
| 1| 100| 20| 10|
| 2| 200| null| null|
+---+----------+----------+----------+
Feel free to manipulate the data a bit if you really need to change the column names.
Or even better:
df1.union(df2).groupBy("key").pivot("date").agg(max("val")).show
+---+----------+----------+----------+
|key|2016-04-16|2017-04-16|2018-04-16|
+---+----------+----------+----------+
| 1| 20| 10| 100|
| 2| null| null| 200|
+---+----------+----------+----------+

how to make row data to source and target zigzag using hive or pig

Input
id,name,time
1,home,10:20
1,product,10:21
1,mobile,10:22
2,id,10:24
2,bag,10:30
2,home,10:21
3,keyboard,10:32
3,home,10:33
3,welcome,10:36
I want to make name column as source and target based on the below output.
Earlier I tried with pig
The steps are:
a=load-->b=asc->c=dec -> then join the data
I got the output like this
(1,home,10:20,1,product,10:21)
(2,bag,10:30,2,id,10:24)
(3,home,10:32,3,welcome,10:36)
output
1,home,product
1,product,mobile
2,id,bag
2,bag,home
3,keyboard,home
3,home,welcome
In Hive (and in Spark), you can use Window function LEAD :
with t as
( select id, name, lead(name) over (partition by id) as zigzag from table)
select * from t where t.zigzag is not null
Should give you the output :
+---+--------+-------+
| id| name| zigzag|
+---+--------+-------+
| 1| home|product|
| 1| product| mobile|
| 2| bag| home|
| 2| home| id|
| 3|keyboard| home|
| 3| home|welcome|
+---+--------+-------+

How to write a SQL query to generate a report

I need to generate a report based on the following tables:
CALLS_FOR_PROPOSALS
ID|Name
--+------
1|Call 1
2|Call 2
3|Call 3
PROPOSALS
ID|Call ID|Title
--+-------+----------
1| 1|Proposal 1
2| 2|Proposal 2
3| 2|Proposal 3
PROPOSAL_STATUSES
ID|Proposal ID|Status ID
--+-----------+---------
1| 1| 1
2| 2| 1
3| 3| 1
4| 3| 2
STATUSES
ID|NAME
--+------------
1|Not Reviewed
2|Processing
3|Accepted
4|Rejected
With this sample data, there are 3 Calls for Proposals. There are three Proposals; one for Call 1, and two for Call 2. (Call 3 does not have any proposals.) Each proposal has at least one status assigned to it. When a row is inserted into the PROPOSALS table, a corresponding row is inserted into PROPOSAL_STATUSES, giving the Proposal an initial default status of 1 (Not Reviewed). Each time the status is changed, a new row is inserted into the PROPOSAL_STATUSES table, so that the history of status changes is preserved. I need to generate a report that shows for each Call, the number of Proposals submitted, and the number of Proposals that have had more than one status (i.e. the status has been changed from the default at least once.) For the sample data above, the results would look like this:
Call Name|Proposals Submitted|Proposals Reviewed|
---------+-------------------+------------------+
Call 1 | 1| 0|
Call 2 | 2| 1|
Call 3 | 0| 0|
How would I write the SQL query to generate this report based on the above table structure? Thanks for your help.
something like that should do the trick : Demo
SELECT Name as 'Call name',
submitted as 'Proposals Submitted',
SUM(CASE WHEN maxStatus > 1 THEN 1 ELSE 0 END) as 'Proposals Reviewed'
FROM
(SELECT cfp.Name,
sum(case when ps.Status_ID = 1 then 1 else 0 end) as submitted,
MAX(ps.Status_ID) as maxStatus
FROM CALLS_FOR_PROPOSALS cfp
LEFT JOIN PROPOSALS p on cfp.ID = p.CALL_ID
LEFT JOIN PROPOSAL_STATUSES ps on ps.PROPOSAL_ID = p.ID
GROUP BY cfp.Name) AS s
GROUP BY Name, submitted