SQL project state pipeline query - sql

I am working with a database that includes a history table showing when projects have moved to the next state of development. Here is some example data.
dbo.history_table
| ProjectID | State | Timestamp |
| 1 | 1 | 2018-03-22 10:38:27.000 |
| 1 | 2 | 2018-03-23 10:22:56.000 |
| 1 | 3 | 2018-03-24 12:18:32.000 |
| 2 | 1 | 2018-03-24 11:01:17.000 |
| 1 | 4 | 2018-03-25 10:32:41.000 |
| 2 | 4 | 2018-03-26 12:39:03.000 |
There are a number of states that look something like this:
| State # | Description | Notes
| State 1 | Planning |
| State 2 | Pre-Production |
| State 3 | Production |
| State 4 | Post-Production |
| State 5 | Successful | Terminal state
| State 6 | Unsuccessful | Terminal state
| State 7 | Cancelled | Terminal state
These states are roughly sequential, but not quite, and not always. For instance, each project can only end in one of the three terminal states. Furthermore, although all projects begin in State 1, not all projects hit all states before terminating.
I need to write a query that looks at a particular state and shows the next state those records move to. The data I need from this query would look something like this:
Report: State 2 Pipeline
| Total: | 100 |
| Moved to State 3: | 50 |
| Moved to State 4: | 0 |
| Moved to State 5: | 25 |
| Moved to State 6: | 0 |
| Moved to State 7: | 25 |

This applies the LEAD logic to find the next row's state and utilizes GROUPING SETS to add the grand total:
with cte as
(
select
State,
lead(State)
over (partition by ProjectID
order by Timestamp) as nextState
from myTable
)
select
case when grouping(nextState) = 1 then 'Total' else nextState end,
count(*)
from cte
where State = 2
group by grouping sets ((nextState), ())
order by grouping(nextState) desc, nextState
But it will not return a row for states that don't exist as next state. You need to Left join this select to the table describing the existing states.

Related

PowerBI / SQL Query to verify records

I am working on a PowerBI report that is grabbing information from SQL and I cannot find a way to solve my problem using PowerBI or how to write the required code. My first table, Certifications, includes a list of certifications and required trainings that must be obtained in order to have an active certification.
My second table, UserCertifications, includes a list of UserIDs, certifications, and the trainings associated with a certification.
How can I write a SQL code or PowerBI measure to tell if a user has all required trainings for a certification? ie, if UserID 1 has the A certification, how can I verify that they have the TrainingIDs of 1, 10, and 150 associated with it?
Certifications:
CertificationsTable
UserCertifications:
UserCertificationsTable
This is a DAX pattern to test if contains at least some values.
| Certifications |
|----------------|------------|
| Certification | TrainingID |
|----------------|------------|
| A | 1 |
| A | 10 |
| A | 150 |
| B | 7 |
| B | 9 |
| UserCertifications |
|--------------------|---------------|----------|
| UserID | Certification | Training |
|--------------------|---------------|----------|
| 1 | A | 1 |
| 1 | A | 10 |
| 1 | A | 300 |
| 2 | A | 150 |
| 2 | B | 9 |
| 2 | B | 90 |
| 3 | A | 7 |
| 4 | A | 1 |
| 4 | A | 10 |
| 4 | A | 150 |
| 4 | A | 1000 |
In the above scenario, DAX needs to find out if the mandatory trainings (Certifications[TrainingID]) by Certifications[Certification] is completed by
UserCertifications[UserID ]&&UserCertifications[Certifications] partition.
In the above scenario, DAX should only return true for UserCertifications[UserID ]=4 as it is the only User that completed at least all the mandatory trainings.
The way to achieve this is through the following measure
areAllMandatoryTrainingCompleted =
VAR _alreadyCompleted =
CONCATENATEX (
UserCertifications,
UserCertifications[Training],
"-",
UserCertifications[Training]
) // what is completed in the fact Table; the fourth argument is very important as it decides the sort order
VAR _0 =
MAX ( UserCertifications[Certification] )
VAR _supposedToComplete =
CONCATENATEX (
FILTER ( Certifications, Certifications[Certification] = _0 ),
Certifications[TrainingID],
"-",
Certifications[TrainingID]
) // what is comeleted in the training Table; the fourth argument is very important as it decides the sort order
VAR _isMandatoryTrainingCompleted =
CONTAINSSTRING ( _alreadyCompleted, _supposedToComplete ) // CONTAINSSTRING (<Within Text>,<Search Text>); return true false
RETURN
_isMandatoryTrainingCompleted

join two views and detect missing entries where the matching condition is in the next row of the other view/table (using SQLITE)

I am running a science test and logging my data inside two sqlite tables.
I have selected the data needed into two seperate and independent Views (RX and TX views).
Now I need to analyze the measurements and create a 3rd table view with the results with the following points in mind:
1- For each test at TX side (Table-1) there might be a corresponding entry at RX side (Table-2).
2- If the time stamp #RX side is less than the time stamp at the next row of the TX table view
we consider them to be associated with one record in the 3rd view/table and calculate the time difference OTHERWISE it would be a miss.
Question: How should i write the sql query in SQLITE to produce the analysis and test result given in table3?
Thanks a lot in advance.
TX View - Table (1)
id | time | measurement
------------------------
1 | 09:40:10.221 | 100
2 | 09:40:15.340 | 60
3 | 09:40:21.100 | 80
4 | 09:40:25.123 | 90
5 | 09:40:29.221 | 45
RX View -Table (2)
time | measurement
------------------------
09:40:15.7 | 65
09:40:21.560 | 80
09:40:30.414 | 50
Test Result View - Table (3)
id |TxTime |RxTime | delta_time(s)| delta_value
------------------------------------------------------------------------
1 | 09:40:10.221 | NULL |NULL | NULL (i.e. missed)
2 | 09:40:15.340 | 09:40:15.7 |0.360 | 5
3 | 09:40:21.100 | 09:40:21.560 |0.460 | 0
4 | 09:40:25.123 | NULL |NULL | NULL (i.e. missed)
5 | 09:40:29.221 | 09:40:30.414 |1.193 | 5
Use window function LEAD() to get the next time of each row in TX and join the views on your conditions:
SELECT t.id, t.time TxTime, r.time RxTime,
ROUND((julianday(r.time) - julianday(t.time)) * 24 * 60 *60, 3) [delta_time(s)],
r.measurement - t.measurement delta_value
FROM (
SELECT *, LEAD(time) OVER (ORDER BY Time) next
FROM TX
) t
LEFT JOIN RX r ON r.time >= t.time AND (r.time < t.next OR t.next IS NULL)
See the demo.
Results:
> id | TxTime | RxTime | delta_time(s) | delta_value
> -: | :----------- | :----------- | :------------ | :----------
> 1 | 09:40:10.221 | null | null | null
> 2 | 09:40:15.340 | 09:40:15.7 | 0.36 | 5
> 3 | 09:40:21.100 | 09:40:21.560 | 0.46 | 0
> 4 | 09:40:25.123 | null | null | null
> 5 | 09:40:29.221 | 09:40:30.414 | 1.193 | 5

How to add data or change schema to production database

I am new to working with databases and I want to make sure I understand the best way to add or remove data from a database without making a mess of any related data.
Here is a scenario I am working with:
I have a Tags table, with an Identity ID column. The Tags can be selected via the web application to categorize stories that are submitted by a user. When the database was first seeded; like tags were seeded in order together. As you can see all the Campuses (cities) were 1-4, the Colleges (subjects) are 5-7, and Populations are 8-11.
If this database is live in production and the client wants to add a new Campus (City) tag, what is the best way to do this?
All the other city tags are sort of organized at the top, it seems like the only option is to insert any new tags at to bottom of the table, where they will end up taking whatever the next ID available is. I suppose this is fine because the Display category column will allow us to know which categories these new tags actually belong to.
Is this typical? Is there better ways to set up the database or handle this situation such that everything remains more organized?
Thank you
+----+------------------+---------------+-----------------+--------------+--------+----------+
| ID | DisplayName | DisplayDetail | DisplayCategory | DisplayOrder | Active | ParentID |
+----+------------------+---------------+-----------------+--------------+--------+----------+
| 1 | Albany | NULL | 1 | 0 | 1 | NULL |
| 2 | Buffalo | NULL | 1 | 1 | 1 | NULL |
| 3 | New York City | NULL | 1 | 2 | 1 | NULL |
| 4 | Syracuse | NULL | 1 | 3 | 1 | NULL |
| 5 | Business | NULL | 2 | 0 | 1 | NULL |
| 6 | Dentistry | NULL | 2 | 1 | 1 | NULL |
| 7 | Law | NULL | 2 | 2 | 1 | NULL |
| 8 | Student-Athletes | NULL | 3 | 0 | 1 | NULL |
| 9 | Alumni | NULL | 3 | 1 | 1 | NULL |
| 10 | Faculty | NULL | 3 | 2 | 1 | NULL |
| 11 | Staff | NULL | 3 | 3 | 1 | NULL |
+----+------------------+---------------+-----------------+--------------+--------+----------+
The terms "top" and "bottom" which you use aren't really applicable. "Albany" isn't at the "Top" of the table - it's merely at the top of the specific view you see when you query the table without specifying a meaningful sort order. It defaults to a sort order based on the Id or an internal ROWID parameter, which isn't the logical way to show this data.
Data in the table isn't inherently ordered. If you want to view your tags organized by their category, simply order your query by DisplayCategory (and probably by DisplayOrder afterwards), and you'll see your data properly organized. You can even create a persistent View that sorts it that way for your convenience.

SQLite3 select last event by user

I have the following table 'events'.
| id | event_type | by_user | asset | time |
| 1 | owner | a | 10 | 1111111111 |
| 2 | updated | b | 20 | 1111111112 |
| 3 | owner | a | 30 | 1111111113 |
| 4 | owner | c | 20 | 1111111114 |
| 5 | updated | a | 10 | 1111111115 |
| 6 | owner | a | 20 | 1111111118 |
I would like to select the assets where user 'a' was the last user
with an 'owner' event_type. So in this example the id's 1, 3 and 6 (the
assets 10, 20 and 30 are owned by user 'a').
Basically, based on the events, I want to find the assests owned by user 'a'.
This is what correlated subqueries are for:
SELECT * FROM events e
WHERE event_type='owner'
AND time=(SELECT MAX(e_inner.time) FROM events e_inner
WHERE e_inner.asset=e.asset AND e_inner.event_type='owner')
Will give you the event that is "for each asset, show the last ownership event". If you want it for specific assets or specific owners, just add an appropriate WHERE clause
Your question is ripe for breakage if you aren't guaranteeing uniqueness of {time, event_type, asset}. This will return all n rows if you have n users being assigned ownership at the exact same time.

Updating a record using data from the same table

I have a joining table that logs the changes in a File's state (Active/Deleted/Archived etc).
e.g.
+----+---------+------------+-------+
| PK | File_FK | Date | State |
+----+---------+------------+-------+
| 1 | 100 | 11-9-2015 | 1 |
| 2 | 200 | 14-09-2015 | 2 |
| 3 | 300 | 14-07-2015 | 0 |
| 4 | 300 | 12-9-2015 | 2 |
| 5 | 200 | 14-09-2015 | 2 |
| 6 | 300 | 14-09-2015 | 0 |
| 7 | 300 | 13-09-2015 | 1 |
+----+---------+------------+-------+
There are a number of records which are were not inserted accordingly, on a certain date e.g. July 15.
Is there a way how to automate the updating of certain records?
I want to change the state to the last option it was (latest date), if the state is 0 and if there are no previous 0 as states (for the same file foreign key) without amending previous states which had a zero.
To try and simplify (taking File_FK = 300 as example):
I want:
The state of the file to update to be 1.
State '0' is invalid since it was made on 14-09-2015 (the last one)
Prevent updating the '0' that was made on 14-07-2015.
Basically I want to change the state of a file to the state it was before it was '0' without changing the 0s that where done earlier.