Query'd top 15 faults, need the accumulated downtime from another column - sql

I'm currently trying to query up a list of the top 15 occurring faults on a PLC in the warehouse. I've gotten that part down:
Select top 15 fault_number, fault_message, count(*) FaultCount
from Faults_Stator
where T_stamp> dateadd(hour, -18, getdate())
Group by Fault_number, Fault_Message
Order by Faultcount desc
HOOOWEVER I now need to find out the accumulated downtime of said faults in the top 15 list, information in another column "Fault_duration". How would I go about doing this? Thanks in advance, you've all helped me so much already.
+--------------+---------------------------------------------+------------+
| Fault Number | Fault Message | FaultCount |
+--------------+---------------------------------------------+------------+
| 122 | ST10: Part A&B Failed | 23 |
| 4 | ST16: Part on Table B | 18 |
| 5 | ST7: No Spring Present on Part A | 15 |
| 6 | ST7: No Spring Present on Part B | 12 |
| 8 | ST3: No Pin Present B | 8 |
| 1 | ST5: No A Housing | 5 |
| 71 | ST4: Shuttle Right Not Loaded | 4 |
| 144 | ST15: Vertical Cylinder did not Retract | 3 |
| 98 | ST8: Plate Loader Can not Retract | 3 |
| 72 | ST4: Shuttle Left Not Loaded | 2 |
| 94 | ST8: Spring Gripper Cylinder did not Extend | 2 |
| 60 | ST8: Plate Loader Can not Retract | 1 |
| 83 | ST6: No A Spring Present | 1 |
| 2 | ST5: No B Housing | 1 |
| 51 | ST4: Vertical Cylinder did not Extend | 1 |
+--------------+---------------------------------------------+------------+
I know I wouldn't be using the same query, but I'm at a loss at how to do this next step.
Fault duration is a column which dictates how long the fault lasted in ms. I'm trying to have those accumulated next to the corresponding fault. So the first offender would have those 23 individual fault occurrences summed next to it, in another column.

You should be able to use the SUM accumulator:
Select top 15 fault_number, fault_message, count(*) FaultCount, SUM (Fault_duration) as FaultDuration
from Faults_Stator
where T_stamp> dateadd(hour, -18, getdate())
Group by Fault_number, Fault_Message
Order by Faultcount desc

Related

join two views and detect missing entries where the matching condition is in the next row of the other view/table (using SQLITE)

I am running a science test and logging my data inside two sqlite tables.
I have selected the data needed into two seperate and independent Views (RX and TX views).
Now I need to analyze the measurements and create a 3rd table view with the results with the following points in mind:
1- For each test at TX side (Table-1) there might be a corresponding entry at RX side (Table-2).
2- If the time stamp #RX side is less than the time stamp at the next row of the TX table view
we consider them to be associated with one record in the 3rd view/table and calculate the time difference OTHERWISE it would be a miss.
Question: How should i write the sql query in SQLITE to produce the analysis and test result given in table3?
Thanks a lot in advance.
TX View - Table (1)
id | time | measurement
------------------------
1 | 09:40:10.221 | 100
2 | 09:40:15.340 | 60
3 | 09:40:21.100 | 80
4 | 09:40:25.123 | 90
5 | 09:40:29.221 | 45
RX View -Table (2)
time | measurement
------------------------
09:40:15.7 | 65
09:40:21.560 | 80
09:40:30.414 | 50
Test Result View - Table (3)
id |TxTime |RxTime | delta_time(s)| delta_value
------------------------------------------------------------------------
1 | 09:40:10.221 | NULL |NULL | NULL (i.e. missed)
2 | 09:40:15.340 | 09:40:15.7 |0.360 | 5
3 | 09:40:21.100 | 09:40:21.560 |0.460 | 0
4 | 09:40:25.123 | NULL |NULL | NULL (i.e. missed)
5 | 09:40:29.221 | 09:40:30.414 |1.193 | 5
Use window function LEAD() to get the next time of each row in TX and join the views on your conditions:
SELECT t.id, t.time TxTime, r.time RxTime,
ROUND((julianday(r.time) - julianday(t.time)) * 24 * 60 *60, 3) [delta_time(s)],
r.measurement - t.measurement delta_value
FROM (
SELECT *, LEAD(time) OVER (ORDER BY Time) next
FROM TX
) t
LEFT JOIN RX r ON r.time >= t.time AND (r.time < t.next OR t.next IS NULL)
See the demo.
Results:
> id | TxTime | RxTime | delta_time(s) | delta_value
> -: | :----------- | :----------- | :------------ | :----------
> 1 | 09:40:10.221 | null | null | null
> 2 | 09:40:15.340 | 09:40:15.7 | 0.36 | 5
> 3 | 09:40:21.100 | 09:40:21.560 | 0.46 | 0
> 4 | 09:40:25.123 | null | null | null
> 5 | 09:40:29.221 | 09:40:30.414 | 1.193 | 5

Specifying condition operator (AND/OR) for a column based on another column value in SQL

I have a recipe table with a many-to-many to a recipe_filter table. Here's some sample data:
recipe:
id | name
----+-----------
1 | test 2019
12 | slug-14
8 | dfadsfd
6 | test 4
4 | test 2
11 | slug-11
10 | Testology
13 | slug-15
5 | test 3
14 | slug-16
(10 rows)
recipe_filter_join:
recipeId | recipeFilterId
----------+----------------
1 | 1
2 | 2
3 | 3
4 | 1
6 | 5
7 | 6
8 | 4
9 | 7
6 | 8
14 | 9
14 | 4
5 | 9
5 | 38
filter:
id | slug | name | label
----+----------------------+-------------+----------------
2 | fdsfa | fdsfa | Category
3 | dsfds | dsfds | Category
6 | fdsaf | fdsaf | Category
7 | dfad | dfad | Category
8 | product-spice-2 | Spice #2 | Product
9 | product-spice-3 | Spice #3 | Product
5 | product-spice-4 | Spice #4 | Product
4 | product-spice-5 | Spice #5 | Product
1 | product-spice-6 | Spice #6 | Product
10 | product-spice-1 | Spice #1 | Product
40 | diet-halal | Halal | Diet
38 | diet-keto | Keto | Diet
41 | diet-gluten-free | Gluten free | Diet
37 | diet-vegan | Vegan | Diet
39 | diet-diabetic | Diabetic | Diet
42 | cooking-method-bake | Bake | Cooking method
43 | cooking-method-fry | Fry | Cooking method
44 | cooking-method-steam | Steam | Cooking method
45 | cooking-method-roast | Roast | Cooking method
(19 rows)
The input to my query is a list of filters.slugs for example product-spice-1, product-spice-5, cooking-method-fry, cooking-method-steam.
For the above example, I want to write a query that gets all recipes where the filter slug is (product-spice-1 or product-spice-5) and (cooking-method-fry or cooking-method-steam).
How do I create a generic query from the example above?
Update: In case it's not clear, for the list of filters given, I want to group them based on label and apply an OR between group members and an AND condition for other groups, if that makes any sense.
You want to INTERSECT two queries
SELECT
rfj."recipeId"
FROM recipe_filter_join rfj
JOIN filter ON filter.id = rfj."recipeFilterId"
WHERE filter.slug IN ('product-spice-1','product-spice-5')
INTERSECT
SELECT
rfj."recipeId"
FROM recipe_filter_join rfj
JOIN filter ON filter.id = rfj."recipeFilterId"
WHERE filter.slug IN ('cooking-method-fry', 'cooking-method-steam')
And this is is quite generalizable. As you can see, the only difference between the two parts is in the WHERE clause. If you have other conditions on Diet or category, you could generate the appropriate query string with the variation on filer & join them with INTERSECT as the separator in your programming language of choice.
I want to group them based on label and apply an OR between group members and an AND condition for other groups.
If you would prefer to have your application code call the query with just a list of slugs, then the following solution is more general.
If we restate the problem description as :
We want to search for recipes which have ingredients intersecting with the provided ingredient list, and the distinct labels for the recipes equals the distinct labels derived from the ingredient list (this last part is handled by the having clause)
We can write
WITH distinct_labels AS (
SELECT
ARRAY_AGG(DISTINCT label ORDER BY label) distinct_labels_filtered
FROM filter
WHERE slug IN ('product-spice-1','product-spice-5','cooking-method-fry', 'cooking-method-steam')
)
SELECT
rfj."recipeId"
FROM filter
JOIN recipe_filter_join rfj
ON filter.id = rfj."recipeFilterId"
WHERE slug IN ('product-spice-1','product-spice-5','cooking-method-fry', 'cooking-method-steam')
GROUP BY 1
HAVING ARRAY_AGG(DISTINCT label ORDER BY label) = (SELECT distinct_labels_filtered FROM distinct_labels)

influxdb/SQL get field count

I have an influxdb table lets call it my_table
my_table is structured like this (simplified):
+-----+-----+-----
| Time| m1 | m2 |
+=====+=====+=====
| 1 | 8 | 4 |
+-----+-----+-----
| 2 | 1 | 12 |
+-----+-----+-----
| 3 | 6 | 18 |
+-----+-----+-----
| 4 | 4 | 1 |
+-----+-----+-----
However I was wondering if it is possible to find out how many of the metrics are larger than a certain (dynamic) threshold for each time.
So lets say I want to know how many of the metrics (columns) are higher than 5,
I would want to do something like this:
select fieldcount(/m*/) from my_table where /m*/ > 5
Returning:
1
1
2
0
I am relatively restricted in structuring the database as I'm using diamond collector (python) which takes care of all datacollection for me and flushes it to my influxdb without me telling what the tables should look like.
EDIT
I am aware of a possible solution if I hardcode the threshold and add a third metric named mGreaterThan5:
+-----+-----+------------------+
| Time| m1 | m2 |mGreaterThan5|
+=====+=====+====+=============+
| 1 | 8 | 4 | 1 |
+-----+-----+----+-------------+
| 2 | 1 | 12 | 1 |
+-----+-----+----+-------------+
| 3 | 6 | 18 | 2 |
+-----+-----+----+-------------+
| 4 | 4 | 1 | 0 |
+-----+-----+----+-------------+
However this means that I cant easily change this threshold to 6 or any other number so thats why I would prefer a better solution if there is one.
EDIT2
Another similar problem occurs with trying to retrieve the highest x amount of metrics. Eg:
On Jan 1st what were the highest 3 values of m? Given table:
+-----+-----+----+-----+----+-----+----+
| Time| m1 | m2 | m3 | m4 | m5 | m6 |
+=====+=====+====+=====+====+=====+====+
| 1/1 | 8 | 4 | 1 | 7 | 2 | 0 |
+-----+-----+----+-----+----+-----+----+
Am I screwed if I keep the table structured this way?

Find a subset of numbers that equals to the target weighted average and target sum

There is a SQL server table containing 1 million of rows. A sample data is shown below.
Percentage column is computed as = ((Y/X)* 100)
+----+--------+-------------+-----+-----+-------------+
| ID | Amount | Percentage | X | Y | Z |
+----+--------+-------------+-----+-----+-------------+
| 1 | 10 | 9.5 | 100 | 9.5 | 95 |
| 2 | 20 | 9.5 | 100 | 9.5 | 190 |
| 3 | 40 | 5 | 100 | 5 | 200 |
| 4 | 50 | 5.555555556 | 90 | 5 | 277.7777778 |
| 5 | 70 | 8.571428571 | 70 | 6 | 600 |
| 6 | 100 | 9.230769231 | 65 | 6 | 923.0769231 |
| 7 | 120 | 7.058823529 | 85 | 6 | 847.0588235 |
| 8 | 60 | 10.52631579 | 95 | 10 | 631.5789474 |
| 9 | 80 | 10 | 100 | 10 | 800 |
| 10 | 95 | 10 | 100 | 10 | 950 |
+----+--------+-------------+-----+-----+-------------+
Now I need to find the rows such that their amount value add up to a given Amount and weighted average matches to the given Percentage.
For example, if the target Amount =365 and target Percentage=9.84, then from the given dataset, we can say that rows with ID=1,2,6,8,9,10 form the subset which will match the given targets.
Amount = 10+20+100+60+80+95
= 365
Percentage = Sum of (product of Amount and Percentage)/Sum of (Amount)
(I am using Z column to store the products of Amount and Percentage to make the calculations easier)
= ((10*9.5)+(20*9.5)+(100*9.23077)+(60*10.5264)+(80*10)+(95*10))/ (10+20+100+60+80+95)
= 9.834673618
So the rows 1,2,6,8,9,10 matches the given target sum and target weighted average.
Proposed algorithm should work on the 1 million rows and main objective is to achieve the match on the weighted average (Percentage) with Amount as much close as possible to the target Amount.
I found few questions on the stackoverflow which are related to match the target sum. But my problem is to match two target attributes Sum and weighted average.
Which algorithm can be used to achieve this?
Since the target "Percentage" is only approximate (therefore not an actual constraint), let's try removing it and find a solution for Amount. This can only make the problem easier.
What's left is the Subset Sum Problem, which is NP-complete. There are simple exponential-time solutions, and sneaky pseudo-polynomial-time solutions, but I don't think any of them will be practical for a table with 106 rows.
If this is an academic exercise, I suggest you write up the cleverest pseudo-polynomial-time solution you can come up with. If it's a task in the real world, I suggest you go back to the person who gave it to you, explain that an exact solution is impractical, and negotiate for an approximate solution.

Sybase select distinct on one column, do not care about others

I have seen many similar questions but none that meet my needs exactly, and I cannot seem to deduce a solution on my own from inspecting the other questions.
I have the following (mock) table below. My actual table has many more columns.
TableA:
ID | color | feel | size | alive | age
------------------------------------------
1 | blue | soft | large | true | 36
2 | red | soft | large | true | 36
2 | blue | hard | small | false | 37
2 | blue | soft | large | true | 36
2 | blue | soft | small | false | 39
15 | blue | soft | medium | true | 04
15 | blue | soft | large | true | 04
15 | green | soft | large | true | 15
40 | pink | sticky | large | true | 83
51 | brown | rough | tiny | false | 01
51 | gray | soft | tiny | true | 59
34 | blue | soft | large | true | 02
I want the result to look like:
Result of query on TableA:
ID | color | feel | size | alive | age
-------------------------------------------
1 | blue | soft | large | true | 36
2 | red | soft | large | true | 36
15 | blue | soft | medium | true | 04
40 | pink | sticky | large | true | 83
51 | brown | rough | tiny | false | 01
34 | blue | soft | large | true | 02
I want one row for every unique ID column, but I do not want to check the other columns. I need the other columns returned in my result set, but I do not want to filter on them. I just need one row for every unique ID - I do not care which row.
In my example, I selected the first row of every unique ID.
I have tried variations of
select *
from TableA
group by ID having ID = max(ID)
Most examples I have seen with group by and max and/or min functions involve only 2 columns. I have many more columns, however.
I have also seen examples using CTE, but I am not using SQL Server (I am using Sybase).
How can I achieve the result set described?
EDIT
We are using Sybase version 15.1.
Your solution with MIN has some drawbacks. It doesn't return you a specific row but MIN values from the group of rows. You can get as result rows which are not in database. Is it OK for you ?
Row_number is supported in sybase 15.2
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc38151.1520/html/iqrefbb/iqrefbb262.htm
It's sad if it is not supported in 15.1. You can use then identity column and temporary table to achieve what you want.
There are a variety of ways to do this. If you have a more recent version of Sybase, you can use row_number():
select t.*
from (select t.*, row_number() over (partition by id order by id) as seqnum
from table t
) t
where seqnum = 1;
The solution I have come up with is below.
It "feels" like a poor solution - I am still open to new answers:
SELECT
ID,
min(color),
min(feel),
min(size),
min(alive),
min(age)
FROM TableA
group by ID
I do not like how verbose I am with the application of the min function to every column, but this returns the desired result set.