CloudWatch Logs Group By and getting last time value - amazon-cloudwatch

I need to map which endpoints are taking the longest from a log.
I have a query that catches all the most discouraged endpoints, but they may have duplicate endpoints, but with different request times.
My query:
fields request.url as URL, response.content.manager.company.label as Empresa, timestamp as Data, response.status as Status, request.time as TEMPO
| filter #logStream = 'production'
| sort request.time DESC
| limit 30
Result:
# | ENDPOINT | TIMESTAMP | COMPANY | STATUS CODE | TIME REQUEST
1 | /api/v1/login | 2020-02-01T11:14:00 | company-label | 200 | 0.9876
2 | /api/v1/register | 2020-02-01T11:11:00 | company-label | 200 | 0.5687
3 | /api/v1/login | 2020-02-01T00:00:00 | company-label | 200 | 0.2345\
I need to unify by endpoint, for example:
# | ENDPOINT | TIMESTAMP | COMPANY | STATUS CODE | TIME REQUEST
1 | /api/v1/login | 2020-02-01T11:14:00 | company-label | 200 | 0.9876
2 | /api/v1/register | 2020-02-01T11:11:00 | company-label | 200 | 0.5687\
Unify by endpoint and get the last "time" to show
Thank you!

I found the solution to this question.
filter #logStream = 'production'
| filter ispresent(request.time)
| stats avg(request.time) as MEDIA by request.uri as ENDPOINT
| sort MEDIA DESC
| limit 30
Using the stats avg(request.time) as MEDIA to grouping data and capture an media to this ENDPOINT.

Related

PowerBI / SQL Query to verify records

I am working on a PowerBI report that is grabbing information from SQL and I cannot find a way to solve my problem using PowerBI or how to write the required code. My first table, Certifications, includes a list of certifications and required trainings that must be obtained in order to have an active certification.
My second table, UserCertifications, includes a list of UserIDs, certifications, and the trainings associated with a certification.
How can I write a SQL code or PowerBI measure to tell if a user has all required trainings for a certification? ie, if UserID 1 has the A certification, how can I verify that they have the TrainingIDs of 1, 10, and 150 associated with it?
Certifications:
CertificationsTable
UserCertifications:
UserCertificationsTable
This is a DAX pattern to test if contains at least some values.
| Certifications |
|----------------|------------|
| Certification | TrainingID |
|----------------|------------|
| A | 1 |
| A | 10 |
| A | 150 |
| B | 7 |
| B | 9 |
| UserCertifications |
|--------------------|---------------|----------|
| UserID | Certification | Training |
|--------------------|---------------|----------|
| 1 | A | 1 |
| 1 | A | 10 |
| 1 | A | 300 |
| 2 | A | 150 |
| 2 | B | 9 |
| 2 | B | 90 |
| 3 | A | 7 |
| 4 | A | 1 |
| 4 | A | 10 |
| 4 | A | 150 |
| 4 | A | 1000 |
In the above scenario, DAX needs to find out if the mandatory trainings (Certifications[TrainingID]) by Certifications[Certification] is completed by
UserCertifications[UserID ]&&UserCertifications[Certifications] partition.
In the above scenario, DAX should only return true for UserCertifications[UserID ]=4 as it is the only User that completed at least all the mandatory trainings.
The way to achieve this is through the following measure
areAllMandatoryTrainingCompleted =
VAR _alreadyCompleted =
CONCATENATEX (
UserCertifications,
UserCertifications[Training],
"-",
UserCertifications[Training]
) // what is completed in the fact Table; the fourth argument is very important as it decides the sort order
VAR _0 =
MAX ( UserCertifications[Certification] )
VAR _supposedToComplete =
CONCATENATEX (
FILTER ( Certifications, Certifications[Certification] = _0 ),
Certifications[TrainingID],
"-",
Certifications[TrainingID]
) // what is comeleted in the training Table; the fourth argument is very important as it decides the sort order
VAR _isMandatoryTrainingCompleted =
CONTAINSSTRING ( _alreadyCompleted, _supposedToComplete ) // CONTAINSSTRING (<Within Text>,<Search Text>); return true false
RETURN
_isMandatoryTrainingCompleted

querying across multiple fields and returning minimum values of sets

I have a complex (at least for me!) SQL question that I've been trying to solve.
The structure of my table is:
Query: describe shipping_zones
+------------+--------+---------+
| name | type | comment |
+------------+--------+---------+
| key_id | int | |
| carrier | string | |
| origin_zip | int | |
| dest_zip | int | |
| zone | int | |
+------------+--------+---------+
There are 3 types of "carriers"
Query: select DISTINCT(carrier) FROM shipping_zones
+---------+
| carrier |
+---------+
| fedex |
| ups |
| usps |
+---------+
Fetched 3 row(s) in 0.42s
I have this query that finds returns two zones for each carrier:
SELECT carrier, zone
FROM shipping_zones
WHERE (origin_zip = 402 OR origin_zip = 950) AND dest_zip = 978;
+---------+------+
| carrier | zone |
+---------+------+
| ups | 4 |
| ups | 7 |
| fedex | 7 |
| fedex | 4 |
| usps | 8 |
| usps | 4 |
+---------+------+
The problem is I only want to return the lowest number from each carrier, not both. Can I use Min() or is there some better way to do it?
Thanks for the help, I appreciate the help in understanding SQL!
Sounds like you're looking for aggregate functions. However, whenever you use an aggregate function you have to tell your database engine what to do with all of the fields individually. If you have a text field you just want collapsed wherever there are duplicates you can use the group by clause.
SELECT carrier, min(zone)
FROM shipping_zones
WHERE (origin_zip = 402 OR origin_zip = 950) AND dest_zip = 978
group by carrier;
Yes, you can use the Min() function on each carrier after you group by carriers. Grouping will essentially "summarize" the column that you want to group by (carrier, in this case) by the function that you apply ("Min", in this case):
SELECT carrier, MIN(zone) as min_zone
FROM shipping_zones
WHERE (origin_zip = 402 OR origin_zip = 950)
AND dest_zip = 978
GROUP BY carrier

Update multiple records SQL with condition (FIFO)

I am sorry before if that title doesn't represent my problem here.
The scenario:
I do outbound 5 items for ItemA
Table FIFO:
| date | item | inbound | outbound |
| 13/11/2015 | itemA | 2 | |
| 15/11/2015 | itemA | 8 | |
My UPDATE script's now (wrong):
| date | item | inbound | outbound |
| 13/11/2015 | itemA | 2 | 5 |
| 15/11/2015 | itemA | 8 | |
The expected result (right):
| date | item | inbound | outbound |
| 13/11/2015 | itemA | 2 | 2 |
| 15/11/2015 | itemA | 8 | 3 |
I have using SQL Server 2008. My script only UPDATE the first row. How to achieve this with SQL?
I am create the scenario on fiddle here. I don't know some script give an error there but in SQL Server is work.
Thank in advance
I think you need to process record by record. Hope the following code helps you to find the logic.
Assumption only one record per date and there will be inbound for all the items.
WHILE #ItemCount>0
BEGIN
SELECT TOP 1 #dat=date,#inbound=inbound FROM FIFO WHERE item='iteamA' AND outbound IS NULL ORDER BY date ASC
IF(#inbound <= #ItemCount)
BEGIN
UPDATE FIFO SET outbound=#inbound WHERE item='iteamA' and date=#dat
ELSE
UPDATE FIFO SET outbound=#ItemCount WHERE item='iteamA' and date=#dat
END
SET #ItemCount=#ItemCount-#inbound
END

T-SQL: Looking up results based on the result of the table prior to it

I'm not sure if I'm phrasing this question title correctly since I'm still very new to SQL, but this is the best that I can come up with...
So I have two tables a Policy table and a Claims table.
In the policy table, I have the following relevant fields:
[Policy_NO], [Creation_Date], [Limit], [Limit_Date]. Now each policy can have multiple limits, so you might get something that looks like the following table:
+-----------+---------------+-------+------------+
| Policy_NO | Creation_Date | Limit | Limit_Date |
+-----------+---------------+-------+------------+
| A00001 | 8/31/2015 | 1000 | 8/31/2015 |
| A00001 | 8/31/2015 | 2000 | 9/30/2015 |
| A00001 | 8/31/2015 | 5000 | 10/22/2015 |
| A00001 | 8/31/2015 | 500 | 11/17/2015 |
| A00003 | 9/21/2015 | 3000 | 1/1/2016 |
+-----------+---------------+-------+------------+
The claims table has the following relevant fields of: [Policy_NO], [Claim_NO], and [Claim_Date]
+-----------+----------+------------+
| Policy_NO | Claim_NO | Claim_Date |
+-----------+----------+------------+
| A00001 | CL00001 | 11/16/2015 |
| A00003 | CL00002 | 2/2/2016 |
+-----------+----------+------------+
So as per the examples above, you should interpret this as, a policy was created in 8/31/2015, and the policy holder requested for limit increases on 9/30/2015 and 10/22/2015. On 11/16/2015, a claim came in, and the limit dropped to 500 from 5000 on 11/17/2015.
I want to create a query (I only have read access btw), that will give me a combined list associated with the correct limit, which a simple right/left join wouldn't be able to do.
So as per the example above, the result in my table regarding Policy A00001 should look like:
+-----------+---------------+-------+---------------+----------+------------+
| Policy_NO | Creation_Date | Limit | Limit_Date | Claim_NO | Claim_Date |
+-----------+---------------+-------+---------------+----------+------------+
| A00001 | 8/31/2015 | 1000 | 8/31/2015 | | |
| A00001 | 8/31/2015 | 2000 | 9/30/2015 | | |
| A00001 | 8/31/2015 | 5000 | 10/22/2015 | CL00001 | 11/16/2015 |
| A00001 | 8/31/2015 | 500 | 11/17/2015 | | |
+-----------+---------------+-------+---------------+----------+------------+
Basically, I want to have a way of easily associating the claim with the right limit date. I've thought about just putting a WHERE to get Claims_Date >= Limit_Date, but that only solves part of my problem. After a claim, the limit would most likely go down, but it could possibly go up in another few months; and my current code would display this claim multiple times with different limits which is incorrect. Bottom line - the claim will only be associated with a policy once, so I was hoping that there might be some iterative process that I could use.
Any help/suggestion is greatly appreciated!
For each claim you can get the policy row using outer apply:
select c.*, p.*
from claims c outer apply
(select top 1 p.*
from policies p
where p.policy_no = c.policy_no and p.limit_date <= c.claim_date
order by p.limit_date desc
) p;
A "normal" JOIN between the two tables using the Policy_NO as JOIN-Field should do what you requested:
SELECT p.Policy_NO, p.Creation_Date, p.Limit, p.Limit_date, c.Claim_NO, Claim_Date
FROM policy AS p, claims AS c
WHERE c.Policy_NO = p.Policy_NO
ORDER BY c.claim_NO, p.Limit_Date ASC

SQL - Combining 3 rows per group in a logging scenario

I have reworked our API's logging system to use Azure Table Storage from using SQL storage for cost and performance reasons. I am now migrating our legacy logs to the new system. I am building a SQL query per table that will map the old fields to the new ones, with the intention of exporting to CSV then importing into Azure.
So far, so good. However, one artifact of the previous system is that it logged 3 times per request - call begin, call response and call end - and the new one logs the call as just one log (again, for cost and performance reasons).
Some fields common are common to all three related logs, e.g. the Session which uniquely identifies the call.
Some fields I only want the first log's value, e.g. Date which may be a few seconds different in the second and third log.
Some fields are shared for the three different purposes, e.g. Parameters gives the Input Model for Call Begin, Output Model for Call Response, and HTTP response (e.g. OK) for Call End.
Some fields are unused for two of the purposes, e.g. ExecutionTime is -1 for Call Begin and Call Response, and a value in ms for Call End.
How can I "roll up" the sets of 3 rows into one row per set? I have tried using DISTINCT and GROUP BY, but the fact that some of the information collides is making it very difficult. I apologize that my SQL isn't really good enough to really explain what I'm asking for - so perhaps an example will make it clearer:
Example of what I have:
SQL:
SELECT * FROM [dbo].[Log]
Results:
+---------+---------------------+-------+------------+---------------+---------------+-----------------+--+
| Session | Date | Level | Context | Message | ExecutionTime | Parameters | |
+---------+---------------------+-------+------------+---------------+---------------+-----------------+--+
| 84248B7 | 2014-07-20 19:16:15 | INFO | GET v1/abc | Call Begin | -1 | {"Input":"xx"} | |
| 84248B7 | 2014-07-20 19:16:15 | INFO | GET v1/abc | Call Response | -1 | {"Output":"yy"} | |
| 84248B7 | 2014-07-20 19:16:15 | INFO | GET v1/abc | Call End | 123 | OK | |
| F76BCBB | 2014-07-20 19:16:17 | ERROR | GET v1/def | Call Begin | -1 | {"Input":"ww"} | |
| F76BCBB | 2014-07-20 19:16:18 | ERROR | GET v1/def | Call Response | -1 | {"Output":"vv"} | |
| F76BCBB | 2014-07-20 19:16:18 | ERROR | GET v1/def | Call End | 456 | BadRequest | |
+---------+---------------------+-------+------------+---------------+---------------+-----------------+--+
Example of what I want:
SQL:
[Need to write this query]
Results:
+---------------------+-------+------------+----------+---------------+----------------+-----------------+--------------+
| Date | Level | Context | Message | ExecutionTime | InputModel | OutputModel | HttpResponse |
+---------------------+-------+------------+----------+---------------+----------------+-----------------+--------------+
| 2014-07-20 19:16:15 | INFO | GET v1/abc | Api Call | 123 | {"Input":"xx"} | {"Output":"yy"} | OK |
| 2014-07-20 19:16:17 | ERROR | GET v1/def | Api Call | 456 | {"Input":"ww"} | {"Output":"vv"} | BadRequest |
+---------------------+-------+------------+----------+---------------+----------------+-----------------+--------------+
select L1.Session, L1.Date, L1.Level, L1.Context, 'Api Call' AS Message,
L3.ExecutionTime,
L1.Parameters as InputModel,
L2.Parameters as OutputModel,
L3.Parameters as HttpResponse
from Log L1
inner join Log L2 ON L1.Session = L2.Session
inner join Log L3 ON L1.Session = L3.Session
where L1.Message = 'Call Begin'
and L2.Message = 'Call Response'
and L3.Message = 'Call End'
This would work in your sample.