MariaDB convert created_at timestamp in total hours from now - sql

I have following question which return created_at timestamps. I would like to convert it in total hours from now. Is there an easy way to make that conversion and print it in total hours?
MariaDB version 10.5.12-MariaDB-1:10.5.12+maria~focal-log
MariaDB [nova]> select hostname, uuid, instances.created_at, instances.deleted_at, json_extract(flavor, '$.cur.*."name"') AS FLAVOR from instances join instance_extra on instances.uuid = instance_extra.instance_uuid WHERE (vm_state='active' OR vm_state='stopped');
+----------+--------------------------------------+---------------------+------------+--------------+
| hostname | uuid | created_at | deleted_at | FLAVOR |
+----------+--------------------------------------+---------------------+------------+--------------+
| vm1 | ef6380b4-5455-48f8-9e4b-3d04199be3f5 | 2023-01-05 14:25:51 | NULL | ["tempest2"] |
+----------+--------------------------------------+---------------------+------------+--------------+
1 row in set (0.001 sec)

Try it like this:
SELECT hostname, UUID, instances.created_at,
TIMESTAMPDIFF(hour,instances.created_at, NOW()) AS HOURDIFF,
instances.deleted_at,
JSON_EXTRACT(flavor, '$.cur.*."name"') AS FLAVOR
FROM instances
JOIN instance_extra ON instances.uuid = instance_extra.instance_uuid
WHERE (vm_state='active' OR vm_state='stopped');
Demo fiddle

Related

Remove old duplicate rows in BQ based on timestamp

I have a BQ table with duplicate (x2 times) rows of the same ad_id.
I want to delete old rows with ts > 120 min where there is a newer one with the same ad_id (Schema contains timestamp, ad_id, value. But there is not rowId).
This is my try, is there a nicer way to do so?
DELETE FROM {table_full_name} o
WHERE timestamp < TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 120 MINUTE) AND timestamp in (
SELECT MIN(timestamp)
FROM {table_full_name} i
WHERE i.ad_id=o.ad_id
GROUP BY ad_id)
Data example:
`ad-id` | `ts` | `value` |
`1` | Sep-1-2021 12:01 | `Scanned` |
`2` | Sep-1-2021 12:02 | `Error` |
`1` | Sep-1-2021 12:03 | `Removed` |
I want to clean it up to be:
`ad-id` | `ts` | `value` |
`2` | Sep-1-2021 12:02 | `Error` |
`1` | Sep-1-2021 12:03 | `Removed` |
I saw this post, but BQ doesn't support auto-increment for row-id.
I saw this post. But how can I modify it without the ts interval (as it's unknown).
You can try this script. Used COUNT() with HAVING to pull duplicate records with timestamp older than 120 minutes from current time using TIMESTAMP_DIFF.
DELETE
FROM `table_full_name`
WHERE ad_id in (SELECT ad_id
FROM `table_full_name`
GROUP BY ad_id
HAVING COUNT(ad_id) > 1)
AND TIMESTAMP_DIFF(CURRENT_TIMESTAMP(), timestamp, MINUTE) > 120
Before:
After:

Postgres find sequence with same intervals

=> \d test_table;
Table "public.test_table"
Column | Type | Collation | Nullable | Default
-------------+---------+-----------+----------+---------
timestamp | bigint | | |
source | inet | | |
destination | inet | | |
type | integer | | |
Let's say there's a table with the same above, that contains the network connection information, with a lot more ip pairs, and types than shown below.
rawdb=> select * from test_table order by timestamp;
timestamp | source | destination | type
------------+-------------+-------------+------
1586940900 | 192.168.1.1 | 192.168.1.2 | 1
1586940960 | 192.168.1.1 | 192.168.1.2 | 1
1586941020 | 192.168.1.1 | 192.168.1.2 | 1
1586941080 | 192.168.1.1 | 192.168.1.2 | 1
1586941140 | 192.168.1.1 | 192.168.1.2 | 1
(5 rows)
Out of the table, I need to find the connection pair that are always connecting every x interval. For example, in the rows above, the connection between the ip pair 192.168.1.1 to 192.168.1.2 is happening every 60s intervals.
The table above would be the answer to the question "how many ip pairs are connecting every 60s over the last 5 min?"
Question
How do I extract those periodic connections for various intervals with the same type, same ip pair? Connections that are 1 min peridoically, 5 min perodically, 30 min perodically.
The baseline is I can provide the x to search for (e.g. every 60s, every 5 min, every 1 hour, etc), the best case solution is to be able to find x without being provided.
The result format I need is the same as the table above.
Is it possible to do all of this in SQL? I have done some research on doing gap analysis on the postgres tables, but this is not to find out the gaps, but rather the continuous sequence.
Not sure what exactly the output is, you are after. But the difference between two connections of the same source/destination combination can be calculated using a window function.
Something like:
select distinct source, destination
from (
select *,
lead("timestamp") over w - "timestamp" as diff
from test_table
window w as (partition by source, destination order by "timestamp")
) t
where diff = 60
lead("timestamp") over w - "timestamp" calculates the difference between the current row's timestamp and the next one for the same source/destination pair. I moved the window definition into the FROM clause to make the expression that calculates the diff more readable.

MSSQL Sum of values referenced by another table

I'm attempting to create a report on the total money spent per day.
In the database are these two tables. They are matched using "UID" made at creation.
I've created this query but it results in duplicate dates.
Select LEFT(f.timestamp, 10) timestamp, sum(s.Total) Total
FROM dbo.purchasing AS f
Join (SELECT uid,SUM(CONVERT(DECIMAL(18,2), (CONVERT(DECIMAL(18,4), qty) * price))) Total
FROM dbo.purchasingitems
GROUP BY uid)
AS s ON f.uid = s.uid
GROUP BY TIMESTAMP
purchasing:
+--+---------+------------+--------+---+
|ID| UID | timestamp | contact|...|
+--+---------+------------+--------+---+
| 1|abr92nas9| 01/01/2018 | ROB |...|
| 2|nsa93m187| 02/02/2018 | ROB |...|
+--+---------+------------+--------+---+
purchasingitems:
+--+---------+-----+--------+---+
|ID| UID | QTY | Price |...|
+--+---------+-----+--------+---+
| 1|abr92nas9| 20 | 0.2435 |...|
| 2|abr92nas9| 5 | 0.5 |...|
| 3|nsa93m187| 1 | 100 |...|
| 4|nsa93m187| 4 | 15.5 |...|
+--+---------+-----+--------+---+
You need to group by the expression:
SELECT LEFT(f.timestamp, 10) as timestamp, sum(s.Total) as Total
FROM dbo.purchasing f JOIN
(SELECT uid, SUM(CONVERT(DECIMAL(18,2), (CONVERT(DECIMAL(18,4), qty) * price))) as Total
FROM dbo.purchasingitems
GROUP BY uid
) s
ON f.uid = s.uid
GROUP BY LEFT(f.timestamp, 10);
Notes:
You should not be storing date/time values as strings (unless you have a really good reason). If timestamp is a date, you should use cast(timestamp as date).
You should not be using string functions on date/times.
timestamp is a keyword in SQL Server (although not reserved), so it is not a good choice for a column name.
Your problem is that you think that GROUP BY timestamp refers to the expression in the SELECT. SQL Server does not support column aliases, so it can only refer to the column of that name.
I don't see a reason to convert to decimal for the multiplication. You might have a good reason.
You probably want order by as well, to ensure that the result set is in a sensible order.
Data you posted does NOT produce duplicates
No reason for the sub query
Select LEFT(f.timestamp, 10) timestamp,
SUM(CONVERT(DECIMAL(18,2), (CONVERT(DECIMAL(18,4), s.qty) * s.price))) Total
FROM dbo.purchasing AS f
join dbo.purchasingitems s
ON f.uid = s.uid
GROUP BY f.TIMESTAMP

Insert dates between Date Ranges in Postgresql

I've been searching for quite a while and can't find an answer to this question:
I have a PostgreSQL table that is staged the following way:
Start Date | End Date | Name | Team
-----------+------------+---------+------
2017-10-01 | 2017-10-10 | Person | 1
And what I would like to is have each row a day between the start date and end date with the corresponding name and team of the person:
Date | Name | Team
------------+---------+-------
2017-10-01 | Person | 1
------------+---------+-------
2017-10-02 | Person | 1
------------+---------+-------
2017-10-03 | Person | 1
Is it even possible to do this with PostgreSQL? I'm currently running PostgreSQL 9.3.
You can use generate_series() for that:
select t.dt::date, p.name, p.team
from person p, generate_series(p.start_date, p.end_date, interval '1' day) as t(dt)
order by t.dt::date;
I don't have 9.3 around any more, but I think that should also work with that old version.

Django returns wrong results when selecting from a postgres view

I have a view defined in postgres, in a separate schema to the data it is using.
It contains three columns:
mydb=# \d "my_views"."results"
View "my_views.results"
Column | Type | Modifiers
-----------+-----------------------+-----------
Date | date |
Something | character varying(60) |
Result | numeric |
When I query it from psql or adminer, I get results like theese:
bb_adminpanel=# select * from "my_views"."results";
Date | Something | Result
------------+-----------------------------+--------------
2015-09-14 | Foo | -3.36000000
2015-09-14 | Bar | -16.34000000
2015-09-12 | Foo | -11.55000000
2015-09-12 | Bar | 11.76000000
2015-09-11 | Bar | 2.48000000
However, querying it through django, I get a different set:
(c is a cursor object on the database)
c.execute('SELECT * from "my_views"."results"')
c.fetchall()
[(datetime.date(2015, 9, 14), 'foo', Decimal('-3.36000000')),
(datetime.date(2015, 9, 14), 'bar', Decimal('-16.34000000')),
(datetime.date(2015, 9, 11), 'foo', Decimal('-11.55000000')),
(datetime.date(2015, 9, 11), 'bar', Decimal('14.24000000'))]
Which doesn't match at all - the first two rows are correct, but the last two are really weird - they have a shifted date, and the Result of the last record is the sum of the last two.
I have no idea why that's happening, any suggestions welcome.
Here is the view definition:
SELECT a."Timestamp"::date AS "Date",
a."Something",
sum(a."x") AS "Result"
FROM my_views.another_view a
WHERE a.status::text = ANY (ARRAY['DONE'::character varying::text, 'CLOSED'::character varying::text])
GROUP BY a."Timestamp"::date, a."Something"
ORDER BY a."Timestamp"::date DESC;
and "another_view" looks like this:
Column | Type | Modifiers
---------------------------+--------------------------+-----------
Timestamp | timestamp with time zone |
Something | character varying(60) |
x | numeric |
status | character varying(100) |
(some columns ommited)
Simple explanation of problem is: timezones.
Detailed: you're not declaring any timezone setting when connecting to PostgreSQL console, but django does it on each query. That way,the timestamp for some records will point to different day depending on used timezone, for example with data
+-------------------------+-----------+-------+--------+
| timestamp | something | x | status |
+-------------------------+-----------+-------+--------+
| 2015-09-11 12:00:00 UTC | foo | 2.48 | DONE |
| 2015-09-12 00:50:00 UTC | foo | 11.76 | DONE |
+-------------------------+-----------+-------+--------+
query on your view executed with timezone UTC will give you 2 rows, but query executed with timezone GMT-2 will give you only one row. because in GMT-2 timezone timestamp from second row is still in day 2015-09-11.
To fix that, you can edit your view, so it will always group days according to specified timezone:
SELECT (a."Timestamp" AT TIME ZONE 'UTC')::date AS "Date",
a."Something",
sum(a."x") AS "Result"
FROM my_views.another_view a
WHERE a.status::text = ANY (ARRAY['DONE'::character varying::text, 'CLOSED'::character varying::text])
GROUP BY (a."Timestamp" AT TIME ZONE 'UTC'), a."Something"
ORDER BY (a."Timestamp" AT TIME ZONE 'UTC') DESC;
That way days will be always counted according to 'UTC' timezone.