How to count items by day

How to count items by day - sql

I'm needing to make a report from our ticket database that has the number of tickets closed per day by tech. My SQL query looks something like this:
select
i.DT_CLOSED,
rp.NAME,
from INCIDENTS i
join REPS rp on (rp.ID = i.id_assignee)
where i.DT_CLOSED > #StartDate
DT_CLOSED is the date and time in ISO format, and NAME is the rep name. I also have a calculated value in my dataset called TICKETSDAY that is calcualted using =DateValue(Fields!DT_CLOSED.Value) giving me the day without the time.
Right now I have a table set up that is grouped by NAME, then by TICKETSDAY, and I would like the last column to be a count of how many tickets there are. But when I set the last column to =Count(DT_CLOSED) it lists a 1 on each row for each ticket instead of aggregating so that my table looks like this:
┌───────────┬───────────┬──────────────┐
│Name │Day │Tickets Closed│
├───────────┼───────────┼──────────────┤
│JOHN SMITH │11/01/2013 │ 1│
│ │ ├──────────────┤
│ │ │ 1│
│ │ ├──────────────┤
│ │ │ 1│
│ ├───────────┼──────────────┤
│ │11/02/2013 │ 1│
└───────────┴───────────┴──────────────┘
And I need it to be:
┌───────────┬───────────┬──────────────┐
│Name │Day │Tickets Closed│
├───────────┼───────────┼──────────────┤
│JOHN SMITH │11/01/2013 │ 3│
│ ├───────────┼──────────────┤
│ │11/02/2013 │ 1│
└───────────┴───────────┴──────────────┘
Any idea what I'm doing wrong? Any help would be greatly appreciated.

I believe that Marc B is correct. You need to group by the non aggregate columns in your select statement. Try something along these lines.
select
i.DT_CLOSED,
rp.NAME,
COUNT(i.ID)
from INCIDENTS i
join REPS rp on (rp.ID = i.id_assignee)
where i.DT_CLOSED > #StartDate
GROUP BY rp.NAME, i.DT_CLOSED
Without a group by to aggregate your rows together your query is counting each row distinctly. I'm unfamiliar with how the report builder works, but try adding the group by clause manually and see what you get.
Let me know if I can clarify anything.

Related

How select multiple ex wild card tables in Clickhouse?

I used to use google bigquery and selected multiple wild card tables with query like this:
SELECT *
FROM project.dataset.events_*
WHERE _TABLE_SUFFIX BETWEEN "20220704" AND "20220731"
and it selects all tables between this two dates
Is it possible in Clickhouse to query multiple tables with _TABLE_SUFFIX or analog if i only have bunch of tables like
1. events_20210501
2. events_20210502
3. events_20210503
...
with table engine ReplicatedMergeTree?
Is it possible to create wild card analog in clickhouse?

https://clickhouse.com/docs/en/engines/table-engines/special/merge
https://clickhouse.com/docs/en/sql-reference/table-functions/merge
create table A1 Engine=Memory as select 1 a;
create table A2 Engine=Memory as select 2 a;
select * from merge(currentDatabase(), '^A.$');
┌─a─┐
│ 1 │
└───┘
┌─a─┐
│ 2 │
└───┘
select * from merge(currentDatabase(), 'A');
┌─a─┐
│ 1 │
└───┘
┌─a─┐
│ 2 │
└───┘

Substraction DateTime64 from DateTime64 in SQL clickhouse BD

I trying to find to calculate time difference in milliseconds betweent timestamps of two tables.
like this,
SELECT value, (table1.time - table2.time) AS time_delta
but i get error :
llegal types DateTime64(9) and DateTime64(9) of arguments of function minus:
so i can't substract DateTime64 in clickhouse.
Second way i tryed use DATEDIFF , but this func is limited by "SECONDS", i need values in "MILLISECONDS"
this is supported, but i get zeros in diff, because difference is too low(few millisecond):
SELECT value, dateDiff(SECOND , table1.time, table2.platform_time) AS time_delta
this is not supported:
SELECT value, dateDiff(MILLISECOND , table1.time, table2.time) AS time_delta
What's a better way to resolve my problem?
P.S i also tryed convert values to float, it's work , but looks strange,
SELECT value, (toFloat64(table1.time) - toFloat64(table2.time)) AS time_delta
as result i get somethink like this:
value time
51167477 -0.10901069641113281

#ditrauth Try casting to Float64, as the subsecond portion that you are looking for is stored as a decimal. Aslo, you want DateTime64(3) for milliseconds, see the docs. see below:
CREATE TABLE dt(
start DateTime64(3, 'Asia/Istanbul'),
end DateTime64(3, 'Asia/Istanbul')
)
ENGINE = MergeTree ORDER BY end
insert into dt values (1546300800123, 1546300800125),
(1546300800123, 1546300800133)
SELECT
start,
CAST(start, 'Float64'),
end,
CAST(end, 'Float64'),
CAST(end, 'Float64') - CAST(start, 'Float64') AS diff
FROM dt
┌───────────────────start─┬─CAST(start, 'Float64')─┬─────────────────────end─┬─CAST(end, 'Float64')─┬─────────────────diff─┐
│ 2019-01-01 03:00:00.123 │ 1546300800.123 │ 2019-01-01 03:00:00.125 │ 1546300800.125 │ 0.002000093460083008 │
│ 2019-01-01 03:00:00.123 │ 1546300800.123 │ 2019-01-01 03:00:00.133 │ 1546300800.133 │ 0.009999990463256836 │
└─────────────────────────┴────────────────────────┴─────────────────────────┴──────────────────────┴──────────────────────┘
2 rows in set. Elapsed: 0.001 sec.

How to access a specific value with two separate Array with SQL (one with name and the other one with the values)

I have two columns as below:
names: Array(String)
['name_one','name_2','name3']
values:Array(Float64)
[1000,2000,3000]
For example, I am interested in getting the value of 'name_2'. I want to retrieve 2000.
My guess is that I should first identify the location of 'name_2' in names, and then use it to retrieve the value in column values?
Would you use JSON to get to the solution ?
PS. I have just started to learn SQL, I am only familiar with basics at the moment. I have read some documentation but I am quite struggling on that one (getting errors always)
I am using Clickhouse.
Thanks for the help !

If you need to extract name multiple occurrences
SELECT arrayFilter((x, y) -> (y = 'name_2'), values, names)
FROM
(
SELECT
1 AS id,
['name_one', 'name_2', 'name3', 'name_2'] AS names,
[1000, 2000, 3000, 4000] AS values
)
┌─arrayFilter(lambda(tuple(x, y), equals(y, 'name_2')), values, names)─┐
│ [2000,4000] │
└──────────────────────────────────────────────────────────────────────┘
if single
SELECT values[indexOf(names, 'name_2')]
FROM
(
SELECT
1 AS id,
['name_one', 'name_2', 'name3'] AS names,
[1000, 2000, 3000] AS values
)
┌─arrayElement(values, indexOf(names, 'name_2'))─┐
│ 2000 │
└────────────────────────────────────────────────┘

Consider using arrayZip-function:
SELECT
arrayZip(names, values) AS zipped,
zipped[2] AS second_pair,
second_pair.1 AS second_name,
second_pair.2 AS second_value
FROM
(
SELECT
['name_one', 'name_2', 'name3'] AS names,
[1000, 2000, 3000] AS values
)
/*
┌─zipped─────────────────────────────────────────────┬─second_pair─────┬─second_name─┬─second_value─┐
│ [('name_one',1000),('name_2',2000),('name3',3000)] │ ('name_2',2000) │ name_2 │ 2000 │
└────────────────────────────────────────────────────┴─────────────────┴─────────────┴──────────────┘
*/
Probably ARRAY JOIN-clause can be useful too:
SELECT *
FROM
(
SELECT
1 AS id,
['name_one', 'name_2', 'name3'] AS names,
[1000, 2000, 3000] AS values
)
ARRAY JOIN
names,
values
/*
┌─id─┬─names────┬─values─┐
│ 1 │ name_one │ 1000 │
│ 1 │ name_2 │ 2000 │
│ 1 │ name3 │ 3000 │
└────┴──────────┴────────┘
*/
Look at Nested Data Structures to store paired values.

Is this the correct way to use it via grafana?

ClickHouse:
┌─name──────────┬─type──────────┬
│ FieldUUID │ UUID │
│ EventDate │ Date │
│ EventDateTime │ DateTime │
│ Metric │ String │
│ LabelNames │ Array(String) │
│ LabelValues │ Array(String) │
│ Value │ Float64 │
└───────────────┴───────────────┴
Row 1:
──────
FieldUUID: 499ca963-2bd4-4c94-bc60-e60757ccaf6b
EventDate: 2021-05-13
EventDateTime: 2021-05-13 09:24:18
Metric: cluster_cm_agent_physical_memory_used
LabelNames: ['host']
LabelValues: ['test01']
Value: 104189952
Grafana:
SELECT
EventDateTime,
Value AS cluster_cm_agent_physical_memory_used
FROM
$table
WHERE
Metric = 'cluster_cm_agent_physical_memory_used'
AND $timeFilter
ORDER BY
EventDateTime
no data points.
question: Is this the correct way to use it via grafana?
Example:
cluster_cm_agent_physical_memory_used{host='test01'} 104189952

Grafana expects your SQL will return time series data format for most of the visualization.
One column DateTime\Date\DateTime64 or UInt32 which describe
timestamp
One or several columns with Numeric types (Float, Int*,
UInt*) with metric values (column name will use as time series name)
optional one column with String which can describe multiple time
series name
or advanced "time series" format, when first column will timestamp, and the second column will Array(tuple(String, Numeric)) where String column will time series name (usually it used with
so, select table metrics.shell as table and EventDateTime as field in drop-down in query editor your query could be changed to
SELECT
EventDateTime,
Value AS cluster_cm_agent_physical_memory_used
FROM
$table
WHERE
Metric = 'cluster_cm_agent_physical_memory_used'
AND $timeFilter
ORDER BY
EventDateTime
SQL query from your post, can be visualized without changes only with the Table plugin and you shall change "time series" to "table" format for properly data transformation on grafana side

Analog for promQL query
cluster_cm_agent_physical_memory_used{host='test01'} 104189952
should look like
SELECT
EventDateTime,
Value AS cluster_cm_agent_physical_memory_used
FROM
$table
WHERE
Metric = 'cluster_cm_agent_physical_memory_used'
AND LabelValues[indexOf(LabelNames,'host')] = 'test01'
AND $timeFilter
ORDER BY
EventDateTime

Conditional aggregation with JOIN?

I am working in Postgres 9.4. I have a table containing medications, as follows:
bnf_code │ character varying(15) │ not null
pills_per_day │ double precision │
For example, this table might contain a medication with code of 04030201, with a recommended pills per day of 4, and one with code 04030202 and recommended pills per day of 2.
And I also have a table containing numbers of prescriptions, with a foreign key to the table above:
code │ character varying(15) │ not null
num_pills │ double precision │ not null
processing_date │ date │ not null
practice_id │ character varying(6) │ not null
Foreign-key constraints:
FOREIGN KEY (code) REFERENCES medications(bnf_code) DEFERRABLE INITIALLY DEFERRED
Now I need to work out how many daily doses were prescribed for all codes starting 0403. The daily dose is defined as the number of pills actually prescribed, divided by the recommended pills per day.
I know how to do this for the two particular codes above:
SELECT (SUM(num_pills) FILTER (WHERE code='04030201') / 4) +
(SUM(num_pills) FILTER (WHERE code='04030202') / 2)
FROM prescriptions
But that's because I can hard-code in the pills per day field.
Can I extend this to divide by the appropriate pills_per_day for all codes starting 0403? There might be several hundred, but I'd prefer to use a single SQL query if possible.

I am not sure if this is what are you looking for:
SELECT SUM(p.num_pills/r.pills_per_day)
FROM prescriptions p INNER join recommendations r
ON p.code = r.bnf_code
WHERE p.code Like '0403%'
I am assuming that num_pills is the number of pills prescribed and pills_per_day is the number of pills recommended.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to count items by day - sql

Related

How select multiple ex wild card tables in Clickhouse?

Substraction DateTime64 from DateTime64 in SQL clickhouse BD

How to access a specific value with two separate Array with SQL (one with name and the other one with the values)

Is this the correct way to use it via grafana?

Conditional aggregation with JOIN?

Categories

Resources