Select a range of Time Stamp in SQL - sql

Please help me in here.
SELECT TOP 200 [TimeStamp]
,[Id]
,[Serial]
,[Server]
,[Message]
,[Station]
,ISNULL([P1],'Active Directory') as 'Category'
,ISNULL([P2],'Item Bold') as 'ItemName'
FROM [data].[dbo].[Message]
WHERE TimeStamp >= '2017-11-13' AND TimeStamp <= '2017-12-30'
ORDER BY TimeStamp Desc
I am trying to get data in a specific range of "TimeStamp", I have a UI where the user can select two timestamp for them to select the range (see code). But my problem is, for specific TimeStamp, there are lot of identical data. For example the "2017-12-30" has 5 entries, but they have different in data in terms of "Category".
Now my question is, how would I know what the user really actually pick from the "TimeStamp" though they have identical items.

Extracts the date part of the date or datetime expression expr.
Use DATE(expr)
SELECT TOP 200 [TimeStamp]
,[Id]
,[Serial]
,[Server]
,[Message]
,[Station]
,ISNULL([P1],'Active Directory') as 'Category'
,ISNULL([P2],'Item Bold') as 'ItemName'
FROM [data].[dbo].[Message]
WHERE DATE(TimeStamp) >= '2017-11-13' AND DATE(TimeStamp) <= '2017-12-30'
ORDER BY TimeStamp Desc

Related

Get totals from difference between rows

I have a table, with the following structure:
(
id SERIAL PRIMARY KEY,
user_id integer NOT NULL REFERENCES user(id) ON UPDATE CASCADE,
status text NOT NULL,
created_at timestamp with time zone NOT NULL,
updated_at timestamp with time zone NOT NULL
)
Example data:
"id","user_id","status","created_at","updated_at"
416,38,"ONLINE","2018-08-07 14:40:51.813+00","2018-08-07 14:40:51.813+00"
417,39,"ONLINE","2018-08-07 14:45:00.717+00","2018-08-07 14:45:00.717+00"
418,38,"OFFLINE","2018-08-07 15:43:22.678+00","2018-08-07 15:43:22.678+00"
419,38,"ONLINE","2018-08-07 16:21:30.725+00","2018-08-07 16:21:30.725+00"
420,38,"OFFLINE","2018-08-07 16:49:10.3+00","2018-08-07 16:49:10.3+00"
421,38,"ONLINE","2018-08-08 11:37:53.639+00","2018-08-08 11:37:53.639+00"
422,38,"OFFLINE","2018-08-08 12:29:08.234+00","2018-08-08 12:29:08.234+00"
423,39,"ONLINE","2018-08-14 15:22:00.539+00","2018-08-14 15:22:00.539+00"
424,39,"OFFLINE","2018-08-14 15:22:02.092+00","2018-08-14 15:22:02.092+00"
When a user on my application goes online, a new row is inserted with status ONLINE. When they go offline, a row with status OFFLINE is inserted. There are other entries created to record different events, but for this query only OFFLINE and ONLINE are important.
I want to produce a chart, showing the total number of users online over a time period (e.g 5 minutes), within a date range. If a user is online for any part of that period they should be counted.
Example:
datetime, count
2019-05-22T12:00:00+0000, 53
2019-05-22T12:05:00+0000, 47
2019-05-22T12:10:00+0000, 49
2019-05-22T12:15:00+0000, 55
2019-05-22T12:20:00+0000, 59
2019-05-22T12:25:00+0000, 56
I'm able to produce a similar chart for an individual user by fetching all status rows within the date range then processing manually, however this approach won't scale to all users.
I believe something like this could be accomplished with window functions, but I'm not really sure where to start
As your question is very vague nobody realy can help you to 100%. Well, you can achive what you want maybe with a combination of of "with" clauses and window functions. With the "with" clause you can easily break down big problems in small parts. Maybe following query (not looking at any performace) may help, you replace public.tbl_test with your table:
with temp_online as (
select
*
from public.tbl_test
where public.tbl_test.status ilike 'online'
order by created_at
),
temp_offline as (
select
*
from public.tbl_test
where public.tbl_test.status ilike 'offline'
order by created_at
),
temp_change as (
select
* ,
(
select temp_offline.created_at from temp_offline where temp_offline.created_at > temp_online.created_at and temp_offline.user_id = temp_online.user_id order by created_at asc limit 1
) as go_offline
from temp_online
),
temp_result as
(
select *,
go_offline - created_at as online_duration
from temp_change
),
temp_series as
(
SELECT (generate_series || ' minute')::interval + '2019-05-22 00:00:00'::timestamp as temp_date
FROM generate_series(0, 1440,5)
)
select
temp_series.temp_date,
(select count(*) from temp_result where temp_result.created_at <= temp_series.temp_date and temp_result.go_offline >= temp_series.temp_date) as count_users
from
temp_series

Analytics in sql

I have a table with the following structure:
use_id (int) - event (str) - time (timestamp) - value (int)
Event can take several values : install, login, buy, etc.
I need to get all user records before updating the application.
For example moment of release of my application - 1 January 2019, but users may be install new version on any day.
How can i get sum(value) by the first and second versions. ---------
I tried self-join table, but I think that this is not the best solution.
Help me, please.
Here is the definition of your table (as I understood it from your comments and description):
CREATE TABLE user_events (
user_id integer,
event varchar,
time timestamp without time zone,
value integer
);
Here is the query you asked for:
SELECT
COUNT(user_id),
SUM(value)
FROM (
SELECT
DISTINCT ON (user_id)
user_id,time,value
FROM user_events
WHERE event='install'
ORDER BY user_id, time DESC
) last_installations
WHERE
time BETWEEN date '2018-01-01' AND date '2019-01-01';
Some explanations:
inner query ( last_installations ) selects last install events for each user
outer query filters out only installations of first and second versions, and calculates SUM(value) (as you asked) and COUNT(user_id) (I added for clarity - how many users are using 1 and 2 versions now)
UPDATE
sum value for all events by version
SELECT
event,
CASE
WHEN time BETWEEN date '2018-01-01' AND timestamp '2018-05-30 23:59:59' THEN 1
WHEN time BETWEEN date '2018-06-01' AND timestamp '2018-12-31 23:59:59' THEN 2
WHEN time > date '2018-01-01' THEN 3
ELSE 0 -- unknown version
END AS version,
SUM(value)
FROM user_events
GROUP BY 1,2

Using date_add function with a timestamp in bigquery results in null for output

I'm trying to use the date_add function with a timestamp in bigquery, but I'm getting 'null' as a result from the output. I've used date_add successfully before, so I don't understand what the problem is. Here's a bit of code.
SELECT
userId,
MAX(most_recent_session) most_recent_session,
date_add(MAX(most_recent_session), 24, 'HOUR') as added_a_day,
FROM
(
SELECT
userId,
LAG(time, 0) OVER (PARTITION BY userId ORDER BY time) as most_recent_session,
LAG(time, 1) OVER (PARTITION BY userId ORDER BY time) as previous_session,
FROM TABLE_DATE_RANGE(dataset.tablename_, TIMESTAMP(DATE_ADD(CURRENT_TIMESTAMP(), -30, "DAY")), CURRENT_TIMESTAMP())
GROUP BY
userId,
time
)
)
group by
userID
So what I would expect to get out would be three columns, the first containing userId, the second containing a time stamp for that users most recent session, and then a third with 24 hours added on to it. But in the third column instead of getting the value in the 2nd column with 24 hours added on to it, I get 'null'.
Any thoughts?
I figured out the solution to the problem. You need to wrap the 'most_recent_session' that exists w/ in the outer level of the SQL w/ a USEC_TO_TIMESTAMP function. That struck me as odd because BQ recognized the field as being a time stamp, but it works.

How to use window functions to get meterics for today, last 7 days, last 30 days for each value of the date?

My problem seems simple on paper:
For a given date, give me active users for that given date, active users in given_Date()-7, active users in a given_Date()-30
i.e. sample data.
"timestamp" "user_public_id"
"23-Sep-15" "805a47023fa611e58ebb22000b680490"
"28-Sep-15" "d842b5bc5b1711e5a84322000b680490"
"01-Oct-15" "ac6b5f70b95911e0ac5312313d06dad5"
"21-Oct-15" "8c3e91e2749f11e296bb12313d086540"
"29-Nov-15" "b144298810ee11e4a3091231390eb251"
for 01-10 the count for today would be 1, last_7_days would be 3, last_30_days would be 3+n (where n would be the count of the user_ids that fall in dates that precede Oct 1st in a 30 day window)
I am on redshift amazon. Can somebody provide a sample sql to help me get started?
the outputshould look like this:
"timestamp" "users_today", "users_last_7_days", "users_30_days"
"01-Oct-15" 1 3 (3+n)
I know asking for help/incomplete solutions are frowned upon, but this is not getting any other attention so I thought I would do my bit.
I have been pulling my hair out trying to nut this one out, alas, I am a beginner and something is not clicking for me. Perhaps yourself or others will be able to drastically improve my answer, but I think I am on the right track.
SELECT replace(convert(varchar, [timestamp], 111), '/','-') AS [timestamp], -- to get date in same format as you require
(SELECT COUNT([TIMESTAMP]) FROM #SIMPLE WHERE ([TIMESTAMP]) = ([timestamp])) AS users_today,
(SELECT COUNT([TIMESTAMP]) FROM #SIMPLE WHERE [TIMESTAMP] BETWEEN DATEADD(DY,-7,[TIMESTAMP]) AND [TIMESTAMP]) AS users_last_7_days ,
(SELECT COUNT([TIMESTAMP]) FROM #SIMPLE WHERE [TIMESTAMP] BETWEEN DATEADD(DY,-30,[TIMESTAMP]) AND [timestamp]) AS users_last_30_days
FROM #SIMPLE
GROUP BY [timestamp]
Starting with this:
CREATE TABLE #SIMPLE (
[timestamp] datetime, user_public_id varchar(32)
)
INSERT INTO #SIMPLE
VALUES('23-Sep-15','805a47023fa611e58ebb22000b680490'),
('28-Sep-15','d842b5bc5b1711e5a84322000b680490'),
('01-Oct-15','ac6b5f70b95911e0ac5312313d06dad5'),
('21-Oct-15','8c3e91e2749f11e296bb12313d086540'),
('29-Nov-15','b144298810ee11e4a3091231390eb251')
The problem I am having is that each row contains the same counts, despite my grouping by [timestamp].
Step 1-- Create a table which has daily counts.
create temp table daily_mobile_Sessions as
select "timestamp" ,
count(user_public_id) over (partition by "timestamp" ) as "today"
from mobile_sessions
group by 1, mobile_sessions.user_public_id
order by 1 DESC
Step 2 -- From the table above. We create yet another table which can use the "today" field, and we apply the window function to Sum the counts.
select "timestamp", today,
sum(today) over (order by "timestamp" rows between 6 PRECEDING and CURRENT ROW) as "last_7days",
sum(today) over (order by "timestamp" rows between 29 PRECEDING and CURRENT ROW) as "last_30days"
from daily_mobile_Sessions group by "timestamp" , 2 order by 1 desc

Efficient way of counting a large content from a cloumn or a two in a database using selected time period

I need to list number of column1 that have been added to the database over the selected time period (since the day the list is requested)-daily, weekly (last 7 days), monthly (last 30 days) and quarterly (last 3 months). for example below is the table I created to perform this task.
Column | Type | Modifiers
------------------+-----------------------------+-----------------------------------------------------
column1 character varying (256) not null default nextval
date timestamp without time zone not null default now()
coloumn2 charater varying(256) ..........
Now, I need the total count of entries in column1 with respect the selected time period.
Like,
Column 1 | Date | Coloumn2
------------------+-----------------------------+-----------------------------------------------------
abcdef 2013-05-12 23:03:22.995562 122345rehr566
njhkepr 2013-04-10 21:03:22.337654 45hgjtron
ffb3a36dce315a7 2013-06-14 07:34:59.477735 jkkionmlopp
abcdefgggg 2013-05-12 23:03:22.788888 22345rehr566
From above data, for daily selected time period it should be count= 2
I have tried doing this query
select count(column1) from table1 where date='2012-05-12 23:03:22';
and have got the exact one record matching the time stamp. But I really needed to do it in proper way I believe this is not an efficient way of retrieving the count. Anyone who could help me know the right and efficient way of writing such query would be great. I am new to the database world, and I am trying to be efficient in writing any query.
Thanks!
[EDIT]
Each query currently is taking 175854ms to get process. What could be the efficient way to lessen the time to have it processed accordingly. Any help would be really great. I am using Postgresql to do the same.
To be efficient, conditions should compare values of the sane type as the columns being compared. In this case, the column being compared - Date - has type timestamp, so we need to use a range of tinestamp values.
In keeping with this, you should use current_timestamp for the "now" value, and as confirmed by the documentation, subtracting an interval from a timestamp yields a timestamp, so...
For the last 1 day:
select count(*) from table1
where "Date" > current_timestamp - interval '1 day'
For the last 7 days:
select count(*) from table1
where "Date" > current_timestamp - interval '7 days'
For the last 30 days:
select count(*) from table1
where "Date" > current_timestamp - interval '30 days'
For the last 3 months:
select count(*) from table1
where "Date" > current_timestamp - interval '3 months'
Make sure you have an index on the Date column.
If you find that the index is not being used, try converting the condition to a between, eg:
where "Date" between current_timestamp - interval '3 months' and current_timestamp
Logically the same, but may help the optimizer to choose the index.
Note that column1 is irrelevant to the question; being unique there is no possibility of the row count being different from the number of different values of column1 found by any given criteria.
Also, the choice of "Date" for the column name is poor, because a) it is a reserved word, and b) it is not in fact a date.
If you want to count number of records between two dates:
select count(*)
from Table1
where "Date" >= '2013-05-12' and "Date" < '2013-05-13'
-- count for one day, upper bound not included
select count(*)
from Table1
where "Date" >= '2013-05-12' and "Date" < '2013-06-13'
-- count for one month, upper bound not included
select count(*)
from Table1
where
"Date" >= current_date and
"Date" < current_date + interval '1 day'
-- current date
What I understand from your wording is
select date_trunc('day', "date"), count(*)
from t
where "date" >= '2013-01-01'
group by 1
order by 1
Replace 'day' for 'week', 'month', 'quarter' as needed.
http://www.postgresql.org/docs/current/static/functions-datetime.html#FUNCTIONS-DATETIME-TRUNC
Create an index on the "date" column.
select count(distinct column1) from table1 where date > '2012-05-12 23:03:22';
I assume "number of column1" means "number of distinct values in column1.
Edit:
Regarding your second question (speed of the query): I would assume that an index on the date column should speed up the runtime. Depending on the data content, this could even be declared unique.
To throw another option into the mix...
Add a column of type "date" and index that -- named "datecol" for this example:
create index on tbl_datecol_idx on tbl (datecol);
analyze tbl;
Then your query can use an equality operator:
select count(*) from tbl where datecol = current_date - 1; --yesterday
Or if you can't add the date datatype column, you could create a functional index on the existing column:
create index tbl_date_fbi on tbl ( ("date"::DATE) );
analyze tbl;
select count(*) from tbl where "date"::DATE = current_date - 1;
Note1: you do not need to query "column1" directly as every row has that attribute filled due to the NOT NULL.
Note2: Creating a column named "date" is poor form, and even worse that it is of type TIMESTAMP.