How to update or insert a record in a postgres table which is obtained by doing another query? - sql

I want to write a simple statistic tool that is doing some queries and saving the results in a nother table from the same database.
Mainly I want to tracke the number of items in different tables, number of touched items during a month and so on. This would allow me to get some analytics regarding the usages of the system, information that I will not be able to get just by looking at the database status at one moment.
Let's say that I have this query:
select count(*) as mytab_mcount from mytab where updated > CURRENT_DATE - INTERVAL '1 months';
Now I do want to store the result of this query in a stats table so I can query it in order to get some trend data.
Clearly I could code this in something but I am wondering if I can do this only in SQL, Postgres blend of it.
I want to put the result in a table like
date mytab_mcount some_stat
2013-09-01 1234 NUL
Clearly the SQL should insert a new row or update the existing one.
Is this possilbe, can you put a basic example?
I this could be done in a single query it would be very easy to automate this, keeping all the logic in one place, and having a cron job to execute it.

Have you tried something like:
INSERT INTO stat_table (stat_date, table_name, row_count, some_stat)
SELECT CURRENT_DATE, 'mytab', count(*), 2+3
FROM mytab
WHERE updated > CURRENT_DATE - INTERVAL '1 months';
Or
UPDATE stat_table
SET row_count = (SELECT count(*) FROM mytab WHERE updated > CURRENT_DATE - INTERVAL '1 months'),
stat_date = CURRENT_DATE,
some_stat = (SELECT 1+3)
WHERE table_name = 'mytab';

Related

Writing Scheduled Queries using the run_date vs current_date

I have created a scheduled query that returns a count of users, and transactions on each day. Here is the code:
SELECT
event_date,
COUNT(DISTINCT user_id) users,
COUNT(DISTINCT transaction_id) transactions,
FROM `xyz.events`
WHERE
event_date = current_date
GROUP BY event_date
ORDER BY event_date
The query shown above works when I execute it manually. But when I use it as a scheduled query it doesn't update the destination table as it should even though if I check the runs, it shows that the query has run successfully for that particular day.
The query shown below however does the trick and runs exactly as intended. It updates the daily count of users and transactions in the destination table.
SELECT
DATE_SUB(#run_date, INTERVAL 1 DAY) event_date,
COUNT(DISTINCT user_id) users,
COUNT(DISTINCT transaction_id) transactions,
FROM `xyz.events`
WHERE
event_date = DATE_SUB(#run_date, INTERVAL 1 DAY)
GROUP BY event_date
ORDER BY event_date
So I wanted to understand why this is happening? Because when run manually both the queries give the same output.
Welcome Anxiety,
When you call the CURRENT_DATE() function you must add the opening and closing parenthesis at the end (). Having this missing from the end of your function call is why this query is failing when set to run as a scheduled query.
As to why it runs when you run it in a regular BigQuery query window, I am not certain, but assume the UI must have some inbuilt logic to work around the missing parenthesis , which is not available to scheduled queries.

Dynamic start date from specific column in a table (sysdate)

I am pretty new in this field, trying to learn slowly so please be patient with me :)
My database contains a table called t_usage_interval. In this table there is a column name ID_Interval. Each month a new random 10 digit number is created in this column.
This is the query I am using
I would like to find out if there is a way to pull the latest interval by using column name DT_START with SYSDATE option? I guess it would be a dynamic query search from a sysdate to display the latest ID_Interval?
Thank you,
A
This is how I understood the question.
A straightforward query returns row(s) whose dt_start is the first in that table that is lower or equal to sysdate (you might also use trunc(sysdate), if you don't care about time component). Drawback of this query is that it scans t_usage_Interval table twice.
select *
from t_usage_interval a
where a.dt_start = (select max(b.dt_start)
from t_usage_interval b
where b.dt_start <= sysdate
);
Somewhat less intuitive option is to "rank" rows (whose dt_start is lower than sysdate) by dt_start, and then return row(s) that rank the "highest". This option scans the table only once, so it should perform better.
with temp as
(select a.*,
rank() over (order by a.dt_start desc) rn
from t_usage_interval a
where a.dt_start <= sysdate
)
select t.*
from temp t
where t.rn = 1;

Oracle sub-partition shows data, but oracle query using filter does not show data

In Oracle 11g, I have created fact table with date as partition and site_id as sub-partition.
analyse is running daily on this table. but based on one day interval, analyse step is performed.
In SQL DEVELOPER tool, when I open table definition, under partition tab, I am able to see the partition as 23-JAN-2016. For all site_ids, I am able to see sub-partition.
Select * from NPM.EH_MODEM_HIST_PRFRM_FACT subpartition(SYS_SUBP1256625);
When I run the above query, I am able to see the data.
But I am running below query using report sql; but table is not fetching data
select * from NPM.EH_MODEM_HIST_PRFRM_FACT
where time_stamp ='23-JAN-16' and site_id =580
Is there any problem in managing this table?
Probably, what you're actually after is something like:
select *
from NPM.EH_MODEM_HIST_PRFRM_FACT
where time_stamp >= to_date('23/01/2016', 'dd/mm/yyyy')
and time_stamp < to_date('23/01/2016', 'dd/mm/yyyy') + 1
and site_id = 580;
The above assumes that the datatype for the time_stamp column is DATE. If it's actually TIMESTAMP then you should use the SQL below:
select *
from NPM.EH_MODEM_HIST_PRFRM_FACT
where time_stamp >= to_date('23/01/2016', 'dd/mm/yyyy')
and time_stamp < to_date('23/01/2016', 'dd/mm/yyyy') + interval '1' day
and site_id = 580;
Note also that I have specified the date with a four digit year. Two digit years are just soooo pre-y2k! *{;-)

Can we have a where clause for delete from <tab> partition <part.>?

I have a table partitioned based on time stamp (like partition1 will have 9 months old data, partition2 have 6 months old data, partition3 have 3 months old data and so on)
I need to delete data based on a condition on a specific partition.
delete from table1 partition partition3
where group id in ( select * from table1 where group_code='pilot');
Is this operation will delete only from partiton3?
First of all, the syntax is:
delete from table1 partition(partition3)
Secondly, you shouldn't refer to partition names. It creates a dependency on the physical layout of the table. It may be partitioned monthly today, but it may be repartitioned on weeks or days sometime in the future. When this happens, your code will break. Oracle can infer the partition just fine from a predicate on the partitioning column. Try to forget about the table being partitioned.
Third, your subselect will fail because of select * being compared to ID.
The statement you are looking for likely looks something like this:
delete
from table1
where group_code = 'pilot'
and partition_column = 'some value';
Partitioning works transparently, meaning that you can simply do a regular
delete table1
where group_id in (select group_id from table2 where group_code = 'pilot')
and your_date_column
between trunc(sysdate,'mm') - interval '3' month
and trunc(sysdate,'mm') - interval '2' month
The optimizer will know that the second predicate means that only data from partition3 needs deleting.
Regards,
Rob.

How to do this query in one query?

eI don't know if what I want to do is possible, but I am curious.
I have a time table that looks like that
timeTable
---------
_id
date
I have another table that stored that with timeinfo that is a ForeignKey to the timeTable
myTable
-------
action
date_id
Now, what I would like to do is updating the date_id with the id of the same date next month.
UPDATE myTable
SET date_id = (SELECT date + interval '1 month' FROM timeTable INNER JOIN myTabe ON _id = date_id)
;
I know that this query can't work because the query return a date where it does expect an uuid.
My question : Is it possible to retrieve the date_id of the same date the next month with one query without anohter subquery ? That my actual query I don't see how could I do it. Any idea ?
SELECT date + interval '1 month' FROM dimtime INNER JOIN roadmap ON _id = date_id
Maybe something like this:
Edited
UPDATE myTable t1
SET t1.date_id = (SELECT t2._id FROM timeTable t2 WHERE t2.date = (SELECT t3.date + interval '1 month' FROM timeTable t3 WHERE t3._id = t1.date_id));
It's a nasty hack, but if your RDBMS supports a deterministic conversion between integer and date datatypes, there is a way to do this without any subqueries at all - but it is almost certain to require changes to existing data.
The solution requires that you use the integer representation of the date as the key in timeTable (rather than arbitrarily assigned values). If you do this, then you can perform date calcuations without the need to look up the date value from timeTable. This removes the need to include timeTable in the query at all (unless you need to validate that the new date exists there). In this scenario, pseudo-SQL for your query would become:
UPDATE myTable
SET date_id = to_int(to_date(date_id) + interval `1 month`))
;
If the int <-> date conversion is implicit in your RDBMS, it gets even shorter:
UPDATE myTable
SET date_id = date_id + interval `1 month`
;
However, if you're at the point where you can choose to implement this design, you might be better off discarding the link timeTable and storing a native date type in the target table rather than the integer FK.