Group by Week beginning Sunday - sql

I have a table with eventdatetime , userid etc. The data is inserted in the table daily.
For the report , I need to give count of userid , projectid grouped by week : Tue-Mon for a month range at a time.
I need help on grouping the data by week for month. I'm using Oracle.
select count(distinct( table1.projectid))as Projects, count(distinct( table2.userid)) as Users,??
from table1
join table 2
on table1.a= table2.a
where table1.e='1'
and table1.eventdatetime between sysdate-30 and sysdate-1
group by ??
I want the output to be grouped by week like :
WeekBegin
2013-04-14
2013-04-21

http://www.techonthenet.com/oracle/functions/to_char.php Use the To_Char function with IW to get the week. Then you can GROUP BY that IW value.
Note that the date the Oracle week starts on is dependent on the language settings of the database. Some countries start on Sunday and some Monday. You'll have to look at your settings to see. If it already starts on Sunday, then you're in luck!

if the example you have posted is your work in progress version - before worrying about getting the days of the week in you should look into getting the basics of the query right
you are selecting e.projectid and u.userid but you haven't got any tables named e or u in your query - it looks like you want to alias them as e and u?
the where clause of your query is also looking for the table e which isn't present
in that case you should change
from table1
join table2
on table1.a= table2.a
to
from table1 e -- select from table1 using alias e
join table2 u -- join table2 using alias u
on ( e.a = u.a ) -- joining on column a from table1 (e) = a from table2 (u)
once you have replaced the a's in the on section with the column names you want to join using it might well run after you remove the last column ", ??" from the select - perhaps something along these lines
select
count (e.projectid) PROJECTS,
count (u.userid) USERS
from table1 e
join table2 u
on ( e.a = u.a )
where e.FILTERING_COLUMN = '1'
and e.eventdatetime >= sysdate-30
note that as sysdate is the current time on the server (depending on localisation and session settings) you can use greater than sysdate-30 instead of between which may well be give the query optimiser an easier time if the table is suitable indexed
the basic rule for grouping is that to select a column you need to either be grouping by it or using an aggregate function such as COUNT()
so you'll probably want something like
select
count (e.projectid) PROJECTS,
count (u.userid) USERS,
to_char(e.eventdatetime,'MM') MONTH
from table1 e
join table2 u
on ( e.a = u.a )
where e.FILTERING_COLUMN = '1'
and e.eventdatetime >= sysdate-30
group by e.eventdatetime
though this won't be the most optimal way to do this it would be easier if you posted the schemas involved in the issue

Related

Bigquery replacing empty results or null values with some 000

I see for few dates data is not there, And now for the dates which data doesn't exist, i would like to replace it with zero instead of no results found. I tried as below and got the present output
select trvl_details.strt_dte as cre_dte,
trvl_typ_cde,
coalesce(count(1),0) as createdcount
from project.dataset.tableid JOIN UNNEST(trvl_details)trvl_details
WHERE trvl_details.strt_dte >= "2020-12-24" and trvl_typ_cde='AIR' group by 1,2
Can someone please help me with this?
You can use GENERATE_DATE_ARRAY to create a list of dates and then left join generated list of dates with your results:
WITH your_data AS (
select trvl_details.strt_dte as cre_dte, trvl_typ_cde, coalesce(count(1),0) as createdcount
from project.dataset.tableid JOIN UNNEST(trvl_details)trvl_details
WHERE trvl_details.strt_dte >= "2020-12-24" and trvl_typ_cde='AIR'
group by 1,2
)
SELECT day, your_data.trvl_typ_cde, IFNULL(your_data.createdcount, 0)
FROM UNNEST(GENERATE_DATE_ARRAY('2020-12-01', '2020-12-31')) as day
LEFT JOIN your_data
ON day = your_data.cre_dte

Sum certain rows in sql

I want to sum different rows of my table into a different column. I have a Id column which have multiple times. When I try to sum the time table it says cannot sum time data type.
Here is what my query looks like atm:
select t.Id,
a.EndTime,
cast(A.Endtime as time)[time]
from [plugin.tickets].Ticket as T
join
[plugin.tickets].TicketActivity as TA ON TA.TicketId = T.Id
join
dbo.Activity as A on A.Id = TA.ActivityId
Here is the output of this query:
As you can see there are id numbers which have multiple times. How can I sum these values?
EDIT:
I have changed my query as following:
select t.Id
,a.EndTime
,convert (varchar(5), EndTime,108) as Tijd
from [plugin.tickets].Ticket as T
join
[plugin.tickets].TicketActivity as TA ON TA.TicketId = T.Id
join
dbo.Activity as A on A.Id = TA.ActivityId
I just want to sum these where the ID number is the same.
Thanks,
Shabby
You should look into the two functions DATEPART and DATEADD. I'm assuming your useing T-SQL here. So you would take the time field convert that to hour which would give you 17 for example and then use date add on the End Time field.
https://learn.microsoft.com/en-us/sql/t-sql/functions/datepart-transact-sql?view=sql-server-2017

PostgreSQL, NOT IN clause

I want to calculate DAU and exclude user that we don't consider "real" (employees, beta testers etc).
It worked fine previously when I wrote the filtering in the query:
SELECT
count(distinct user_id) AS daily,
e.event_timestamp::DATE AS date
FROM
"public"."events" AS e
WHERE
user_id IN (SELECT
distinct id
from
"user"."user"
WHERE
username IS NOT NULL AND position IS NOT NULL )
GROUP BY date
When I try changing it to below, which should give more or less the same count (basically instead of defining the 4000 "real users" I define the 1000 "non-users" I want to exclude). However, this gives me way higher counts. It's like the distinct statement isn't working.
I added the NOT NULL to the subquery but doesn't change the result. Is there something with the NOT IN + subquery that works in another way than the IN clause?
SELECT
count(distinct e.user_id) AS daily,
e.event_timestamp::DATE AS date
FROM
"public"."events" AS e
WHERE
e.user_id NOT IN (SELECT distinct id FROM "public"."non_users" WHERE id IS NOT NULL)
GROUP BY
date
ORDER BY
date
Yes. If any of the values in the subquery are NULL, then NOT IN returns no rows For this reason, I strongly recommend that you always use NOT EXISTS -- it behaves as expected.
You seem to know this, because you are using a NULL comparison in the WHERE. So, the difference is probably due to the other condition. So, include it as well:
SELECT count(distinct e.user_id) AS daily,
e.event_timestamp::DATE AS date
FROM "public"."events" e
WHERE NOT EXISTS (SELECT 1
FROM "public"."non_users" nu
WHERE e.user_id = nu.id AND
nu.position IS NOT NULL
)
GROUP BY date
ORDER BY date;

Using a stored procedure in Teradata to build a summarial history table

I am using Terdata SQL Assistant connected to an enterprise DW. I have written the query below to show an inventory of outstanding items as of a specific point in time. The table referenced loads and stores new records as changes are made to their state by load date (and does not delete historical records). The output of my query is 1 row for the specified date. Can I create a stored procedure or recursive query of some sort to build a history of these summary rows (with 1 new row per day)? I have not used such functions in the past; links to pertinent previously answered questions or suggestions on how I could get on the right track in researching other possible solutions are totally fine if applicable; just trying to bridge this gap in my knowledge.
SELECT
'2017-10-02' as Dt
,COUNT(DISTINCT A.RECORD_NBR) as Pending_Records
,SUM(A.PAY_AMT) AS Total_Pending_Payments
FROM DB.RECORD_HISTORY A
INNER JOIN
(SELECT MAX(LOAD_DT) AS LOAD_DT
,RECORD_NBR
FROM DB.RECORD_HISTORY
WHERE LOAD_DT <= '2017-10-02'
GROUP BY RECORD_NBR
) B
ON A.RECORD_NBR = B.RECORD_NBR
AND A.LOAD_DT = B.LOAD_DT
WHERE
A.RECORD_ORDER =1 AND Final_DT Is Null
GROUP BY Dt
ORDER BY 1 desc
Here is my interpretation of your query:
For the most recent load_dt (up until 2017-10-02) for record_order #1,
return
1) the number of different pending records
2) the total amount of pending payments
Is this correct? If you're looking for this info, but one row for each "Load_Dt", you just need to remove that INNER JOIN:
SELECT
load_Dt,
COUNT(DISTINCT record_nbr) AS Pending_Records,
SUM(pay_amt) AS Total_Pending_Payments
FROM DB.record_history
WHERE record_order = 1
AND final_Dt IS NULL
GROUP BY load_Dt
ORDER BY 1 DESC
If you want to get the summary info per record_order, just add record_order as a grouping column:
SELECT
load_Dt,
record_order,
COUNT(DISTINCT record_nbr) AS Pending_Records,
SUM(pay_amt) AS Total_Pending_Payments
FROM DB.record_history
WHERE final_Dt IS NULL
GROUP BY load_Dt, record_order
ORDER BY 1,2 DESC
If you want to get one row per day (if there are calendar days with no corresponding "load_dt" days), then you can SELECT from the sys_calendar.calendar view and LEFT JOIN the query above on the "load_dt" field:
SELECT cal.calendar_date, src.Pending_Records, src.Total_Pending_Payments
FROM sys_calendar.calendar cal
LEFT JOIN (
SELECT
load_Dt,
COUNT(DISTINCT record_nbr) AS Pending_Records,
SUM(pay_amt) AS Total_Pending_Payments
FROM DB.record_history
WHERE record_order = 1
AND final_Dt IS NULL
GROUP BY load_Dt
) src ON cal.calendar_date = src.load_Dt
WHERE cal.calendar_date BETWEEN <start_date> AND <end_date>
ORDER BY 1 DESC
I don't have access to a TD system, so you may get syntax errors. Let me know if that works or you're looking for something else.

Unpivot date columns to a single column of a complex query in Oracle

Hi guys, I am stuck with a stubborn problem which I am unable to solve. Am trying to compile a report wherein all the dates coming from different tables would need to come into a single date field in the report. Ofcourse, the max or the most recent date from all these date columns needs to be added to the single date column for the report. I have multiple users of multiple branches/courses for whom the report would be generated.
There are multiple blogs and the latest date w.r.t to the blogtitle needs to be grouped, i.e. max(date_value) from the six date columns should give the greatest or latest date for that blogtitle.
Expected Result:
select u.batch_uid as ext_person_key, u.user_id, cm.batch_uid as ext_crs_key, cm.crs_id, ir.role_id as
insti_role, (CASE when b.JOURNAL_IND = 'N' then
'BLOG' else 'JOURNAL' end) as item_type, gm.title as item_name, gm.disp_title as ITEM_DISP_NAME, be.blog_pk1 as be_blogPk1, bc.blog_entry_pk1 as bc_blog_entry_pk1,bc.pk1,
b.ENTRY_mod_DATE as b_ENTRY_mod_DATE ,b.CMT_mod_DATE as BlogCmtModDate, be.CMT_mod_DATE as be_cmnt_mod_Date,
b.UPDATE_DATE as BlogUpDate, be.UPDATE_DATE as be_UPDATE_DATE,
bc.creation_date as bc_creation_date,
be.CREATOR_USER_ID as be_CREATOR_USER_ID , bc.creator_user_id as bc_creator_user_id,
b.TITLE as BlogTitle, be.TITLE as be_TITLE,
be.DESCRIPTION as be_DESCRIPTION, bc.DESCRIPTION as bc_DESCRIPTION
FROM users u
INNER JOIN insti_roles ir on u.insti_roles_pk1 = ir.pk1
INNER JOIN crs_users cu ON u.pk1 = cu.users_pk1
INNER JOIN crs_mast cm on cu.crsmast_pk1 = cm.pk1
INNER JOIN blogs b on b.crsmast_pk1 = cm.pk1
INNER JOIN blog_entry be on b.pk1=be.blog_pk1 AND be.creator_user_id = cu.pk1
LEFT JOIN blog_CMT bc on be.pk1=bc.blog_entry_pk1 and bc.CREATOR_USER_ID=cu.pk1
JOIN gradeledger_mast gm ON gm.crsmast_pk1 = cm.pk1 and b.grade_handler = gm.linkId
WHERE cu.ROLE='S' AND BE.STATUS='2' AND B.ALLOW_GRADING='Y' AND u.row_status='0'
AND u.available_ind ='Y' and cm.row_status='0' and and u.batch_uid='userA_157'
I am getting a resultset for the above query with multiple date columns which I want > > to input into a single columnn. The dates have to be the most recent, i.e. max of the dates in the date columns.
I have successfully done the Unpivot by using a view to store the above
resultset and put all the dates in one column. However, I do not
want to use a view or a table to store the resultset and then do
Unipivot simply because I cannot keep creating views for every user
one would query for.
The max(date_value) from the date columns need to be put in one single column. They are as follows:
* 1) b.entry_mod_date, 2) b.cmt_mod_date ,3) be.cmt_mod_date , 4) b.update_Date ,5) be.update_date, 6) bc.creation_date *
Apologies that I could not provide the desc of all the tables and the
fields being used.
Any help to get the above mentioned max of the dates from these
multiple date columns into a single column without using a view or a
table would be greatly appreciated.*
It is not clear what results you want, but the easiest solution is to use greatest().
with t as (
YOURQUERYHERE
)
select t.*,
greatest(entry_mod_date, cmt_mod_date, cmt_mod_date, update_Date,
update_date, bc.creation_date
) as greatestdate
from t;
select <columns>,
case
when greatest (b_ENTRY_mod_DATE) >= greatest (BlogCmtModDate) and greatest(b_ENTRY_mod_DATE) >= greatest(BlogUpDate)
then greatest( b_ENTRY_mod_DATE )
--<same implementation to compare each time BlogCmtModDate and BlogUpDate separately to get the greatest then 'date'>
,<columns>
FROM table
<rest of the query>
UNION ALL
Select <columns>,
case
when greatest (be_cmnt_mod_Date) >= greatest (be_UPDATE_DATE)
then greatest( be_cmnt_mod_Date )
when greatest (be_UPDATE_DATE) >= greatest (be_cmnt_mod_Date)
then greatest( be_UPDATE_DATE )
,<columns>
FROM table
<rest of the query>
UNION ALL
Select <columns>,
GREATEST(bc_creation_date)
,<columns>
FROM table
<rest of the query>