sql oracle group by subquery - sql

I get the same ecommerce number for each date. I am trying to get ecommerce value count depending on the date, which is different for each date as the total number is only 105 for all October, not 391958.
Any idea how to group by the output of a subquery?
Thank you!
SELECT to_char(wcs1.start_tms,'DD/MM/YYYY') as dates,
(
SELECT count(*)
FROM ft_t_wcs1 wcs1,ft_t_stup stup
WHERE stup.modl_id='ECOMMERC'
AND stup.CROSS_REF_ID=wcs1.acct_id
AND stup.end_tms IS NULL
) AS ecommerce
FROM ft_t_wcs1 wcs1, ft_t_stup stup
WHERE wcs1.scenario='CREATE'
AND wcs1.acct_id IS NOT NULL
AND wcs1.start_tms BETWEEN add_months(TRUNC(SYSDATE,'mm'),-1) AND LAST_DAY(add_months(TRUNC(SYSDATE,'mm'),-1))
GROUP BY to_char(wcs1.start_tms,'DD/MM/YYYY')
ORDER BY to_char(wcs1.start_tms,'DD/MM/YYYY');
OUTPUT

Try below modified queries
select to_char(wcs1.start_tms,'DD/MM/YYYY') as dates,count(*) AS
ecommerce
from ft_t_wcs1 wcs1, ft_t_stup stup
where stup.modl_id='ECOMMERC' and stup.CROSS_REF_ID=wcs1.acct_id and stup.end_tms is null wcs1.scenario='CREATE' and wcs1.acct_id is not null and
wcs1.start_tms between add_months(TRUNC(SYSDATE,'mm'),-1) and
LAST_DAY(add_months(TRUNC(SYSDATE,'mm'),-1))
group by to_char(wcs1.start_tms,'DD/MM/YYYY')
order by to_char(wcs1.start_tms,'DD/MM/YYYY');
-- Another way using JOIN clause
select to_char(wcs1.start_tms,'DD/MM/YYYY') as dates,count(*) AS
ecommerce
from ft_t_wcs1 wcs1
join ft_t_stup stup
ON stup.CROSS_REF_ID=wcs1.acct_id
where stup.modl_id='ECOMMERC' and stup.end_tms is null wcs1.scenario='CREATE' and wcs1.acct_id is not null and
wcs1.start_tms between add_months(TRUNC(SYSDATE,'mm'),-1) and
LAST_DAY(add_months(TRUNC(SYSDATE,'mm'),-1))
group by to_char(wcs1.start_tms,'DD/MM/YYYY')
order by to_char(wcs1.start_tms,'DD/MM/YYYY');

It's hard to suggest an answer without understanding your table relationship, but I can tell that your problem is there is no relationship between your subquery and your main query. Your subquery simply returns a count where modl_id='ECOMMERC', so that value will always be the same - in your case, 105. You need to add a JOIN criteria to the subquery that ties the unique value to your main query. You'll also want to alias the table names differently to ensure you're joining correctly.

You are doing unnecessary joins when you just want a correlated subquery:
SELECT to_char(wcs1.start_tms,'DD/MM/YYYY') as dates,
(SELECT count(*)
FROM ft_t_stup stup
WHERE stup.modl_id= 'ECOMMERC' AND
stup.CROSS_REF_ID = wcs1.acct_id
stup.end_tms IS NULL
) AS ecommerce
FROM ft_t_wcs1 wcs1
WHERE wcs1.scenario = 'CREATE' AND
wcs1.acct_id IS NOT NULL AND
wcs1.start_tms BETWEEN add_months(TRUNC(SYSDATE,'mm'),-1) AND LAST_DAY(add_months(TRUNC(SYSDATE,'mm'),-1))
GROUP BY to_char(wcs1.start_tms, 'DD/MM/YYYY')
ORDER BY to_char(wcs1.start_tms, 'DD/MM/YYYY');

Related

SQL Aggregating data based on condition containing the key fields for aggregation

I am new to SQL (Oracle SQL if it makes a difference) but it so happens I have to use it. I need to aggregate data by some key fields (CustId, AppId). I also have some AppDate, PDate and Amount.Initial data
What I need to do is aggregate but for each key field combination I need to aggregate the data from other rows with the following conditions:
CustID = CustID aka take only information for this custID
AppId != AppId aka take only information for application different than the current one.
AppDate >= PDate aka take only information available at time of application
From a quick look at SQL language my approach was the use of:
select CustId, AppId, Sum(case when
custid=custid and Appid!=Appid and AppDate >= PDate then Amount else 0 end) as SumAmount
From Table
Group by CustId AppId
Unfortunately, the result I get are all 0 for SumAmount. My guess it is because of the last 2 conditions.
The results I want to get from the example table are: Results
Also, I would probably add condition that AppDate - AppDate of other AppID > 6months exclude those from the aggregated amounts.
P.S. I am really sorry for the substandard formatting and probably bad code. I am not really experienced on how to do it.
Edit: I've found a solution as follows:
select distinct a.CustId, a.AppId, a.AppDate, b.PDate, b.Amount
from table a
inner join (select CustId, AppId, Amount, PDate from Table) b
on a.CustId = b.CustId and a.AppId != b.AppId
where a.AppDate >= b.PDate
After that I aggregate by AppId summing the amount.
Basically, I just append the same information based on a condition and since I get a lot of full duplicates I deduplicate with distinct.
I've found a solution as follows:
select distinct a.CustId, a.AppId, a.AppDate, b.PDate, b.Amount
from table a
inner join (select CustId, AppId, Amount, PDate from Table) b
on a.CustId = b.CustId and a.AppId != b.AppId
where a.AppDate >= b.PDate
After that I aggregate by AppId summing the amount.
Basically, I just append the same information based on a condition and since I get a lot of full duplicates I deduplicate with distinct.

PostgreSQL, NOT IN clause

I want to calculate DAU and exclude user that we don't consider "real" (employees, beta testers etc).
It worked fine previously when I wrote the filtering in the query:
SELECT
count(distinct user_id) AS daily,
e.event_timestamp::DATE AS date
FROM
"public"."events" AS e
WHERE
user_id IN (SELECT
distinct id
from
"user"."user"
WHERE
username IS NOT NULL AND position IS NOT NULL )
GROUP BY date
When I try changing it to below, which should give more or less the same count (basically instead of defining the 4000 "real users" I define the 1000 "non-users" I want to exclude). However, this gives me way higher counts. It's like the distinct statement isn't working.
I added the NOT NULL to the subquery but doesn't change the result. Is there something with the NOT IN + subquery that works in another way than the IN clause?
SELECT
count(distinct e.user_id) AS daily,
e.event_timestamp::DATE AS date
FROM
"public"."events" AS e
WHERE
e.user_id NOT IN (SELECT distinct id FROM "public"."non_users" WHERE id IS NOT NULL)
GROUP BY
date
ORDER BY
date
Yes. If any of the values in the subquery are NULL, then NOT IN returns no rows For this reason, I strongly recommend that you always use NOT EXISTS -- it behaves as expected.
You seem to know this, because you are using a NULL comparison in the WHERE. So, the difference is probably due to the other condition. So, include it as well:
SELECT count(distinct e.user_id) AS daily,
e.event_timestamp::DATE AS date
FROM "public"."events" e
WHERE NOT EXISTS (SELECT 1
FROM "public"."non_users" nu
WHERE e.user_id = nu.id AND
nu.position IS NOT NULL
)
GROUP BY date
ORDER BY date;

How to get datetime duplicate rows in SQL Server?

Im trying to find duplicate DATETIME rows in a table,
My column has datetime values such as 2015-01-11 11:24:10.000.
I must get the duplicates in 2015-01-11 11:24 type. Rest of it, not important. I can get the right value when I use SELECT with 'convert(nvarchar(16),column,121)', but when I put this in my code, I have to use 'group by' statement, so
My code is:
SELECT ID,
RECEIPT_BARCODE,
convert(nvarchar(16),TRANS_DATE,121),
PTYPE
FROM TRANSACTION_HEADER
WHERE TRANS_DATE BETWEEN '11.01.2015' AND '12.01.2015'
GROUP BY ID,RECEIPT_BARCODE,convert(nvarchar(16),TRANS_DATE,121),PTYPE
HAVING COUNT(convert(nvarchar(16),TRANS_DATE,121)) > 1
Since SQL forces me to use 'convert(nvarchar(16),TRANS_DATE,121)' in GROUP BY statement, I can't get the duplicate values.
Any idea for this?
Thanks in advance.
If you want the actual rows that are duplicated, then use window functions instead:
SELECT th.*, convert(nvarchar(16),TRANS_DATE,121)
FROM (SELECT th.*, COUNT(*) OVER (PARTITION BY convert(nvarchar(16),TRANS_DATE,121)) as cnt
FROM TRANSACTION_HEADER th
WHERE TRANS_DATE BETWEEN '11.01.2015' AND '12.01.2015'
) th
WHERE cnt > 1;
SELECT ID,RECEIPT_BARCODE,convert(nvarchar(16),TRANS_DATE,121), PTYPE ,COUNT(*)
FROM TRANSACTION_HEADER
WHERE TRANS_DATE BETWEEN '11.01.2015' AND '12.01.2015'
GROUP ID,RECEIPT_BARCODE,convert(nvarchar(16),TRANS_DATE,121), PTYPE
HAVING COUNT(*)>1;
I think you can use count(*) directly here.try the above one.

Unpivot date columns to a single column of a complex query in Oracle

Hi guys, I am stuck with a stubborn problem which I am unable to solve. Am trying to compile a report wherein all the dates coming from different tables would need to come into a single date field in the report. Ofcourse, the max or the most recent date from all these date columns needs to be added to the single date column for the report. I have multiple users of multiple branches/courses for whom the report would be generated.
There are multiple blogs and the latest date w.r.t to the blogtitle needs to be grouped, i.e. max(date_value) from the six date columns should give the greatest or latest date for that blogtitle.
Expected Result:
select u.batch_uid as ext_person_key, u.user_id, cm.batch_uid as ext_crs_key, cm.crs_id, ir.role_id as
insti_role, (CASE when b.JOURNAL_IND = 'N' then
'BLOG' else 'JOURNAL' end) as item_type, gm.title as item_name, gm.disp_title as ITEM_DISP_NAME, be.blog_pk1 as be_blogPk1, bc.blog_entry_pk1 as bc_blog_entry_pk1,bc.pk1,
b.ENTRY_mod_DATE as b_ENTRY_mod_DATE ,b.CMT_mod_DATE as BlogCmtModDate, be.CMT_mod_DATE as be_cmnt_mod_Date,
b.UPDATE_DATE as BlogUpDate, be.UPDATE_DATE as be_UPDATE_DATE,
bc.creation_date as bc_creation_date,
be.CREATOR_USER_ID as be_CREATOR_USER_ID , bc.creator_user_id as bc_creator_user_id,
b.TITLE as BlogTitle, be.TITLE as be_TITLE,
be.DESCRIPTION as be_DESCRIPTION, bc.DESCRIPTION as bc_DESCRIPTION
FROM users u
INNER JOIN insti_roles ir on u.insti_roles_pk1 = ir.pk1
INNER JOIN crs_users cu ON u.pk1 = cu.users_pk1
INNER JOIN crs_mast cm on cu.crsmast_pk1 = cm.pk1
INNER JOIN blogs b on b.crsmast_pk1 = cm.pk1
INNER JOIN blog_entry be on b.pk1=be.blog_pk1 AND be.creator_user_id = cu.pk1
LEFT JOIN blog_CMT bc on be.pk1=bc.blog_entry_pk1 and bc.CREATOR_USER_ID=cu.pk1
JOIN gradeledger_mast gm ON gm.crsmast_pk1 = cm.pk1 and b.grade_handler = gm.linkId
WHERE cu.ROLE='S' AND BE.STATUS='2' AND B.ALLOW_GRADING='Y' AND u.row_status='0'
AND u.available_ind ='Y' and cm.row_status='0' and and u.batch_uid='userA_157'
I am getting a resultset for the above query with multiple date columns which I want > > to input into a single columnn. The dates have to be the most recent, i.e. max of the dates in the date columns.
I have successfully done the Unpivot by using a view to store the above
resultset and put all the dates in one column. However, I do not
want to use a view or a table to store the resultset and then do
Unipivot simply because I cannot keep creating views for every user
one would query for.
The max(date_value) from the date columns need to be put in one single column. They are as follows:
* 1) b.entry_mod_date, 2) b.cmt_mod_date ,3) be.cmt_mod_date , 4) b.update_Date ,5) be.update_date, 6) bc.creation_date *
Apologies that I could not provide the desc of all the tables and the
fields being used.
Any help to get the above mentioned max of the dates from these
multiple date columns into a single column without using a view or a
table would be greatly appreciated.*
It is not clear what results you want, but the easiest solution is to use greatest().
with t as (
YOURQUERYHERE
)
select t.*,
greatest(entry_mod_date, cmt_mod_date, cmt_mod_date, update_Date,
update_date, bc.creation_date
) as greatestdate
from t;
select <columns>,
case
when greatest (b_ENTRY_mod_DATE) >= greatest (BlogCmtModDate) and greatest(b_ENTRY_mod_DATE) >= greatest(BlogUpDate)
then greatest( b_ENTRY_mod_DATE )
--<same implementation to compare each time BlogCmtModDate and BlogUpDate separately to get the greatest then 'date'>
,<columns>
FROM table
<rest of the query>
UNION ALL
Select <columns>,
case
when greatest (be_cmnt_mod_Date) >= greatest (be_UPDATE_DATE)
then greatest( be_cmnt_mod_Date )
when greatest (be_UPDATE_DATE) >= greatest (be_cmnt_mod_Date)
then greatest( be_UPDATE_DATE )
,<columns>
FROM table
<rest of the query>
UNION ALL
Select <columns>,
GREATEST(bc_creation_date)
,<columns>
FROM table
<rest of the query>

Trying to select multiple columns on an inner join query with group and where clauses

I'm trying to run a query where it will give me one Sum Function, then select two columns from a joined table and then to group that data by the unique id i gave them. This is my original query and it works.
SELECT Sum (Commission_Paid)
FROM [INTERN_DB2].[dbo].[PaymentList]
INNER JOIN [INTERN_DB2]..[RealEstateAgentList]
ON RealEstateAgentList.AgentID = PaymentList.AgentID
WHERE Close_Date >= '1/1/2013' AND Close_Date <= '12/31/2013'
GROUP BY RealEstateAgentList.AgentID
I've tried the query below, but I keep getting an error and I don't know why. It says its a syntax error.
SELECT Sum (Commission_Paid)
FROM [INTERN_DB2].[dbo].[PaymentList]
INNERJOIN [INTERN_DB2]..[RealEstateAgentList](
Select First_Name, Last_Name
From [Intern_DB2]..[RealEstateAgentList]
Group By Last_name
)
ON RealEstateAgentList.AgentID = PaymentList.AgentID
WHERE Close_Date >= '1/1/2013' AND Close_Date <= '12/31/2013'
GROUP BY RealEstateAgentList.AgentID
Your query has multiple problems:
SELECT rl.AgentId, rl.first_name, rl.last_name, Sum(Commission_Paid)
FROM [INTERN_DB2].[dbo].[PaymentList] pl inner join
(Select agent_id, min(first_name) as first_name, min(last_name) as last_name
From [Intern_DB2]..[RealEstateAgentList]
GROUP BY agent_id
) rl
ON rl.AgentID = pl.AgentID
WHERE Close_Date >= '2013-01-01' AND Close_Date <= '2013-12-31'
GROUP BY rl.AgentID, rl.first_name, rl.last_name;
Here are some changes:
INNERJOIN --> inner join.
Fixed the syntax of the subquery next to the table name.
Removed columns for first and last name. They are not used.
Changed the subquery to include agent_id.
Added agent_id, first_name, and last_name to the outer aggregation, so you can tell where the values are coming from.
Changed the date formats to a less ambiguous standard form.
Added table alias for subquery.
I suspect the subquery on the agent list is not important. You can probably do:
SELECT rl.AgentId, rl.first_name, rl.last_name, Sum(pl.Commission_Paid)
FROM [INTERN_DB2].[dbo].[PaymentList] pl inner join
[Intern_DB2]..[RealEstateAgentList] rl
ON rl.AgentID = pl.AgentID
WHERE pl.Close_Date >= '2013-01-01' AND pl.Close_Date <= '2013-12-31'
GROUP BY rl.AgentID, rl.first_name, rl.last_name;
EDIT:
I'm glad this solution helped. As you continue to write queries, try to always do the following:
Use table aliases that are abbreviations of the table names.
Always use table aliases when referring to columns.
When using date constants, either use "YYYY-MM-DD" format or use convert() to convert a string using the specified format. (The latter is actually the safer method, but the former is more convenient and works in almost all databases.)
Pay attention to the error messages; they can be informative in SQL Server (unfortunately, other databases are not so clear).
Format your query so other people can understand it. This will help you understand and debug your queries as well. I have a very particular formatting style (which no one is going to change at this point); the important thing is not the particular style but being able to "see" what the query is doing. My style is documented in my book "Data Analysis Using SQL and Excel.
There are other rules, but these are a good way to get started.
SELECT Sum (Commission_Paid)
FROM [INTERN_DB2].[dbo].[PaymentList] pl
INNER JOIN (
Select First_Name, Last_Name
From [Intern_DB2]..[RealEstateAgentList]
Group By Last_name
) x ON x.AgentID = pl.AgentID
WHERE Close_Date >= '1/1/2013'
AND Close_Date <= '12/31/2013'
GROUP BY RealEstateAgentList.AgentID
This is how the query should look... however, if you subquery first and last name, you'll also have to include them in the group by. Assuming Close_Date is in the PaymentList table, this is how I would write the query:
SELECT
al.AgentID,
al.FirstName,
al.LastName,
Sum(pl.Commission_Paid) AS Commission_Paid
FROM [INTERN_DB2].[dbo].[PaymentList] pl
INNER JOIN [Intern_DB2].dbo.[RealEstateAgentList] al ON al.AgentID = pl.AgentID
WHERE YEAR(pl.Close_Date) = '2013'
GROUP BY al.AgentID, al.FirstName, al.LastName
Subqueries are evil, for the most part. There's no need for one here, because you can just get the columns from the join.