Invalid column reference in hive - sql

When I run the below query, I get the error "Invalid column reference: cnt". Any suggestions would be great !!
select count(customer) as cnt from (
select customer, concat(visid, lowid), count(name)
from tab1 where date_time between '2017-05-01 00:00:00' and '2017-05-31 23:59:59' and name in ('payment: Complete', 'check: Complete')
group by evar71, concat(visid, lowid)) t1
where cnt > 1;

Another way to do it.
select count(customer) as cnt from (
select customer, concat(visid, lowid), count(name)
from tab1 where date_time between '2017-05-01 00:00:00' and '2017-05-31 23:59:59' and name in ('payment: Complete', 'check: Complete')
group by evar71, concat(visid, lowid)) t1
having count(customer) > 1;

WHERE filter applied before aggregation
that is why where cnt > 1 does not work. There is HAVING keyword which introduces a condition on aggregations, it works as filter after aggregation.
select count(customer) cnt
...
where rows_filter_condition_here --before aggregation
having count(customer) > 1 --aggregation results filter
order by cnt desc --this works after aggregation

I think hive prefers aliases in the group by. In addition, several column aliases are not correct:
select count(customer) as cnt
from (select customer, concat(visid, lowid) as ids, count(name) as cc
from tab1
where date_time >= '2017-05-01' and date_time < '2017-06-01' and
name in ('payment: Complete', 'check: Complete')
group by customer, ids
) t1
where cc > 1;

Related

SQL - One Table with Two Date Columns. Count and Join

I have a table (vOKPI_Tickets) that has the following columns:
|CreationDate | CompletionDate|
I'd like to get a count on each of those columns, and group them by date. It should look something like this when complete:
| Date | Count-Created | Count-Completed |
I can get each of the counts individually, by doing something like this:
SELECT COUNT(TicketId)
FROM vOKPI_Tickets
GROUP BY CreationDate
and
SELECT COUNT(TicketId)
FROM vOKPI_Tickets
GROUP BY CreationDate
How can I combine the output into one table? I should also note that this will become a View.
Thanks in advance
Simple generic approach:
select
coalesce(crte.creationdate, cmpl.CompletionDate) as theDate,
crte.cnt as created,
cmpl.cnt as completed
from
(select creationdate, count (*) as cnt from vOKPI_Tickets where creationdate is not null group by creationdate) crte
full join
(select CompletionDate, count (*) as cnt from vOKPI_Tickets where CompletionDate is not null group by CompletionDate) cmpl
on crte.creationdate = cmpl.CompletionDate
You can unpivot and aggregate. A general method is:
select dte, sum(created), sum(completed)
from ((select creationdate as dte, 1 as created, 0 as completed
from vOKPI_Tickets
) union all
(select completed as dte, 0 created, 1 as completed
from vOKPI_Tickets
)
) t
group by dte;
In SQL Server, you can use cross apply for this:
select d.dt, sum(d.is_completed) count_created, sum(d.is_completed) count_completed
from vokpi_tickets t
cross apply (values (creationdate, 1, 0), (completion_date, 0, 1)) as d(dt, is_created, is_completed)
where d.dt is not null
group by d.dt

SQLite Getting multiple results with LIMIT 1

I have the following problem.
Part of a task is to determine the visitor(s) with the most money spent between 2000 and 2020.
It just looks like this.
SELECT UserEMail FROM Visitor
JOIN Ticket ON Visitor.UserEMail = Ticket.VisitorUserEMail
where Ticket.Date> date('2000-01-01') AND Ticket.Date < date ('2020-12-31')
Group by Ticket.VisitorUserEMail
order by SUM(Price) DESC;
Is it possible to output more than one person if both have spent the same amount?
Use rank():
SELECT VisitorUserEMail
FROM (SELECT VisitorUserEMail, SUM(PRICE) as sum_price,
RANK() OVER (ORDER BY SUM(Price) DESC) as seqnum
FROM Ticket t
WHERE t.Date >= date('2000-01-01') AND Ticket.Date <= date('2021-01-01')
GROUP BY t.VisitorUserEMail
) t
WHERE seqnum = 1;
Note: You don't need the JOIN, assuming that ticket buyers are actually visitors. If that assumption is not true, then use the JOIN.
Use a CTE that returns all the total prices for each email and with NOT EXISTS select the rows with the top total price:
WITH cte AS (
SELECT VisitorUserEMail, SUM(Price) SumPrice
FROM Ticket
WHERE Date >= '2000-01-01' AND Date <= '2020-12-31'
GROUP BY VisitorUserEMail
)
SELECT c.VisitorUserEMail
FROM cte c
WHERE NOT EXISTS (
SELECT 1 FROM cte
WHERE SumPrice > c.SumPrice
)
or:
WITH cte AS (
SELECT VisitorUserEMail, SUM(Price) SumPrice
FROM Ticket
WHERE Date >= '2000-01-01' AND Date <= '2020-12-31'
GROUP BY VisitorUserEMail
)
SELECT VisitorUserEMail
FROM cte
WHERE SumPrice = (SELECT MAX(SumPrice) FROM cte)
Note that you don't need the function date() because the result of date('2000-01-01') is '2000-01-01'.
Also I think that the conditions in the WHERE clause should include the =, right?

Getting ORA-00928: missing SELECT keyword error

I am getting an error
ORA-00928: missing SELECT keyword
while running this query:
WITH Dups AS
(
SELECT
ID, AMOUNT, BATCH_ID, PROCESS_DATE, ITEM_NUMBER, ERROR_TYPE, INSERTED_DATE,
ROW_NUMBER() OVER(PARTITION BY ID, ERROR_TYPE ORDER BY ID) AS rn
FROM
ERROR_TABLE
WHERE
inserted_date >= TRIM(TO_DATE('01-AUG-17', 'DD-MON-YY'))
AND inserted_date <= TRIM(TO_DATE('11-AUG-17', 'DD-MON-YY'))
)
DELETE FROM Dups
WHERE rn > 1
This is not how to delete duplicates in Oracle. Inspired, but not for that database. Something like this:
delete error_table et
where et.inserted_date >= date '2017-08-01' and
et.inserted_date <= date '2017-08-11' and
rowid > (select min(et2.rowid)
from error_table et2
where et2.inserted_date >= date '2017-08-01' and
et2.inserted_date <= date '2017-08-11' and
et2.id = et.id and
et2.error_type = et.error_type
);

SQL Server : combining COUNT(*) for different tables at once

I have a few queries that I would like to combine into ONE query in order to not have to call out to the server multiple times.
An example of the queries I am using:
SELECT COUNT(*) AS mailCount1
FROM [WebContact].[dbo].[memberEmails]
WHERE contactdatetime > '01/01/06'
AND contactdatetime < '02/01/06'
SELECT COUNT(*) AS mailCount2
FROM [WebContact].[dbo].[otherEmails]
WHERE contactdatetime > '01/01/06'
AND contactdatetime < '02/01/06'
SELECT COUNT(*) AS mailCount3
FROM [WebContact].[dbo].[memberEmails]
WHERE contactdatetime > '02/01/06'
AND contactdatetime < '03/01/06'
SELECT COUNT(*) AS mailCount4
FROM [WebContact].[dbo].[otherEmails]
WHERE contactdatetime > '02/01/06'
AND contactdatetime < '03/01/06'
etc etc...
So as the examples above, only thing that changes are:
The FROM (memberEmails & otherEmails)
The > & < months (01/01/06, 02/01/06 | 02/01/06, 03/01/06 | etc...)
Is this possible to do with a single query?
First, use group by and just use two queries:
select year(contactdatetime) as yyyy, month(contactdatetime) as mm, count(*)
from WebContact].[dbo].[memberEmails]
group by year(contactdatetime), month(contactdatetime);
and:
select year(contactdatetime) as yyyy, month(contactdatetime) as mm, count(*)
from WebContact].[dbo].[otherEmails]
group by year(contactdatetime), month(contactdatetime);
Then, if you like, you can combine these into a single query:
select coalesce(me.yyyy, oe.yyyy) as yyyy, coalesce(me.mm, oe.mm) as mm,
coalesce(me.cnt, 0) as memberemailcnt,
coalesce(oe.cnt, 0) as otheremailcnt
from (select year(contactdatetime) as yyyy, month(contactdatetime) as mm, count(*) as cnt
from WebContact].[dbo].[memberEmails]
group by year(contactdatetime), month(contactdatetime)
) me full outer join
(select year(contactdatetime) as yyyy, month(contactdatetime) as mm, count(*) as cnt
from WebContact].[dbo].[otherEmails]
group by year(contactdatetime), month(contactdatetime)
) oe
on me.yyyy = oe.yyyy and me.mm = oe.mm;
A full outer join is not necessary if both tables have data for all months.
declare #emailCount table(tablename varchar(20), year int, month int, qty int)
insert into #emailCount
select 'memberEmails', year(contactdatetime), month(contactdatetime), count(*)
from [WebContact].[dbo].[memberEmails]
group by year(contactdatetime), month(contactdatetime)
insert into #emailCount
select 'otherEmails',year(contactdatetime), month(contactdatetime), count(*)
from [WebContact].[dbo].[otherEmails]
group by year(contactdatetime), month(contactdatetime)
select tablename, year, month, qty from #emailCount
Add WHERE clause if needed to restrict date ranges. (edit- simplified to use year() and month() functions.)
I haven't check the syntax or performance but you can do something like this,
WITH cte (
countvalue
,description
)
AS (
SELECT COUNT(*)
,'mailCount1'
FROM [WebContact].[dbo].[memberEmails]
WHERE contactdatetime > '01/01/06'
AND contactdatetime < '02/01/06'
UNION ALL
SELECT COUNT(*)
,'mailCount2'
FROM [WebContact].[dbo].[otherEmails]
WHERE contactdatetime > '01/01/06'
AND contactdatetime < '02/01/06'
UNION ALL
SELECT COUNT(*)
,'mailCount3'
FROM [WebContact].[dbo].[memberEmails]
WHERE contactdatetime > '02/01/06'
AND contactdatetime < '03/01/06'
UNION ALL
SELECT COUNT(*)
,'mailCount4'
FROM [WebContact].[dbo].[otherEmails]
WHERE contactdatetime > '02/01/06'
AND contactdatetime < '03/01/06'
)
SELECT mailCount1
,mailCount2
,mailCount3
,mailCount4
FROM (
SELECT countvalue
,description
FROM cte
) d
pivot(max(countvalue) FOR description IN (mailCount1, mailCount2, mailCount3, mailCount4)) piv;
Hope this helps..

Oracle sub error on query

Following code I added to the SQL Server query and now have to do the same in Oracle. I need to do grouping in the view rather than in the C#. I get this error message:
ORA-01747 Invalid user.table.column or column specification.
How must I code this to work in Oracle?
SELECT CTE.FACILITY_KEY, CTE.DATE, CTE.PATIENT_STATUS, COUNT(*) AS [COUNT]
FROM CTE
GROUP BY CTE.FACILITY_KEY, CTE.DATE, CTE.PATIENT_STATUS;
at the beginning of query I have this full code here:
CREATE OR REPLACE VIEW DBD_V_CDL_CHANGES AS
WITH CTE AS
(
SELECT TR.FACILITY_KEY
, MV.VALUE_CODE
, CAST(COUNT(*) AS NUMERIC(9, 0)) COUNT
FROM OPTC.THS_T_TRANSACTIONS1 TR
JOIN OPTC.THS_M_MENU2 M
ON M.MENU_ID = TR.MENU_ID
JOIN OPTC.THS_M_VALUES MV
ON MV.MENU_ID = TR.MENU_ID_VALUE
JOIN OPTC.THS_M_VALUES MV2
ON MV2.MENU_ID = TR.PREVIOUS_MENU_ID_VALUE
JOIN OGEN.GEN_M_PATIENT_MAST PM
ON PM.PAT_NUMBER = TR.PAT_NUMBER
WHERE TR.TR_DATETIME BETWEEN TRUNC(SYSDATE)
AND TRUNC(SYSDATE) + 86399 / 86400
AND TR.EDIT_NO < 0
AND MV.VALUE_TYPE IS NULL
AND MV2.VALUE_TYPE IS NULL
AND MV.VALUE_CODE >= 0
AND MV2.VALUE_CODE >= 0
AND M.SUB_SYS_EXT = 'G1'
AND ABS(MV.VALUE_CODE - MV2.VALUE_CODE) > 1
AND (PM.DISCHARGE_DATE IS NULL OR PM.DISCHARGE_DATE < SYSDATE)
GROUP BY TR.FACILITY_KEY, MV.VALUE_CODE)
SELECT CTE.FACILITY_KEY, CTE.DATE, CTE.PATIENT_STATUS, COUNT(*) AS [COUNT] FROM CTE
GROUP BY CTE.FACILITY_KEY, CTE.DATE, CTE.PATIENT_STATUS;
I see a few things wrong with your code.
First, you are selecting the following three columns FACILITY_KEY, VALUE_CODE and the count in the CTE:
SELECT TR.FACILITY_KEY ,
MV.VALUE_CODE ,
COUNT(*) as Count -- note there is no need to CAST(COUNT(*) AS NUMERIC(9, 0)) this
FROM OPTC.THS_T_TRANSACTIONS1 TR
But then when you select from the CTE you are selecting columns that you are not returning in the CTE:
with cte as
(
-- your query here does not return DATE or PATIENT_STATUS
)
SELECT CTE.FACILITY_KEY,
CTE.DATE,
CTE.PATIENT_STATUS,
COUNT(*) AS COUNT
FROM CTE
GROUP BY CTE.FACILITY_KEY, CTE.DATE, CTE.PATIENT_STATUS;
Where do PATIENT_STATUS and Date come from since you are not including them in your CTE? So these do not exist when you are trying to select them.
I replicated your error by including columns in the list that were not select in the CTE query.
The second issue is the CTE.DATE column. DATE is a reserved word, place that is double quotes CTE."DATE"
...AS [COUNT], ...AS NUMERIC(9, 0)) is not Oracle syntax and will never work. Simply remove [ ] and use NUMBER instead of NUMERIC. There is no need to CAST Count(). The Count() function will always return number, e.g. 0-zero or some number.
This is valid syntax in Oracle:
SELECT deptno, count(*) total_count_by_dept -- no need to cast or AS --
FROM scott.emp
GROUP BY deptno
/
Try not to use reserved words as COUNT for aliases:
SELECT CTE.FACILITY_KEY, CTE.DATE, CTE.PATIENT_STATUS, COUNT(*) AS total_cnt -- 'AS' is for clarity only, not required
FROM CTE
GROUP BY CTE.FACILITY_KEY, CTE.DATE, CTE.PATIENT_STATUS
/