I'm trying to replicate the following SQL in Rails3 active record, nothing I've found so far comes close. So, any help would be appreciated.
SELECT AVG(DAILY_AVG) FROM (
SELECT user_code, (COUNT(actioned_at) / 200) as DAILY_AVG
FROM transactions
GROUP BY user_code
) TMP
I'm currently executing this directly using ...connection.select_value(sql) but would really like to figure out the active record way of doing this.
The inner query can be written as:
Transaction.group(:user_code).select("COUNT(actioned_at) / 200 AS daily_avg")
And then to nest this to get the average we can do:
Transaction.select("AVG(daily_avg)").from(Transaction.group(:user_code).select("COUNT(actioned_at) / 200 AS daily_avg"))[0].avg.to_f
Related
I am trying to replicate the Google Analyitcs data in Big Query but couldnt do that.
Basically I am using Custom Dimension 40 (user subscription status)
but I am getting wrong numbers in BQ.
Can someone help me on this?
I am using this query but couldn't find it out the exact one.
SELECT
(SELECT value FROM hits.customDimensions where index=40) AS UserStatus,
COUNT(hits.transaction.transactionId) AS Unique_Purchases
FROM
`xxxxxxxxxxxxx.ga_sessions_2020*` AS GA, --new rollup
UNNEST(GA.hits) AS hits
WHERE
(SELECT value FROM hits.customDimensions where index=40) IN ("xx001","xxx002")
GROUP BY 1
I am getting this from big query which is wrong.
I have check out the dates also but dont know why its wrong.
Your question is rather unclear. But because you want something to be unique and numbers are mysteriously not what you want, I would suggest using COUNT(DISTINCT):
COUNT(DISTINCT hits.transaction.transactionId) AS Unique_Purchases
As far as I understand, you imported Google Analytics data into Bigquery and you are trying to group the custom dimension with index 40 and values ("xx001","xxx002") in order to know how many hit transactions were performed in function of these dimension values.
Replicating your scenario and trying to execute the query you posted, I got the following error.
However, I created a query that could help with your use-case. At first, it selects the transactionId and dimension values with the transactionId different from null and with index value equal to 40, then the grouping is done by the dimension value, filtered with values equals to "xx001"&"xxx002".
WITH tx AS (
SELECT
HIT.transaction.transactionId,
CD.value
FROM
`xxxxxxxxxxxxx.ga_sessions_2020*` AS GA,
UNNEST(GA.hits) AS HIT,
UNNEST(HIT.customDimensions) AS CD
WHERE
HIT.transaction.transactionId IS NOT NULL
AND
CD.index = 40
)
SELECT tx.value AS UserStatus, count(tx.transactionId) AS Unique_Purchases
FROM tx
WHERE tx.value IN ("xx001","xx002")
GROUP BY tx.value
For further details about the format and schema of the data that is imported into BigQuery, I found this document.
I do have a problem with a task because my division value is different when I use it alone and when I use it in full code. Let's say I do this code:
SELECT (count(paimta))::numeric / count(distinct paimta) as average
FROM Stud.Egzempliorius;
and finally a number I get is 2.(6)7, but when I use it in full code which is:
SELECT Stud.Egzempliorius.Paimta, COUNT(PAIMTA) as PaimtaKnyga
FROM Stud.Skaitytojas, Stud.Egzempliorius
WHERE Stud.Skaitytojas.Nr=Stud.Egzempliorius.Skaitytojas
GROUP BY Stud.Egzempliorius.Paimta
HAVING count(paimta) > (count(paimta))::numeric / count(distinct paimta);
it's value changes because division is not working anymore and let's say instead of having
HAVING count(paimta) > (count(paimta))::numeric / count(distinct paimta);
my codes turns into
HAVING count(paimta) > (count(paimta))::numeric;
and these values are equal, so I can't get final answer. That's database I use https://klevas.mif.vu.lt/~baronas/dbvs/biblio/show-table.php?table=Stud.Egzempliorius
I was struggling for 10 hours now and finally I've lost my patience... So, my question is what I have to do that this code:
SELECT (count(paimta))::numeric / count(distinct paimta) as average
FROM Stud.Egzempliorius;
value doesn't change in full code?
Picture how it changes Photo
Your solution fails because the two queries operate on a different groups of rows. The first query does a computation over the whole dataset, while the second one groups by paimta.
One option would have been to use window functions, but as far as concerns Postgres does not support count(distinct) as a window function.
I think that the simplest approach is to use a subquery :
select e.paimta, count(paimta) as paimtaknyga
from stud.skaitytojas s
inner join stud.egzempliorius e on s.nr = e.skaitytojas
group by e.paimta
having count(paimta) > (
select (count(paimta))::numeric / count(distinct paimta) from stud.egzempliorius
)
i am building a partition based table in a dataset and i am trying to query those partitions using a date range.
Here is an example of the data:
Dataset:
logs
Tables:
logs_20170501
logs_20170502
logs_20170503
i am trying first the TABLE_RANGE_DATE
SELECT count(*) FROM TABLE_DATE_RANGE([logs.logs_],
TIMESTAMP("2017-05-01"),
TIMESTAMP("2017-05-03")) as logs_count
i am keep getting : "ERROR:Error evaluating subsidiary query"
i tried those options as well:
single comma:
SELECT count(*) FROM TABLE_DATE_RANGE([logs.logs_],
TIMESTAMP('2017-05-01'),
TIMESTAMP('2017-05-03')) as logs_count
Add Project ID:
SELECT count(*) FROM TABLE_DATE_RANGE([main_sys_logs:logs.logs_],
TIMESTAMP('2017-05-01'),
TIMESTAMP('2017-05-03')) as logs_count
And it didn't worked.
So i tried to use TABLE_SUFFIX
SELECT
count(*)
FROM [main_sys_logs:logs.logs_*]
WHERE _TABLE_SUFFIX BETWEEN '20170501' AND '20170503'
And i got this error :
Invalid table name:'main_sys_logs:logs.logs_*
i have been switching SQL Dialect between legacy SQL ON/Off and i just got different errors on the table name part.
Is there any tips or help for this matter ?
maybe my table name is build wrong with the "_" at the end and this is causing the problem ? thanks for any help.
So i tried this Query and it worked :
SELECT count(*) FROM TABLE_DATE_RANGE(logs.logs_,
TIMESTAMP("2017-05-01"),
TIMESTAMP("2017-05-03")) as logs_count
it started to work after i run this query , i don't know if this is the reason .. but i just query the TABLES data for the dataset
SELECT *
FROM logs__TABLES__
I'm looking for a bit of support regarding using a value from a separate query in an update query. The background is that i have a query calle qry_AvgOfXCoeff which calculates the average of tbl_ConvertToDouble.XCoeff. What i would like to do is replace any Xcoeff value that is greater than 0 with the avg calculated in the first query. At present i cannot use the qry directly in an Update query as i received the dreaded 'Must use a updateable query' error.
qry_AvgOfXCoeff:
SELECT Avg(tbl_ConvertToDouble.XCoeff) AS [Avg]
FROM tbl_ConvertToDouble;
Now i've been informed that i should be able to do this by using an IN condition in the update query, but im really stumped with this one and cannot seem to find any examples of how i would implement this. I've had a play with some code as per below, but please can someone help with this. It seems such a simple thing.
UPDATE qry_AvgOfXCoeff, tbl_ConvertToDouble SET tbl_ConvertToDouble.[Xcoeff]
WHERE (( ( tbl_ConvertToDouble.[xcoeff] ) IN (SELECT [qry_AvgOfCoeff].[Avg]
FROM [qry_AvgOfCoeff] AS Tmp
Where [tbl_ConvertToDouble].[Xcoeff] > 0) ))
ORDER BY tbl_calcreg.[xcoeff];
Thank you kindly in advance.
Donna
Access offers Domain Aggregate Functions that can be helpful in avoiding the "Operation must use an updateable query" issue. In this case, you can use the DAvg() function
UPDATE tbl_ConvertToDouble
SET XCoeff = DAvg("XCoeff", "tbl_ConvertToDouble")
WHERE XCoeff>0
iam not a big ORACLE - SQL Expert, so i hope someone knows a good way to find the "duplicate" record wich is causing the: single-row subquery returns more than one row error.
This my Statement:
SELECT
CAST(af.SAP_SID AS VARCHAR2(4000)) APP_ID,
(SELECT DR_OPTION
FROM
DR_OPTIONS
WHERE DR_OPTIONS.ID = (
select dr_option from applications where applications.sap_sid = af.sap_sid)) DR_OPTION
FROM
APPLICATIONS_FILER_VIEW af
it works on my test system, so iam "sure" there must be an error inside the available data records, but i have no idea how to find those ..
Try with this query:
select applications.sap_sid, count(dr_option)
from applications
group by applications.sap_sid
having count(dr_option) > 1
This should give you the sap_sid of the duplicated rows
I'd suggest simplifying your query:
SELECT CAST(af.SAP_SID AS VARCHAR2(4000)) APP_ID,
dr.DR_OPTION
FROM APPLICATIONS_FILER_VIEW af
INNER JOIN applications a ON af.sap_sid = a.sap_sid
INNER JOIN DR_OPTIONS dr ON a.dr_option = dr.ID
I would investigate what you get when you run:
select dr_option from applications where applications.sap_sid = af.sap_sid
but you could force only one row to be returned (I see this as being a fudge and would not recommend using it at least add an order by to have some control over the row being returned) with something like:
SELECT
CAST(af.SAP_SID AS VARCHAR2(4000)) APP_ID,
(SELECT DR_OPTION
FROM
DR_OPTIONS
WHERE DR_OPTIONS.ID = (
select dr_option
from applications
where applications.sap_sid = af.sap_sid
and rownumber = 1)
) DR_OPTION
FROM
APPLICATIONS_FILER_VIEW af
(not tested just googled how to limit results in oracle)
If you fix the data issue (as per A.B.Cades comment) then I would recommend converting it to use joins as per weenoid's answer. this would also highlight other data issues that may arise in the future.
IN SHORT: I have never fixed anything in this way.. the real answer is to investigate the multiple rows returned and decide what you want to do maybe:
add more where clauses
order the results and only select top row
actually keep the duplicates as they represent a scenario you have not thought of before