How to CREATE TABLE in BIgQuery? - error-handling

I'm trying to create a temporary table on BigQuery but an error keeps pupping up. The Portfolio_Covid_Data is the dataset and percentage_population_vaccinated the table I'm creating. The code I'm running is:
DROP TABLE IF EXISTS Portfolio_Covid_Data.percentage_population_vaccinated
CREATE TABLE Portfolio_Covid_Data.percentage_population_vaccinated
(dea.continent STRING,
dea.location STRING,
dea.date DATE,
dea.population NUMERIC,
vac.new_vaccinations NUMERIC,
rolling_people_vaccinated NUMERIC)
INSERT INTO Portfolio_Covid_Data.percentage_population_vaccinated
SELECT
dea.continent,
dea.location,
dea.date,
dea.population,
vac.new_vaccinations,
SUM(vac.new_vaccinations) OVER (PARTITION BY dea.location ORDER BY dea.location,dea.date ) AS rolling_people_vaccinated
FROM
`big-dataset.Portfolio_Covid_Data.covid_deaths` AS dea
JOIN
`big-dataset.Portfolio_Covid_Data.covid_vaccinations` AS vac
ON dea.location = vac.location
AND dea.date = vac.date
SELECT
*,
ROUND((rolling_people_vaccinated/population)*100,2) AS percentage_population_vaccinated
FROM
Portfolio_Covid_Data.percentage_population_vaccinated

What do dea and vac refer to in your CREATE TABLE statement?
This worked for me
CREATE TABLE Portfolio_Covid_Data.percentage_population_vaccinated
(continent STRING,
location STRING,
date DATE,
population NUMERIC,
new_vaccinations NUMERIC,
rolling_people_vaccinated NUMERIC)

Related

ORA-00979: not a GROUP BY expression ORACLE

select to_char(r.rut_cli,'99G999G999')||'-'|| r.dv_cli RUT_CLIENTE ,
to_char(ROUND(r.monto_compras),'$999,999,999,999') AS "MontoComprasRegistrado"
, to_char(ROUND(sum(s.total_sali)),'$999,999,999') AS "MontoComprasCalculado"
, to_char(ROUND(sum(s.total_sali) - r.monto_compras),'$999,999,999,999') AS "Dif.MontoCalculado|Registrado"
from resumen_vta_cliente r join det_despacho d on r.rut_cli=d.rut_cli
join salida s on d.cod_sali=s.cod_sali
join det_salida z on (s.cod_sali=z.cod_sali)
where d.estado_des='E'
and z.cod_prod = (select cod_prod from producto where cod_tipo ='&CODIGO_TIPO')
and s.estado_sali='V'
group by r.rut_cli, r.dv_cli
having r.monto_compras > (select min(total_sali) from salida)
order by r.rut_cli,r.dv_cli;
I don't know why I'm getting this error
ORA-00979: not a GROUP BY expression
Error en la lĂ­nea: 26, columna: 9
Line 26 is having r.monto_compras > (select min(total_sali) from salida)
Let's start from remove r.monto_compras in your sql, this will work.
select to_char(r.rut_cli,'99G999G999')||'-'|| r.dv_cli RUT_CLIENTE ,
-- to_char(ROUND(r.monto_compras),'$999,999,999,999') AS "MontoComprasRegistrado"
, to_char(ROUND(sum(s.total_sali)),'$999,999,999') AS "MontoComprasCalculado"
-- , to_char(ROUND(sum(s.total_sali) - r.monto_compras),'$999,999,999,999') AS "Dif.MontoCalculado|Registrado"
from resumen_vta_cliente r join det_despacho d on r.rut_cli=d.rut_cli
join salida s on d.cod_sali=s.cod_sali
join det_salida z on (s.cod_sali=z.cod_sali)
where d.estado_des='E'
and z.cod_prod = (select cod_prod from producto where cod_tipo ='&CODIGO_TIPO')
and s.estado_sali='V'
group by r.rut_cli, r.dv_cli
-- having r.monto_compras > (select min(total_sali) from salida)
order by r.rut_cli,r.dv_cli;
You may get result from it.
And now, remove comment and then add r.monto_compras to group by clause.
group by r.rut_cli, r.dv_cli, r.monto_compras
having r.monto_compras > (select min(total_sali) from salida)
But I'm not sure work or not, I don't have any script for this.
If you have table script, we need it to solve your problem as soon as possible.
Here is what I've tested.(Note: I tested this on PostgreSQL + Oracle)
create table resumen_vta_cliente(
rut_cli integer,
dv_cli varchar(10),
monto_compras integer
);
create table det_despacho(
rut_cli integer,
cod_sali integer,
estado_des varchar(10)
);
create table salida(
cod_sali integer,
estado_sali varchar(10),
total_sali integer
);
create table det_salida(
cod_sali integer,
estado_sali varchar(10),
cod_prod varchar(10)
);
create table producto(
cod_prod varchar(10),
cod_tipo varchar(10)
);

Aggregate hstore in Postgres within GROUP BY

I have data with an hstore like this:
|brand|account|likes|views |
|-----|-------|-----|----------------------|
|Ford |ford_uk|1 |"3"=>"100" |
|Ford |ford_us|2 |"3"=>"200", "5"=>"10" |
|Jeep |jeep_uk|3 |"3"=>"300" |
|Jeep |jeep_us|4 |"3"=>"400", "5"=>"20" |
I would like to be able to sum the hstores by key, grouped by brand:
|brand|likes|views |
|-----|-----|----------------------|
|Ford |3 |"3"=>"300", "5"=>"10" |
|Jeep |7 |"3"=>"700", "5"=>"20" |
This answer gives a good solution for how to do this without a GROUP BY. Adapting it to this situation gives something like:
SELECT
sum(likes) AS total_likes,
(SELECT hstore(array_agg(key), array_agg(value::text))
FROM (
SELECT s.key, sum(s.value::integer)
FROM (
SELECT((each(views)).*)
) AS s(key, value)
GROUP BY key
) x(key, value)) AS total_views
FROM my_table
GROUP BY brand
However this gives:
ERROR: subquery uses ungrouped column "my_table.views" from outer query
Any help appreciated!
It is because of using views column without aggregate function in the group by query.
Very quick workaround:
with my_table(brand,account,likes,views) as (
values
('Ford', 'ford_uk', 1, '"3"=>"100"'::hstore),
('Ford', 'ford_uk', 2, '"3"=>"200", "5"=>"10"'),
('Jeep', 'jeep_uk', 3, '"3"=>"300"'::hstore),
('Jeep', 'jeep_uk', 4, '"3"=>"400", "5"=>"20"'))
SELECT
brand,
sum(likes) AS total_likes,
(SELECT hstore(array_agg(key), array_agg(value::text))
FROM (
SELECT s.key, sum(s.value::integer)
FROM
unnest(array_agg(views)) AS h, --<< aggregate views according to the group by, then unnest it into the table
each(h) as s(key,value)
GROUP BY key
) x(key, value)) AS total_views
FROM my_table
GROUP BY brand
Update
Also you can to create the aggregate for such tasks:
--drop aggregate if exists hstore_sum(hstore);
--drop function if exists hstore_sum_ffunc(hstore[]);
create function hstore_sum_ffunc(hstore[]) returns hstore language sql immutable as $$
select hstore(array_agg(key), array_agg(value::text))
from
(select s.key, sum(s.value::numeric) as value
from unnest($1) as h, each(h) as s(key, value) group by s.key) as t
$$;
create aggregate hstore_sum(hstore)
(
SFUNC = array_append,
STYPE = hstore[],
FINALFUNC = hstore_sum_ffunc,
INITCOND = '{}'
);
After that your query will be simpler and more "canonical":
select
brand,
sum(likes) as total_likes,
hstore_sum(views) as total_views
from my_table
group by brand;
Update 2
Even without create aggregate the function hstore_sum_ffunc could be useful:
select
brand,
sum(likes) as total_likes,
hstore_sum_ffunc(array_agg(views)) as total_views
from my_table
group by brand;
If you create an aggregate for hstore, this gets a bit easier:
create aggregate hstore_agg(hstore)
(
sfunc = hs_concat(hstore, hstore),
stype = hstore
);
Then you can do this:
with totals as (
select t1.brand,
hstore(k, sum(v::int)::text) as views
from my_table t1, each(views) x(k,v)
group by brand, k
)
select brand,
(select sum(likes) from my_table t2 where t1.brand = t2.brand) as likes,
hstore_agg(views) as views
from totals t1
group by brand;
Another option is to move the co-related sub-query which might be slow into a CTE:
with vals as (
select t1.brand,
hstore(k, sum(v::int)::text) as views
from my_table t1, each(views) x(k,v)
group by brand, k
), view_totals as (
select brand,
hstore_agg(views) as views
from vals
group by brand
), like_totals as (
select brand,
sum(likes) as likes
from my_table
group by brand
)
select vt.brand,
lt.likes,
vt.views
from view_totals vt
join like_totals lt on vt.brand = lt.brand
order by brand;

order by Item must appear in the select list if distinct is used

Need to return a temp table in SQL joining another temp table using DISTINCT and ORDER BY clause.
I have a declared a table which returns a few things.
Declare #GrpItems TABLE (ID INT,
Name NVARCHAR(32),
Date DATETIME,
City NVARCHAR(32),
CityCode NVARCHAR(8),
CurrencySort NVARCHAR(16)
)
INSERT INTO #GrpItems
SELECT
ID, Name, Date ,
CityCodeorCaption --this can be two type based on User input CityCode or CityCaption
FROM
RepeatItemTable
Now I have a different table where I want to insert and the procedure returns that table as the final result.
DECLARE #CurrencyTable TABLE (RowNumber INT Identity (1,1),
FK_Currency INT,
Value INT,
CityCode NVARCHAR(16),
CityCaption NVARCHAR(16)
)
INSERT INTO #Currency
SELECT DISTINCT
gb.FK_Currency, cv.Value,
c.CityCode, c.CityCaption
FROM
Balance b
JOIN
Currency c ON c.PK_Currency = b.FK_Currency
JOIN
#GrpItems gi ON c.FK_Grpitem = gi.PK_Grpitem
ORDER BY
gi.CityCodeorName
I know somewhere I need group by but I am not sure or a select clause in where filter
I think
ORDER BY
gi.CityCodeOrNAME
WHEN 'City' THEN City
ELSE CityCode ASC
END
Which does not seem to work? I need the Distinct because it might break some other logic.
Select * from #CurrencyTable
You can always use group by instead of select distinct. That will solve your problem:
SELECT gb.FK_Currency, cv.Value, c.CityCode, c.CityCaption
FROM Balance b JOIN
Currency c
ON c.PK_Currency = b.FK_Currency JOIN
#GrpItems gi
ON c.FK_Grpitem = gi.PK_Grpitem
GROUP BY gb.FK_Currency, cv.Value, c.CityCode, c.CityCaption
ORDER BY MAX(gi.CityCodeorName) ;
Note the use of the aggregation function in the ORDER BY.
ORDER BY CASE WHEN CityCodeOrNAME = 'City'
THEN City
ELSE CityCode
END
In case you need differnt orders you can also separate them
ORDER BY CASE WHEN CityCodeOrNAME = 'City' THEN City END DESC,
CASE WHEN CityCodeOrNAME <> 'City' THEN CityCode END ASC

Bringing all the rows from left join in Postgresql

Here are two tables, reseller and domain_name, I want all the resellers against the domain_name.extensions they have(even if extension count is zero).
Problem is that this query is returning only those resellers whose extension count is not null
SELECT row_number() over (order by 1) as id, *, 123 as total FROM crosstab(
$$
select domain_name.invoicing_party , domain_name.extension_id, count(*) as total
from reseller LEFT JOIN domain_name
ON domain_name.invoicing_party_id = reseller.id
where domain_name.registration_date::date = '2015-05-25'
group by extension_id,invoicing_party
order by invoicing_party, extension_id
$$,
$$ SELECT m FROM generate_series(1,9) m $$
) AS (
invoicing_party varchar, "com" varchar, "my" varchar, "idn" varchar, "net" varchar,
"org" varchar, "edu" varchar, "mil" varchar, "gov" varchar, "name" varchar
) ;
Any help would of great value.
Edit 1
Even if I remove the where clause, even then I am getting only those resellers who have some domain extensions.

Generate id row for a view with grouping

I'm trying to create a view with row numbers like so:
create or replace view daily_transactions as
select
generate_series(1, count(t)) as id,
t.ic,
t.bio_id,
t.wp,
date_trunc('day', t.transaction_time)::date transaction_date,
min(t.transaction_time)::time time_in,
w.start_time wp_start,
w.start_time - min(t.transaction_time)::time in_diff,
max(t.transaction_time)::time time_out,
w.end_time wp_end,
max(t.transaction_time)::time - w.end_time out_diff,
count(t) total_transactions,
calc_att_status(date_trunc('day', t.transaction_time)::date,
min(t.transaction_time)::time,
max(t.transaction_time)::time,
w.start_time, w.end_time ) status
from transactions t
left join wp w on (t.wp = w.wp_name)
group by ic, bio_id, t.wp, date_trunc('day', transaction_time),
w.start_time, w.end_time;
I ended up with duplicate rows. SELECT DISTINCT doesn't work either. Any ideas?
Transaction Table:
create table transactions(
id serial primary key,
ic text references users(ic),
wp text references wp(wp_name),
serial_no integer,
bio_id integer,
node integer,
finger integer,
transaction_time timestamp,
transaction_type text,
transaction_status text
);
WP table:
create table wp(
id serial unique,
wp_name text primary key,
start_time time,
end_time time,
description text,
status text
);
View Output:
CREATE OR REPLACE VIEW daily_transactions as
SELECT row_number() OVER () AS id
, t.ic
, t.bio_id
, t.wp
, t.transaction_time::date AS transaction_date
, min(t.transaction_time)::time AS time_in
, w.start_time AS wp_start
, w.start_time - min(t.transaction_time)::time AS in_diff
, max(t.transaction_time)::time AS time_out
, w.end_time AS wp_end
, max(t.transaction_time)::time - w.end_time AS out_diff
, count(*) AS total_transactions
, calc_att_status(t.transaction_time::date, min(t.transaction_time)::time
, max(t.transaction_time)::time
, w.start_time, w.end_time) AS status
FROM transactions t
LEFT JOIN wp w ON t.wp = w.wp_name
GROUP BY t.ic, t.bio_id, t.wp, t.transaction_time::date
, w.start_time, w.end_time;
Major points
generate_series() is applied after aggregate functions, but produces multiple rows, thereby multiplying all output rows.
The window function row_number() is also applied after aggregate functions, but only generates a single number per row. You need PostgreSQL 8.4 or later for that.
date_trunc() is redundant in date_trunc('day', t.transaction_time)::date.
t.transaction_time::date achieves the same, simper & faster.
Use count(*) instead of count(t). Same result here, but a bit faster.
Some other minor changes.