I have a dataset.table in Bigquery. It has return_code and response_time columns besides few other columns. I need to add a new column 'normal_abnormal' and
fill the column with value 'abnormal' if:
response_time is >= 99 percentile or
return_code starts with 4 or 5
else
fill column with 'normal';
I am trying to write a query like below but looks like I can't use UPDATE with WITH AS.
I am new to bigquery so please bear with me.
WITH
temp_table AS (
SELECT
CASE
WHEN ROUND(PERCENT_RANK() OVER( ORDER BY response_time ),2)*100 >= 99 THEN 'abnormal'
WHEN starts_with(return_code, '4') THEN 'abnormal'
WHEN starts_with(return_code, '5') THEN 'abnormal'
ELSE 'normal'
END
AS normality
FROM `access_log`)
UPDATE <=== CAn't use update here.
Any guidance is really appreciated.
Related
I have a time series data, looks something like below. All I need is to generate the Sequence column for only 'Low' values. Also sequence should increment only when there is a change in 'MODE' value.
I am trying to do this using pyspark. Ideas or actual code to implement will be really appreciated.
Thanks,
Nash.
I have done something like this before using SQL and have translated it to Spark SQL, and if you are okay with that it is easy enough to create a temp view -> spark.sql and then go back to a data frame. I am sure it is also easily done in Pyspark.
df.createOrReplaceTempView("data") # Name your temp view
query = """
SELECT time, temp, pressure, mode, CASE WHEN mode='Low' THEN sequence ELSE NULL END FROM (
SELECT *
, SUM(increment) OVER(order by time) as sequence
FROM (
Select *
, CASE WHEN mode!='Low' and LEAD(mode) OVER(order by time) != 'Low' THEN 0
WHEN mode='Low' AND lag(mode) OVER(order by time) = 'Low' then 0
WHEN SUM(CASE WHEN mode='Low' THEN 1 ELSE 0 END) over(order by time) > 0 THEN 1 ELSE 0 END as increment -- Don't want to start incrementing if there aren't any "lows" yet, like rows 1 and 2 in your table
FROM data -- the name of your temp view
)
)
"""
dfSequence = spark.sql(query)
In the above query, the sequence= null if the mode is not 'Null', per the example you provided above.
I'm looking for a way to get an overview of how well my table is populated for each variable grouped by a specific variable so something like:
SELECT AVG(VAR IS NOT NULL) *
FROM my_table
GROUP BY my_var;
or in pandas:
my_table.groupby('my_var').apply(lambda x : x.isnull().mean())
Hope you can help me I'm pretty new to SQL..
SELECT my_var, AVG (CASE WHEN value is null then 0 else 1 end) as ratio
FROM my_table
GROUP BY my_var
I have a table where there are two columns, which should technically just be one column. Let's say there are 20 rows in my table, id_col1 has data till row 15, then id_col2 has data from rows 16-20. So in my new table, I'm creating a new column that has data from these 2 columns by using a case statement. But, the new column accepts data from id_col1, but doesn't from id_col2, it's just blank when it should have data from id_col2. My code is as follows:
select
case
when id_col1 is null then id_col2
else id_col1 end as 'newcol',
cusip,
cast (Date as date) as 'Date',
Price,
Evaluator,
Yes_No as 'Accepted'
into #cloudtemptbl
from MasterData
where Date >= '2017-01-01'
select * from #cloudtemptbl
My theory is that the data in id_col2 is binary not string. Any help is highly appreciated. Thank you.
I would suggest this :
when id_col1 is null or id_col1 = '' then id_col2
else id_col1 end as 'newcol'
You might need to use CONVERT or CAST to get the data types to match across both tables.
Work out the data type you're planning on using in the final, combined table. Then use CAST or CONVERT on any data that doesn't match that type, to get it into the correct format.
I want to retrieve a full table with some of the values sorted. The sorted values should all appear before the unsorted values. I though I could pull this off with a UNION but order by is only valid to use after unioning the table and my set of data isn't set up such that that is useful in this case. I want rows with a column value of 0-6 to show up sorted in DESC order and then the rest of the results to show up after that. Is there some way to specify a condition in the order by clause? I saw something that looked close to what I wanted to so but I couldn't get the equality condition working in sql. I'm going to try to make a query using WHEN cases but I'm not sure if there's a way to specify a case like currentValue <= 6. If anyone has any suggestions that would be awesome.
You could do something like this:
order by (case when currentValue <= 6 then 1 else 0 end) desc,
(case when currentValue <= 6 then column end) desc
The first puts the values you care about first. The second puts them in sorted order. The rest will be ordered arbitrarily.
Try this:
SELECT *
FROM yourdata
ORDER BY CASE WHEN yourColumn BETWEEN 0 AND 6 THEN yourColumn ELSE -1 End Desc
One RDBMS-agnostic solution would be to add a second field that takes the same value as the field you wish to sort when that field is less than or equal to six. Then just sort by that field.
I need to calculate the net total of a column-- sounds simple. The problem is that some of the values should be negative, as are marked in a separate column. For example, the table below would yield a result of (4+3-5+2-2 = 2). I've tried doing this with subqueries in the select clause, but it seems unnecessarily complex and difficult to expand when I start adding in analysis for other parts of my table. Any help is much appreciated!
Sign Value
Pos 4
Pos 3
Neg 5
Pos 2
Neg 2
Using a CASE statement should work in most versions of sql:
SELECT SUM( CASE
WHEN t.Sign = 'Pos' THEN t.Value
ELSE t.Value * -1
END
) AS Total
FROM YourTable AS t
Try this:
SELECT SUM(IF(sign = 'Pos', Value, Value * (-1))) as total FROM table
I am adding rows from a single field in a table based on values from another field in the same table using oracle 11g as database and sql developer as user interface.
This works:
SELECT COUNTRY_ID, SUM(
CASE
WHEN ACCOUNT IN 'PTBI' THEN AMOUNT
WHEN ACCOUNT IN 'MLS_ENT' THEN AMOUNT
WHEN ACCOUNT IN 'VAL_ALLOW' THEN AMOUNT
WHEN ACCOUNT IN 'RSC_DEV' THEN AMOUNT * -1
END) AS TI
FROM SAMP_TAX_F4
GROUP BY COUNTRY_ID;
select a= sum(Value) where Sign like 'pos'
select b = sum(Value) where Signe like 'neg'
select total = a-b
this is abit sql-agnostic, since you didnt say which db you are using, but it should be easy to adapat it to any db out there.