Iterative function in SQL, postgres - sql

I have a gigantic script that I would like to create in an iterative way (while or for loop), so it becomes overviewable and much shorter. It makes so much sense that it should be doable in SQL but so far I have not succeeded. What I did now in order to make it work is a lot of selections that I UNION together to make one table.
I want to iterate through the years, so while year is lower then 2017 execute function with the year in it as variable, starting from 1995.
So actually, an iterative function that fills in all years in the following lines of code and combines all results within one table: I will keep trying myself and update the code if I make progress.
SELECT
regio, 1995 as year, sum("0") as "0", sum("1") as "1", sum("2") as "2", sum("3") as "3", sum("4") as "4", sum("5") as "5", sum("6") as "6", sum("7") as "7", sum("8") as "8", sum("9") as "9", sum("10") as "10"
FROM
source
where
year = 1995 OR "year-1" = 1995 OR "year-2" = 1995 OR "year-3" = 1995 OR "year-4" = 1995
group by
regio
UNION
SELECT
regio, 1996 as year, sum("0") as "0", sum("1") as "1", sum("2") as "2", sum("3") as "3", sum("4") as "4", sum("5") as "5", sum("6") as "6", sum("7") as "7", sum("8") as "8", sum("9") as "9", sum("10") as "10"
FROM
source
where
year = 1996 OR "year-1" = 1996 OR "year-2" = 1996 OR "year-3" = 1996 OR "year-4" = 1996
group by
regio

You would seem to want:
SELECT regio, g.yyyy as year, sum("0") as "0", sum("1") as "1",
sum("2") as "2", sum("3") as "3", sum("4") as "4",
sum("5") as "5", sum("6") as "6", sum("7") as "7",
sum("8") as "8", sum("9") as "9", sum("10") as "10"
FROM source CROSS JOIN
generate_series(1995, 2017) g(yyyy)
WHERE g.yyyy IN (year, "year-1", "year-2", "year-3", "year-4")
GROUP BY regio, g.yyyy;

Related

Laravel query sum group by week and get 0 for weeks not existent in the dataset

Hi I am trying to get sum of quantity group by week of current year.
Here is my query which is working
Sale::selectRaw('sum(qty) as y')
->selectRaw(DB::raw('WEEK(order_date,1) as x'))
->whereYear('order_date',Carbon::now()->format('Y'))
->groupBy('x')
->orderBy('x', 'ASC')
->get();
The response I get is look like this. where x is the week number and y is the sum value.
[
{
"y": "50",
"x": 2
},
{
"y": "4",
"x": 14
}
]
I want to get 0 values for the week that doesn't have any value for y
My desired result should be like this
[
{
"y": "0",
"x": 1
},
{
"y": "50",
"x": 2
},
...
...
...
{
"y": "4",
"x": 14
}
]

BigQuery string-formatting to json

Is the following a full list of all value types as they're passed to json in BigQuery? I've gotten this by trial and error but haven't been able to find this in the documentation:
select
NULL as NullValue,
FALSE as BoolValue,
DATE '2014-01-01' as DateValue,
INTERVAL 1 year as IntervalValue,
DATETIME '2014-01-01 01:02:03' as DatetimeValue,
TIMESTAMP '2014-01-01 01:02:03' as TimestampValue,
"Hello" as StringValue,
B"abc" as BytesValue,
123 as IntegerValue,
NUMERIC '3.14' as NumericValue,
3.14 as FloatValue,
TIME '12:30:00.45' as TimeValue,
[1,2,3] as ArrayValue,
STRUCT('Mark' as first, 'Thomas' as last) as StructValue,
[STRUCT(1 as x, 2 as y), STRUCT(5 as x, 6 as y)] as ArrayStructValue,
STRUCT(1 as x, [1,2,3] as y, ('a','b','c') as z) as StructNestedValue
{
"NullValue": null,
"BoolValue": "false", // why not just false without quotes?
"DateValue": "2014-01-01",
"IntervalValue": "1-0 0 0:0:0",
"DatetimeValue": "2014-01-01T01:02:03",
"TimestampValue": "2014-01-01T01:02:03Z",
"StringValue": "Hello",
"BytesValue": "YWJj",
"IntegerValue": "123",
"NumericValue": "3.14",
"FloatValue": "3.14",
"TimeValue": "12:30:00.450000",
"ArrayValue": ["1", "2", "3"],
"StructValue": {
"first": "Mark",
"last": "Thomas"
},
"ArrayStructValue": [
{"x": "1", "y": "2"},
{"x": "5", "y": "6"}
],
"StructNestedValue": {
"x": "1",
"y": ["1", 2", "3"],
"z": {"a": "a", b": "b", "c": "c"}
}
}
Honestly, it seems to me that other than the null value and the array [] or struct {} container, everything is string-enclosed, which seems a bit odd.
According to this document, json is built on two structures:
A collection of name/value pairs. In various languages, this is
realized as an object, record, struct, dictionary, hash table, keyed
list, or associative array.
An ordered list of values. In most
languages, this is realized as an array, vector, list, or sequence.
The result of the SELECT query is in json format, wherein [] depicts an array datatype, {} depicts an object datatype and double quotes(" ") depicts a string value as in the query itself.

delete row number in dictionary format from pandas dataframe column

please help me with the following conversion please. So I have a pandas dataframe in the following format:
id
location
{ "0": "5",
"0": "Charlotte, North Carolina",
"1": "5",
"1": "N/A",
"2": "5",
"2": "Portland, Oregon",
"3": "5",
"3": "Jonesborough, Tennessee",
"4": "5",
"4": "Rockville, Indiana",
"5": "5",}
"5": "Dallas, Texas",
and would like to convert this into the following format:
A header
Another header
"5"
"Charlotte, North Carolina"
"5"
"N/A"
"5"
"Portland, Oregon"
"5"
"Jonesborough, Tennessee"
"5"
"Rockville, Indiana"
"5"
"Dallas, Texas"
Please help
You can try this.
import pandas as pd
import re
df = pd.DataFrame([['{ "0": "5",', '"0": "Charlotte, North Carolina",'], ['"1": "5",','"1": "N/A",']], columns=['id', 'location'])
#using regex to extract int values and selecting second int
df['id'] = df['id'].apply(lambda x: re.findall(r'\d+', x)[1])
#Split the string with : and select second value. And remove comma
df['location'] = df['location'].apply(lambda x: x.split(':')[1][:-1])
print(df)
Output:
id location
0 5 "Charlotte, North Carolina"
1 5 "N/A"

Flatening a JSONB column into a string in PostgreSQL

I am dealing with JSONB columns and its quite new to me.
Below is the sample data set
UUID Survey_id Employee_id Employee_Response Status
f212 2 17 [{"q_id": "5", "answer": {"value": "Agree"}, "q_type": "radio-buttons"}, {"q_id": "6", "answer": {"value": "4"}, "q_type": "star-ratings"}, {"q_id": "7", "answer": {"value": "9"}, "q_type": "slider-type"}] active
a3f5 2 46 [{"q_id": "5", "answer": {"value": "Agree"}, "q_type": "radio-buttons"}, {"q_id": "6", "answer": {"value": "4"}, "q_type": "star-ratings"}, {"q_id": "7", "answer": {"value": "8"}, "q_type": "slider-type"}] active
2db8 2 32 [{"q_id": "5", "answer": {"value": "Agree"}, "q_type": "radio-buttons"}, {"q_id": "6", "answer": {"value": "3"}, "q_type": "star-ratings"}, {"q_id": "7", "answer": {"value": "9"}, "q_type": "slider-type"}] active
d2bd 2 40 [{"q_id": "5", "answer": {"value": "Disagree"}, "q_type": "radio-buttons"}, {"q_id": "6", "answer": {"value": "2"}, "q_type": "star-ratings"}, {"q_id": "7", "answer": {"value": "3"}, "q_type": "slider-type"}] active
g632 2 31 [{"q_id": "5", "answer": {"value": "Strongly Agree"}, "q_type": "radio-buttons"}, {"q_id": "6", "answer": {"value": "3"}, "q_type": "star-ratings"}, {"q_id": "7", "answer": {"value": "6"}, "q_type": "slider-type"}] active
Expected output
UUID Survey_id Employee_id Q_5 Q_6 q_7
f212 2 17 Agree 4 9
a3f5 2 46 Agree 4 8
2db8 2 32 Agree 3 9
d2bd 2 40 Disagree 2 3
g632 2 31 Strongly Agree 3 6
Can you please suggest or help to achieve the same,
I have tried below and other various methods but still not luck in achieving it in SQL
SELECT
survey_id,
response::jsonb->'answer'->>'value' as name
FROM survey_resposnes
select survey_id,user_id,
response-> 'q_id' #> '[5]' as q0--,
from survey_resposnes
You can use jsonb_path_query_first() for this:
select r."uuid",
r.survey_id,
r.employee_id,
jsonb_path_query_first(r.employee_response, '$[*] ? (#.q_id == "5").answer.value') #>> '{}' as q_5,
jsonb_path_query_first(r.employee_response, '$[*] ? (#.q_id == "6").answer.value') #>> '{}' as q_6,
jsonb_path_query_first(r.employee_response, '$[*] ? (#.q_id == "7").answer.value') #>> '{}' as q_7
from survey_responses r
jsonb_path_query_first() returns a jsonb value. To convert that into a proper text value, the #>> '{}' is used. If you are fine with a jsonb value as the result you can remove that expression
Online example

Hive Extract Data in a Array

I need to extract the 5th value from data string array in Hive,
arr = ("abc", "123-4567", "10", "ax", "cdpp asd", "00", "q", "na", "avail", "n", "n", "na")
How can I extract "cdpp asd" ie 5th value.
We can use SUBSTR, and INSTR but is there any other way to achieve this?
If your array is in string column then you can remove brackets and double quotes using regexp_replace and split resulted string to get an array using split():
select split(regexp_replace('("abc", "123-4567", "10", "ax", "cdpp asd", "00", "q", "na", "avail", "n", "n", "na")','^\\(|\\)$|"',''),', *')[4];
OK
cdpp asd
arr = ("abc", "123-4567", "10", "ax", "cdpp asd", "00", "q", "na", "avail", "n", "n", "na")
Select arr[4] from tablename;
Output:
cdpp asd
1、Maybe you can try to write a UDF to cast this string to an Array arr, then you can use arr[4] to visit the 5th value;
2、Or you can use the following way to get the 5th value:
select tf.* from (
select regexp_replace('("abc", "123-4567", "10", "ax", "cdpp asd", "00", "q", "na", "avail", "n", "n", "na")','\\(|\\)|"','') as str
) t lateral view posexplode(split(str,', ')) tf as pos,val
where tf.pos = 4;
Note: this way require the array string has no brackets.