PostgreSQL: Frequency Table Expansion - sql

Does anyone know how to expand a frequency table in PostgreSQL?
For example, transform table x:
data | frequency
-------+-----------
string | 4
into
data | index
-------+-------
string | 1
string | 2
string | 3
string | 4
Set up code:
CREATE TABLE x (
data TEXT,
frequency INTEGER
);
INSERT INTO x VALUES ('string',4);

This is amazingly simple with generate_series():
SELECT data, generate_series(1, frequency) AS index
FROM x;

Related

How to convert binary string to integer in BigQuery

I would need to convert a binary string (actually string) column to its integer representation:
----------------------------------------------------------------
| binary_string (string) | integer (int64) <- What I need |
----------------------------------------------------------------
| '1011011100111000' | 46904 |
| '1111111111101011' | 65515 |
| '0111111001001010' | 32330 |
----------------------------------------------------------------
In my case the values are 16 bit max.
It's the equivalent to javascript code: parseInt('0111111001001010', 2)
Thanks
Consider below approach
create temp function binary2int(x string) returns int64
language js as r'''
return parseInt(x, 2);
''';
select *, binary2int(binary_string) as integer
from your_table
if applied to sample data in your question as
with your_table as (
select '1011011100111000' binary_string union all
select '1111111111101011' union all
select '0111111001001010'
)
output is
Try bqutil.fn.from_binary:
select bqutil.fn.from_binary('1011011100111000')

How to create a column that increments in steps of 4 in Postgresql

I am trying to add a column to my table that increments in steps of four which would look like this:
1
1
1
1
2
2
2
2
3
3
3
3
etc.
I have been reading about CREATE SEQUENCE, but that does not seem to be what I need.
Does anyone have any suggestions how best to do this?
You could use row_number() and integer division:
select
t.*,
(3 + row_number() over(order by id)) / 4 rn
from mytable t
This assumes that you have an ordering column called id. I would not actually recommend storing this derived information. You can compute it on the fly, or put in a view.
You can still use a regular sequence for the default value, but do the following instead:
CREATE TABLE test (col1 int, col2 text);
CREATE SEQUENCE test_col1_seq OWNED BY test.col1;
ALTER TABLE test ALTER COLUMN col1 SET DEFAULT ceil(nextval('test_col1_seq')/4::numeric);
SELECT * FROM test;
col1 | col2
------+------
1 | a
1 | b
1 | c
1 | d
2 | e
2 | f
2 | g
2 | h
3 | i
(9 rows)
This just divides is by 4, and then rounds the value down.

SQL Server : drop zeros from col1 and concat with col2 into new View

I need to reconcile article1 (top) and article2 tables into a View displaying differences. But before that I need to drop all zeros from column 'type'. Create new ID column equals to filenumber + type so the resulting column should be use as index. All columns share same data type
Columns needed:
ID
C0016
C0029
C00311
You can utilize below script in SQL Server to get the format you want:
Reference SO post on removing padding 0
SELECT CONCAT(filenumber,type) AS filenumber, type, cost
FROM
(
SELECT
filenumber,
SUBSTRING(type, PATINDEX('%[^0]%',type),
LEN(type)- PATINDEX('%[^0]%',type)+ 1) AS type, cost
FROM
(
VALUES
('C001','00006',40),
('C002','00009',80),
('C003','00011',120)
) as t(filenumber,type, cost)
) AS t
Resultset
+------------+------+------+
| filenumber | type | cost |
+------------+------+------+
| C0016 | 6 | 40 |
| C0029 | 9 | 80 |
| C00311 | 11 | 120 |
+------------+------+------+
You can use try_convert() :
alter table table_name
add id as concat(filenumber, try_convert(int, type)) persisted -- physical storage
If you want a view :
create view veiw_name
as
select t.*, concat(filenumber, try_convert(int, type)) as id
from table t;
try_convert() will return null whereas conversation fails.

SQL query for accumulated sum using window function in postgresql

I've set up a pretty simple table, representing points in a 2D environment. The Id column is the id of each point and geom column is a binary representation of the point into the space:
Table public.foo
Column | Type | Modifiers
--------+----------------------+--------------------------------------------
id | integer | not null default nextval('mseq'::regclass)
geom | geometry(Point,2100) |
Indexes:
"foo_pkey" PRIMARY KEY, btree (id)
"foo_index_gist_geom" gist (geom)
To find the distance from each point to the next I am using this window function :
select
id,
st_distance(geom,lag(geom,1) over (order by id asc)) distance
from
foo;
which results the following ( st_distance(geom,geom) gives the distance between two geom data type):
id | distance
----+------------------
1 |
2 | 27746.1563439608
3 | 57361.8216245281
4 | 34563.3607734946
5 | 23421.2022073633
6 | 41367.8247514439
....
distance(1) -> null since its the first point
distance(2) -> ~28km from point 1 to point 2
distance(3) -> ~57km from point 2 to point 3
and etc..
My objective is to find the accumulative distance from each point to the next from the start for each node. eg like this mock table below:
id | distance | acc
----+------------------+-----
1 | |
2 | 27746.1563439608 | 27746.1563439608
3 | 57361.8216245281 | 85107.97797
4 | 34563.3607734946 | 119671.33874
where acc(1) is null because it is the first node,
acc(2) = acc(1) + dist(2)
acc(3) = acc(2) + dist(3)
and etc..
I tried combining the sum and lag functions but postgresql says that windows functions cannot be nested. I'm completely baffled on how to proceed. Anyone who can help me ?
Since you cannot have a window function over another window function ("cannot be nested"), you need to add a subquery layer (or a CTE):
SELECT id, sum(distance) OVER (ORDER BY id) AS cum_dist
FROM (
SELECT id, st_distance(geom, lag(geom, 1) OVER (ORDER BY id)) AS distance
FROM foo
) sub
ORDER BY id;
This assumes that id is unique - which is guaranteed by your primary key.

SQLite - select the newest row with a certain field value

I have an SQLite question which essentially boils down to the following problem.
id | key | data
1 | A | x
2 | A | x
3 | B | x
4 | B | x
5 | A | x
6 | A | x
New data is appended to the end of the table with an auto-incremented id.
Now, I want to create a query which returns the latest row for each key, like this:
id | key | data
4 | B | x
6 | A | x
I've tried some different queries but I have been unsuccessful. How do you select only the latest rows for each "key" value in the table?
use this SQL-Query:
select * from tbl where id in (select max(id) from tbl group by key);
You could split the main task into two subroutine.
You could move with the approach first retrieve all id/key value then get the id for the latest value of A and B keys,
Now you could easly write a query to get latest value for A and B because you have value of id's for both A and B keys.
SELECT *
FROM mytable
JOIN
( SELECT MAX(id) AS maxid
FROM mytable
GROUP BY "key"
) AS grp
ON grp.maxid = mytable.id
Side note: it's best not to use reserved words like keyas identifiers (for tables, fields. etc.)
Without nested SELECTs, or JOINs but only if the field determining "newest" is primary key (e.g. autoincrement):
SELECT * FROM table GROUP BY key DESC;