find character within a string sql - sql

I have a string column called day with indexes of the days of week as follows
1,2,3,4,5,6,7
I want to check if for example 1 exists in the string within the column.
the logic should be like this.
select * from table where 1 is in day;
how can I achieve this with psql?

select * from table
where position('1' in day) > 0
But in general it's a bad idea to store comma separated items in a column, it will only cause you lots of trouble.

You shouldn't be storing comma separated values in the first place.
The solution using like or position() can fail if you have two-digit numbers in those values, e.g. '12,13' would be found by position('1' in day) as well.
To avoid that, convert the de-normalized value into an array, then search in the array:
select *
from the_table
where '1' = any(string_to_array(day, ','));
Online example: https://rextester.com/GEMN57014

Another way You can try to use like
PostgreSQL 9.6 Schema Setup:
CREATE TABLE T(col1 varchar(50));
INSERT INTO T VALUES ('1,2,3,4,5,6,7');
INSERT INTO T VALUES ('2,3,4,5,6,7,1');
INSERT INTO T VALUES ('5,6,7');
Query 1:
select *
from T
WHERE col1 like concat('%','1',',%') or col1 like concat('%,','1','%')
Results:
| col1 |
|---------------|
| 1,2,3,4,5,6,7 |
| 2,3,4,5,6,7,1 |

Related

How can I alter data type and datas in it, numbers separated with comma

I have table , that has field named AMOUNT , amount takes number, table has 1.4m records, i need to upgrade them all . I would like the change NUMBER to varchar and make look like amount datas comma separated , eg: 76543-> 76,543. How can I able to do it?
1 - Create the new column at the end of the table.
2 - Run an update to populate the new table column
(in this step for thousand seperataor look Thousand Seperator function in oracle? )
3 - Drop the old table column
4 - Re-name the new column to the original column name
i need to upgrade them all
Don't; if you have a numeric value then store it as a NUMBER.
I would like the change NUMBER to varchar and make look like amount datas comma separated , eg: 76543-> 76,543. How can I able to do it?
Just change how you are displaying the value rather than changing how you are storing the value.
If you have the table and data:
CREATE TABLE table_name ( amount NUMBER(12,0) );
INSERT INTO table_name ( amount ) VALUES ( 76543 );
If you want to do it in a SELECT statement then use TO_CHAR and include sufficient digits to format the largest number you can hold:
SELECT amount,
TO_CHAR(amount, 'FM999G999G999G990') AS formatted_amount
FROM table_name;
Outputs:
AMOUNT
FORMATTED_AMOUNT
76543
76,543
If you want to do that in the table then add a virtual column:
ALTER TABLE table_name
ADD formatted_amount VARCHAR2(16)
GENERATED ALWAYS AS ( TO_CHAR(amount, 'FM999G999G999G990') );
Then, after adding the virtual column:
SELECT * FROM table_name;
Outputs:
AMOUNT
FORMATTED_AMOUNT
76543
76,543
db<>fiddle here
You can use to_char():
select to_char(col, 'FM999,990;')

Generate new ID for every new combination of column 1 and column 2

I would like to generate a new ID number for every new combination of column 1 and column 2.
For example:
ID | column 1 | column 2
1 | peter | blue
2 | mark | red
1 | peter | blue
As there will be new rows added over time, with new values, this should be able to auto-update.
I tried DENSE_RANK(), which seemed to work. But it gave an error when I put it as a statement in a calculated column, so I guess this is not possible? (Still very new to SQL).
Thanks for any help!
Error message: #1901 - Function or expression 'dense_rank()' cannot be
used in the GENERATED ALWAYS AS clause of `ProductIdentifier
EDIT:
What I basically want is to link a row to another table based on the 2 columns. I could also concatenate the two columns of course, but I read somewhere that doing this with string will be slower. It will ultimately be a big table with currently 200.000+ rows and growing to millions. Is this something I could/should do?
I don't understand. If you want column1/column2 to be unique, then they should be in their own table:
create table t12 (
t12_id int auto_increment primary key,
column1 int,
column2 int,
unique (column1, column2)
);
This gives you the unique value for the pair that you seem to want.
You can't use window functions in a generated column, as you found out. You can, however, compute the information on the fly in a view:
create view myview as
select id, column1, column2, dense_rank() over(order by column1, column2) rn
from mytable
Then you can query the view instead of the table, like:
select * from myview;

Adding column to sqlite database and distribute rows based on primary key

I have some data elements containing a timestamp and information about Item X sales related to this timestamp.
e.g.
timestamp | items X sold
------------------------
1 | 10
4 | 40
7 | 20
I store this data in an SQLite table. Now I want to add to this table. Especially if I get data about another item Y.
The item Y data might or might not have different timestamps but I want to insert this data into the existing table so that it looks like this:
timestamp | items X sold | items Y sold
------------------------------------------
1 | 10 | 5
2 | NULL | 10
4 | 40 | NULL
5 | NULL | 3
7 | 20 | NULL
Later on additional sales data (columns) must be added with the same scheme.
Is there an easy way to accomplish this with SQLite?
In the end I want to fetch data by timestamp and get an overview which items were sold at this time. Most examples consider the usecase to add a complete row (one record) or a complete column if it perfectly matches to the other columns.
Or is sqlite the wrong tool at all? And I should rather use csv or excel?
(Using pythons sqlite3 package to create and manipulate the DB)
Thanks!
Dynamically adding columns is not a good design. You could add them using
ALTER TABLE your_table ADD COLUMN the_column_name TEXT
the column, for existing rows would be populated with nulls, although you could specify a DEFAULT value and the existing rows would then be populated with that value.
e.g. the following demonstrates the above :-
DROP TABLE IF EXISTS soldv1;
CREATE TABLE IF NOT EXISTS soldv1 (timestamp INTEGER PRIAMRY KEY, items_sold_x INTEGER);
INSERT INTO soldv1 VALUES(1,10),(4,40),(7,20);
SELECT * FROM soldv1 ORDER BY timestamp;
ALTER TABLE soldv1 ADD COLUMN items_sold_y INTEGER;
UPDATE soldv1 SET items_sold_y = 5 WHERE timestamp = 1;
INSERT INTO soldv1 VALUES(2,null,10),(5,null,3);
SELECT * FROM soldv1 ORDER BY timestamp;
resulting in the first query returning :-
and the second query returning :-
However, as stated, the above is not considered a good design as the schema is dynamic.
You could alternately manage an equivalent of the above with the addition of either a new column (to also be part of the primary key) or by prefixing/suffixing the timestamp with a type.
Consider, as an example, the following :-
DROP TABLE IF EXISTS soldv2;
CREATE TABLE IF NOT EXISTS soldv2 (type TEXT, timestamp INTEGER, items_sold INTEGER, PRIMARY KEY(timestamp,type));
INSERT INTO soldv2 VALUES('x',1,10),('x',4,40),('x',7,20);
INSERT INTO soldv2 VALUES('y',1,5),('y',2,10),('y',5,3);
INSERT INTO soldv2 VALUES('z',1,15),('z',2,5),('z',9,25);
SELECT * FROM soldv2 ORDER BY timestamp;
This has replicated, data-wise, your original data and additionally added another type (column items_sold_z) without having to change the table's schema (nor having the additional complication of needing to update rather than insert as per when applying timestamp 1 items_sold_y 5).
The result from the query being :-
Or is sqlite the wrong tool at all? And I should rather use csv or excel?
SQLite is a valid tool. What you then do with the data can probably be done as easy as in excel (perhaps simpler) and probably much simpler than trying to process the data in csv format.
For example, say you wanted the total items sold per timestamp and how many types were sold then :-
SELECT timestamp, count(items_sold) AS number_of_item_types_sold, sum(items_sold) AS total_sold FROM soldv2 GROUP by timestamp ORDER BY timestamp;
would result in :-

How to lookup based on ranged values

I have a table like:
id name
001to005 ABC
006to210 PQR
211to300 XYZ
This is not the final table i can make it any how i want...so i would like to lookup on this data on id and extract name like if id is in range of 001-005 then ABC and if id is in range 006-010 .... then name XYZ.
My approach would be, store id as regular expression in table like this:
id name
[0][0][1-5] ABC
[0-2][0-9][0-9] PQR
[2-3][0-9][0-9] XYZ
and then query:
select * from table where '004' ~ id
This query will return ABC which is correct but when range gets bigger my input value can lie on both 2nd and 3rd row.
For Eg:
select * from table where '299' ~ id
this query will result in 2 rows,so my question is what reg exp to use to make it more restrictive or is there any other approach to solve this:
Do not store regular expressions for simple ranges, that would be extremely expensive and cannot use an index: every single expression in the table would have to be evaluated for every query to satisfy conditions.
You could use range types like #a_horse commented. But while you don't need the added functionality for range types this simple layout is smaller and faster:
CREATE TABLE tbl (
id_lo int NOT NULL
, id_hi int NOT NULL
, name text NOT NULL
);
INSERT INTO t VALUES
( 1, 5, 'ABC')
, ( 6, 210, 'PQR')
, (211, 300, 'XYZ');
CREATE UNIQUE INDEX foo ON t (id_lo, id_hi DESC);
Two integer occupy 8 bytes, int4range value occupies 17 bytes. Size matters in tables and indexes.
Query:
SELECT * FROM tbl
WHERE 4 BETWEEN id_lo AND id_hi;
Lower (id_lo) and upper (id_hi) bounds are included in the range like your sample data suggests.
Note that range types exclude the upper bound by default.
Also assuming that leading zeros are insignificant, so we can operate with plain integer.
Related:
PostgreSQL daterange not using index correctly
Optimizing queries on a range of timestamps (two columns)
Find overlapping date ranges in PostgreSQL
To enforce distinct ranges in the table:
Preventing adjacent/overlapping entries with EXCLUDE in PostgreSQL
You still don't need a range type in the table for this:
Postgres: How to find nearest tsrange from timestamp outside of ranges?

return all rows, in which at least one column contains a specific string

I have a legacy database with a table, which has around 80 columns. The problem is that the columns are written on German, so I can't understand them.
I need to take the id of all rows, which on a specific column there is the string "NETTO". The problem is that I don't know which is the column of all of them.
So now I wonder if I can check if any of the columns contains this string.
I thought of using "or" and spelling all columns, but it is not good solution for me( they are more than 80).
sample:
t=# create table n(a text, i int, b text);
CREATE TABLE
t=# insert into n values ('abc def',1,'here NETTO exists'), ('abc',2,'brutto');
INSERT 0 2
you can check in text representation for a row:
t=# select n from n;
n
-----------------------------------
("abc def",1,"here NETTO exists")
(abc,2,brutto)
(2 rows)
thus your query would be like:
t=# select * from n where n::text like '%NETTO%';
a | i | b
---------+---+-------------------
abc def | 1 | here NETTO exists
(1 row)
Of course this will always SeqScan and thus suboptimal