Exclude "grouped" data from query - sql

I have a table that looks like this (simplified):
CREATE TABLE IF NOT EXISTS records (
user_id uuid NOT NULL
ts timestamptz NOT NULL,
op_type text NOT NULL,
PRIMARY KEY (user_id, ts, op_type)
);
I cannot for practical purposes change the PRIMARY KEY.
I'm trying to write a query that gets all records for a given user_id where, for a specific record, the ts and the op_type don't match an array of exclusions.
I'm not exactly sure of the right postgres terminology so let me see if this example makes my constraint clearer:
This array looks something like this (in JavaScript):
var excludes = [
[DATE1, 'OP1'],
[DATE2, 'OP2']
]
If, for a given user id, there are rows that look like this in the database:
ts | op_type
----------------------------+-------------
DATE1 | OP1
DATE2 | OP2
DATE1 | OP3
DATE2 | OP1
OTHER DATE | OP1
OTHER DATE | OP2
Then, with the excludes from above, I'd like to run a query that returns everything EXCEPT or the first two rows since they match exactly.
My attempt was to do this:
client.query(`
SELECT * FROM records
WHERE
user_id = $1
AND (ts, op_type) NOT IN ($2)
`, [userId, excluding])
But I get "input of anonymous composite types is not implemented". I'm not sure how to properly type excluding or if this is even the right way to do this.

The query may look like this
SELECT *
FROM records
WHERE user_id = 'a0eebc999c0b4ef8bb6d6bb9bd380a11'
AND (ts, op_type) NOT IN (('2016-01-01', 'OP1'), ('2016-01-02', 'OP2'));
so if you want to pass the conditions as a single parameter then excluding should be a string in the format:
('2016-01-01', 'OP1'), ('2016-01-02', 'OP2')
It seems that there is no simple way to pass the condition string into query() as a parameter. You can try to write a function to get the string in the correct format (I'm not a JS developer but this piece of code seems to work well):
excluding = function(exc) {
var s = '(';
for (var i = 0; i < exc.length; i++)
s = s+ '(\''+ exc[i][0]+ '\',\''+ exc[i][1]+ '\'),';
return s.slice(0, -1)+ ')';
};
var excludes = [
['2016-01-01', 'OP1'],
['2016-01-02', 'OP2']
];
// ...
client.query(
'SELECT * FROM records '+
'WHERE user_id = $1 '+
'AND (ts, op_type) NOT IN ' + excluding(excludes),
[userId])

Related

PostgreSQL import from CSV NULL values are text - Need null

I had exported a bunch of tables (>30) as CSV files from MySQL database using phpMyAdmin. These CSV file contains NULL values like:
"id","sourceType","name","website","location"
"1","non-commercial","John Doe",NULL,"California"
I imported many such csv to a PostgreSQL database with TablePlus. However, the NULL values in the columns are actually appearing as text rather than null.
When my application fetches the data from these columns it actually retrieves the text 'NULL' rather than a null value.
Also SQL command with IS NULL does not retrieve these rows probably because they are identified as text rather than null values.
Is there a SQL command I can do to convert all text NULL values in all the tables to actual NULL values? This would be the easiest way to avoid re-importing all the tables.
PostgreSQL's COPY command has the NULL 'some_string' option that allows to specify any string as NULL value: https://www.postgresql.org/docs/current/sql-copy.html
This would of course require re-importing all your tables.
Example with your data:
The CSV:
"id","sourceType","name","website","location"
"1","non-commercial","John Doe",NULL,"California"
"2","non-commercial","John Doe",NULL,"California"
The table:
CREATE TABLE import_with_null (id integer, source_type varchar(50), name varchar(50), website varchar(50), location varchar(50));
The COPY statement:
COPY import_with_null (id, source_type, name, website, location) from '/tmp/import_with_NULL.csv' WITH (FORMAT CSV, NULL 'NULL', HEADER);
Test of the correct import of NULL strings as SQL NULL:
SELECT * FROM import_with_null WHERE website IS NULL;
id | source_type | name | website | location
----+----------------+----------+---------+------------
1 | non-commercial | John Doe | | California
2 | non-commercial | John Doe | | California
(2 rows)
The important part that transforms NULL strings into SQL NULL values is NULL 'NULL' and could be any other value NULL 'whatever string'.
UPDATE For whoever comes here looking for a solution
See answers for two potential solutions
One of the solutions provides a SQL COPY method which must be performed before the import itself. The solution is provided by Michal T and marked as accepted answer is the better way to prevent this from happening in the first place.
My solution below uses a script in my application (Built in Laravel/PHP) which can be done after the import is already done.
Note- See the comments in the code and you could potentially figure out a similar solution in other languages/frameworks.
Thanks to #BjarniRagnarsson suggestion in the comments above, I came up with a short PHP Laravel script to perform update queries on all columns (which are of type 'string' or 'text') to replace the 'NULL' text with NULL values.
public function convertNULLStringToNULL()
{
$tables = DB::connection()->getDoctrineSchemaManager()->listTableNames(); //Get list of all tables
$results = []; // an array to store the output results
foreach ($tables as $table) { // Loop through each table
$columnNames = DB::getSchemaBuilder()->getColumnListing($table); //Get list of all columns
$columnResults = []; // array to store the results per column
foreach ($columnNames as $column) { Loop through each column
$columnType = DB::getSchemaBuilder()->getColumnType($table, $column); // Get the column type
if (
$columnType == 'string' || //check if column type is string or text
$columnType == 'text'
) {
$query = "update " . $table . " set \"" . $column . "\"=NULL where \"" . $column . "\"='NULL'"; //Build the update query as mentioned in comments above
$r = DB::update($query); //perform the update query
array_push($columnResults, [
$column => $r
]); //Push the column Results
}
}
array_push($results, [
$table => $columnResults
]); // push the table results
}
dd($results); //Output the results
}
Note I was using Laravel 8 for this.

LIKE with integers in PostgreeSQL using R

I need to download a table from postgree to R, but filtered by part of an INT.
I have been trying:
library(RPostgreSQL)
con <- dbConnect(PostgreSQL(), user= "#####", dbname="######",password="#####"
,host="#####", port='######')
vetor_id <- c("83052407","10406587","12272377")
match_id <- dbGetQuery(con,paste("
SELECT *
FROM public.data2015
WHERE id IN ('", paste(vetor_id,collapse = "','"),"')
",sep = ""))
dbDisconnect(con)
I also tried CONTAINS but didn't work.WHERE Contains(id,", paste(vetor_id,collapse = " OR "),"')
id is INT and vetor_id is just part of the values. I mean,vector_id = 83052407 must find id = 83052407000132.
How can I use something like LIKE and put vetor_id% ?
Is this what you want?
WHERE id::text like ? || '%'
This converts the integer id to a string, and attempts to match it against the parameter. If id starts with the parameter, the condition is satisfied.
Note that this uses a legitimate query parameter (represented by the question mark): you should get used to parameterize your queries rather than concatenating variables in the query string: this is cleaner, more efficient and safer.

Sql query 'IN ' operator is not work error?

I am using CI 'in'operator is not work sql error please check its and share valuable idea...
table
enter image description here
id | coach_name
------------------
9 | GS
------------------
10 | SLR
view and function
$coachID = explode(',',$list['coach']);
$coachname = $this->rail_ceil_model->display_coach_name($coachID);
show result
SLR
need result
GS,SLR
last query result here
SELECT coach_name FROM mcc_coach WHERE id IN('9', '10')
CI code
public function display_coach_name($coachID='')
{
$db2 = $this->load->database('rail',TRUE);
$db2->select('coach_name');
$db2->from('mcc_coach');
$db2->where_in('id',$coachID);
$query = $db2->get();
echo $db2->last_query(); die;
if ($query->num_rows() > 0):
//return $query->row()->coach_name;
else:
return 0;
endif;
}
You must provide an array to in operator so #coachId must be an array not a string
If you are writing this query
SELECT coach_name FROM mcc_coach WHERE id IN('9,10')
it means you are applying in operator on a single id which contains a comma separated value.
So, right query will be
SELECT coach_name FROM mcc_coach WHERE id IN('9','10')

How to use where in list items

I have a database as below:
TABLE_B:
ID Name LISTID
1 NameB1 1
2 NameB2 1,10
3 NameB3 1025,1026
To select list data of table with ID. I used:
public static List<ListData> GetDataById(string id)
{
var db = Connect.GetDataContext<DataContext>("NameConnection");
var sql = (from tblB in db.TABLE_B
where tblB.LISTID.Contains(id)
select new ListData
{
Name= tblB.Name,
});
return sql.ToList();
}
When I call the function:
GetDataById("10") ==> Data return "NameB2, NameB3" are not correct.
The data correct is "NameB2". Please help me about that?
Thanks!
The value 10 will cause unintended matches because LISTID is a string/varchar type, as you already saw, and the Contains function does not know that there delimiters that should be taken into account.
The fix could be very simple: surround both the id that you are looking for and LISTID with extra commas.
So you will now be looking for ,10,.
The value ,10, will be found in ,1,10, and not in ,1025,1026,
The LINQ where clause then becomes this:
where ("," + tblB.LISTID + ",").Contains("," + id + ",")

Update multiple rows with different values in a single SQL query

I have a SQLite database with table myTable and columns id, posX, posY. The number of rows changes constantly (might increase or decrease). If I know the value of id for each row, and the number of rows, can I perform a single SQL query to update all of the posX and posY fields with different values according to the id?
For example:
---------------------
myTable:
id posX posY
1 35 565
3 89 224
6 11 456
14 87 475
---------------------
SQL query pseudocode:
UPDATE myTable SET posX[id] = #arrayX[id], posY[id] = #arrayY[id] "
#arrayX and #arrayY are arrays which store new values for the posX and posY fields.
If, for example, arrayX and arrayY contain the following values:
arrayX = { 20, 30, 40, 50 }
arrayY = { 100, 200, 300, 400 }
... then the database after the query should look like this:
---------------------
myTable:
id posX posY
1 20 100
3 30 200
6 40 300
14 50 400
---------------------
Is this possible? I'm updating one row per query right now, but it's going to take hundreds of queries as the row count increases. I'm doing all this in AIR by the way.
There's a couple of ways to accomplish this decently efficiently.
First -
If possible, you can do some sort of bulk insert to a temporary table. This depends somewhat on your RDBMS/host language, but at worst this can be accomplished with a simple dynamic SQL (using a VALUES() clause), and then a standard update-from-another-table. Most systems provide utilities for bulk load, though
Second -
And this is somewhat RDBMS dependent as well, you could construct a dynamic update statement. In this case, where the VALUES(...) clause inside the CTE has been created on-the-fly:
WITH Tmp(id, px, py) AS (VALUES(id1, newsPosX1, newPosY1),
(id2, newsPosX2, newPosY2),
......................... ,
(idN, newsPosXN, newPosYN))
UPDATE TableToUpdate SET posX = (SELECT px
FROM Tmp
WHERE TableToUpdate.id = Tmp.id),
posY = (SELECT py
FROM Tmp
WHERE TableToUpdate.id = Tmp.id)
WHERE id IN (SELECT id
FROM Tmp)
(According to the documentation, this should be valid SQLite syntax, but I can't get it to work in a fiddle)
One way: SET x=CASE..END (any SQL)
Yes, you can do this, but I doubt that it would improve performances, unless your query has a real large latency.
If the query is indexed on the search value (e.g. if id is the primary key), then locating the desired tuple is very, very fast and after the first query the table will be held in memory.
So, multiple UPDATEs in this case aren't all that bad.
If, on the other hand, the condition requires a full table scan, and even worse, the table's memory impact is significant, then having a single complex query will be better, even if evaluating the UPDATE is more expensive than a simple UPDATE (which gets internally optimized).
In this latter case, you could do:
UPDATE table SET posX=CASE
WHEN id=id[1] THEN posX[1]
WHEN id=id[2] THEN posX[2]
...
ELSE posX END [, posY = CASE ... END]
WHERE id IN (id[1], id[2], id[3]...);
The total cost is given more or less by: NUM_QUERIES * ( COST_QUERY_SETUP + COST_QUERY_PERFORMANCE ). This way, you knock down on NUM_QUERIES (from N separate id's to 1), but COST_QUERY_PERFORMANCE goes up (about 3x in MySQL 5.28; haven't yet tested in MySQL 8).
Otherwise, I'd try with indexing on id, or modifying the architecture.
This is an example with PHP, where I suppose we have a condition that already requires a full table scan, and which I can use as a key:
// Multiple update rules
$updates = [
"fldA='01' AND fldB='X'" => [ 'fldC' => 12, 'fldD' => 15 ],
"fldA='02' AND fldB='X'" => [ 'fldC' => 60, 'fldD' => 15 ],
...
];
The fields updated in the right hand expressions can be one or many, must always be the same (always fldC and fldD in this case). This restriction can be removed, but it would require a modified algorithm.
I can then build the single query through a loop:
$where = [ ];
$set = [ ];
foreach ($updates as $when => $then) {
$where[] = "({$when})";
foreach ($then as $fld => $value) {
if (!array_key_exists($fld, $set)) {
$set[$fld] = [ ];
}
$set[$fld][] = $value;
}
}
$set1 = [ ];
foreach ($set as $fld => $values) {
$set2 = "{$fld} = CASE";
foreach ($values as $i => $value) {
$set2 .= " WHEN {$where[$i]} THEN {$value}";
}
$set2 .= ' END';
$set1[] = $set2;
}
// Single query
$sql = 'UPDATE table SET '
. implode(', ', $set1)
. ' WHERE '
. implode(' OR ', $where);
Another way: ON DUPLICATE KEY UPDATE (MySQL)
In MySQL I think you could do this more easily with a multiple INSERT ON DUPLICATE KEY UPDATE, assuming that id is a primary key keeping in mind that nonexistent conditions ("id = 777" with no 777) will get inserted in the table and maybe cause an error if, for example, other required columns (declared NOT NULL) aren't specified in the query:
INSERT INTO tbl (id, posx, posy, bazinga)
VALUES (id1, posY1, posY1, 'DELETE'),
...
ON DUPLICATE KEY SET posx=VALUES(posx), posy=VALUES(posy);
DELETE FROM tbl WHERE bazinga='DELETE';
The 'bazinga' trick above allows to delete any rows that might have been unwittingly inserted because their id was not present (in other scenarios you might want the inserted rows to stay, though).
For example, a periodic update from a set of gathered sensors, but some sensors might not have been transmitted:
INSERT INTO monitor (id, value)
VALUES (sensor1, value1), (sensor2, 'N/A'), ...
ON DUPLICATE KEY UPDATE value=VALUES(value), reading=NOW();
(This is a contrived case, it would probably be more reasonable to LOCK the table, UPDATE all sensors to N/A and NOW(), then proceed with INSERTing only those values we do have).
A third way: CTE (PostgreSQL, not sure about SQLite3)
This is conceptually almost the same as the INSERT MySQL trick. As written, it works in PostgreSQL 9.6:
WITH updated(id, posX, posY) AS (VALUES
(id1, posX1, posY1),
(id2, posX2, posY2),
...
)
UPDATE myTable
SET
posX = updated.posY,
posY = updated.posY
FROM updated
WHERE (myTable.id = updated.id);
Something like this might work for you:
"UPDATE myTable SET ... ;
UPDATE myTable SET ... ;
UPDATE myTable SET ... ;
UPDATE myTable SET ... ;"
If any of the posX or posY values are the same, then they could be combined into one query
UPDATE myTable SET posX='39' WHERE id IN('2','3','40');
In recent versions of SQLite (beginning from 3.24.0 from 2018) you can use the UPSERT clause. Assuming only existing datasets are updated having a unique id column, you can use this approach, which is similar to #LSerni's ON DUPLICATE suggestion:
INSERT INTO myTable (id, posX, posY) VALUES
( 1, 35, 565),
( 3, 89, 224),
( 6, 11, 456),
(14, 87, 475)
ON CONFLICT (id) DO UPDATE SET
posX = excluded.posX, posY = excluded.posY
I could not make #Clockwork-Muse work actually. But I could make this variation work:
WITH Tmp AS (SELECT * FROM (VALUES (id1, newsPosX1, newPosY1),
(id2, newsPosX2, newPosY2),
......................... ,
(idN, newsPosXN, newPosYN)) d(id, px, py))
UPDATE t
SET posX = (SELECT px FROM Tmp WHERE t.id = Tmp.id),
posY = (SELECT py FROM Tmp WHERE t.id = Tmp.id)
FROM TableToUpdate t
I hope this works for you too!
Use a comma ","
eg:
UPDATE my_table SET rowOneValue = rowOneValue + 1, rowTwoValue = rowTwoValue + ( (rowTwoValue / (rowTwoValue) ) + ?) * (v + 1) WHERE value = ?
To update a table with different values for a column1, given values on column2, one can do as follows for SQLite:
"UPDATE table SET column1=CASE WHEN column2<>'something' THEN 'val1' ELSE 'val2' END"
Try with "update tablet set (row='value' where id=0001'), (row='value2' where id=0002'), ...