HSQLDB (HyperSQL): Changing column type in a TEXT table - sql

For the CsvCruncher project,
I am loading a CSV file into HSQLDB.
CREATE TEXT TABLE concat_1 ( Op VARCHAR(255), id VARCHAR(255), uuid VARCHAR(255), session_id VARCHAR(255) )
SET TABLE concat_1 SOURCE '.../concat_1.csv;encoding=UTF-8;cache_rows=50000;cache_size=10240000;ignore_first=true;fs=,;qc=\quote'
At the time of creating the table and loading, I don't know anything about the column values.
To speed the SELECTs up, I am trying to convert the columns (after loading) to other types, relying on this HSQLDB feature:
"HyperSQL allows changing the type if all the existing values can be cast
into the new type without string truncation or loss of significant digits."
ALTER TABLE concat_1 ALTER COLUMN id SET DATA TYPE BIGINT
But when I try that, I get:
operation is not allowed on text table with data in statement
Is this possible with HSQLDB without duplicating the TEXT table into a normal (native) table?
Here's the code, for your imagination:
for (String colName : colNames) {
String sqlTypeUsed = null;
for (String sqlType : new String[]{"TIMESTAMP","UUID","BIGINT","INTEGER","SMALLINT","BOOLEAN"}) {
String sqlCol = String.format("ALTER TABLE %s ALTER COLUMN %s SET DATA TYPE %s",
tableName, colName, sqlTypeUsed = sqlType);
log.info("Column change attempt SQL: " + sqlCol);
try (Statement st = this.conn.createStatement()) {
st.execute(sqlCol);
log.info(String.format("Column %s.%s converted to to %s", tableName, colName, sqlTypeUsed));
} catch (SQLException ex) {
log.info(String.format("Column %s.%s values don't fit to %s.\n %s",
tableName, colName, sqlTypeUsed, ex.getMessage()));
}
}
}

I figured out. Although it's not documented, TEXT tables can't be altered while bound to a CSV file.
What I did:
1) Instead of trying ALTER with each type, I queried SELECT CAST (<col> AS <type>).
2) I collected all types that the column can fit in and chose the most specific and smallest.
3) Then I detached the table - SET TABLE <table> SOURCE OFF.
4) Then I did the ALTER COLUMN.
5) Lastly, reattach - SET TABLE <table> SOURCE ON.
This way the table ends up with the most fitting type and the caches and indexes work more optimally.
For large tables, though, it could be worth flipping the resulting table into a native CACHED (disk-based) table.
Code coming when I clean it up.

Related

Best way to compress xml text column using SQL?

Using Microsoft SQL Server 2019.
I have two columns, one text representing some xml, another varbinary(max) representing already compressed xml, that I need to compress.
Please assume I cannot change the source data, but conversions can be made as necessary in the code.
I'd like to compress the text column, and initially it works fine, but if I try to save it into a temp table to be used further along in the process I get weird characters like ‹ or tŠÌK'À3û€Í‚;jw. Again, the first temp table I make stores it just fine, I can select the initial table and it displays compressed correctly. But if I need to pull it into a secondary temp table or variable from there it turns into a mess.
I've tried converting into several different formats, converting later in the process, and bringing in the source data for the column at the very last stage, but my end goal is to populate a variable that will be converted into JSON, and it always ends up weird there as well. i just need the compressed version of the columns do display properly when viewing the json variable I've made.
Any suggestions on how to tackle this?
Collation issue?
This smells of collation issue. tempdb is actually its own database with its own default collation and other settings.
In one database with default CollationA you call COMPRESS(NvarcharData) and that produces some VARBINARY.
In other database (tempdb) with default CollationB you call CONVERT(NVARCHAR(MAX), DECOMPRESS(CompressedData)). Now, what happens under the hood is:
CompressedData gets decompressed into VARBINARY representing NvarcharData in CollationA
that VARBINARY is converted to NVARCHAR assuming the binary data represents NVARCHAR data in CollationB, which is not true!
Try to be more explicit (collation, data type) with conversions between XML, VARBINARY and (N)VARCHAR.
Double compression?
I have also noticed "representing already compressed xml, that I need to compress". If you are doublecompressing, maybe you forgot to doubledecompress?
Example?
You are sadly missing an example, but I have produced minimal example of converting between XML and compressed data that works for me.
BEGIN TRANSACTION
GO
CREATE TABLE dbo.XmlData_Base (
PrimaryKey INTEGER NOT NULL IDENTITY(1, 1),
XmlCompressed VARBINARY(MAX) NULL
);
GO
CREATE OR ALTER VIEW dbo.XmlData
WITH SCHEMABINDING
AS
SELECT
BASE.PrimaryKey,
CONVERT(XML, DECOMPRESS(BASE.XmlCompressed)) AS XmlData
FROM
dbo.XmlData_Base AS BASE;
GO
CREATE OR ALTER TRIGGER dbo.TR_XmlData_instead_I
ON dbo.XmlData
INSTEAD OF INSERT
AS
BEGIN
INSERT INTO dbo.XmlData_Base
(XmlCompressed)
SELECT
COMPRESS(CONVERT(VARBINARY(MAX), I.XmlData))
FROM
Inserted AS I;
END;
GO
CREATE OR ALTER TRIGGER dbo.TR_XmlData_instead_U
ON dbo.XmlData
INSTEAD OF UPDATE
AS
BEGIN
UPDATE BASE
SET
BASE.XmlCompressed = COMPRESS(CONVERT(VARBINARY(MAX), I.XmlData))
FROM
dbo.XmlData_Base AS BASE
JOIN Inserted AS I ON I.PrimaryKey = BASE.PrimaryKey;
END;
GO
INSERT INTO dbo.XmlData
(XmlData)
VALUES
(CONVERT(XML, N'<this><I>I call upon thee!</I></this>'));
SELECT
*
FROM
dbo.XmlData;
SELECT
PrimaryKey,
XmlCompressed,
CONVERT(XML, DECOMPRESS(XmlCompressed))
FROM
dbo.XmlData_Base;
UPDATE dbo.XmlData
SET
XmlData = CONVERT(XML, N'<that><I>I call upon thee!</I></that>');
SELECT
*
FROM
dbo.XmlData;
SELECT
PrimaryKey,
XmlCompressed,
CONVERT(XML, DECOMPRESS(XmlCompressed))
FROM
dbo.XmlData_Base;
GO
ROLLBACK TRANSACTION;

Convert table column data type from image to varbinary

I have a table like:
create table tbl (
id int,
data image
)
It's found that the column data have very small size, which can be stored in varbinary(200)
So the new table would be,
create table tbl (
id int,
data varbinary(200)
)
How can I migrate this table to new design without loosing the data in it.
Just do two separate ALTER TABLEs, since you can only convert image to varbinary(max), but you can, afterwards, change its length:
create table tbl (
id int,
data image
)
go
insert into tbl(id,data) values
(1,0x0101010101),
(2,0x0204081632)
go
alter table tbl alter column data varbinary(max)
go
alter table tbl alter column data varbinary(200)
go
select * from tbl
Result:
id data
----------- ---------------
1 0x0101010101
2 0x0204081632
You can use this ALTER statement to convert existing column IMAGE to VARBINARY(MAX). Refer Here
ALTER Table tbl ALTER COLUMN DATA VARBINARY(MAX)
After this conversion, you are surely, get your data backout.
NOTE:- Don't forgot to take backup before execution.
The IMAGE datatype has been deprecated in future version SQL SERVER, and needs to be converted to VARBINARY(MAX) wherever possible.
How about you create a NewTable with the varbinary, then copy the data from the OldTable into it?
INSERT INTO [dbo].[NewTable] ([id], [data])
SELECT [id], [image] FROM [dbo].[OldTable]
First of all from BOL:
image: Variable-length binary data from 0 through 2^31-1
(2,147,483,647) bytes.
The image data type is essentially an alias for varbinary (2GB), so converting it to a varbinary(max) should not lead to data loss.
But to be sure:
back up your existing data
add a new field (varbinary(max))
copy data from old field to new field
swap the fields with sp_rename
test
after successful test, drop the old column

Unable to create table in hive

I am creating table in hive like:
CREATE TABLE SEQUENCE_TABLE(
SEQUENCE_NAME VARCHAR2(225) NOT NULL,
NEXT_VAL NUMBER NOT NULL
);
But, in result there is parse exception. Unable to read Varchar2(225) NOT NULL.
Can anyone guide me that how to create table like given above and any other process to provide path for it.
There's no such thing as VARCHAR, field width or NOT NULL clause in hive.
CREATE TABLE SEQUENCE_TABLE( SEQUENCE_TABLE string, NEXT_VAL bigint);
Please read this for CREATE TABLE syntax:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable
Anyway Hive is "SQL Like" but it's not "SQL". I wouldn't use it for things such as sequence table as you don't have support for transactions, locking, keys and everything you are familiar with from Oracle (though I think that in new version there is simple support for transactions, updates, deletes, etc.).
I would consider using normal OLTP database for whatever you are trying to achieve
only you have option here like:
CREATE TABLE SEQUENCE_TABLE(SEQUENCE_NAME String,NEXT_VAL bigint) row format delimited fields terminated by ',' stored as textfile;
PS:Again depends the types to data you are going to load in hive
Use following syntax...
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.] table_name
[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[ROW FORMAT row_format]
[STORED AS file_format]
And Example of hive create table
CREATE TABLE IF NOT EXISTS employee ( eid int, name String,
salary String, destination String)
COMMENT ‘Employee details’
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t’
LINES TERMINATED BY ‘\n’
STORED AS TEXTFILE;

Declare variable in SQLite and use it

I want to declare a variable in SQLite and use it in insert operation.
Like in MS SQL:
declare #name as varchar(10)
set name = 'name'
select * from table where name = #name
For example, I will need to get last_insert_row and use it in insert.
I have found something about binding but I didn't really fully understood it.
SQLite doesn't support native variable syntax, but you can achieve virtually the same using an in-memory temp table.
I've used the below approach for large projects and works like a charm.
/* Create in-memory temp table for variables */
BEGIN;
PRAGMA temp_store = 2; /* 2 means use in-memory */
CREATE TEMP TABLE _Variables(Name TEXT PRIMARY KEY, RealValue REAL, IntegerValue INTEGER, BlobValue BLOB, TextValue TEXT);
/* Declaring a variable */
INSERT INTO _Variables (Name) VALUES ('VariableName');
/* Assigning a variable (pick the right storage class) */
UPDATE _Variables SET IntegerValue = ... WHERE Name = 'VariableName';
/* Getting variable value (use within expression) */
... (SELECT coalesce(RealValue, IntegerValue, BlobValue, TextValue) FROM _Variables WHERE Name = 'VariableName' LIMIT 1) ...
DROP TABLE _Variables;
END;
For a read-only variable (that is, a constant value set once and used anywhere in the query), use a Common Table Expression (CTE).
WITH const AS (SELECT 'name' AS name, 10 AS more)
SELECT table.cost, (table.cost + const.more) AS newCost
FROM table, const
WHERE table.name = const.name
SQLite WITH clause
Herman's solution works, but it can be simplified because Sqlite allows to store any value type on any field.
Here is a simpler version that uses one Value field declared as TEXT to store any value:
CREATE TEMP TABLE IF NOT EXISTS Variables (Name TEXT PRIMARY KEY, Value TEXT);
INSERT OR REPLACE INTO Variables VALUES ('VarStr', 'Val1');
INSERT OR REPLACE INTO Variables VALUES ('VarInt', 123);
INSERT OR REPLACE INTO Variables VALUES ('VarBlob', x'12345678');
SELECT Value
FROM Variables
WHERE Name = 'VarStr'
UNION ALL
SELECT Value
FROM Variables
WHERE Name = 'VarInt'
UNION ALL
SELECT Value
FROM Variables
WHERE Name = 'VarBlob';
Herman's solution worked for me, but the ... had me mixed up for a bit. I'm including the demo I worked up based on his answer. The additional features in my answer include foreign key support, auto incrementing keys, and use of the last_insert_rowid() function to get the last auto generated key in a transaction.
My need for this information came up when I hit a transaction that required three foreign keys but I could only get the last one with last_insert_rowid().
PRAGMA foreign_keys = ON; -- sqlite foreign key support is off by default
PRAGMA temp_store = 2; -- store temp table in memory, not on disk
CREATE TABLE Foo(
Thing1 INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL
);
CREATE TABLE Bar(
Thing2 INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
FOREIGN KEY(Thing2) REFERENCES Foo(Thing1)
);
BEGIN TRANSACTION;
CREATE TEMP TABLE _Variables(Key TEXT, Value INTEGER);
INSERT INTO Foo(Thing1)
VALUES(2);
INSERT INTO _Variables(Key, Value)
VALUES('FooThing', last_insert_rowid());
INSERT INTO Bar(Thing2)
VALUES((SELECT Value FROM _Variables WHERE Key = 'FooThing'));
DROP TABLE _Variables;
END TRANSACTION;
To use the one from denverCR in your example:
WITH tblCTE AS (SELECT "Joe" AS namevar)
SELECT * FROM table, tblCTE
WHERE name = namevar
As a beginner I found other answers too difficult to understand, hope this works
Creating "VARIABLE" for use in SQLite SELECT (and some other) statements
CREATE TEMP TABLE IF NOT EXISTS variable AS SELECT '2002' AS _year; --creating the "variable" named "_year" with value "2002"
UPDATE variable SET _year = '2021'; --changing the variable named "_year" assigning "new" value "2021"
SELECT _year FROM variable; --viewing the variable
SELECT 'TEST', (SELECT _year FROM variable) AS _year; --using the variable
SELECT taxyr FROM owndat WHERE taxyr = (SELECT _year FROM variable); --another example of using the variable
SELECT DISTINCT taxyr FROM owndat WHERE taxyr IN ('2022',(SELECT _year FROM variable)); --another example of using the variable
DROP TABLE IF EXISTS variable; --releasing the "variable" if needed to be released
After reading all the answers I prefer something like this:
select *
from table, (select 'name' as name) const
where table.name = const.name
Try using Binding Values. You cannot use variables as you do in T-SQL but you can use "parameters". I hope the following link is usefull.Binding Values
I found one solution for assign variables to COLUMN or TABLE:
conn = sqlite3.connect('database.db')
cursor=conn.cursor()
z="Cash_payers" # bring results from Table 1 , Column: Customers and COLUMN
# which are pays cash
sorgu_y= Customers #Column name
query1="SELECT * FROM Table_1 WHERE " +sorgu_y+ " LIKE ? "
print (query1)
query=(query1)
cursor.execute(query,(z,))
Don't forget input one space between the WHERE and double quotes
and between the double quotes and LIKE

Load Data Infile - negative decimal truncated (to positive number)

I am having trouble loading decimal data into a database - specifically, my negative numbers are getting truncated, and I can't figure it out.
Here is what my query looks like:
> CREATE TABLE IF NOT EXISTS mytable (id INT(12) NOT NULL AUTO_INCREMENT,
mydecimal DECIMAL(13,2),PRIMARY KEY(id));
> LOAD DATA INFILE 'data.dat' INTO TABLE mytable FIELDS TERMINATED BY ';';
And the data.dat that I'm loading:
;000000019.50 ;
;000000029.50-;
;000000049.50 ;
When it completes, giving me a warning that "Data truncated for column 'mydecimal' at row 2." And when I look at the data, it's stored as positive number. Any ideas how to fix this?
The best way to handle data abnormalities like this in the input file is to load them into a local variable, then set the actual column value based on a transformation of the local variable.
In your case, you can load the strings into a local variable, then either leave it alone or multiply by negative one depending on whether it ends with a minus sign.
Something like this should work for you:
LOAD DATA INFILE 'data.dat'
INTO TABLE mytable FIELDS TERMINATED BY ';'
(id,#mydecimal)
set mydecimal = IF(#mydecimal like '%-',#mydecimal * -1,#mydecimal);
I'm not sure why you're putting the minus sign after the number rather than before it. Does it work when you place the '-' sign at the start of the line?
you can consider this
CREATE TABLE IF NOT EXISTS mytable (id INT(12) NOT NULL AUTO_INCREMENT,
mydecimal varchar(255),PRIMARY KEY(id));
LOAD DATA INFILE 'data.dat' INTO TABLE mytable FIELDS TERMINATED BY ';';
update mytable set mydecimal =
cast(mydecimal as decimal(13,2))*if (substring(mydecimal, -1)='-', -1, 1);
alter table mytable modify column mydecimal decimal(13,2) signed;