I am trying to extract xml into a a table output separated by rows.
The data is a CLOB field in Oracle Database as follows:
<emailInfo>
<recipientList>
<recipientName>ATS</recipientName>
<recipientEmailList>
<emailAddress>wp#act.com.au</emailAddress>
<statusFlag>F1AC</statusFlag>
</recipientEmailList>
<contactEmailList>
<emailAddress>wp#act.com.au</emailAddress>
<statusFlag>F1AC</statusFlag>
<contactEmailList>
<emailAddress>wp2#act.com.au</emailAddress>
<statusFlag>F1AC</statusFlag>
</contactEmailList>
<escalationEmailList>
<emailAddress>pw#wp.com.au</emailAddress>
<statusFlag>F1AC</statusFlag>
</escalationEmailList>
</recipientList>
<recipientList>
<recipientName>ERG</recipientName>
<recipientEmailList>
<emailAddress>erg#wp.com.au</emailAddress>
<statusFlag>F1AC</statusFlag>
</recipientEmailList>
<contactEmailList>
<emailAddress>erg#wp.com.au</emailAddress>
<statusFlag>F1AC</statusFlag>
</contactEmailList>
<escalationEmailList>
<emailAddress>sl#wp.com.au</emailAddress>
<statusFlag>F1AC</statusFlag>
</escalationEmailList>
<escalationEmailList>
<emailAddress>sl2#wp.com.au</emailAddress>
<statusFlag>F1AC</statusFlag>
</escalationEmailList>
</recipientList>
</emailInfo>
EDIT2: My updated SQL query is as follows:
SELECT t.*, m.*, p.*, l.*
FROM cisadm.F1_ext_lookup_val exval,
XMLTABLE ('/emailInfo/recipientList'
PASSING XMLTYPE (exval.bo_data_area)
COLUMNS recipient_name VARCHAR2 (4000) PATH 'recipientName',
recipient_email_list XMLTYPE PATH '/recipientEmailList',
contact_email_list XMLTYPE PATH '/contactEmailList',
escalation_email_list XMLTYPE PATH '/escalationEmailList') t,
XMLTABLE ('/recipientEmailList'
PASSING (t.recipient_email_list)
COLUMNS recipient_email_address VARCHAR2 (4000) PATH '/emailAddress',
rec_email_status_flg VARCHAR2 (10) PATH '/statusFlag') m,
XMLTABLE ('/contactEmailList'
PASSING (t.contact_email_list)
COLUMNS contact_email_address VARCHAR2 (4000) PATH 'contactEmailList/emailAddress',
contact_email_status_flg VARCHAR2 (10) PATH 'contactEmailList/statusFlag'
) p,
XMLTABLE('/escalationEmailList'
PASSING (t.escalation_email_list)
COLUMNS esc_email_address VARCHAR2(4000) PATH 'escalationEmailList/emailAddress',
esc_email_status_flg VARCHAR2(10) PATH 'escalationEmailList/statusFlag'
) l
I am trying to provision for the fact that there may be multiple values for each Recipient email list, contact email list, and escalation email list.
Sample output should be:
Any help would be so appreciated!
For future readers, here are general-purpose solutions in open-source programming to migrate XML data from a CLOB field into csv tabular format.
Using the OP's data needs, these approaches are not dependent on any RDMS and hence can be used in other database connections. Additionally, limitations of SQL are overcome as various nuances like xpaths, arrays, loops can be used:
Python (using cx_Oracle):
#!/usr/bin/python
import os
import cx_Oracle
import csv
import lxml.etree as ET
# SET DIRECTORY PATH
cd = os.path.dirname(os.path.abspath(__file__))
# DB CONNECTION AND QUERY
db = cx_Oracle.connect("uid/pwd#database")
cur = db.cursor()
clob = cur.execute("SELECT CLOBfield FROM OracleTable").fetchone()
# CLOSE CURSOR AND DATABASE
cur.close()
db.close()
# PARSE XML CONTENT
dom = ET.fromstring(clob)
# DEFINING COLUMNS
columns = ['RECIPENT_NAME', 'RECIPIENT_EMAIL_ADDRESS', 'REC_EMAIL_STATUS_FLG',
'CONTACT_EMAIL_ADDRESS', 'CONTACT_EMAIL_STATUS_FLG',
'ESC_EMAIL_ADDRESS', 'ESC_EMAIL_STATUS_FLG']
emailnodes = ['recipientEmailList', 'contactEmailList', 'escalationEmailList']
# OPEN CSV FILE
with open(os.path.join(cd,'CLOB_Py.csv'), 'w', newline='') as m:
writer = csv.writer(m)
writer.writerow(columns)
nodexpath = dom.xpath('//recipientList')
dataline = []
for j in range(1,len(nodexpath)+1):
dataline = []
dataline.append(dom.xpath('//recipientList[{0}]/recipientName'.format(j))[0].text)
for n in emailnodes:
# EMAILS
childxpath = dom.xpath('//recipientList[{0}]/{1}[1]/*[1]'.format(j, n))
# APPEND DATA LINES
for elem in childxpath:
dataline.append(elem.text)
if childxpath == []:
dataline.append('')
# FLAGS
childxpath = dom.xpath('//recipientList[{0}]/{1}[1]/*[2]'.format(j, n))
# APPEND DATA LINES
for elem in childxpath:
dataline.append(elem.text)
if childxpath == []:
dataline.append('')
writer.writerow(dataline)
PHP (using PDO Oracle OCI)
// Set Directory Path
$cd = dirname(__FILE__);
// Opening db connection
$db_username = "your_username";
$db_password = "your_password";
$db = "oci:dbname=your_sid";
try {
$dbh = new PDO($db,$db_username,$db_password);
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$sql = "SELECT CLOBfield FROM OracleTable";
$STH = $dbh->query($sql);
$clob = $STH->fetch();
}
catch(PDOException $e) {
echo $e->getMessage();
exit;
}
# Closing db connection
$dbh = null;
// Loading XML source
$xpath = simplexml_load_string($clob);
// Writing column headers
$columns = array('RECIPENT_NAME', 'RECIPIENT_EMAIL_ADDRESS', 'REC_EMAIL_STATUS_FLG',
'CONTACT_EMAIL_ADDRESS', 'CONTACT_EMAIL_STATUS_FLG',
'ESC_EMAIL_ADDRESS', 'ESC_EMAIL_STATUS_FLG');
$emailnodes = array('recipientEmailList', 'contactEmailList', 'escalationEmailList');
$fs = fopen($cd.'/CLOB_PHP.csv', 'w');
fputcsv($fs, $columns);
fclose($fs);
// Writing data lines
$i = 1;
$values = [];
$node = $xpath->xpath('//recipientList');
foreach ($node as $n){
$child = $xpath->xpath('//recipientList['. $i .']/recipientName');
foreach($child as $value) {
$values[] = $value;
}
foreach ($emailnodes as $e){
// EMAILS
$child = $xpath->xpath('//recipientList['. $i .']/'. $e.'[1]/*[1]');
if (count($child) > 0) {
foreach($child as $value) {
$values[] = $value;
}
}
else {
$values[] = '';
}
// FLAGS
$child = $xpath->xpath('//recipientList['. $i .']/'. $e.'[1]/*[2]');
if (count($child) > 0) {
foreach($child as $value) {
$values[] = $value;
}
}
else {
$values[] = '';
}
}
$fs = fopen($cd.'/CLOB_PHP.csv', 'a');
fputcsv($fs, $values);
fclose($fs);
$values = [];
$i++;
}
R (using ROracle):
library(XML)
library(ROracle)
setwd("C:\\Path\\To\\R\\Script")
# OPEN DATABASE AND QUERY
conn <-dbConnect(drv, username = "", password = "", dbname = "")
clobdf <- dbGetQuery(conn, "SELECT CLOBfield FROM OracleTable;")
dbDisconnect(conn)
# READ IN EXTERNAL DATA FILE
doc<-xmlParse(clobdf[[1,1]])
emailnodes <- c('recipientEmailList', 'contactEmailList', 'escalationEmailList')
# EXTRACT NODE VALUES INTO LISTS
recipientNamesList <- xpathSApply(doc, paste0("//recipientList/recipientName"), xmlValue)
for (e in emailnodes){
assign(e, xpathSApply(doc, paste0("//recipientList/", e, "[1]/*[1]"), xmlValue))
}
for (e in emailnodes){
assign(paste0(e, "flg"), xpathSApply(doc, paste0("//recipientList/", e, "[1]/*[2]"), xmlValue))
}
# COMBINE LISTS TO DATA FRAME
xmldf<- data.frame(RECIPENT_NAME = matrix(unlist(recipientNamesList), nrow=2, byrow=T),
RECIPIENT_EMAIL_ADDRESS = matrix(unlist(recipientEmailList), nrow=2, byrow=T),
REC_EMAIL_STATUS_FLG = matrix(unlist(recipientEmailListflg), nrow=2, byrow=T),
CONTACT_EMAIL_ADDRESS = matrix(unlist(contactEmailList), nrow=2, byrow=T),
CONTACT_EMAIL_STATUS_FLG = matrix(unlist(contactEmailListflg), nrow=2, byrow=T),
ESC_EMAIL_ADDRESS = matrix(unlist(escalationEmailList), nrow=2, byrow=T),
ESC_EMAIL_STATUS_FLG = matrix(unlist(escalationEmailListflg), nrow=2, byrow=T))
# OUTPUT TO CSV
write.csv(xmldf, "CLOB_R.csv", na = "", row.names=FALSE)
This query returns the data as in screenshot -
select
extractvalue(s.column_value, '/*/recipientName') as recipient_name,
extractvalue(s.column_value, '/*/recipientEmailList/emailAddress') as recipient_email_address,
extractvalue(s.column_value, '/*/recipientEmailList/statusFlag') as rec_email_status_flg,
extractvalue(s.column_value, '/*/contactEmailList/emailAddress') as contact_email_address,
extractvalue(s.column_value, '/*/contactEmailList/statusFlag') as contact_email_status_flg,
extractvalue(s.column_value, '/*/escalationEmailList/emailAddress') as esc_email_address,
extractvalue(s.column_value, '/*/escalationEmailList/statusFlag') as esc_email_status_flg
from tmp, table(xmlsequence(EXTRACT(XMLTYPE(tmp.bo_data_area), '/emailInfo/recipientList'))) s
and this query extract each email on a separate line -
select recipient_name, email_address, status_flag
from
(
select
recipient_name,
extractvalue(x.column_value, '/*/emailAddress') as email_address,
extractvalue(x.column_value, '/*/statusFlag') as status_flag
from
(
select
extractvalue(s.column_value, '/*/recipientName') as recipient_name,
EXTRACT(s.column_value, '/*') recipients
from tmp, table(xmlsequence(EXTRACT(XMLTYPE(tmp.bo_data_area), '/emailInfo/recipientList'))) s
) v, table(xmlsequence(EXTRACT(v.recipients, '/*/*'))) x
)
where (email_address is not null or status_flag is not null)
You may try xmltable
SELECT *
FROM XMLTable('/emailInfo/recipientList' PASSING XMLTYPE('<emailInfo>
<recipientList>
<recipientName>ATS</recipientName>
<recipientEmailList>
<emailAddress>wp#act.com.au</emailAddress>
<statusFlag>F1AC</statusFlag>
</recipientEmailList>
<contactEmailList>
<emailAddress>wp#act.com.au</emailAddress>
<statusFlag>F1AC</statusFlag>
</contactEmailList>
<escalationEmailList>
<emailAddress>pw#wp.com.au</emailAddress>
<statusFlag>F1AC</statusFlag>
</escalationEmailList>
</recipientList>
<recipientList>
<recipientName>ERG</recipientName>
<recipientEmailList>
<emailAddress>erg#wp.com.au</emailAddress>
<statusFlag>F1AC</statusFlag>
</recipientEmailList>
<contactEmailList>
<emailAddress>erg#wp.com.au</emailAddress>
<statusFlag>F1AC</statusFlag>
</contactEmailList>
<escalationEmailList>
<emailAddress>sl#wp.com.au</emailAddress>
<statusFlag>F1AC</statusFlag>
</escalationEmailList>
</recipientList>
</emailInfo>')
COLUMNS recipient_name VARCHAR2(4000) PATH 'recipientName',
recipient_email_address VARCHAR2(4000) PATH 'recipientEmailList/emailAddress',
rec_email_status_flg VARCHAR2(10) PATH 'recipientEmailList/statusFlag',
contact_email_address VARCHAR2(4000) PATH 'contactEmailList/emailAddress',
contact_email_status_flg VARCHAR2(10) PATH 'contactEmailList/statusFlag',
esc_email_address VARCHAR2(4000) PATH 'escalationEmailList/emailAddress',
esc_email_status_flg VARCHAR2(10) PATH 'escalationEmailList/statusFlag'
) t
Same from table
SELECT *
FROM tmp,XMLTable('/emailInfo/recipientList' PASSING XMLTYPE(tmp.bo_data_area)
COLUMNS recipient_name VARCHAR2(4000) PATH 'recipientName',
recipient_email_address VARCHAR2(4000) PATH 'recipientEmailList/emailAddress',
rec_email_status_flg VARCHAR2(10) PATH 'recipientEmailList/statusFlag',
contact_email_address VARCHAR2(4000) PATH 'contactEmailList/emailAddress',
contact_email_status_flg VARCHAR2(10) PATH 'contactEmailList/statusFlag',
esc_email_address VARCHAR2(4000) PATH 'escalationEmailList/emailAddress',
esc_email_status_flg VARCHAR2(10) PATH 'escalationEmailList/statusFlag'
) t
Related
Context
Currently I am using Snowflake as a Data Warehouse and AWS' S3 as a data lake. The majority of the files that land on S3 are in the Parquet format. For these, I am using a new limited feature by Snowflake (documented here) that automatically detects the schema from the parquet files on S3, which I can use to generate a CREATE TABLE statement with the correct column names and inferred data types. This feature currently only works for Apache Parquet, Avro, and ORC files. I would like to find a way that achieves the same desired objective but for CSV files.
What I have tried to do
This is how I currently infer the schema for Parquet files:
select generate_column_description(array_agg(object_construct(*)), 'table') as columns
from table (infer_schema(location=>'${LOCATION}', file_format=>'${FILE_FORMAT}'))
However if I try specifying the FILE_FORMAT as csv that approach will fail.
Other approaches I have considered:
Transferring all files that land on S3 to parquet (this involves more code, and infra setup so wouldn't be my top choice, especially that I'd like to keep some files in their natural type on s3)
Having a script (using libraries like Pandas in Python for example) that infer the schema for files in S3 (this also involves more code, and will be strange in the sense that parquet files are handled in Snowflake, but non parquet files are handled by some script on aws).
Using a Snowflake UDF to infer the schema. Haven't fully considered my options there yet.
Desired Behaviour
As a new csv file lands on S3 (on a pre-existing STAGE), I would like to infer the schema, and be able to generate a CREATE TABLE statement with the inferred data types. Preferably, I would like to do that within Snowflake as the existing aforementioned schema-inference solution exists there. Happy to add further information if needed.
UPDATE: I modified the SP that infers data types in untyped (all string type columns) tables and it now works directly against Snowflake stages. The project code is available here: https://github.com/GregPavlik/InferSchema
I wrote a stored procedure to assist with this; however, its only goal is to infer the data types of untyped columns. It works as follows:
Load the CSV into a table with all columns defined as varchars.
Call the SP with a query against the new table (main point is to get only the columns you want and limit the row count to keep type inference times reasonable).
Also in the SP call is the DB, schema, and table for the old and new locations -- old with all varchar and new with the inferred types.
The SP will then infer the data types and create two SQL statements. One statement will create the new table with the inferred data types. One statement will copy from the untyped (all varchar) table to the new table with appropriate wrappers such as try_multi_timestamp(), a UDF that extends try_to_timestamp() to try various common formats.
I meant to extend this so that it didn't require the untyped (all varchar) table at all, but haven't gotten around to it. Since it's come up here, I may circle back and update the SP with that capability. You can specify a query that reads directly from the stage, but you'd have to use $1, $2... with aliases for the column names (or else the DDL will try to create column names like $1). If the query runs directly against a stage, for the old DB, schema, and table, you could put in whatever because that's only used to generate an insert from select statement.
-- This shows how to use on the Snowflake TPCH sample, but could be any query.
-- Keep the row count down to reduce the time it take to infer the types.
call infer_data_types('select * from SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.LINEITEM limit 10000',
'SNOWFLAKE_SAMPLE_DATA', 'TPCH_SF1', 'LINEITEM',
'TEST', 'PUBLIC', 'LINEITEM');
create or replace procedure INFER_DATA_TYPES(SOURCE_QUERY string,
DATABASE_OLD string,
SCHEMA_OLD string,
TABLE_OLD string,
DATABASE_NEW string,
SCHEMA_NEW string,
TABLE_NEW string)
returns string
language javascript
as
$$
/****************************************************************************************************
* *
* DataType Classes
* *
****************************************************************************************************/
class Query{
constructor(statement){
this.statement = statement;
}
}
class DataType {
constructor(db, schema, table, column, sourceQuery) {
this.db = db;
this.schema = schema;
this.table = table;
this.sourceQuery = sourceQuery
this.column = column;
this.insert = '"#~COLUMN~#"';
this.totalCount = 0;
this.notNullCount = 0;
this.typeCount = 0;
this.blankCount = 0;
this.minTypeOf = 0.95;
this.minNotNull = 1.00;
}
setSQL(sqlTemplate){
this.sql = sqlTemplate;
this.sql = this.sql.replace(/#~DB~#/g, this.db);
this.sql = this.sql.replace(/#~SCHEMA~#/g, this.schema);
this.sql = this.sql.replace(/#~TABLE~#/g, this.table);
this.sql = this.sql.replace(/#~COLUMN~#/g, this.column);
}
getCounts(){
var rs;
rs = GetResultSet(this.sql);
rs.next();
this.totalCount = rs.getColumnValue("TOTAL_COUNT");
this.notNullCount = rs.getColumnValue("NON_NULL_COUNT");
this.typeCount = rs.getColumnValue("TO_TYPE_COUNT");
this.blankCount = rs.getColumnValue("BLANK");
}
isCorrectType(){
return (this.typeCount / (this.notNullCount - this.blankCount) >= this.minTypeOf);
}
isNotNull(){
return (this.notNullCount / this.totalCount >= this.minNotNull);
}
}
class TimestampType extends DataType{
constructor(db, schema, table, column, sourceQuery){
super(db, schema, table, column, sourceQuery)
this.syntax = "timestamp";
this.insert = 'try_multi_timestamp(trim("#~COLUMN~#"))';
this.sourceQuery = SOURCE_QUERY;
this.setSQL(GetCheckTypeSQL(this.insert, this.sourceQuery));
this.getCounts();
}
}
class IntegerType extends DataType{
constructor(db, schema, table, column, sourceQuery){
super(db, schema, table, column, sourceQuery)
this.syntax = "number(38,0)";
this.insert = 'try_to_number(trim("#~COLUMN~#"), 38, 0)';
this.setSQL(GetCheckTypeSQL(this.insert, this.sourceQuery));
this.getCounts();
}
}
class DoubleType extends DataType{
constructor(db, schema, table, column, sourceQuery){
super(db, schema, table, column, sourceQuery)
this.syntax = "double";
this.insert = 'try_to_double(trim("#~COLUMN~#"))';
this.setSQL(GetCheckTypeSQL(this.insert, this.sourceQuery));
this.getCounts();
}
}
class BooleanType extends DataType{
constructor(db, schema, table, column, sourceQuery){
super(db, schema, table, column, sourceQuery)
this.syntax = "boolean";
this.insert = 'try_to_boolean(trim("#~COLUMN~#"))';
this.setSQL(GetCheckTypeSQL(this.insert, this.sourceQuery));
this.getCounts();
}
}
// Catch all is STRING data type
class StringType extends DataType{
constructor(db, schema, table, column, sourceQuery){
super(db, schema, table, column, sourceQuery)
this.syntax = "string";
this.totalCount = 1;
this.notNullCount = 0;
this.typeCount = 1;
this.minTypeOf = 0;
this.minNotNull = 1;
}
}
/****************************************************************************************************
* *
* Main function *
* *
****************************************************************************************************/
var pass = 0;
var column;
var typeOf;
var ins = '';
var newTableDDL = '';
var insertDML = '';
var columnRS = GetResultSet(GetTableColumnsSQL(DATABASE_OLD, SCHEMA_OLD, TABLE_OLD));
while (columnRS.next()){
pass++;
if(pass > 1){
newTableDDL += ",\n";
insertDML += ",\n";
}
column = columnRS.getColumnValue("COLUMN_NAME");
typeOf = InferDataType(DATABASE_OLD, SCHEMA_OLD, TABLE_OLD, column, SOURCE_QUERY);
newTableDDL += '"' + typeOf.column + '" ' + typeOf.syntax;
ins = typeOf.insert;
insertDML += ins.replace(/#~COLUMN~#/g, typeOf.column);
}
return GetOpeningComments() +
GetDDLPrefixSQL(DATABASE_NEW, SCHEMA_NEW, TABLE_NEW) +
newTableDDL +
GetDDLSuffixSQL() +
GetDividerSQL() +
GetInsertPrefixSQL(DATABASE_NEW, SCHEMA_NEW, TABLE_NEW) +
insertDML +
GetInsertSuffixSQL(DATABASE_OLD, SCHEMA_OLD, TABLE_OLD) ;
/****************************************************************************************************
* *
* Helper functions *
* *
****************************************************************************************************/
function InferDataType(db, schema, table, column, sourceQuery){
var typeOf;
typeOf = new IntegerType(db, schema, table, column, sourceQuery);
if (typeOf.isCorrectType()) return typeOf;
typeOf = new DoubleType(db, schema, table, column, sourceQuery);
if (typeOf.isCorrectType()) return typeOf;
typeOf = new BooleanType(db, schema, table, column, sourceQuery); // May want to do a distinct and look for two values
if (typeOf.isCorrectType()) return typeOf;
typeOf = new TimestampType(db, schema, table, column, sourceQuery);
if (typeOf.isCorrectType()) return typeOf;
typeOf = new StringType(db, schema, table, column, sourceQuery);
if (typeOf.isCorrectType()) return typeOf;
return null;
}
/****************************************************************************************************
* *
* SQL Template Functions *
* *
****************************************************************************************************/
function GetCheckTypeSQL(insert, sourceQuery){
var sql =
`
select count(1) as TOTAL_COUNT,
count("#~COLUMN~#") as NON_NULL_COUNT,
count(${insert}) as TO_TYPE_COUNT,
sum(iff(trim("#~COLUMN~#")='', 1, 0)) as BLANK
--from "#~DB~#"."#~SCHEMA~#"."#~TABLE~#";
from (${sourceQuery})
`;
return sql;
}
function GetTableColumnsSQL(dbName, schemaName, tableName){
var sql =
`
select COLUMN_NAME
from ${dbName}.INFORMATION_SCHEMA.COLUMNS
where TABLE_CATALOG = '${dbName}' and
TABLE_SCHEMA = '${schemaName}' and
TABLE_NAME = '${tableName}'
order by ORDINAL_POSITION;
`;
return sql;
}
function GetOpeningComments(){
return `
/**************************************************************************************************************
* *
* Copy and paste into a worksheet to create the typed table and insert into the new table from the old one. *
* *
**************************************************************************************************************/
`;
}
function GetDDLPrefixSQL(db, schema, table){
var sql =
`
create or replace table "${db}"."${schema}"."${table}"
(
`;
return sql;
}
function GetDDLSuffixSQL(){
return "\n);";
}
function GetDividerSQL(){
return `
/**************************************************************************************************************
* *
* The SQL statement below this attempts to copy all rows from the string tabe to the typed table. *
* *
**************************************************************************************************************/
`;
}
function GetInsertPrefixSQL(db, schema, table){
var sql =
`\ninsert into "${db}"."${schema}"."${table}" select\n`;
return sql;
}
function GetInsertSuffixSQL(db, schema, table){
var sql =
`\nfrom "${db}"."${schema}"."${table}" ;`;
return sql;
}
//function GetInsertSuffixSQL(db, schema, table){
//var sql = '\nfrom "${db}"."${schema}"."${table}";';
//return sql;
//}
/****************************************************************************************************
* *
* SQL functions *
* *
****************************************************************************************************/
function GetResultSet(sql){
cmd1 = {sqlText: sql};
stmt = snowflake.createStatement(cmd1);
var rs;
rs = stmt.execute();
return rs;
}
function ExecuteNonQuery(queryString) {
var out = '';
cmd1 = {sqlText: queryString};
stmt = snowflake.createStatement(cmd1);
var rs;
rs = stmt.execute();
}
function ExecuteSingleValueQuery(columnName, queryString) {
var out;
cmd1 = {sqlText: queryString};
stmt = snowflake.createStatement(cmd1);
var rs;
try{
rs = stmt.execute();
rs.next();
return rs.getColumnValue(columnName);
}
catch(err) {
if (err.message.substring(0, 18) == "ResultSet is empty"){
throw "ERROR: No rows returned in query.";
} else {
throw "ERROR: " + err.message.replace(/\n/g, " ");
}
}
return out;
}
function ExecuteFirstValueQuery(queryString) {
var out;
cmd1 = {sqlText: queryString};
stmt = snowflake.createStatement(cmd1);
var rs;
try{
rs = stmt.execute();
rs.next();
return rs.getColumnValue(1);
}
catch(err) {
if (err.message.substring(0, 18) == "ResultSet is empty"){
throw "ERROR: No rows returned in query.";
} else {
throw "ERROR: " + err.message.replace(/\n/g, " ");
}
}
return out;
}
function getQuery(sql){
var cmd = {sqlText: sql};
var query = new Query(snowflake.createStatement(cmd));
try {
query.resultSet = query.statement.execute();
} catch (err) {
throw "ERROR: " + err.message.replace(/\n/g, " ");
}
return query;
}
$$;
Have you tried STAGES?
Create 2 stages ... one with no header and the other with header .. .
see examples below.
Then a bit of SQL and voila your DDL.
Only issue - you need to know the # of cols to put correct number of t.$'s.
If someone could automate that ... we'd have an almost automatic DDL generator for CSV's.
Obviously once you have the SQL stmt then just add the create or replace table to the front and your table is nicely created with all the names from the CSV.
:-)
-- create or replace stage CSV_NO_HEADER
URL = 's3://xxx-x-dev-landing/xxx/'
STORAGE_INTEGRATION = "xxxLAKE_DEV_S3_INTEGRATION"
FILE_FORMAT = ( TYPE = CSV SKIP_HEADER = 1 FIELD_OPTIONALLY_ENCLOSED_BY = '"' )
-- create or replace stage CSV
URL = 's3://xxx-xxxlake-dev-landing/xxx/'
STORAGE_INTEGRATION = "xxxLAKE_DEV_S3_INTEGRATION"
FILE_FORMAT = ( TYPE = CSV FIELD_OPTIONALLY_ENCLOSED_BY = '"' )
select concat('select t.$1 ', t.$1, ',t.$2 ', t.$2,',t.$3 ', t.$3, ',t.$4 ', t.$4,',t.$5 ', t.$5,',t.$6 ', t.$6,',t.$7 ', t.$7,',t.$8 ', t.$8,',t.$9 ', t.$9,
',t.$10 ', t.$10, ',t.$11 ', t.$11,',t.$12 ', t.$12 ,',t.$13 ', t.$13, ',t.$14 ', t.$14 ,',t.$15 ', t.$15 ,',t.$16 ', t.$16 ,',t.$17 ', t.$17 ,' from #xxxx_NO_HEADER/SUB_TRANSACTION_20201204.csv t') from
--- CHANGE TABLE ---
#xxx/SUB_TRANSACTION_20201204.csv t limit 1;
I have a problem: I need to delete a column from my SQLite database. I wrote this query
alter table table_name drop column column_name
but it does not work. Please help me.
Update: SQLite 2021-03-12 (3.35.0) now supports DROP COLUMN. The FAQ on the website is still outdated.
From: http://www.sqlite.org/faq.html:
(11) How do I add or delete columns from an existing table in SQLite.
SQLite has limited ALTER TABLE support that you can use to add a
column to the end of a table or to change the name of a table. If you
want to make more complex changes in the structure of a table, you
will have to recreate the table. You can save existing data to a
temporary table, drop the old table, create the new table, then copy
the data back in from the temporary table.
For example, suppose you have a table named "t1" with columns names
"a", "b", and "c" and that you want to delete column "c" from this
table. The following steps illustrate how this could be done:
BEGIN TRANSACTION;
CREATE TEMPORARY TABLE t1_backup(a,b);
INSERT INTO t1_backup SELECT a,b FROM t1;
DROP TABLE t1;
CREATE TABLE t1(a,b);
INSERT INTO t1 SELECT a,b FROM t1_backup;
DROP TABLE t1_backup;
COMMIT;
Instead of dropping the backup table, just rename it...
BEGIN TRANSACTION;
CREATE TABLE t1_backup(a,b);
INSERT INTO t1_backup SELECT a,b FROM t1;
DROP TABLE t1;
ALTER TABLE t1_backup RENAME TO t1;
COMMIT;
For simplicity, why not create the backup table from the select statement?
CREATE TABLE t1_backup AS SELECT a, b FROM t1;
DROP TABLE t1;
ALTER TABLE t1_backup RENAME TO t1;
This option works only if you can open the DB in a DB Browser like DB Browser for SQLite.
In DB Browser for SQLite:
Go to the tab, "Database Structure"
Select you table Select Modify table (just under the tabs)
Select the column you want to delete
Click on Remove field and click OK
=>Create a new table directly with the following query:
CREATE TABLE table_name (Column_1 TEXT,Column_2 TEXT);
=>Now insert the data into table_name from existing_table with the following query:
INSERT INTO table_name (Column_1,Column_2) FROM existing_table;
=>Now drop the existing_table by following query:
DROP TABLE existing_table;
PRAGMA foreign_keys=off;
BEGIN TRANSACTION;
ALTER TABLE table1 RENAME TO _table1_old;
CREATE TABLE table1 (
( column1 datatype [ NULL | NOT NULL ],
column2 datatype [ NULL | NOT NULL ],
...
);
INSERT INTO table1 (column1, column2, ... column_n)
SELECT column1, column2, ... column_n
FROM _table1_old;
COMMIT;
PRAGMA foreign_keys=on;
For more info:
https://www.techonthenet.com/sqlite/tables/alter_table.php
I've made a Python function where you enter the table and column to remove as arguments:
def removeColumn(table, column):
columns = []
for row in c.execute('PRAGMA table_info(' + table + ')'):
columns.append(row[1])
columns.remove(column)
columns = str(columns)
columns = columns.replace("[", "(")
columns = columns.replace("]", ")")
for i in ["\'", "(", ")"]:
columns = columns.replace(i, "")
c.execute('CREATE TABLE temptable AS SELECT ' + columns + ' FROM ' + table)
c.execute('DROP TABLE ' + table)
c.execute('ALTER TABLE temptable RENAME TO ' + table)
conn.commit()
As per the info on Duda's and MeBigFatGuy's answers this won't work if there is a foreign key on the table, but this can be fixed with 2 lines of code (creating a new table and not just renaming the temporary table)
For SQLite3 c++ :
void GetTableColNames( tstring sTableName , std::vector<tstring> *pvsCols )
{
UASSERT(pvsCols);
CppSQLite3Table table1;
tstring sDML = StringOps::std_sprintf(_T("SELECT * FROM %s") , sTableName.c_str() );
table1 = getTable( StringOps::tstringToUTF8string(sDML).c_str() );
for ( int nCol = 0 ; nCol < table1.numFields() ; nCol++ )
{
const char* pch1 = table1.fieldName(nCol);
pvsCols->push_back( StringOps::UTF8charTo_tstring(pch1));
}
}
bool ColExists( tstring sColName )
{
bool bColExists = true;
try
{
tstring sQuery = StringOps::std_sprintf(_T("SELECT %s FROM MyOriginalTable LIMIT 1;") , sColName.c_str() );
ShowVerbalMessages(false);
CppSQLite3Query q = execQuery( StringOps::tstringTo_stdString(sQuery).c_str() );
ShowVerbalMessages(true);
}
catch (CppSQLite3Exception& e)
{
bColExists = false;
}
return bColExists;
}
void DeleteColumns( std::vector<tstring> *pvsColsToDelete )
{
UASSERT(pvsColsToDelete);
execDML( StringOps::tstringTo_stdString(_T("begin transaction;")).c_str() );
std::vector<tstring> vsCols;
GetTableColNames( _T("MyOriginalTable") , &vsCols );
CreateFields( _T("TempTable1") , false );
tstring sFieldNamesSeperatedByCommas;
for ( int nCol = 0 ; nCol < vsCols.size() ; nCol++ )
{
tstring sColNameCurr = vsCols.at(nCol);
bool bUseCol = true;
for ( int nColsToDelete = 0; nColsToDelete < pvsColsToDelete->size() ; nColsToDelete++ )
{
if ( pvsColsToDelete->at(nColsToDelete) == sColNameCurr )
{
bUseCol = false;
break;
}
}
if ( bUseCol )
sFieldNamesSeperatedByCommas+= (sColNameCurr + _T(","));
}
if ( sFieldNamesSeperatedByCommas.at( int(sFieldNamesSeperatedByCommas.size()) - 1) == _T(','))
sFieldNamesSeperatedByCommas.erase( int(sFieldNamesSeperatedByCommas.size()) - 1 );
tstring sDML;
sDML = StringOps::std_sprintf(_T("insert into TempTable1 SELECT %s FROM MyOriginalTable;\n") , sFieldNamesSeperatedByCommas.c_str() );
execDML( StringOps::tstringTo_stdString(sDML).c_str() );
sDML = StringOps::std_sprintf(_T("ALTER TABLE MyOriginalTable RENAME TO MyOriginalTable_old\n") );
execDML( StringOps::tstringTo_stdString(sDML).c_str() );
sDML = StringOps::std_sprintf(_T("ALTER TABLE TempTable1 RENAME TO MyOriginalTable\n") );
execDML( StringOps::tstringTo_stdString(sDML).c_str() );
sDML = ( _T("DROP TABLE MyOriginalTable_old;") );
execDML( StringOps::tstringTo_stdString(sDML).c_str() );
execDML( StringOps::tstringTo_stdString(_T("commit transaction;")).c_str() );
}
In case anyone needs a (nearly) ready-to-use PHP function, the following is based on this answer:
/**
* Remove a column from a table.
*
* #param string $tableName The table to remove the column from.
* #param string $columnName The column to remove from the table.
*/
public function DropTableColumn($tableName, $columnName)
{
// --
// Determine all columns except the one to remove.
$columnNames = array();
$statement = $pdo->prepare("PRAGMA table_info($tableName);");
$statement->execute(array());
$rows = $statement->fetchAll(PDO::FETCH_OBJ);
$hasColumn = false;
foreach ($rows as $row)
{
if(strtolower($row->name) !== strtolower($columnName))
{
array_push($columnNames, $row->name);
}
else
{
$hasColumn = true;
}
}
// Column does not exist in table, no need to do anything.
if ( !$hasColumn ) return;
// --
// Actually execute the SQL.
$columns = implode('`,`', $columnNames);
$statement = $pdo->exec(
"CREATE TABLE `t1_backup` AS SELECT `$columns` FROM `$tableName`;
DROP TABLE `$tableName`;
ALTER TABLE `t1_backup` RENAME TO `$tableName`;");
}
In contrast to other answers, the SQL used in this approach seems to preserve the data types of the columns, whereas something like the accepted answer seems to result in all columns to be of type TEXT.
Update 1:
The SQL used has the drawback that autoincrement columns are not preserved.
Just in case if it could help someone like me.
Based on the Official website and the Accepted answer, I made a code using C# that uses System.Data.SQLite NuGet package.
This code also preserves the Primary key and Foreign key.
CODE in C#:
void RemoveColumnFromSqlite (string tableName, string columnToRemove) {
try {
var mSqliteDbConnection = new SQLiteConnection ("Data Source=db_folder\\MySqliteBasedApp.db;Version=3;Page Size=1024;");
mSqliteDbConnection.Open ();
// Reads all columns definitions from table
List<string> columnDefinition = new List<string> ();
var mSql = $"SELECT type, sql FROM sqlite_master WHERE tbl_name='{tableName}'";
var mSqliteCommand = new SQLiteCommand (mSql, mSqliteDbConnection);
string sqlScript = "";
using (mSqliteReader = mSqliteCommand.ExecuteReader ()) {
while (mSqliteReader.Read ()) {
sqlScript = mSqliteReader["sql"].ToString ();
break;
}
}
if (!string.IsNullOrEmpty (sqlScript)) {
// Gets string within first '(' and last ')' characters
int firstIndex = sqlScript.IndexOf ("(");
int lastIndex = sqlScript.LastIndexOf (")");
if (firstIndex >= 0 && lastIndex <= sqlScript.Length - 1) {
sqlScript = sqlScript.Substring (firstIndex, lastIndex - firstIndex + 1);
}
string[] scriptParts = sqlScript.Split (new string[] { "," }, StringSplitOptions.RemoveEmptyEntries);
foreach (string s in scriptParts) {
if (!s.Contains (columnToRemove)) {
columnDefinition.Add (s);
}
}
}
string columnDefinitionString = string.Join (",", columnDefinition);
// Reads all columns from table
List<string> columns = new List<string> ();
mSql = $"PRAGMA table_info({tableName})";
mSqliteCommand = new SQLiteCommand (mSql, mSqliteDbConnection);
using (mSqliteReader = mSqliteCommand.ExecuteReader ()) {
while (mSqliteReader.Read ()) columns.Add (mSqliteReader["name"].ToString ());
}
columns.Remove (columnToRemove);
string columnString = string.Join (",", columns);
mSql = "PRAGMA foreign_keys=OFF";
mSqliteCommand = new SQLiteCommand (mSql, mSqliteDbConnection);
int n = mSqliteCommand.ExecuteNonQuery ();
// Removes a column from the table
using (SQLiteTransaction tr = mSqliteDbConnection.BeginTransaction ()) {
using (SQLiteCommand cmd = mSqliteDbConnection.CreateCommand ()) {
cmd.Transaction = tr;
string query = $"CREATE TEMPORARY TABLE {tableName}_backup {columnDefinitionString}";
cmd.CommandText = query;
cmd.ExecuteNonQuery ();
cmd.CommandText = $"INSERT INTO {tableName}_backup SELECT {columnString} FROM {tableName}";
cmd.ExecuteNonQuery ();
cmd.CommandText = $"DROP TABLE {tableName}";
cmd.ExecuteNonQuery ();
cmd.CommandText = $"CREATE TABLE {tableName} {columnDefinitionString}";
cmd.ExecuteNonQuery ();
cmd.CommandText = $"INSERT INTO {tableName} SELECT {columnString} FROM {tableName}_backup;";
cmd.ExecuteNonQuery ();
cmd.CommandText = $"DROP TABLE {tableName}_backup";
cmd.ExecuteNonQuery ();
}
tr.Commit ();
}
mSql = "PRAGMA foreign_keys=ON";
mSqliteCommand = new SQLiteCommand (mSql, mSqliteDbConnection);
n = mSqliteCommand.ExecuteNonQuery ();
} catch (Exception ex) {
HandleExceptions (ex);
}
}
In Python 3.8...
Preserves primary key and column types.
Takes 3 inputs:
a sqlite cursor: db_cur,
table name: t and,
list of columns to junk: columns_to_junk
def removeColumns(db_cur, t, columns_to_junk):
# Obtain column information
sql = "PRAGMA table_info(" + t + ")"
record = query(db_cur, sql)
# Initialize two strings: one for column names + column types and one just
# for column names
cols_w_types = "("
cols = ""
# Build the strings, filtering for the column to throw out
for r in record:
if r[1] not in columns_to_junk:
if r[5] == 0:
cols_w_types += r[1] + " " + r[2] + ","
if r[5] == 1:
cols_w_types += r[1] + " " + r[2] + " PRIMARY KEY,"
cols += r[1] + ","
# Cut potentially trailing commas
if cols_w_types[-1] == ",":
cols_w_types = cols_w_types[:-1]
else:
pass
if cols[-1] == ",":
cols = cols[:-1]
else:
pass
# Execute SQL
sql = "CREATE TEMPORARY TABLE xfer " + cols_w_types + ")"
db_cur.execute(sql)
sql = "INSERT INTO xfer SELECT " + cols + " FROM " + t
db_cur.execute(sql)
sql = "DROP TABLE " + t
db_cur.execute(sql)
sql = "CREATE TABLE " + t + cols_w_types + ")"
db_cur.execute(sql)
sql = "INSERT INTO " + t + " SELECT " + cols + " FROM xfer"
db_cur.execute(sql)
You'll find a reference to a query() function. Just a helper...
Takes two inputs:
sqlite cursor db_cur and,
the query string: query
def query(db_cur, query):
r = db_cur.execute(query).fetchall()
return r
Don't forget to include a "commit()"!
What's the best way to incrementally build an XML document/string using PL/pgSQL? Consider the following desired XML output:
<Directory>
<Person>
<Name>Bob</Name>
<Address>1234 Main St</Address>
<MagicalAddressFactor1>3</MagicalAddressFactor1>
<MagicalAddressFactor2>8</MagicalAddressFactor2>
<MagicalAddressFactor3>1</MagicalAddressFactor3>
<IsMagicalAddress>Y</IsMagicalAddress>
</Person>
<Person>
<Name>Joshua</Name>
<Address>100 Broadway Blvd</Address>
<MagicalAddressFactor1>2</MagicalAddressFactor1>
<MagicalAddressFactor2>1</MagicalAddressFactor2>
<MagicalAddressFactor3>4</MagicalAddressFactor3>
<IsMagicalAddress>Y</IsMagicalAddress>
</Person>
</Directory>
Where:
Person name and address is based on a simple person table.
MagicalAddressFactor 1, 2, and 3 are all based on some complex links and calculations to other tables from the Person table.
IsMagicalAddress is based on the sum of the three MagicalAddressFactors being greater than 10.
How could I generate this with PL/pgSQL using XML functions to ensure a well-formed XML element? Without using XML functions the code would look like this:
DECLARE
v_sql text;
v_rec RECORD;
v_XML xml;
v_factor1 integer;
v_factor2 integer;
v_factor3 integer;
v_IsMagical varchar;
BEGIN
v_XML := '<Directory>';
v_sql := 'select * from person;'
FOR v_rec IN v_sql LOOP
v_XML := v_XML || '<Name>' || v_rec.name || '</Name>' ||
'<Address>' || v_rec.Address || '</Address>';
v_factor1 := get_factor_1(v_rec);
v_factor2 := get_factor_2(v_rec);
v_factor3 := get_factor_3(v_rec);
v_IsMagical := case
when (v_factor1 + v_factor2 + v_factor3) > 10 then
'Y'
else
'N'
end;
v_XML := v_XML || '<MagicalAddressFactor1>' || v_factor1 || '</MagicalAddressFactor1>' ||
'<MagicalAddressFactor2>' || v_factor2 || '</MagicalAddressFactor2>' ||
'<MagicalAddressFactor3>' || v_factor3 || '</MagicalAddressFactor3>' ||
'<IsMagicalAddress>' || v_IsMagical || '</IsMagicalAddress>';
v_XML := v_XML || '</Person>'
END LOOP;
v_XML := v_XML || '</Directory>'
END;
For OP and future readers, consider a general purpose language whenever needed to migrate database content to XML documents. Simply connect via ODBC/OLEDB drivers, retrieve query, and output to XML document. Using OP's needs, calculations can be incorporated into one select query or a stored procedure that returns a resultset and have coding language import records for document building.
Below are open-source solutions including Java where each connect using corresponding PostgreSQL drivers (requiring installation). SQL queries assumes get_factor1(), get_factor2(), get_factor3() are inline database functions and Persons maintain a unique ID in first column.
Java (using the Postgre JDBC driver)
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.OutputKeys;
import java.sql.* ;
import java.util.ArrayList;
import java.io.IOException;
import java.io.File;
import org.w3c.dom.Attr;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
public class SQLtoXML {
public static void main(String[] args) {
String currentDir = new File("").getAbsolutePath();
try {
String url = "jdbc:postgresql://localhost/test";
Properties props = new Properties();
props.setProperty("user","sqluser");
props.setProperty("password","secret");
props.setProperty("ssl","true");
Connection conn = DriverManager.getConnection(url, props);
String url = "jdbc:postgresql://localhost/test?user=sqlduser&password=secret&ssl=true";
Connection conn = DriverManager.getConnection(url);
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("SELECT name, address, "
+ "get_factor_1(v_rec) As v_factor1, "
+ "get_factor_2(v_rec) As v_factor2, "
+ "get_factor_3(v_rec) As v_factor3, "
+ " CASE WHEN (get_factor_1(v_rec) + "
+ " get_factor_2(v_rec) + "
+ " get_factor_3(v_rec)) > 10 "
+ " THEN 'Y' ELSE 'N' END As v_isMagical "
+ " FROM Persons;");
// Write to XML document
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
Document doc = docBuilder.newDocument();
// Root element
Element rootElement = doc.createElement("Directory");
doc.appendChild(rootElement);
// Export table data
ResultSetMetaData rsmd = rs.getMetaData();
int columnsNumber = rsmd.getColumnCount();
while (rs.next()) {
// Data rows
Element personNode = doc.createElement("Person");
rootElement.appendChild(personNode);
Element nameNode = doc.createElement("name");
nameNode.appendChild(doc.createTextNode(rs.getString(2)));
personNode.appendChild(nameNode);
Element addressNode = doc.createElement("address");
addressNode.appendChild(doc.createTextNode(rs.getString(3)));
personNode.appendChild(addressNode);
Element magicaladd1Node = doc.createElement("MagicalAddressFactor1");
magicaladd1Node.appendChild(doc.createTextNode(rs.getString(4)));
personNode.appendChild(magicaladd1Node);
Element magicaladd2Node = doc.createElement("MagicalAddressFactor2");
magicaladd2Node.appendChild(doc.createTextNode(rs.getString(5)));
personNode.appendChild(magicaladd2Node);
Element magicaladd3Node = doc.createElement("MagicalAddressFactor3");
magicaladd3Node.appendChild(doc.createTextNode(rs.getString(6)));
personNode.appendChild(magicaladd3Node);
Element isMagicalNode = doc.createElement("IsMagicalAddress");
isMagicalNode.appendChild(doc.createTextNode(rs.getString(7)));
personNode.appendChild(isMagicalNode);
}
rs.close();
stmt.close();
conn.close();
// Output content to xml file
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File(currentDir + "\\PostgreXML_java.xml"));
transformer.transform(source, result);
System.out.println("Successfully created xml file!");
} catch (ParserConfigurationException pce) {
System.out.println(pce.getMessage());
} catch (TransformerException tfe) {
System.out.println(tfe.getMessage());
} catch (SQLException err) {
System.out.println(err.getMessage());
}
}
}
Python (Using the Psycopg module)
import psycopg2
import os
import lxml.etree as ET
cd = os.path.dirname(os.path.abspath(__file__))
# DB CONNECTION AND QUERY
db = psycopg2.connect("dbname=test user=postgres")
cur = db.cursor()
cur.execute("SELECT name, address, \
get_factor_1(v_rec) As v_factor1, \
get_factor_2(v_rec) As v_factor2, \
get_factor_3(v_rec) As v_factor3, \
CASE WHEN (get_factor_1(v_rec) + \
get_factor_2(v_rec) + \
get_factor_3(v_rec)) > 10 \
THEN 'Y' ELSE 'N' END As v_isMagical \
FROM Persons;")
# WRITING XML FILE
root = ET.Element('Directory')
for row in cur.fetchall():
personNode = ET.SubElement(root, "Person")
ET.SubElement(personNode, "Name").text = row[1]
ET.SubElement(personNode, "Address").text = row[2]
ET.SubElement(personNode, "MagicalAddressFactor1").text = row[3]
ET.SubElement(personNode, "MagicalAddressFactor2").text = row[4]
ET.SubElement(personNode, "MagicalAddressFactor3").text = row[5]
ET.SubElement(personNode, "IsMagicalAddress").text = row[6]
# CLOSE CURSOR AND DATABASE
cur.close()
db.close()
# OUTPUT XML
tree_out = (ET.tostring(root, pretty_print=True, xml_declaration=True, encoding="UTF-8"))
xmlfile = open(os.path.join(cd, 'PostgreXML_py.xml'),'wb')
xmlfile.write(tree_out)
xmlfile.close()
print("Successfully migrated SQL to XML data!")
PHP (using Postgre PDO Driver)
<?php
$cd = dirname(__FILE__);
// create a dom document with encoding utf8
$domtree = new DOMDocument('1.0', 'UTF-8');
$domtree->formatOutput = true;
$domtree->preserveWhiteSpace = false;
# Opening db connection
$host="root";
$dbuser = "*****";
try {
$dbh = new PDO("pgsql:dbname=$dbname;host=$host", $dbuser, $dbpass);
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$sql = "SELECT name, address,
get_factor_1(v_rec) As v_factor1,
get_factor_2(v_rec) As v_factor2,
get_factor_3(v_rec) As v_factor3,
CASE WHEN (get_factor_1(v_rec) +
get_factor_2(v_rec) +
get_factor_3(v_rec)) > 10
THEN 'Y' ELSE 'N' END As v_isMagical
FROM Persons;";
$STH = $dbh->query($sql);
$STH->setFetchMode(PDO::FETCH_ASSOC);
}
catch(PDOException $e) {
echo $e->getMessage();
exit;
}
/* create the root element of the xml tree */
$xmlRoot = $domtree->createElement("Directory");
$xmlRoot = $domtree->appendChild($xmlRoot);
/* loop query results through child elements */
while($row = $STH->fetch()) {
$personNode = $xmlRoot->appendChild($domtree->createElement('Person'));
$nameNode = $personNode->appendChild($domtree->createElement('Name', $row['name']));
$addNode = $personNode->appendChild($domtree->createElement('Address', $row['address']));
$magadd1Node = $personNode->appendChild($domtree->createElement('MagicalAddressFactor1', $row['v_factor1']));
$magadd2Node = $personNode->appendChild($domtree->createElement('MagicalAddressFactor2', $row['v_factor2']));
$magadd3Node = $personNode->appendChild($domtree->createElement('MagicalAddressFactor3', $row['v_factor3']));
$ismagicalNode = $personNode->appendChild($domtree->createElement('IsMagicalAddress', $row['v_isMagical']));
}
file_put_contents($cd. "/PostgreXML_php.xml", $domtree->saveXML());
echo "\nSuccessfully migrated SQL data into XML!\n";
# Closing db connection
$dbh = null;
exit;
?>
R (using the RPostgreSQL package)
library(RPostgreSQL)
library(XML)
#setwd("C:/path/to/working/folder")
# OPEN DATABASE AND QUERY
drv <- dbDriver("PostgreSQL")
conn <- dbConnect(drv, dbname="tempdb")
df <- sqlQuery(conn, "SELECT name, address,
get_factor_1(v_rec) As v_factor1,
get_factor_2(v_rec) As v_factor2,
get_factor_3(v_rec) As v_factor3,
CASE WHEN (get_factor_1(v_rec) +
get_factor_2(v_rec) +
get_factor_3(v_rec)) > 10
THEN 'Y' ELSE 'N' END As v_isMagical
FROM Persons;")
close(conn)
# CREATE XML FILE
doc = newXMLDoc()
root = newXMLNode("Directory", doc = doc)
# WRITE XML NODES AND DATA
for (i in 1:nrow(df)){
personNode = newXMLNode("Person", parent = root)
nameNode = newXMLNode("name", df$name[i], parent = personNode)
addressNode = newXMLNode("address", df$address[i], parent = personNode)
magicaladdress1Node = newXMLNode("MagicalAddressFactor1", df$v_factor1[i], parent = personNode)
magicaladdress2Node = newXMLNode("MagicalAddressFactor2", df$v_factor2[i], parent = personNode)
magicaladdress3Node = newXMLNode("MagicalAddressFactor3", df$v_factor3[i], parent = personNode)
ismagicalNode = newXMLNode("IsMagicalAddress", df$v_isMagical[i], parent = personNode)
}
# OUTPUT XML CONTENT TO FILE
saveXML(doc, file="PostgreXML_R.xml")
print("Successfully migrated SQL to XML data!")
Your code has three issues:
FOR IN variable LOOP doesn't work - if you really needs dynamic SQL then you have to use form FOR IN EXECUTE variable, but better is directly write the SQL query,
But, it cannot by fast, if persons are more than few
iteration over expensive cycle body is slow,
string concatenation is expensive
The output XML can be wrong, because you are missing escaping.
Last two points are solved pretty well by SQL/XML functions - I'll write only simple example - but it is really pretty strong ANSI/SQL feature (supported by Postgres).
SELECT xmlelement(NAME "Directory",
xmlagg(xmlelement(NAME "Person",
xmlforest(name AS "Name",
address AS "Address"))))
FROM persons;
I'm trying to create the following XML using a SQL query (Oracle):
<Changes>
<Description>Some static test</Description>
<Notes>Some static test</Notes>
<UserChange>
<Operation>Static Text</Operation>
<User>VALUE from Table - record #1</User>
<BusinessSource>VALUE from Table #1</BusinessSource>
<ApplicationRole>VALUE from Table #1</ApplicationRole>
</UserChange>
<UserChange>
<Operation>Static Text</Operation>
<User>VALUE from Table - record #2</User>
<BusinessSource>VALUE from Table #2</BusinessSource>
<ApplicationRole>VALUE from Table #2</ApplicationRole>
</UserChange>
<UserChange>
<Operation>Static Text</Operation>
<User>VALUE from Table - record #3</User>
<BusinessSource>VALUE from Table #3</BusinessSource>
<ApplicationRole>VALUE from Table #3</ApplicationRole>
</UserChange>
</Changes>
The table I'm using looks like this:
ID USER SOURCE ROLE
1 test1 src1 role1
2 test1 src1 role1
3 test1 src1 role2
4 user2 src role
5 user3 src role
6 user1 src role
I want to write a query that will create a dynamic XML based on the values in the table.
For example:
The query should only take the values where user='test1' and the output will be the following XML:
<Changes>
<Description>Some static test</Description>
<Notes>Some static test</Notes>
<UserChange>
<Operation>Static Text</Operation>
<User>user1</User>
<BusinessSource>src1</BusinessSource>
<ApplicationRole>role1</ApplicationRole>
</UserChange>
<UserChange>
<Operation>Static Text</Operation>
<User>user1</User>
<BusinessSource>src1</BusinessSource>
<ApplicationRole>role1</ApplicationRole>
</UserChange>
<UserChange>
<Operation>Static Text</Operation>
<User>user1</User>
<BusinessSource>src1</BusinessSource>
<ApplicationRole>role2</ApplicationRole>
</UserChange>
</Changes>
I've started to write the query:
SELECT XMLElement("Changes",
XMLElement("Description", 'sometext'),
XMLElement("Notes", 'sometext'),
XMLElement("FulfillmentDate", 'Some Date'),
XMLElement("UserChange",
XMLElement("Operation", 'sometext'),
XMLElement("User", 'sometext'),
XMLElement("BusinessSource", 'sometext'),
XMLElement("ApplicationRole", 'sometext')
)).GETSTRINGVAL() RESULTs
FROM DUAL;
I need to iterate on the other values and make them part of the complete XML.
Appreciate your help.
Thanks
I was able to find a solution:
select XMLElement("Changes",
XMLElement("Description", 'sometext'),
XMLElement("Notes", 'sometext'),
XMLElement("FulfillmentDate", 'Some Date'),
XMLAgg(XML_CANDIDATE) ).GETSTRINGVAL() RESULTS
from
(
select XMLAGG(
XMLElement("UserChange",
XMLElement("Operation", 'sometext'),
XMLElement("User", 'sometext'),
XMLElement("BusinessSource", 'sometext'),
XMLElement("ApplicationRole", 'sometext'))) XML_CANDIDATE
from
table);
For future readers, here are open source programming solutions to transfer a SQL query to XML document, using OP's data needs as example.
Below code examples are not restricted to any database SQL dialect (i.e., transferrable to other RDBMS using corresponding connection modules as below are Oracle-specific).
For Python (using cx_Oracle and lxml modules):
import os
import cx_Oracle
import lxml.etree as ET
# Set current directory
cd = os.path.dirname(os.path.abspath(__file__))
# DB CONNECTION AND QUERY
db = cx_Oracle.connect("uid/pwd#database")
cur = db.cursor()
cur.execute("SELECT * FROM OracleData where user='test1'")
# WRITING XML FILE
root = ET.Element('Changes')
DescNode = ET.SubElement(root, "Description").text = 'Some static test'
NotesNode = ET.SubElement(root, "Notes").text = 'Some static test'
# LOOPING THROUGH QUERY RESULTS TO WRITE CHILD ELEMENTS
for row in cur.fetchall():
UCNode = ET.SubElement(root, "UserChange")
ET.SubElement(UCNode, "Operation").text = 'Static Text'
ET.SubElement(UCNode, "User").text = row[1]
ET.SubElement(UCNode, "BusinessSource").text = row[2]
ET.SubElement(UCNode, "ApplicationRole").text = row[3]
# CLOSE CURSOR AND DATABASE
cur.close()
db.close()
tree_out = (ET.tostring(root, pretty_print=True, xml_declaration=True, encoding="UTF-8"))
xmlfile = open(os.path.join(cd, 'OracleXML.xml'),'wb')
xmlfile.write(tree_out)
xmlfile.close()
For PHP (using PDO Oracle OCI and DOMDocument)
// Set current directory
$cd = dirname(__FILE__);
// create a dom document with encoding utf8
$domtree = new DOMDocument('1.0', 'UTF-8');
$domtree->formatOutput = true;
$domtree->preserveWhiteSpace = false;
// Opening db connection
$db_username = "your_username";
$db_password = "your_password";
$db = "oci:dbname=your_sid";
try {
$dbh = new PDO($db,$db_username,$db_password);
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$sql = "SELECT * FROM OracleData where user='test1'";
$STH = $dbh->query($sql);
$STH->setFetchMode(PDO::FETCH_ASSOC);
}
catch(PDOException $e) {
echo $e->getMessage();
exit;
}
/* create the root element of the xml tree */
$xmlRoot = $domtree->createElement("Changes");
$xmlRoot = $domtree->appendChild($xmlRoot);
$DescNode = $xmlRoot->appendChild($domtree->createElement('Description', 'Some static test'));
$NotesNode = $xmlRoot->appendChild($domtree->createElement('Notes', 'Some static test'));
/* loop query results through child elements */
while($row = $STH->fetch()) {
$UCNode = $xmlRoot->appendChild($domtree->createElement('UserChange'));
$operationNode = $UCNode->appendChild($domtree->createElement('Operation', 'Some static text'));
$userNode = $UCNode->appendChild($domtree->createElement('User', $row['USER']));
$sourceNode = $UCNode->appendChild( $domtree->createElement('BusienssSource', $row['SOURCE']));
$roleNode = $UCNode->appendChild($domtree->createElement('ApplicationRole', $row['ROLE']));
}
file_put_contents($cd. "/OracleXML.xml", $domtree->saveXML());
# Closing db connection
$dbh = null;
exit;
For R (using ROracle and XML packages):
library(XML)
library(ROracle)
# SET CURRENT DIRECTORY
setwd("C:\\Path\\To\\R\\Script")
# OPEN DATABASE AND QUERY
conn <-dbConnect(drv, username = "", password = "", dbname = "")
df <- dbGetQuery(conn, "select * from OracleData where user= 'test1';")
dbDisconnect(conn)
# CREATE XML FILE
doc = newXMLDoc()
root = newXMLNode("Changes", doc = doc)
descNode = newXMLNode("Description", "Some static test", parent = root)
notesNode = newXMLNode("Notes", "Some static test", parent = root)
# WRITE XML NODES AND DATA
for (i in 1:nrow(df)){
UCNode = newXMLNode("UserChange", parent = root)
operationNode = newXMLNode("Operation", "Some static text", parent = UCNode)
userNode = newXMLNode("User", df$USER[i], parent = UCNode)
sourceNode = newXMLNode("BusinessSource", df$SOURCE[i], parent = UCNode)
roleNode = newXMLNode("ApplicationRole", df$ROLE[i], parent = UCNode)
}
# OUTPUT XML CONTENT TO FILE
saveXML(doc, file="OracleXML.xml")
You can use this query:
select xmlelement("Changes",
xmlforest(
'Some Static Text' "Description"
, 'Some Static Text' "Notes")
, xmlagg(
xmlelement("UserChange",
xmlforest('Static Text' "Operation",
"USER" "User",
SOURCE "BusinessSource",
ROLE "ApplicationRole")
)
)
),getclobval()
from table
where "USER" = 'test1';
But remember that the XMLAGG function is an aggregate function. In this case every selected column from table is included in the aggregate, so no group by is needed. However, if you wanted to include some column from table outside of the XMLAGG you would need to include them in a group by statement. Also since USER is a reserved word it needs to be surrounded by double quotes to be used as a column reference.
I use INFORMATION_SCHEMA.PARAMETERS for getting information about a stored procedure parameter now. I need to know the default values of parameters. Is there a solution for getting the default value of the parameters of a given stored procedure?
Parse the SQL code if you are doing it via SQL commands...
The information isn't stored in a system table. From sys.parameters (where you'd expect it), has_default_value and default_value columns, we're told to parse SQL:
SQL Server only maintains default values for CLR objects in this catalog view; therefore, this column has a value of 0 for Transact-SQL objects. To view the default value of a parameter in a Transact-SQL object, query the definition column of the sys.sql_modules catalog view, or use the OBJECT_DEFINITION system function.
If has_default_value is 1, the value of this column is the value of the default for the parameter; otherwise, NULL.
To prove:
CREATE PROC dbo.Paramtest (#foo int = 42)
AS
SET NOCOUNT ON;
GO
SELECT OBJECT_NAME(object_id), has_default_value, default_value
FROM sys.parameters
WHERE name = '#foo' AND object_id = OBJECT_ID('dbo.Paramtest')
-- gives Paramtest, 0, NULL
Declare #pSProcName NVARCHAR(MAX)='ProcedureName'
DECLARE #SQLTEXT NVARCHAR(MAX),#start int ,#end int,#SearchCode NVARCHAR(MAX)
SELECT #SQLTEXT =OBJECT_DEFINITION(OBJECT_ID(#pSProcName))
SELECT #start=CHARINDEX('#',#SQLTEXT,1)
SELECT #end =min(val) FROM (
SELECT PATINDEX('%'+CHAR(10)+'AS'+CHAR(10)+'%',#SQLTEXT ) AS val
UNION ALL SELECT PATINDEX('%'+CHAR(10)+'AS'+CHAR(13)+'%',#SQLTEXT )
UNION ALL SELECT PATINDEX('%'+CHAR(13)+'AS'+CHAR(10)+'%',#SQLTEXT )
UNION ALL SELECT PATINDEX('%'+CHAR(13)+'AS'+CHAR(13)+'%',#SQLTEXT )
UNION ALL SELECT PATINDEX('%'+CHAR(10)+'AS'+CHAR(32)+'%',#SQLTEXT )
UNION ALL SELECT PATINDEX('%'+CHAR(32)+'AS'+CHAR(10)+'%',#SQLTEXT )
UNION ALL SELECT PATINDEX('%'+CHAR(32)+'AS'+CHAR(32)+'%',#SQLTEXT )
UNION ALL SELECT PATINDEX('%'+CHAR(13)+'AS'+CHAR(32)+'%',#SQLTEXT )
UNION ALL SELECT PATINDEX('%'+CHAR(32)+'AS'+CHAR(13)+'%',#SQLTEXT )
) S
Where s.val <> 0
SELECT #SearchCode=SUBSTRING(#sqltext,#start,#end - #start)
SELECT S2.parameter_id,S2.ParameterName,S2.DataType,S1.Default_Value FROM
(
SELECT CASE WHEN Data like '%=%' then RIGHT(Data,len(Data)-CHARINDEX('=',Data,1)) ELSE '' END as Default_Value,
CASE WHEN Data like '%=%' then 1 ELSE 0 END as Has_default_Value
,Data FROM(
SELECT LTRIM(RTRIM(Split.a.value('.', 'VARCHAR(100)'))) AS Data
FROM
(
SELECT CAST ('<M>' + REPLACE(#SearchCode, ',', '</M><M>') + '</M>' AS XML) AS Data
) AS A CROSS APPLY Data.nodes ('/M') AS Split(a))s
)S1
INNER JOIN
(
Select p.parameter_id,p.name as ParameterName,UPPER(t.name) AS DataType from sys.all_parameters p
Inner JOIN sys.types t
on t.user_type_id = p.user_type_id
where Object_name(OBJECT_ID) = #pSProcName
) S2
ON S1.Data LIKE '%'+S2.ParameterName+'%'+S2.DataType+'%'
#N.Dinesh.Reddy has an amazing answer. I re-worked it a little bit to provide support for new-lines between parameter definitions, as well as supporting parameters whose default values containing XML characters (e.g. #Parameter1 NVARCHAR(MAX) = N'<').
DECLARE #ProcedureName NVARCHAR(MAX) = N'dbo.TestProcedure';
DECLARE #ObjectId INT = OBJECT_ID(#ProcedureName);
DECLARE #DefinitionText NVARCHAR(MAX) = REPLACE(REPLACE(OBJECT_DEFINITION(#ObjectId), CHAR(10), N' '), CHAR(13), N' ');
DECLARE #FirstParameterIndex INT = CHARINDEX('#',#DefinitionText, 1);
-- Pull out only the parameters, and xml-encode them.
SET #DefinitionText = (SELECT SUBSTRING(#DefinitionText, #FirstParameterIndex, PATINDEX(N'% AS %', #DefinitionText) - #FirstParameterIndex) FOR XML PATH(N''));
-- Find the parameter names.
SELECT b.parameter_id, b.name, b.system_type_id, b.user_type_id, b.max_length, b.is_output, a.has_default_value FROM (
SELECT LEFT(ParameterDefinition, CHARINDEX(N' ', ParameterDefinition, 1)) parameter_name, CAST(CASE WHEN EqualSignIndex = 0 THEN 0 ELSE 1 END AS BIT) has_default_value FROM (
SELECT ParameterDefinition, CHARINDEX(N'=', ParameterDefinition, 1) EqualSignIndex FROM (
SELECT LTRIM(RTRIM(Split.ParameterDefinition.value(N'.', N'NVARCHAR(100)'))) ParameterDefinition FROM (
SELECT CAST(CONCAT('<a>', REPLACE(#DefinitionText, ',', '</a><a>'), '</a>') AS XML) Xml
) a CROSS APPLY Xml.nodes('/a') AS Split(ParameterDefinition)
) a
) a
) a
FULL JOIN sys.all_parameters b ON a.parameter_name = b.name
WHERE b.object_id = #ObjectId;
I poked around the dissassembly of SQL Server Management Studio to find out how Microsoft themselves do it, because I wanted to ensure my approach was fully-correct.
As the other posters have surmised, and to my surprise (read: appalled shock) SSMS runs a RegEx against sys.sql_modules.definition to extract parameter information that isn't available in sys.parameters.
As of SQL Server Management Studio 18, the logic for this is in Microsoft.SqlServer.SqlEnum.dll, specifically, the class Microsoft.SqlServer.Management.Smo.PostProcessParam. You'll find other useful regexes in there too.
Here's the patterns:
If your procedure was saved with SET QUOTED_IDENTIFIER then this pattern is used:
new Regex( "(/\\*(([^/\\*])|(\\*(?=[^/]))|(/(?=[^\\*])))*|(/\\*(?>/\\*(?<DEPTH>)|\\*/(?<-DEPTH>)|(.|[\n])?)*(?(DEPTH)(?!))\\*/)\\*/)|(--[^\n]*)|(\"((\"\")|[^\"])*\")|(//[^\n]*)|(?<delim>\\b((AS)|(RETURNS))\\b)|(?:(?<param>#[\\w_][\\w\\d_$$##]*)((\\s)|((--[^\n]*))|((/\\*(([^/\\*])|(\\*(?=[^/]))|(/(?=[^\\*])))*|(/\\*(?>/\\*(?<DEPTH>)|\\*/(?<-DEPTH>)|(.|[\n])?)*(?(DEPTH)(?!))\\*/)\\*/)))*(AS){0,1})|(?<val>(((\"((\"\")|[^\"])*\"))|((N{0,1}'(('')|[^'])*)')|((0x[0-9a-f]+))|(((\\+|\\-){0,1}((\\d+\\.\\d*)|(\\d*\\.\\d+)|(\\d+))(e((\\+)|(\\-))\\d+){0,1}))|((\\[((\\]\\])|[^\\]])*\\]))|(([\\w_][\\w;\\d_]*))))|(?<comma>,)|(?<eq>=)|(\\([\\d, ]*\\))", RegexOptions.IgnoreCase | RegexOptions.ExplicitCapture);
If your procedure was not saved with SET QUOTED_IDENTIFIER then this pattern is used:
new Regex( "(/\\*(([^/\\*])|(\\*(?=[^/]))|(/(?=[^\\*])))*|(/\\*(?>/\\*(?<DEPTH>)|\\*/(?<-DEPTH>)|(.|[\n])?)*(?(DEPTH)(?!))\\*/)\\*/)|(--[^\n]*)|(//[^\n]*)|(?<delim>\\b((AS)|(RETURNS))\\b)|(?:(?<param>#[\\w_][\\w\\d_$$##]*)((\\s)|((--[^\n]*))|((/\\*(([^/\\*])|(\\*(?=[^/]))|(/(?=[^\\*])))*|(/\\*(?>/\\*(?<DEPTH>)|\\*/(?<-DEPTH>)|(.|[\n])?)*(?(DEPTH)(?!))\\*/)\\*/)))*(AS){0,1})|(?<val>(((\"((\"\")|[^\"])*\"))|((N{0,1}'(('')|[^'])*)')|((0x[0-9a-f]+))|(((\\+|\\-){0,1}((\\d+\\.\\d*)|(\\d*\\.\\d+)|(\\d+))(e((\\+)|(\\-))\\d+){0,1}))|((\\[((\\]\\])|[^\\]])*\\]))|(([\\w_][\\w;\\d_]*))))|(?<comma>,)|(?<eq>=)|(\\([\\d, ]*\\))", RegexOptions.IgnoreCase | RegexOptions.ExplicitCapture);
To get the default values of each parameter, run the appropriate regex against your sql.sql_modules_definition and watch for the val group.
TL;DR:
Here's the full instructions needed to get this to work:
Requirements:
On-Prem SQL Server (tested with SQL Server 2014, 2016 and 207). This does not work with Azure SQL because Azure SQL does not support SQL-CLR.
You do not need Visual Studio - just a command-line access to csc.exe so you can compile the SQL-CLR assembly yourself.
Copy and paste the GetParameterDefaults.cs file described below.
Run this command in a terminal/command-prompt to compile GetParameterDefaults.cs to a SQL-CLR assembly ProcParamDefs.dll:
csc.exe /noconfig /nowarn:1701,1702,2008 /fullpaths /nostdlib+ /errorreport:prompt /warn:4 /define:DEBUG;TRACE /errorendlocation /preferreduilang:en-US /highentropyva+ /reference:"C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.8\mscorlib.dll" /reference:"C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.8\System.Data.dll" /reference:"C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.8\System.dll" /reference:"C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.8\System.Xml.dll" /debug+ /debug:full /optimize- /out:ProcParamDefs.dll /subsystemversion:6.00 /target:library /warnaserror- /utf8output /langversion:7.3 GetParameterDefaults.cs
Locate the ProcParamDefs.dll you created in Step 2 and convert it to a hex-bin string (Base 16).
There are websites that will do it for free, or use this PowerShell command:
( Get-Content ".\ProcParamDefs.dll" | Format-Hex | Select-Object -Expand Bytes | ForEach-Object { '{0:x2}' -f $_ }) -join ''
Copy and paste the Install.sql file (at the end of this post) into an SSMS session.
Replace <PASTE BASE-16 (HEX) DLL HERE KEEP THE LEADING 0x PREFIX> with the Base16/hex from step 3. Note the leading 0x needs to be present, as does the trailing ;, so it should look something like this:
CREATE ASSEMBLY [ProcParamDefs]
AUTHORIZATION [dbo]
FROM 0x0x4D5A90000300000004000000FFFF0000B800etc
;
Run it and you should be all-set to use the UDF dbo.GetParameterDefaults, dbo.[GetParameterDefaultsByProcedureObjectId, and dbo.ParseParameterDefaultValues.
For example, see this screenshot:
Note that it only lists parameters with defaults. It will not list parameters without any default defined.
GetParameterDefaults.cs
using System;
using System.Collections;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using System.Text.RegularExpressions;
using Microsoft.SqlServer.Server;
/// <summary>
/// This SQL-CLR table-valued UDF will parse and expose the default-values for SQL Server stored procedures as this information is not normally available through INFORMATION_SCHEMA nor sys.parameters.<br />
/// By Dai Rees on StackOverflow: https://stackoverflow.com/questions/6992561/is-there-a-solution-for-getting-the-default-value-of-the-parameters-of-a-given-s<br />
/// This also may find its way onto my GitHub eventually too.<br />
/// The crucial regular-expressions inside this UDF were copied from Microsoft's SqlEnum.dll (they're *the exact same* regexes that SSMS uses to show parameter information) btw.<br />
/// I guess the regexes are Microsoft's copyright, but SSMS is given-away for free and Microsoft made no attempt to obscure/hide them, so I guess this is fair-use? If so, then consider my code as MIT licensed, so happy forking!
/// </summary>
public partial class UserDefinedFunctions
{
private static readonly Regex _paramRegexQI = new Regex("(/\\*(([^/\\*])|(\\*(?=[^/]))|(/(?=[^\\*])))*|(/\\*(?>/\\*(?<DEPTH>)|\\*/(?<-DEPTH>)|(.|[\n])?)*(?(DEPTH)(?!))\\*/)\\*/)|(--[^\n]*)|(\"((\"\")|[^\"])*\")|(//[^\n]*)|(?<delim>\\b((AS)|(RETURNS))\\b)|(?:(?<param>#[\\w_][\\w\\d_$$##]*)((\\s)|((--[^\n]*))|((/\\*(([^/\\*])|(\\*(?=[^/]))|(/(?=[^\\*])))*|(/\\*(?>/\\*(?<DEPTH>)|\\*/(?<-DEPTH>)|(.|[\n])?)*(?(DEPTH)(?!))\\*/)\\*/)))*(AS){0,1})|(?<val>(((\"((\"\")|[^\"])*\"))|((N{0,1}'(('')|[^'])*)')|((0x[0-9a-f]+))|(((\\+|\\-){0,1}((\\d+\\.\\d*)|(\\d*\\.\\d+)|(\\d+))(e((\\+)|(\\-))\\d+){0,1}))|((\\[((\\]\\])|[^\\]])*\\]))|(([\\w_][\\w;\\d_]*))))|(?<comma>,)|(?<eq>=)|(\\([\\d, ]*\\))", RegexOptions.IgnoreCase | RegexOptions.ExplicitCapture | RegexOptions.Compiled );
private static readonly Regex _paramRegex = new Regex("(/\\*(([^/\\*])|(\\*(?=[^/]))|(/(?=[^\\*])))*|(/\\*(?>/\\*(?<DEPTH>)|\\*/(?<-DEPTH>)|(.|[\n])?)*(?(DEPTH)(?!))\\*/)\\*/)|(--[^\n]*)|(//[^\n]*)|(?<delim>\\b((AS)|(RETURNS))\\b)|(?:(?<param>#[\\w_][\\w\\d_$$##]*)((\\s)|((--[^\n]*))|((/\\*(([^/\\*])|(\\*(?=[^/]))|(/(?=[^\\*])))*|(/\\*(?>/\\*(?<DEPTH>)|\\*/(?<-DEPTH>)|(.|[\n])?)*(?(DEPTH)(?!))\\*/)\\*/)))*(AS){0,1})|(?<val>(((\"((\"\")|[^\"])*\"))|((N{0,1}'(('')|[^'])*)')|((0x[0-9a-f]+))|(((\\+|\\-){0,1}((\\d+\\.\\d*)|(\\d*\\.\\d+)|(\\d+))(e((\\+)|(\\-))\\d+){0,1}))|((\\[((\\]\\])|[^\\]])*\\]))|(([\\w_][\\w;\\d_]*))))|(?<comma>,)|(?<eq>=)|(\\([\\d, ]*\\))", RegexOptions.IgnoreCase | RegexOptions.ExplicitCapture | RegexOptions.Compiled );
private const String _tableDefinition = #"ParameterName nvarchar(100), DefaultValueExpr nvarchar(4000)";
[SqlFunction(
DataAccess = DataAccessKind.Read,
FillRowMethodName = nameof(FillRow),
IsDeterministic = false,
Name = nameof(GetParameterDefaults),
SystemDataAccess = SystemDataAccessKind.Read,
TableDefinition = _tableDefinition
)]
public static IEnumerable/*<(String parameterName, String defaultValueExpr)>*/ GetParameterDefaults( String procedureName )
{
// Despite the fact the function returns an IEnumerable and the SQLCLR docs saying how great streaming is, it's actually very difficult to use `yield return`:
// See here: https://stackoverflow.com/questions/591191/sqlfunction-fails-to-open-context-connection-despite-dataaccesskind-read-present
// SQLCLR will handle ArgumentExceptions just fine:
// https://learn.microsoft.com/en-us/archive/blogs/sqlprogrammability/server-side-error-handling-part-2-errors-and-error-messages
// https://learn.microsoft.com/en-us/archive/blogs/sqlprogrammability/exception-handling-in-sqlclr
if( procedureName is null ) throw new ArgumentNullException( paramName: nameof(procedureName) );
if( String.IsNullOrWhiteSpace( procedureName ) ) throw new ArgumentException( message: "Value cannot be empty nor whitespace.", paramName: nameof(procedureName) );
//
( Boolean ok, String definition, Boolean isQuotedId ) = TryGetProcedureDefinitionFromName( procedureName );
if( !ok )
{
throw new ArgumentException( message: "Could not find the definition of a procedure with the name \"" + procedureName + "\".", paramName: nameof(procedureName) );
}
// We can't do this, boo:
// foreach( var t in ParseParams( definition, quotedId ? _paramRegexQI : _paramRegex ) ) yield return t;
return ParseParameterDefaultValues( definition, isQuotedId );
}
[SqlFunction(
DataAccess = DataAccessKind.Read,
FillRowMethodName = nameof(FillRow),
IsDeterministic = false,
Name = nameof(GetParameterDefaultsByProcedureObjectId),
SystemDataAccess = SystemDataAccessKind.Read,
TableDefinition = _tableDefinition
)]
public static IEnumerable/*<(String parameterName, String defaultValueExpr)>*/ GetParameterDefaultsByProcedureObjectId( Int32 procedureObjectId )
{
( Boolean ok, String definition, Boolean isQuotedId ) = TryGetProcedureDefinitionFromId( procedureObjectId );
if( !ok )
{
throw new ArgumentException( message: "Could not find the definition of a procedure with OBJECT_ID = " + procedureObjectId.ToString(), paramName: nameof(procedureObjectId) );
}
// We can't do this, boo:
// foreach( var t in ParseParams( definition, quotedId ? _paramRegexQI : _paramRegex ) ) yield return t;
return ParseParameterDefaultValues( definition, isQuotedId );
}
[SqlFunction(
DataAccess = DataAccessKind.Read,
FillRowMethodName = nameof(FillRow),
IsDeterministic = false,
Name = nameof(ParseParameterDefaultValues),
SystemDataAccess = SystemDataAccessKind.Read,
TableDefinition = _tableDefinition
)]
public static IEnumerable/*<(String parameterName, String defaultValueExpr)>*/ ParseParameterDefaultValues( String procedureDefinition, Boolean isQuotedId )
{
List<(String parameterName, String defaultValueExpr)> list = new List<(String parameterName, String defaultValueExpr)>();
foreach( (String name, String value) t in ParseParams( procedureDefinition, isQuotedId ? _paramRegexQI : _paramRegex ) )
{
list.Add( t );
}
return list;
}
private static ( Boolean ok, String definition, Boolean quotedId ) TryGetProcedureDefinitionFromName( String procedureName )
{
using( SqlConnection c = new SqlConnection( "context connection=true" ) )
{
c.Open();
using( SqlCommand cmd = c.CreateCommand() )
{
cmd.CommandText = #"
SELECT
c.[definition],
CONVERT( bit, OBJECTPROPERTY( c.object_id, N'ExecIsQuotedIdentOn' ) ) AS IsQuotedId
FROM
sys.sql_modules AS c
WHERE
c.object_id = OBJECT_ID( #procedureName );";
_ = cmd.Parameters.Add( new SqlParameter( "#procedureName", SqlDbType.NVarChar ) { Value = procedureName } );
using( SqlDataReader rdr = cmd.ExecuteReader() )
{
if( rdr.Read() )
{
String definition = rdr.GetString(0);
Boolean quotedId = rdr.GetBoolean(1);
return ( ok: true, definition, quotedId );
}
else
{
// Validate the object-name:
return ( ok: false, definition: null, quotedId: default );
}
}
}
/*
using( SqlCommand cmdPostMortem = c.CreateCommand() )
{
cmdPostMortem.CommandText = #"
SELECT OBJECT_ID( #procedureName ) AS oid";
_ = cmdPostMortem.Parameters.Add( new SqlParameter( "#procedureName", SqlDbType.NVarChar ) { Value = procedureName } );
Object objectId = cmdPostMortem.ExecuteScalar();
if( objectId is null || objectId == DBNull.Value )
{
}
}
*/
}
}
private static ( Boolean ok, String definition, Boolean quotedId ) TryGetProcedureDefinitionFromId( Int32 procedureObjectId )
{
using( SqlConnection c = new SqlConnection( "context connection=true" ) )
{
c.Open();
using( SqlCommand cmd = c.CreateCommand() )
{
cmd.CommandText = #"
SELECT
c.[definition],
CONVERT( bit, OBJECTPROPERTY( c.object_id, N'ExecIsQuotedIdentOn' ) ) AS IsQuotedId
FROM
sys.sql_modules AS c
WHERE
c.object_id = #objId;";
_ = cmd.Parameters.Add( new SqlParameter( "#objId", SqlDbType.Int ) { Value = procedureObjectId } ); // `OBJECT_ID` returns `int` values btw.
using( SqlDataReader rdr = cmd.ExecuteReader() )
{
if( rdr.Read() )
{
String definition = rdr.GetString(0);
Boolean quotedId = rdr.GetBoolean(1);
return ( ok: true, definition, quotedId );
}
else
{
return ( ok: false, definition: null, quotedId: default );
}
}
}
}
}
private static IEnumerable<(String name, String value)> ParseParams( String definition, Regex r )
{
Boolean inParam = false;
String currentParameterName = null;
Match m = r.Match( definition );
while( m.Success && !m.Groups["delim"].Success )
{
if( m.Groups["eq"].Success )
{
inParam = true;
}
if( m.Groups["comma"].Success )
{
inParam = false;
currentParameterName = null;
}
if( inParam && currentParameterName != null && m.Groups["val"].Success )
{
String defaultValue = m.Groups["val"].Value;
inParam = false;
yield return ( currentParameterName, defaultValue );
}
if( m.Groups["param"].Success )
{
currentParameterName = m.Groups["param"].Value;
}
m = m.NextMatch();
}
}
public static void FillRow( Object tupleObj, out SqlString parameterName, out SqlString defaultValueExpr )
{
if( tupleObj is ValueTuple<String,String> vt )
{
parameterName = vt.Item1;
defaultValueExpr = vt.Item2;
}
else if( tupleObj is null )
{
throw new ArgumentNullException( paramName: nameof(tupleObj) );
}
else
{
throw new ArgumentException( message: "Expected first argument to be of type ValueTuple<String,String> but encountered " + tupleObj.GetType().FullName, paramName: nameof(tupleObj) );
}
}
}
Install.sql
/* You may need to run these statements too:
EXEC sp_configure 'show advanced options', 1;
RECONFIGURE;
EXEC sp_configure 'clr strict security', 0;
RECONFIGURE;
EXEC sp_configure 'clr enabled', 1;
RECONFIGURE;
GO
*/
CREATE ASSEMBLY [ProcParamDefs] AUTHORIZATION [dbo]
FROM 0x<PASTE BASE-16 (HEX) DLL HERE KEEP THE LEADING 0x PREFIX>
WITH PERMISSION_SET = SAFE
GO
CREATE FUNCTION [dbo].[GetParameterDefaults] (#procedureName [nvarchar](MAX))
RETURNS TABLE (ParameterName nvarchar(100), DefaultValueExpr nvarchar(4000))
AS EXTERNAL NAME [ProcParamDefs].[UserDefinedFunctions].[GetParameterDefaults];
GO
CREATE FUNCTION [dbo].[GetParameterDefaultsByProcedureObjectId] (#procedureObjectId [int])
RETURNS TABLE (ParameterName nvarchar(100), DefaultValueExpr nvarchar(4000))
AS EXTERNAL NAME [ProcParamDefs].[UserDefinedFunctions].[GetParameterDefaultsByProcedureObjectId];
GO
CREATE FUNCTION [dbo].[ParseParameterDefaultValues] (#procedureDefinition [nvarchar](MAX), #isQuotedId [bit])
RETURNS TABLE (ParameterName nvarchar(100), DefaultValueExpr nvarchar(4000))
AS EXTERNAL NAME [ProcParamDefs].[UserDefinedFunctions].[ParseParameterDefaultValues];
GO
As said before, this is not supported via T-SQL.
You'll have to use some kind of program language to implement this (unless you want to deal with text parsing with T-SQL).
I found it very easy to use SMO for this.
For example, in Powershell:
param
(
[string]$InstanceName = $env:COMPUTERNAME,
[string]$DBName = "MyDB",
[string]$ProcedureSchema = "dbo",
[string]$ProcedureName = "MyProcedure"
)
[System.Reflection.Assembly]::LoadWithPartialName('Microsoft.SqlServer.SMO') | Out-Null
$serverInstance = New-Object ('Microsoft.SqlServer.Management.Smo.Server') $InstanceName
$serverInstance.SetDefaultInitFields([Microsoft.SqlServer.Management.Smo.StoredProcedure], $false)
$procedure = $serverInstance.Databases[$DBName].StoredProcedures[$ProcedureName, $ProcedureSchema];
$procedure.Parameters | Select-Object Parent, Name, DataType, DefaultValue, #{Name="Properties";Expression={$_.Properties | Select Name, Value }}
Or, using C#:
Server svr = new Server(new ServerConnection(new SqlConnection(ConfigurationManager.ConnectionStrings["MyConnectionString"].ConnectionString)));
svr.SetDefaultInitFields(typeof(StoredProcedure), false);
StoredProcedure sp = svr.Databases["MyDatabase"].StoredProcedures["mySproc", "myScheme"];
Dictionary<string, string> defaultValueLookup = new Dictionary<string, string>();
foreach (StoredProcedureParameter parameter in sp.Parameters)
{
string defaultValue = parameter.DefaultValue;
string parameterName = parameter.Name;
defaultValueLookup.Add(parameterName, defaultValue);
}
(source for the last one: https://stackoverflow.com/a/9977237/3114728)