Regex to extract fields and data types from sql statement - sql

I have this sql statement:
CREATE TABLE [dbo].[User]( [UserId] [int] IDENTITY(1,1) NOT NULL,
[FirstName] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL, [MiddleName]
[varchar](50) COLLATE SQL_Latin1_General_CP1_CI_A
What i want is regex code which i can use to get all fields and data type.
So will return something like that:
FirstName varchar
MiddleName varchar
Notes:
The sql statement will always have this format.
I am using .Net to run this regex

You didn't mention whether the SQL statement is in a string on one line or if it's spanning multiple lines.
Assuming it's on one line, this may fit your request:
Dim input As String = "CREATE TABLE [dbo].[User]( [UserId] [int] IDENTITY(1,1) NOT NULL, " & _
"[FirstName] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL, [MiddleName] " & _
"[varchar](50) COLLATE SQL_Latin1_General_CP1_CI_A"
For Each m As Match In Regex.Matches(input, "\[(?<Field>\w+)\]\s*\[(?<Type>\w+)\]")
Console.WriteLine("{0} : {1}", m.Groups("Field").Value, m.Groups("Type").Value)
Next

I don't know anything .NET. In some other worlds, the following could handle the search portion of the operation:
\[(.*?)\][\s\n\r]+\[(.*?)\]\((\d\d)\)
Insert that into the "search" format for a .NET regex (whatever that might be), write your output stuff. If linebreaks can occur midword then this could have problems. Note that the above also pulls the type's length, so it would produce
MiddleName varchar 50
To do without the third backreference, just leave it out of the replace (wasted) or do
\[(.*?)\][\s\n\r]+\[(.*?)\]\(\d\d\)
Lots of fine ways to do it. As usual just make sure you understand the potential variability of the input.

Related

How to retrieve German characters from a large CSV File into SQL Server 2017 script

I have a CSV file including a list of employees, where some of them includes German characters like 'ö' in their names. I need to create a temp table in my SQL Server 2017 script and fill it with the content of the CSV file. My script is:
CREATE TABLE #AllAdUsers(
[PhysicalDeliveryOfficeName] [NVARCHAR](255) NULL,
[Name] [NVARCHAR](255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[DisplayName] [NVARCHAR](255) NULL,
[Company] [NVARCHAR](255) NULL,
[SAMAccountName] [NVARCHAR](255) NULL
)
--import AD users
BULK INSERT #AllAdUsers
FROM 'C:\Employees.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',', --CSV field delimiter
ROWTERMINATOR = '\n', --Use to shift the control to next row
TABLOCK
)
However, even though I use "Nvarchar" variable type with the collation of "SQL_Latin1_General_CP1_CI", the German characters are not seem OK, for instance "Kösker" seems like:
"K├╢sker"
I've tried many other collations but couldn't find a fix for it. Any help would be very much appreciated.

.NET framework error when enabling where clause in sql query

I am facing a weird issue wherein on disabling/enabling certain condition in where clause, my Select query throws .net framework error.
Here is the CREATE table script.
Table test_classes:
CREATE TABLE [dbo].[test_classes]
(
[CLASSID] [int] NOT NULL,
[PARENTID] [int] NULL,
[CATID] [int] NOT NULL,
[CLASS_NAME] [nvarchar](255) NOT NULL,
[ORIGINAL_NAME] [nvarchar](255) NULL,
[GEOMETRY] [tinyint] NOT NULL,
[READ_ONLY] [bit] NOT NULL,
[DISPLAY_STYLES] [image] NULL,
[FEATURE_COUNT] [int] NOT NULL,
[TEMPOWNER] [int] NULL,
[OPTIONS] [int] NOT NULL,
[POLYGON_TYPE] [int] NULL,
[CLASS_EXTRA] [nvarchar](1024) NULL,
[MAPID] [int] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
Table test_polygon:
CREATE TABLE [dbo].[test_polygon]
(
[FID] [nvarchar](36) NOT NULL,
[EXTENT_L] [float] NOT NULL,
[EXTENT_T] [float] NOT NULL,
[EXTENT_R] [float] NOT NULL,
[EXTENT_B] [float] NOT NULL,
[COORDINATES] [image] NULL,
[CHAINS] [smallint] NOT NULL,
[CLASSID] [int] NOT NULL,
[SPATIAL_KEY] [bigint] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
Due to word limitation (due to image datatype), here is the INSERT input: GDrive SQL Link
SELECT SQL query:
select
Class_Name, FID,
geometry::STGeomFromWKB(b1+b2,0) as polygon,
Class_ID, Original_Name
from
(Select
cl.Class_Name, p.FID,
substring(CAST(p.Coordinates AS varbinary(max)),1,1) as b1,
substring(CAST(p.Coordinates AS varbinary(max)),3,999999) as b2,
cl.ClassID as Class_ID,
cl.Original_Name
From
test_polygon p
Inner Join
test_classes cl on cl.ClassID = p.ClassID) s_polygon
--where Class_ID = 215 --Filter#1
--where Class_Name = 'L1_County' --Filter#2
To note, Class_ID 215 represents 'L1_County' class_name.
Problem is, if you enable Filter#1, then the output is as expected. But when I only enable Filter#2 then the query fails with .NET Error.
Expected output :
Class_Name FID polygon Class_ID Original_Name
----------- ---------------- ------------- ----------- ------------------------
L1_County Northamptonshire <long value> 215 B8USR_4DB8184E88092424
Error I get :
Msg 6522, Level 16, State 1, Line 4
A .NET Framework error occurred during execution of user-defined routine or aggregate "geometry":
System.FormatException: 24119: The Polygon input is not valid because the start and end points of the exterior ring are not the same. Each ring of a polygon must have the same start and end points.
System.FormatException:
at Microsoft.SqlServer.Types.GeometryValidator.ValidatePolygonRing(Int32 iRing, Int32 cPoints, Double firstX, Double firstY, Double lastX, Double lastY)
at Microsoft.SqlServer.Types.Validator.Execute(Transition transition)
at Microsoft.SqlServer.Types.ForwardingGeoDataSink.EndFigure()
at Microsoft.SqlServer.Types.WellKnownBinaryReader.ReadLineStringPoints(ByteOrder byteOrder, UInt32 cPoints, Boolean readZ, Boolean readM)
at Microsoft.SqlServer.Types.WellKnownBinaryReader.ReadLinearRing(ByteOrder byteOrder, Boolean readZ, Boolean readM)
at Microsoft.SqlServer.Types.WellKnownBinaryReader.ParseWkbPolygonWithoutHeader(ByteOrder byteOrder, Boolean readZ, Boolean readM)
at Microsoft.SqlServer.Types.WellKnownBinaryReader.ParseWkb(OpenGisType> type) > at Microsoft.SqlServer.Types.WellKnownBinaryReader.Read(OpenGisType type, Int32 srid)
at Microsoft.SqlServer.Types.SqlGeometry.GeometryFromBinary(OpenGisType type, SqlBytes binary, Int32 srid) .
What I am trying to ask is, Why do I get error when WHERE clause has Class_Name and not when Class_ID.
I am using SQL Server 2012 Enterprise edition. Error replicates in SQL Server 2008 as well.
edit:
Estimated Execution plan for Filter#1 :
Estimated Execution plan for Filter#2 :
I will summarise comments:
You are seeing this issue because your table contains invalid data. The reason you do not see it when searching by test_polygon.Class_ID is that Class_ID is passed as a predicate to the table scan. When test_classes.Class_Name is used as filter the search predicate is applied to test_classes table.
Since geometry::STGeomFromWKB "Compute Scalar" happens before "Join" it causes all rows of test_polygon to be evaluated by this function, including rows containing invalid data.
Update: Even though the plans look the same, they are not, as predicate conditions are different for different filters (WHERE conditions) and therefore outputs of table scans operators are different.
The is no standard way to force the order of evaluation in SQL Server query as by design you are not supposed to.
There are two options:
Materialise (store in a table) the result of the sub-query. This, simply, splits the query into two separate queries, one to find records and the second query to compute data on the found results. The intermediate results are stored in a (temp) table.
Use "hacks" that allow you to coerce SQL Server to evaluate query a certain way.
Below is an example of a "hack":
select
Class_Name, FID,
CASE WHEN Class_Name = Class_Name THEN geometry::STGeomFromWKB(b1+b2,0) ELSE NULL END as polygon,
Class_ID, Original_Name
from
(Select
cl.Class_Name, p.FID,
substring(CAST(p.Coordinates AS varbinary(max)),1,1) as b1,
substring(CAST(p.Coordinates AS varbinary(max)),3,999999) as b2,
cl.ClassID as Class_ID,
cl.Original_Name
From
test_polygon p
Inner Join
test_classes cl on cl.ClassID = p.ClassID) s_polygon
--where Class_ID = 215 --Filter#1
where Class_Name = 'L1_County' --Filter#2
By adding a dummy CASE expression that looks at test_classes.Class_Name we are forcing SQL Server to evaluate it after the JOIN has been resolved.
The plan:
Useful Article:
http://dataeducation.com/cursors-run-just-fine/

sqlite3 Error Executing SQL From File

I am trying to create tables in an SQLite database with sqlite3.
The command $ sqlite3 mydb < mytables.sql produce the following error: Incomplete SQL: ??C.
mytables.sql is:
CREATE TABLE SizeCulture (
SizeCultureID INTEGER PRIMARY KEY ASC,
SizeID INTEGER NULL,
CultureID TEXT NULL,
Name TEXT NULL,
Description TEXT NULL,
Abbreviation TEXT NULL,
);
CREATE TABLE Size(
SizeID INTEGER PRIMARY KEY ASC ,
Creation TEXT NOT NULL,
Modification TEXT NOT NULL,
Deleted INTEGER NOT NULL,
);
/****** Object: Table [Ordering].[BarCode] Script Date: 11/09/2011 14:58:19 ******/
CREATE TABLE BarCode(
BarCodeID INTEGER PRIMARY KEY ASC NOT NULL,
BarCodeValue TEXT NOT NULL,
);
This was modified from a script generated by SQL Server, where some tables need to be replicated on an Android device.
The above is just a set of repeating create table statements. From what I understand, SQLite follows standard SQL (like MySQL or postgres).
Though I can't test it at the moment, I think it's the trailing commas that are confusing it (for example, the comma at the end of Abbreviation TEXT NULL,). Try removing all those trailing commas.
Edit: To be clear, I'm talking about all of these commas:
Abbreviation TEXT NULL,
...
Deleted INTEGER NOT NULL,
...
BarCodeValue TEXT NOT NULL,
I had the same problem, but for a different reason (so I'm commenting because Google led me here). Turns out you can also encounter this error if your file has a weird encoding (like UCS-2 instead of UTF8).

Computed column based on nullable columns

I want to create a computed column that is the concatenation of several other columns. In the below example, fulladdress is null in the result set when any of the 'real' columns is null. How can I adjust the computed column function to take into account the nullable columns?
CREATE TABLE Locations
(
[id] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
[fulladdress] AS (((([address]+[address2])+[city])+[state])+[zip]),
[address] [varchar](50) NULL,
[address2] [varchar](50) NULL,
[city] [varchar](50) NULL,
[state] [varchar](50) NULL,
[zip] [varchar](50) NULL
)
Thanks in advance
This gets messy pretty quick, but here's a start:
ISNULL(address,'') + ' '
+ ISNULL(address2,'') + ' '
+ ISNULL(city,'') + ' '
+ ISNULL(state,'') + ' '
+ ISNULL(zip,'')
(If isnull doesn't work, you can try coalesce. If neither work, share what DMBS you're using.)
You shouldn't have a full address column (which is a duplicate of other columns) stored in your database unless you have a good reason. The correct way would be to construct the full address string in your queries. By creating the field dynamically you reduce redundancy in the table and you have one less column to maintain (which would need to be updated anytime any other column changes).
In your query you would do something like
SELECT CONCAT(ISNULL(address,''), ISNULL(address2,''), ISNULL(city,''), ISNULL(state,''), ISNULL(zip,'')) AS fulladdress FROM Locations;
The CONCAT() function performs concatenation and the ISNULL() gives you your string if it's not null or the second param (which was passed as '') if it is null

Sql Server - Insufficient result space to convert uniqueidentifier value to char

I am getting below error when I run sql query while copying data from one table to another,
Msg 8170, Level 16, State 2, Line 2
Insufficient result space to convert
uniqueidentifier value to char.
My sql query is,
INSERT INTO dbo.cust_info (
uid,
first_name,
last_name
)
SELECT
NEWID(),
first_name,
last_name
FROM dbo.tmp_cust_info
My create table scripts are,
CREATE TABLE [dbo].[cust_info](
[uid] [varchar](32) NOT NULL,
[first_name] [varchar](100) NULL,
[last_name] [varchar](100) NULL)
CREATE TABLE [dbo].[tmp_cust_info](
[first_name] [varchar](100) NULL,
[last_name] [varchar](100) NULL)
I am sure there is some problem with NEWID(), if i take out and replace it with some string it is working.
I appreciate any help. Thanks in advance.
A guid needs 36 characters (because of the dashes). You only provide a 32 character column. Not enough, hence the error.
You need to use one of 3 alternatives
1, A uniqueidentifier column, which stores it internally as 16 bytes. When you select from this column, it automatically renders it for display using the 8-4-4-4-12 format.
CREATE TABLE [dbo].[cust_info](
[uid] uniqueidentifier NOT NULL,
[first_name] [varchar](100) NULL,
[last_name] [varchar](100) NULL)
2, not recommended Change the field to char(36) so that it fits the format, including dashes.
CREATE TABLE [dbo].[cust_info](
[uid] char(36) NOT NULL,
[first_name] [varchar](100) NULL,
[last_name] [varchar](100) NULL)
3, not recommended Store it without the dashes, as just the 32-character components
INSERT INTO dbo.cust_info (
uid,
first_name,
last_name
)
SELECT
replace(NEWID(),'-',''),
first_name,
last_name
FROM dbo.tmp_cust_info
I received this error when I was trying to perform simple string concatenation on the GUID. Apparently a VARCHAR is not big enough.
I had to change:
SET #foo = 'Old GUID: {' + CONVERT(VARCHAR, #guid) + '}';
to:
SET #foo = 'Old GUID: {' + CONVERT(NVARCHAR(36), #guid) + '}';
...and all was good. Huge thanks to the prior answers on this one!
Increase length of your uid column from varchar(32) ->varchar(36)
because guid take 36 characters
Guid.NewGuid().ToString() -> 36 characters
outputs: 12345678-1234-1234-1234-123456789abc
You can try this. This worked for me.
Specify a length for VARCHAR when you cast/convert a value..for uniqueidentifier use VARCHAR(36) as below:
SELECT Convert (varchar(36),NEWID()) AS NEWID
The default length for VARCHAR datatype if we don't specify a length during CAST/CONVERT is 30..
Credit : Krishnakumar S
Reference : https://social.msdn.microsoft.com/Forums/en-US/fb24a153-f468-4e18-afb8-60ce90b55234/insufficient-result-space-to-convert-uniqueidentifier-value-to-char?forum=transactsql