SQL parser in C - sql

I want to parse and store the columns and values of a SQL DML (INSERT, UPDATE, DELETE) statement in C. Need the URL of the open source code or a library with which I can link my C program. The platform is SUSE Linux. Have tried to make and use libSQL unsuccessfully. A detailed answer is appreciated. Thanks.
Additional Notes: Please suggest a library/code that I can link with my C program. In my program I want to use the functions of this library to parse and use the tokens for further processing.

You can have a look at the source code for SQLite. It uses a parser called Lemon.
Links:
SQLite architecture
Lemon parser
You can also look at the source code for postgresql-plpython3. Looks like it has a pure C based SQL parser.
Link:
postgresql-plpython3 # github

I would suggest to start from the real parser of a real DBMS. There are several in free software. For instance, the parser of PostgreSQL is in the directory src/backend/parser of the distribution and is written in C and Yacc.

See the Chapter "Parsing SQL" in "Lex & Yacc"(O'Reilly) in google books http://books.google.fr/books?id=YrzpxNYegEkC&lpg=PT1&dq=bison%20flex%20sql%20grammar&client=firefox-a&hl=en&pg=PA109

Have you looked at SQLite ? It certainly does have the code to parse SQL, so maybe you could avoid reimplementing it..

ANTLR can target C, among other languages, and its catalog of premade grammars has a bunch of SQL dialects - notably MySQL and Oracle.

µSQL for C++
What is µSQL ?
µSQL is a SQL parser engine for C++ to develop SQL based applications
easily, and it supports other SQL like domain specific languages such
as UnQL and GQL too. Because µSQL is written only in old standard C++
library such as STL with ANTLR, then you can use it with many C++
compilers and platforms.
Repo on Github

The standalone SQL parser of the Hyrise database system is open source and, even though it is written in C++, it can be accessed from C and is easy to understand and modify. It utilizes bison and flex.

Not sure is there any mature C sql parser can do that easily.
Here is a Java version SQL library can do what you need exactly, it can be used to Retrieve/Refactor table & column name from a complex SQL query easily.

Have you considered writing your own using lex and yacc? (the hacker - hardcore approach)
Not trivial .. but this site might help you get started

Related

pylint equivalent for SQL?

python having pylint
scala having Scalastyle
I searched around but didn't find a style checker for SQL. Does it exist?
Thank you.
You don't require any error checker for Sql, as Sql is not a programming language. They IDE you use will help you to understand the issue in the query and can be formatted accordingly. Please choose appropriate IDE (Sql developer/ db weaver)
I also found this sql style guide
https://www.sqlstyle.guide/
so it's more of a team wide consensus on which SQL style to go with, but not a linter to spot it out.

Sql objects spell checker

I am just wondering that whether there is any utility which can help in determining the spelling mistakes in sql objects.
I can only think of getting information of all the sql objects and the save them in any other system (e.g. excel file) and then run a spell checker on it.
I am looking forward for a better way for this, like any plugin for MSSQL.
You can use spell check functionality built in Office as CLR function. Example see here:
http://msdn.microsoft.com/en-us/library/office/aa537153%28v=office.11%29.aspx
To use this, you need MS Word to be installed on server side.

Any tools to check for duplicate VB.NET code?

I wish to get a quick feeling for how much “copy and paste” coding we have, there are many tools for C# / Java to check for this type of thing. Are there any such tools that work well with VB.NET?
(I have seen what looks like lots of repeated code, but wish to get some number to help me make a case for sorting it out)
Update on progress.
I have just tried Simian.
It does not seem to be able to produce a nicely formatted report I can sent by email
It does not cope when the names of local variables, or parameters etc may have been changed, e.g it just matches on lines of text being the same.
Clone Doctor does not support VB.NET (only C# and VB 6 and lot of other)
October 2010: VB.net added to langauges supported by CloneDR
Clone Detective for Visual Studio only supports C#
SolidSDD - Source Code Duplication Detector only supports C, C++, C# and Java
DuplicateFinder is open source, but otherwise looks very match like Simian, e.g it just works on lines of text
ConQAT - Continuous Quality Assessment Toolkit seems to have a clone detector that works for VB.NET (not tried it yet)
Gendarme is a bit like FXCop and has a AvoidCodeDuplicatedInSameClassRule rule, this looks very promising, as it avoids the problem of working at the text level. Just tried it, it is the best solution so far, pity it does not search with a greater scope.
Before claiming that this question is a duplicate, please check that the other question addresses VB.NET, as a lot of tools that work well for C# don't work so well for VB.NET. (However it would not surprise me if this question is a real duplicate)
CodeRush 11.2 introduced a new feature called Duplicate Detection and Consolidation (DDC)
http://community.devexpress.com/blogs/markmiller/archive/2011/11/29/duplicate-detection-and-consolidation-in-coderush-for-visual-studio.aspx
Make sure to check out the options for it as well, as you can have it run when so many lines are changed, certainly time has passed, etc.
They've posted some decent videos on the DevExpress site too.
Simian: http://www.redhillconsulting.com.au/products/simian/
[I'm the author of CloneDR ("Clone Doctor").]
CloneDR is parameterized by a full grammar for the programming language in question. So it doesn't just match lines. Rather, it can find clones which are syntactically well-formed, with variations that are more than just identifier changes, regardless of where they stop or start in a line.
The engine on which CloneDR rests, The DMS Software Reengineering Toolkit" is a tool for analyzing large scale systems in any programming language, and uses language descriptions to drive the analysis. DMS has a wide variety of language front ends already available.
Presently it has VBScript and VB6 (as dialects of "Visual Basic"). It doesn't have VB.net, but that would be pretty straightforward to do given the DMS infrastructure and our experience with lots of other languages.
So, CloneDR could do this just fine, with a small bit of effort on our part.
EDIT October 2010: VB.net added as a language CloneDR can process.
Atomiq supports vb.net amongst other languages, and the results are nicely presented.
JetBrains published console tool set Resharper Console Tools to run duplication analysis. Once installed it allows you to do the same analysis as TeamCity does and generate duplicates report locally and even include duplicates search into custom build process with MSBuild. This tool does exactly what you need. More details you can find here at JetBrains blog post
Try Simian:
Simian (Similarity Analyser) identifies duplication in Java, C#, C, C++, COBOL, Ruby, JSP, ASP, HTML, XML, Visual Basic, Groovy source code and even plain text files.
I once saw an impressive demo of Pattern Insight; its CP Miner may be what you’re looking for: http://patterninsight.com/products/cp-miner.php. It seems to be language-independent, though I couldn’t find anything explicit about languages other than C/C++.
Roll up your sleeves and write your own parser to use it with CPD?
See the question for the tools I found.

How to create a compiler in vb.net

Before answering this question, understand that I am not asking how to create my own programming language, I am asking how, using vb.net code, I can create a compiler for a language like vb.net itself. Essentially, the user inputs code, they get a .exe. By NO MEANS do I want to write my own language, as it seems other compiler related questions on here have asked. I also do not want to use the vb.net compiler itself, nor do I wish to duplicate the IDE.
The exact purpose of what I wish to do is rather hard to explain, but all I need is a nudge in the right direction for writing a compiler (from scratch if possible) which can simply take input and create a .exe. I have opened .exe files as plain text before (my own programs) to see if I could derive some meaning from what I assumed would be human readable text, yet I was obviously sorely disappointed to see the random ascii, though it is understandable why this is all I found.
I know that a .exe file is simply lines of code, being parsed by the computer it is on, but my question here really boils down to this: What code makes up a .exe? How could I go about making one in a plain text editor if I wanted to? (No, I do not want to do that, but if I understand the process my goals will be much easier to achieve.) What makes an executable file an executable file? Where does the logic of the code fit in?
This is intended to be a programming question as opposed to a computer question, which is why I did not post it on SuperUser. I know plenty of information about the System.IO namespace, so I know how to create a file and write to it, I simply do not know what exactly I would be placing inside this file to get it to work as an executable file.
I am sorry if this question is "confusing", "dumb", or "obvious", but I have not been able to find any information regarding the actual contents of an executable file anywhere.
One of my google searches
Something that looked promising
EDIT: The second link here, while it looked good, was an utter fail. I am not going to waste hours of my time hitting keys and recording the results. "Use the "Alt" and the 3-digit combinations to create symbols that do not appear on the keyboard but that you need in the program." (step 4) How the heck do I know what symbols I need???
Thank you very much for your help, and my apologies if this question is a nooby or "bad" one.
To sum this up simply: I want to create a program in vb.net that can compile code in a particular language to a single executable file. What methods exist that can allow me to do this, and if there are none, how can I go about writing my own from scratch?
What you're asking is a pretty complex question. Sure, at its core it seems pretty basic:
Interpret the code itself
Write out the interpreted code
but each of those steps can be pretty intense. Step 1 should be somewhat achievable with some time and a LOT of elbow grease - you need to parse the code into a number of control statements based upon the specification of the language. Check out http://en.wikipedia.org/wiki/Parsing for more information on this step. In essence you're converting the typed code into a common format that represents the functionality you want.
Once you've parsed the code, the next step would be to take that parsed code and convert it into machine-runnable code (typically assembly, though with VB.NET you can write Microsoft Intermediate Language code as the output and then run it in the CLR). This is what will actually create the executable file in a manner that lets the computer run the program.
Unfortunately, the best advice for solving this problem is to either:
Go purchase several books on a programming language, machine code, assembly language, compilers, and so on, then spend several months or years reading and experimenting until the knowledge you gain from the books results in your writing a successful compiler.
Enroll in a computer science program at a local university. Writing compilers and programming languages are usually covered at a rudimentary level during the second or third year of a BS in computer science, and then in much more depth at a graduate level.
Good luck!
EDIT: If all you're looking for is a way to write code and then write a way to execute it, you might try looking at writing an interpreter for one of the scripting languages that already exist - Ruby, Python, Lua, etc.
Process.Start(String.Format("vbc.exe {0}", sourceFilePath))
Here is one book that explains it all at a very basic level. Including sample code that you can get working in a matter of hours.
As with many programming endevors, it will take you many months of study to accomplish what you want. Good luck!!
Let me start by saying this is totally doable. Ignore the naysayers.
OK, so it seems you bit off more than you could chew here. I would suggest slightly re-evaluating your goal. It seems that you want to learn how compilers work, and writing a VB.net compiler is your project to get started.
Try instead to write an interpreter, which is a much smaller and simpler task. Here's how you do it:
Write a parser for a very very small language, maybe only supporting assignment statements. Use a parser toolkit, like Antlr. This will build an AST ("abstract syntax tree" - ie a few classes or objects representing the program's syntax, without the crap like semi-colons, as a tree), which will look something like this:
then you go through the AST, and just execute it. Google for "AST walker". So for the assignment x = 5, create a hashtable entry for 'x', assign 5 to it, then move on to the next statement.
keep adding features as you go.
Once you've got the full language, you'll probably have learned enough on the way to understand the compiler books. Don't use the dragon book, try Appel's book instead, or Cooper/Torczon. There are online books if you prefer, I've never tried them.
When you go to write the compiler, you'll just be changing the bit that executes the AST into one which generates assembly (or C if you prefer) which will do the same action when it's run.
I'll grant that it seems daunting, but if you stick to a something simple to start, you'll get something working in a few days at most. In a few months, you'll have built it up to something like what you're looking for. Good luck.
To the final part of your question:
It seems that you don't know how executables are made from code. People haven't messed with hex in their compilers since the 80s, and hopefully not much even then.
Basically, after parsing the code, you go through a series of steps which make the code progressively simpler. At the end of that, you have something that is quite close to assembly. You then generate assembly, and the assembler and linker conspire to make it into an executable.
The Visual Basic .NET compiler is shipped for free as part of the .NET Framework - you don't even need the SDK or Express Editions. The compiler for VB (and C#) is located at c:\windows\microsoft.net\framework[version]\vbc.exe (or csc.exe for C#). Therefore, any computer which can run a VB.NET program can compile one. The .NET Framework also includes the System.CodeDom namespace which provides a way from within a program to compile a program, either from a document model or from a string (or file) into a .NET assembly (.exe or .dll) and generate code in both VB and C#.
Regards,
Anthony D. Green | Program Manager | Visual Basic Compiler
You create a compiler for vb.net the same way you create any other compiler. You need a lexer/parser. Entire books have been written on this topic, the most famous probably being The Dragon Book.
To provide a definitive answer: No, you cannot create a decent compiler that will generate an executable file using Notepad. You need a compiler to convert from human-readable text into the machine language (assembly or IL) that a linker or interpreter can then execute.
You can try checking out my tutorial at http://www.icemanind.com
It is a tutorial on creating your own virtual machine and assembler, written in 100% C#.
Cyclone, I'm wondering exactly what are you trying to achieve? You say "the exact purpose of what I wish to do is rather hard to explain" and "essentially, the user inputs code, they get a .exe".
If you just want the user to be able to enter code and then execute it, you might consider an existing scripting language. VBScript is built into Windows and the language is fairly similar to VB.Net, or there are various excellent free languages you can download like Python.
If the user really needs to be able to create a .exe - I think it's likely scripting might do - then why not use an existing free compiler like FreeBasic, or even Visual Basic.Net Express Edition.
I am making a tutorial for that in my website, http://dgblogs.weebly.com. It is written with C# and you can create your own computer programming language!
You will never see this syntax in my tutorials:
Console.Readline();
My website is currently offline by now so, GOOD LUCK!
Thanks,
DgBlogs
Im quite shocked that this question was never fully answered ; Recently i came across some tutorials on the subject of developing a programming language from scratch https://www.youtube.com/c/DmitrySoshnikov-education/playlists
Although the person uses Java Script the technique used for creating the tokenizer and the parser to consume the output from the tokenizer producing the AST for the transcribed language which i would consider to be GOLD! ...
Another tutorial or person https://youtube.com/playlist?list=PLSq9OFrD2Q3DasoOa54Vm9Mr8CATyTbLF
Toby Ho! Has also produce various methods using the Nearly parser (plugin) to build a language much easier but not a Pure code coder like myself(does not like extensions to do the work which i can do myself)....
https://github.com/spydaz/SpydazWeb-AI-_Emulators
I was able to do some different experiments designing a stack machine (runs mini assembly language). for my "Toy" Programming language(basic) could be executed on. (the VM) - Using Visual Basic(My personal Lang - I think in VB)
Since revisiting the tutorials ;
https://youtube.com/playlist?list=PLRAdsfhKI4OWNOSfS7EUu5GRAVmze1t2y
Immo Landwerth! Had a few more experiments This time with C#
A bit more Spaghetti to mull over; but actually revealing (after knowing a lot more);
Yes IT is possible to design a Compiler / VM with VB.net Quite easily ... without external tools .
For myself as an AI developer (CONVERSATIONAL AI / NLP) i was interested in building a compiler for the English language if it would be possible to use the technique for parsing syntax trees and building syntax trees from text using the Tokenizer/Parser to AST Method , as well as designing the transpiler to allow for the syntax tree to be used for translation to other languages. as with today's programming languages merely being a Higher lever language interfacing with a low level(VM) language. in-fact we should be developing more natural language - programming languages and dispelling the more cryptic languages such as C# (Lol) and Java and Go and the like (so many brackets and semi colons / tabs etc - truly not needed!) and that is to say that we programmers still think in code! the journey as a AI developer crosses many domains. forces many off topic research pathways .. So Again a big YES! and we Still could say that ANy programming language can be used to create another programming language it just depends on which language you "THINK IN" .... hence not having too many languages using (i understand most programming languages - but i will always think in VB)...

Creating your own language

If I were looking to create my own language are there any tools that would help me along? I have heard of yacc but I'm wondering how I would implement features that I want in the language.
Closely related questions (all taken by searching on [compiler] on stackoverflow):
Learning Resources on Parsers, Interpreters, and Compilers
Learning to write a compiler
Constructing a simple interpreter
...
And similar topics (from the same search):
Bootstrapping a language
How much of the compiler should we know?
Writing a compiler in its own language
...
Edit: I know the stackoverflow related question search isn't what we'd like it to be, but
did we really need the nth iteration of this topic? Meh!
The first tool I would recommend is the Dragon Book. That is the reference for building compilers. Designing a language is no easy task, implementing it is even more difficult. The dragon book helps there. The book even reference to the standard unix tools lex and yacc. The gnu equivalent tools are called flex and bison. They both generate lexer and parser. There exist also more modern tools for generating lexer and parser, e.g. for java there are ANTLR (I also remember javacc and CUP, but I used myself only ANTLR). The fact that ANTLR combines parser and lexer and that eclipse plugin is availabe make it very comfortable to use. But to compare them, the type of parser you need, and know for what you need them, you should read the Dragon book. There are also other things you have to consider, like runtime environment, programming paradigm, ....
If you have already certain design ideas and need help for a certain step or detail the anwsers could be more helpful.
ANTLR is a very nice parser generator written in Java. There's a terrific book available, too.
I like Flex (Fast Lex) [Lexical scanner]
and Bison (A Hairy Yacc) [Yet another compiler compiler]
Both are free and available on all *NIX installations. For Windows just install cygwin.
But I old school.
By using these tools you can also find the lex rules and yacc gramers for a lot of popular languages on the internet. Thus providing you with a quick way to get up and running and then you can customize the grammers as you go.
Example: Arithmetic expression handling [order of precedence etc is a done to death problem] you can quickly get the grammer for this from the web.
An alternative to think about is to write a front-end extension to GCC.
Non Trivial but if you want a compiled language it saves a lot of work in the code generation section (you will still need to know love and understand flex/bison).
I never finished the complete language, I had used rply and llvmlite implements a simple foxbase language, in https://github.com/acekingke/foxbase_compiler
so if you want use python, rply or llvmlite is helpful.
if you want use golang, goyacc maybe useful. But you should write a lexical analyzer by hard coding by hand. Or you can use https://github.com/acekingke/lexergo to simplify it.