incremental query vs. continuous query - sql

I know that continuous query is a query which is registered once and it is evaluated continuously over a data stream. But, I don't understand what does incremental query means. I am reading about continuous data streams and the way we query for a specific pattern in the stream.
Can anyone explain me - what is an incremental query? Explanation with an example will be really helpful
Although after googling a lot, I find some definitions, but none of them explains clearly.
UPDATE:
I don't find the exact paper now in which I found this term, but in this paper I can find it on page no. 6.

You might already have researched incremental algorithm, I think it is what you're looking for.
I have never heard of an 'incremental' query. However that sounds a lot like doctrine's schema update command here in symfony's doc
Food for thought until someone come up with a better answer :)

Related

Is there a way to compare the similarity between sentences in sql?

Is there a way to compare the similarity between sentences in sql? I have large dataset and I need to identify instances where there are similar words in a two or more setences.
How do I tell SQL to only return the values below?
From what I have googled, there may be a way to do this using a Full-Text Search and Semantic Search, but I have been able to find an article that addresses what I am trying to achieve.
Could someone in the group, provide me example or point to an article that could help me? Better yet, is what I am trying to do even achievable in SQL.
No, there is not.
Part of the problem is that "similarity" is a complex setup and this requires a program to analyze the sentence POSSIBLY with months of programming. You give pretty simplistic examples - grats. Even that is not as easy as you think. What about "the small boy wear red t-shirt" - would small boy be a difference or not?
This requires a LOT of work, and a LOT of definition, or a LOT of training of possibly a multi layer neural network.
SQL generally is awful at string manipulation - the best you get is SOUNDEX and that just compares 4 letters of the first word (RTFM, it is actually QUITE interesting how it works, but it makes it absolutely unsuitable for anything like comparing sentences.
So, no - this is simply way outside the scope of anything in SQL, you will have to download the data and use an out of SQL approach (which is also a LOT more fit for this type of work).
You can obviously work around that with simplistic SQL such as #ASH was suggesting - but this is not looking for "similar sentences" but working around specific markers that ARE SPECIFIC FOR YOUR DATA SET. THis is overfitting and bypassing answering the question you have asked.
You can try SOUNDEX function. Google SOUNDEX and then understand if this works for your case. The query is:
SELECT *
FROM your_table
WHERE SOUNDEX(Sentence) = SOUNDEX(Sentence);

COPY CSV files, then analyze? step by step?

I just started to upload CSV files and creating tables in the database of the company I work for. Would anyone be so kind to explain the proper steps to ensure the copy was complete and there are no mistakes?
My boss told me a few steps of how they do things:
CREATE TABLE -> COPY
or
INSERT DATA -> CREATE INDEX/CONSTRAINTS (if necessary) -> TABLE ANALYSIS
The table analysis part is the confusing part for me. They told me to analyze the table, then check for errors, then get the estimate rows. What do I do with the estimate rows? I used ANALYZE table_name but nothing really shows on the data output.
Please help!
My answer is going to take a slightly different tack.
Clearly your boss has given you instructions and you don't understand them. In my opinion it is important that you go back to your boss and keep asking questions until you understand.
There are a number of important reasons for this:
1. You understand what you are being asked to do (rather than us guessing).
2. If it goes wrong you have done what you were asked, and
3. You might learn something.
The attitude that asking questions ("asking noob questions again") makes you look stupid or ignorant is very dangerous and will, in fact, make you stupid and ignorant.
After 30 years developing some very complex software systems, I still ask questions whenever I don't understand something. The result? In the end, I understand.
This is the only way to actually get better. None of us was born knowing how to do everything.
This sounds like a big misunderstanding. Your boss probably just wants you to run
ANALYZE table_name;
for every table to update the statistics (including row estimates). The query planner uses those statistics to chose how to best execute queries. Read the fine manual about ANALYZE.
Better ask your boss next time if you don't understand instructions.
Typically you'll want to check for a few things
The number of rows inserted is consistent
Data Types are sufficient (Ensure you didn't allocate too much or too little space to a variable)
Data Types are consistent (I.e. You're using integer data types for integers if it's in the design)
There weren't errors caused special characters (You shouldn't have this problem if you used a proper delimiter)
My assumption is "get estimate rows" is he simply wants the number of rows returned. I'll leave that up to you to figure out how to determine that.
If the CSV file was created correctly, I wouldn't sweat it too much. Don't be afraid to ask for help or advice from your colleagues, that's how you learn!
Best of luck!

Math needed for Sql Server

I've been working in Sql server jobs since 2 years now. Although I like it, sometimes I get the feeling that at certain times, I stall too much on some tasks, and I seem to be discouraged easily from things that involve relatively simple logic. It's like, at some point I must repeat a logical condition inside my head more than 2 or 3 times in order to understand it completely.
I have the feeling that this might be of my lack of math knowledge. Can anyone please let me know what area of mathematics I can study, that would improve my Sql server coding skills?
Thank you.
The field of maths most likely to be useful to you is Boolean logic
Set Theory is good for second place however it will often go into more detail that you are likely to need/use in understanding most sql queries.
A quick cheat that you may find useful is if you feed a boolean expression into wolfram alpha it will spit out a truth table for you which some find a much easier way of visualising the expression.
http://www.wolframalpha.com/input/?i=a+or+not+b
I recommend you study symbolic logic.
I'd suggest reading up on Set based Math.
See this link: http://weblogs.sqlteam.com/jeffs/archive/2007/04/30/thinking-set-based-or-not.aspx
Set theory helped me somewhat. Studied it in college years before I got into SQL, but being able to think of a bunch of numbers as a semi-amorphous blob of data and not as an ordered list of items really helps.
Get a copy of this book. It should prove to be most useful: The Art of SQL, by Stephane Faroult.

How do you think while formulating Sql Queries. Is it an experience or a concept?

I have been working on sql server and front end coding and have usually faced problem formulating queries.
I do understand most of the concepts of sql that are needed in formulating queries but whenever some new functionality comes into the picture that can be dont using sql query, i do usually fails resolving them.
I am very comfortable with select queries using joins and all such things but when it comes to DML operation i usually fails
For every query that i never done before I usually finds uncomfortable with that while creating them. Whenever I goes for an interview I usually faces this problem.
Is it their some concept behind approaching on formulating sql queries.
Eg.
I need to create an sql query such that
A table contain single column having duplicate record. I need to remove duplicate records.
I know i can find the solution to this query very easily on Googling, but I want to know how everyone comes to the desired result.
Is it something like Practice Makes Man Perfect i.e. once you did it, next time you will be able to formulate or their is some logic or concept behind.
I could have get my answer of solving above problem simply by posting it on stackoverflow and i would have been with an answer within 5 to 10 minutes but I want to know the reason. How do you work on any new kind of query. Is it a major contribution of experience or some an implementation of concepts.
Whenever I learns some new thing in coding section I tries to utilize it wherever I can use it. But here scenario seems to be changed because might be i am lagging in some concepts.
EDIT
How could I test my knowledge and
concepts in Sql and related sql
queries ?
Typically, the first time you need to open a child proof bottle of pills, you have a hard time, but after that you are prepared for what it might/will entail.
So it is with programming (me thinks).
You find problems, research best practices, and beat your head against a couple of rocks, but in the process you will come to have a handy set of tools.
Also, reading what others tried/did, is a good way to avoid major obsticles.
All in all, with a lot of practice/coding, you will see patterns quicker, and learn to notice where to make use of what tool.
I have a somewhat methodical method of constructing queries in general, and it is something I use elsewhere with any problem solving I need to do.
The first step is ALWAYS listing out any bits of information I have in a request. Information is essentially anything that tells me something about something.
A table contain single column having
duplicate record. I need to remove
duplicate
I have a table (I'll call it table1)
I have a
column on table table1 (I'll call it col1)
I have
duplicates in col1 on table table1
I need to remove
duplicates.
The next step of my query construction is identifying the action I'll take from the information I have.
I'll look for certain keywords (e.g. remove, create, edit, show, etc...) along with the standard insert, update, delete to determine the action.
In the example this would be DELETE because of remove.
The next step is isolation.
Asnwer the question "the action determined above should only be valid for ______..?" This part is almost always the most difficult part of constructing any query because it's usually abstract.
In the above example you're listing "duplicate records" as a piece of information, but that's really an abstract concept of something (anything where a specific value is not unique in usage).
Isolation is also where I test my action using a SELECT statement.
Every new query I run gets thrown through a select first!
The next step is execution, or essentially the "how do I get this done" part of a request.
A lot of times you'll figure the how out during the isolation step, but in some instances (yours included) how you isolate something, and how you fix it is not the same thing.
Showing duplicated values is different than removing a specific duplicate.
The last step is implementation. This is just where I take everything and make the query...
Summing it all up... for me to construct a query I'll pick out all information that I have in the request. Using the information I'll figure out what I need to do (the action), and what I need to do it on (isolation). Once I know what I need to do with what I figure out the execution.
Every single time I'm starting a new "query" I'll run it through these general steps to get an idea for what I'm going to do at an abstract level.
For specific implementations of an actual request you'll have to have some knowledge (or access to google) to go further than this.
Kris
I think in the same way I cook dinner. I have some ingredients (tables, columns etc.), some cooking methods (SELECT, UPDATE, INSERT, GROUP BY etc.) then I put them together in the way I know how.
Sometimes I will do something weird and find it tastes horrible, or that it is amazing.
Occasionally I will pick up new recipes from the internet or friends, then use parts of these in my own.
I also save my recipes in handy repositories, broken down into reusable chunks.
On the "Delete a duplicate" example, I'd come to the result by googling it. This scenario is so rare if the DB is designed properly that I wouldn't bother keeping this information in my head. Why bother, when there is a good resource is available for me to look it up when I need it?
For other queries, it really is practice makes perfect.
Over time, you get to remember frequently used patterns just because they ARE frequently used. Rare cases should be kept in a reference material. I've simply got too much other stuff to remember.
Find a good documentation to your software. I am using Mysql a lot and Mysql has excellent documentation site with decent search function so you get many answers just by reading docs. If you do NOT get your answer at least you are learning something.
Than I set up an example database (or use the one I am working on) and gradually build my SQL. I tend to separate the problem into small pieces and solve it step by step - this is very successful if you are building queries including many JOINS - it is best to start with some particular case and "polute" your SQL with many conditions like WHEN id = "123" which you are taking out as you are working towards your solution.
The best and fastest way to learn good SQL is to work with someone else, preferably someone who knows more than you, but it is not necessarry condition. It can be replaced by studying mature code written by others.
Your example is a test of how well you understand the DISTINCT keyword and the GROUP BY clause, which are SQL's ways of dealing with duplicate data.
Examples and experience. You look at other peoples examples and you create your own code and once it groks, you don't need to think about it again.
I would have a look at the Mere Mortals book - I think it's the one by Hernandez. I remember that when I first started seriously with SQL Server 6.5, moving from manual ISAM databases and Access database systems using VB4, that it was difficult to understand the syntax, the joins and the declarative style. And the SQL queries, while powerful, were very intimidating to understand - because typically, I was looking at generated code in Microsoft Access.
However, once I had developed a relatively systematic approach to building queries in a consistent and straightforward fashion, my skills and confidence quickly moved forward.
From seeing your responses you have two options.
Have a copy of the specification for whatever your working on (SQL spec and the documentation for the SQL implementation (SQLite, SQL Server etc..)
Use Google, SO, Books, etc.. as a resource to find answers.
You can't formulate an answer to a problem without doing one of the above. The first option is to become well versed into the capabilities of whatever you are working on.
The second option allows you to find answers that you may not even fully know how to ask. You example is fairly simplistic, so if you read the spec/implementation documentaion you would know the answer right away. But there are times, where even if you read the spec/documentation you don't know the answer. You only know that it IS possible, just not how to do it.
Remember that as far as jobs and supervisors go, being able to resolve a problem is important, but the faster you can do it the better which can often be done with option 2.

Fastest way to become a MySQL expert?

I have been using MySQL for years, mainly on smaller projects until the last year or so. I'm not sure if it's the nature of the language or my lack of real tutorials that gives me the feeling of being unsure if what I'm writing is the proper way for optimization purposes and scaling purposes.
While self-taught in PHP I'm very sure of myself and the code I write, easily can compare it to others and so on.
With MySQL, I'm not sure whether (and in what cases) an INNER JOIN or LEFT JOIN should be used, nor am I aware of the large amount of functionality that it has. While I've written code for databases that handled tens of millions of records, I don't know if it's optimum. I often find that a small tweak will make a query take less than 1/10 of the original time... but how do I know that my current query isn't also slow?
I would like to become completely confident in this field in the ability to optimize databases and be scalable. Use is not a problem -- I use it on a daily basis in a number of different ways.
So, the question is, what's the path? Reading a book? Website/tutorials? Recommendations?
EXPLAIN is your friend for one. If you learn to use this tool, you should be able to optimize your queries very effectively.
Scan the the MySQL manual and read Paul DuBois' MySQL book.
Use EXPLAIN SELECT, SHOW VARIABLES, SHOW STATUS and SHOW PROCESSLIST.
Learn how the query optimizer works.
Optimize your table formats.
Maintain your tables (myisamchk, CHECK TABLE, OPTIMIZE TABLE).
Use MySQL extensions to get things done faster.
Write a MySQL UDF function if you notice that you would need some
function in many places.
Don't use GRANT on table level or column level if you don't really need
it.
http://dev.mysql.com/tech-resources/presentations/presentation-oscon2000-20000719/index.html
The only way to become an expert in something is experience and that usually takes time. And a good mentor(s) that are better than you to teach you what you are missing. The problem is you don't know what you don't know.
Research and experience - if you don't have the projects to warrant the research, make them. Make three tables with related data and make up scenarios.
E.g.
Make a table of movies their data
make a table of user
make a table of ratings for users
spend time learning how joins work, how to get movies of a particular rating range in one query, how to search the movies table ( like, regex) - as mentioned, use explain to see how different things affect speed. Make a day of it; I guarantee your
handle on it will be greatly increased.
If you're still struggling for case-scenarios, start looking here on SO for questions and try out those scenarios yourself.
I don't know if MIT open courseware has anything about databases... Well whaddya know? They do: http://ocw.mit.edu/OcwWeb/Electrical-Engineering-and-Computer-Science/6-830Fall-2005/CourseHome/
I would recommend that as one source based only on MITs reputation. If you can take a formal course from a university you may find that helpful. Also a good understanding of the fundamental discrete mathematics/logic certainly would do no harm.
As others have said, time and practice is the only real approach.
More practically, I found that EXPLAIN worked wonders for me personally. Learning to read the output of that was probably the biggest single leap I made in being able to write efficient queries.
The second thing I found really helpful was SQL Tuning by Dan Tow, which describes a fairly formal methodology for extracting performance. It's a bit involved, but works well in lots of situations. And if nothing else, it will give you a much better understanding of the way joins are processed.
Start with a class like this one: https://www.udemy.com/sql-mysql-databases/
Then use what you've learned to create and manage a number of SQL databases and run queries. Getting to the expert level is really about practice. But of course you need to learn the pieces before you can practice.